SEO Rant: 2008

Saturday, 25 October 2008

Free links on Blogger

Blogger, like many other popular services, offers an option to post-by-email. This lets you post when you're out and about. You pick a special email address, and when you send something there, it appears on your blog; the subject is the blog title, and the content is the blog post. Blogger's service doesn't offer HTML, but it will auto-link any URLs that you put in the email subject.

According to Blogger's help:

The format of the email address is username.secretword@blogger.com . Note that this email address must be kept secret.

Of course, a few of these have leaked into the public domain, onto mailing lists, which is why you see some sites at blogspot filled with ads for cialis and the like. One of these, for battlefornaboo.blogspot.com, is "robitforrock dot starwars at blogger dot com". Enjoy!

Free eBooks

Yeah, it's been a while. There's a great archive of eBooks around - these ones are actually good, copies of published prints on computing. Grab the dir with:

wget -np http://debian.yaako.org/ebook/

Friday, 6 June 2008

How not to keyword spin

Unique content's a good thing, right? Not always.

To get well written text, you need to invest some time, or pay someone skilled to do it for you. Both are expensive and slow exercises, so many SEOs choose to take a shortcut and "spin" articles. This generally involves taking one source text, and altering the language inside it to "create" a new article. The better spinning programs are aware of things like grammar rules, word frequencies in various languages, set phrases and idioms, and so forth. The worse - and predominant - spinners simply perform synonym replacement, which produces this kind of mapping:

"We walked to the large house" would be replaced by "We ambulated to the gigantic residence".

The latter looks ugly, is awkward to read, and generally doesn't fulfil any kind of quality standard - though it is different, thus helping create unique content, and the meaning remains pretty much the same. The massive failing here, though, that utterly defeats the goal of those using spinners, is how trivial the product is to detect.

In every language, there are common words, rare words, and everything between. The probability of individual words occuring in a piece of text is fairly constant, being skewed a bit depending on the document's type and domain (agricultural reports will be more likely to contain terms about farming, horticulture, and plant and chemical proper nouns, for example). The core set of terms, and their frequencies, will remain the same.

The direct synonym replacement used by typical spinning programs (spinners) has two problems. Firstly, one must bear context in mind when picking alternative words. For example, "junk" (used as a noun) can also be:

"boat, clutter, debris, discard, dope, dreck, dump, flotsam, garbage, jetsam, jettison, litter, refuse, rubbish, salvage, scrap, ship, trash, waste"

Depending on whether we're using this word to talk about a sailing vessel or a piece of rubbish, we can divide this set of alternatives into two distinct groups with different meanings. Simply using a thesaurus to pick a random replacement word will often change the meaning of a sentence. "I thought your product was a heap of junk" is not semantically equivalent to "I thought your product was a heap of boat". Note how simplistic substitution also makes the sentence grammatically incorrect in this case.

The second problem with direct synonym replacement is that it doesn't care about the probability distribution of words in a language. This always leads to the inclusion of rarer words, and exclusion of more common ones. In one above example, we used ambulate instead of walk; the former is a comparatively rare word. Using it makes the sentence more awkward, and harder to read (protip: always use the simplest language that you can).

Just to show how easy spun pages are to spot, let's find one, and take it apart, then see how abnormal it is. We're going to first find how frequent words are in English, and then use them to compare a spun article to a previous post on my blog.

The reference frequency list we use to represent general English comes from the British National Corpus. This is in British English, so we'll make things fairer by Anglicising the spun document, making "color" into "colour", "center" into "centre", and "-ize" into "-ise".

We'll mathematically compare both the spun and un-spun text against this reference model of the English language. This can be done by first building a list of words used in a document, and then counting how many times this occurs in the document. Dividing the count of a word by the total number gives the probability than any random token from the document will match that word. We'll also have a list of these probabilities from the British National Corpus (BNC). To compare this, we'll take the absolute difference between measured and reference probabilities for each term, and express that as a percentage of the reference likelihood. As these percentages get pretty high, and to reduce the impact of any anomalous data, we'll also measure the log of the difference measure.

Spun document

Taken from Good Articles Recommend Top Rank by SEO, an almost illegible and probably spun document (it may possibly be badly translated by someone with a newfound love for thesaurii, though given the topic domain - SEO - this seems only minimally likely).

word	freq	prob	bncprob	difference	logdiff
seo	5	0.0125	9.00E-09	138888791.10%	6.142667198
spell-check	1	0.0025	1.80E-08	13888789.11%	5.142664384
overusing	1	0.0025	1.90E-08	13157794.88%	5.119183112
copywriting	2	0.005	5.70E-08	8771829.78%	4.943090196
scruffily	1	0.0025	5.90E-08	4237188.04%	4.627077738
overeat	1	0.0025	8.80E-08	2840809.03%	4.453442039
well-crafted	1	0.0025	1.38E-07	1811494.22%	4.258036952
proofread	1	0.0025	3.05E-07	819572.12%	3.913587174

Full dataset

We can see a few words sticking out here. Some give an indication of the document's topic (SEO, copywriting) while others are quite bizarre (scruffily, overeat). The difference column shows the magnitude of frequency variation from what's expected - a difference of zero means that a word occurs just as frequently in this text as it does in the British National Corpus; a difference of 100% means that the word occurs twice as often or half as often. Note how the words that stick out hugely aren't that congruent as a set - overeating and scruffiness have little to do with copywriting, spell checking and proof reading.

Differences measured this way will be skewed rapidly by any rarer words that come into a document, and every document that has something to say will have to incorporate some topics using rarer that don't fit the curve perfectly - this would be expected. So, using this measure, a non-zero difference score is inevitable; logs have been taken to smooth differences in scale. What is significant is where and how much the differences are.

The mean difference from standard English for the unspun document is 947349.85%, and the mean of the logs of the difference measure is ~1.46. These numbers show us how different the words in the spun document are from what would be expected in general language.

Unspun document

Taken from my overall vaguely positive Ubercart review.

word	freq	prob	bncprob	difference	logdiff
cron	2	0.002444988	9.00E-09	27166431.27%	5.434032591
www	2	0.002444988	1.90E-08	12868256.85%	5.109519721
php	1	0.001222494	2.90E-08	4215396.07%	4.624838387
firewall	1	0.001222494	3.80E-08	3216989.14%	4.507449594
uploading	1	0.001222494	3.90E-08	3134499.74%	4.496168238
todos	1	0.001222494	4.80E-08	2546762.24%	4.405988403
upload	1	0.001222494	5.60E-08	2182924.83%	4.33903878
metadata	1	0.001222494	5.80E-08	2107648.07%	4.323798095
poin	1	0.001222494	5.80E-08	2107648.07%	4.323798095

Full dataset

We can guess from the top differences here that the document is related to computing and fairly technical. The biggest differences in word probability are in the range of say 1e6 - 2.7e7, a lot less than the top four in the unspun document, which were from 8e6 all the way up to 1.4e8. The mean difference is half that of the unspun document (524475.55%) and the log differences again significantly smaller (1.18).

Comparison

For good measure, and to illustrate this point clearly, here's a graph. The red line is the spun document, the blue one the unspun one. For a document that completely followed average word frequency, you'd see a line at y=0 (i.e., a flatline).

Visual comparison of terms in a spun and unspun document

This shows that the spun document uses English consistently more unusually than the human-written (unspun) document; the red line is higher than the blue one, and the higher a point is, the more it varies from the British National Corpus' survey of English usage. For reference, that covers ~10 million words in 4000 documents, so it's a fairly good source of comparison data.

We all know about term frequencies (TF); it seems fair to guess that search engines have models of these, and that's it's not computationally intense for them to use TF as one tool to distinguish spam from useful content. When one can pick out spun content so easily (this system took ~40 minutes of coding and juggling in excel to make it look pretty, for one guy), there's really no point bothering to add it to your site.

Of course, a sophisticated document spinner is definitely possible to construct. My point here is, the cheap and common ones only provide a massive bright flag that your site is spam. Avoid them.

Data

A full set of all the produced data, in a pretty and readable format, including the full keyword data, and a large graph, is available online here. The texts actually used for comparison are here (unspun) and here (spun).

Further information

If you feel like exploring English word frequences and getting into that long tail, I can't recommend anything more highly than Wordcount.org.

Tuesday, 3 June 2008

No search engine traffic here please, and half you visitors can f*** off too

You gotta love sites that hate spiders as much as this. Open up IE (if you have it) and visit www.fdms.com. No problem, right? I mean, it's a pretty bad site, with some clear SEO problems, sucky design, lack of content, etc, but you can see something, at least (hopefully).

So, that's how things are meant to work - far from stunning, admittedly, but vaguely functional. Now, switch to something quite popular, say Firefox or Opera, and try the same. Oh! What's this?

IE4, you say.. that's what, a decade old this year? And I need a newer browser than that? Visitors aren't even given a chance to try their luck and saunter on in regardless. If developers are designing sites with IE4 as their target browser, something has to be seriously wrong. It fails with plenty of standards, meaning lots of exclusionist code bodges on the inside, and has had less than 0.5% market share with savvy users (see this list as W3Schools); even ignoring the bias that might be a list gathered there, IE4 hasn't shipped with any OS since Windows ME, 8 years ago. What the fuck are these guys on?

Anyway, luckily, hilarity ensues. Looking at the URL of this error page, we have http://www.fdms.com/ns_win.asp. Now, if they were detecting our browser using JavaScript, we'd still be able to see some kind of content somewhere - but we get HTTP-redirected to an error page. This looks like HTTP level user-agent sniffing. I wonder what the search engines think of that?

Fantastic! We need Javascript and IE to browse this site, and if we're a spider, we can get lost. So what's the result?

Abject failure - 6 results, all of which save one are error pages and login links. Fantastic - keeping FDMS's developers where they belong - AS FAR AWAY FROM THE WEB AS POSSIBLE!

Monday, 12 May 2008

Free fashion related links

Run a fashion site? Want some free linkage? Leave me a comment, with the keyword you want - make sure I can contact you via it, perhaps include your URL (or a link to somewhere that has a contact form for you, don't make me think) - and I'll hook you up with some good deep links inside text, from a white-hat site.

Saturday, 10 May 2008

Screw Lipsum

People love (I mean LOVE) to fill up their mediocre templates with text from Lipsum:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum

But this stuff is so dull! Surely it's preferable to at least fill your templates with something more legible and interesting:

Knight Rider, a shadowy flight into the dangerous world of a man who does not exist. Michael Knight, a young loner on a crusade to champion the cause of the innocent, the helpless in a world of criminals who operate above the law.

Children of the sun, see your time has just begun, searching for your ways, through adventures every day. Every day and night, with the condor in flight, with all your friends in tow, you search for the Cities of Gold. Ah-ah-ah-ah-ah... wishing for The Cities of Gold. Ah-ah-ah-ah-ah... some day we will find The Cities of Gold. Do-do-do-do ah-ah-ah, do-do-do-do, Cities of Gold. Do-do-do-do, Cities of Gold. Ah-ah-ah-ah-ah... some day we will find The Cities of Gold.

Malevole's Text Generator has the filler text hookup.

Friday, 9 May 2008

Ubercart initial review

Ubercart is an open-source ecommerce solution. It's not osCommerce derived, which immediately fills me with hope. It's got a silly name, but great looking demo sites. It's based on Drupal, and plugs in as a module.

The install process looks promising; Step 1's a remote installer! This needs FTP details, a MySQL login, and the hostname of the destination site. Great stuff. After setting up vsftpd dual logging in order to work out why it doesn't work, it turns out it's expecting passive mode to just work. Punching a firewall hole for the install source host's IP fixed things, and the installer then runs a few tests and becomes familiar with its environment.

Step 2: configuration. Here we set up an admin login, and are given options to configure drupal and the cart, mainly asked to add or drop modules. As we have no idea about Ubercart or Drupal at this point, the huge lists of modules to install or ignore are fairly overwhelming here, but they are hidden to start off with, so that's alright. Slightly disturbing is the choice not to install Paypal, Order, Payment, Shipping and Stock modules by default!

So, after confirming this, a progress bar appears, and chugs along happily as data is pumped onto your server. Watching Ubercart remote install is a bit like watching insects having sex; there's definitely a bit of protrusion and insertion. The install process uploads PgSql files, even though it's working via MySQL; interesting. Files also go up individually. Surely uploading a zip then unpacking it server side would be faster? It already checked phpinfo so it knows we have the gzip and bz2 libs compiled in; there's no excuse.

Anyway, after a good looking upload that went smoothly to 100% without any ftp errors, a message appears that the installation has failed. The user account we set up doesn't work, either. Panic stations!

Luckily, the Drupal basic config screen could be much worse, and signing up as the first user is easy. Trying to administer the site, we find:

PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 7680 bytes) in /var/www/x/html/sites/all/modules/cck/content.install on line 158, referer: http://www.x.com/?q=user/1

Bah! That could've been easily detected and catered for, or alerts raised, by the installer. Well, never mind; it's easily fixed.

So, we're in, kinda. Looks like a basic Drupal install more than anything else, and there are certainly no pointers as to what to do next. Well, the status section has some flagged status items - easily corrected, again. The Remote installer could've checked for .htaccess function and fixed the rewrites itself, but no problem. Adding a crontab for the site's user doesn't satisfy the cron poin - is cron job detection flaky?

And then we get to.. this horrendous thing - each line is a link to a page of configuration settings to adjust so that you're up and running:

For some reason, the crash reporting is set by default to report your URL, sales, order and product volumes. URL is understandable, but that other data? No thanks!

Product data contains no link to actually add an item. How do we do this? Via the "create content" link at the top of the admin leftnav, obviously! This screen is again filled with detailed settings - fine for the advanced user if they need them, but skipped to start off with. Ubercart requires you to set your own SKUs, with no option to have them autogenerated. Sigh.

So; now we have a shop, with an item, and not much else. The contact info entered hasn't gone onto a nice contact info page; in fact there's no way of contacting the site owners yet. More manual work. The currency symbol seems dead set on following the price, too (e.g. 2.50$) - very odd! It'd be nice to get rid of the login links everywhere.

At this point, there's no add to basket button on the product page. The item has a setting for "how many to add into the basket at once"; the modules that look appropriate are set up; and the guide on the site has been followed, until it got to the point where 20 pages of lists of todos are required to proceed. Given experiences so far, it looks like some special tweak has to be done to let people add to the god damn basket.

I want a shop that installs, I set up basic metadata, add products too, and people go buy. I don't want to learn Drupal; I don't want to hunt for hours for settings to enable basic ecommerce system requirements; I don't want to force my visitors to log in; I just want to sell things, online.

Ubercart - you looked hot, but it turned out that it was all make-up. I'm glad it rained.

Saturday, 26 April 2008

SearchMe

SearchMe have a nice, shiny interface. They're based just down the road from Google, and are crawling the net at the moment. I've had quite a fun afternoon checking that I'm at the top for the words I want in this engine too (I am!) and flagging competitor sites as porn. Fantastic stuff. Take a look (watch out, may crash lesser browsers and will definitely break lesser machines):

Searchme Visual Search

They've picked up a great way of building category lists and collecting stats (and indeed feedback) for it, too. Enjoy!

Thursday, 17 April 2008

BT Web Clicks = Bad

BT Web Clicks are sending out an AMAZING, unique offer!

They then proceeded to ask me “what do we rank for on google?”, my response was “your company name, unless you request otherwise”.

They then went on to mentioned that “the man from BT” can get us listed “at the top” of the search engine for “our keywords”.

Interesting! What does Google say about companies offering that?

No one can guarantee a #1 ranking on Google.

So.. BT (British Telecom) are officially offering a large pile of fail - and this is just the tip of the iceberg; inside are hidden £15k/month costs, outrageous consultancy fees, targetting at the meek and lonely, and to top it all, a cornucopia of grammatical errors. Watch out for this one, people - they may be cold calling in your area now.

James Wade has the full scoop: BT Web Clicks review.

Monday, 14 April 2008

Sunday, 13 April 2008

Opera Pagerank widget

If you use Opera (think FF but faster and no memory leaks), you'll probably miss LivePR. There's an almost-as-good replacement available, and it's really easy to install:

Pagerank in Opera

It's well worth spending the extra 30 seconds to set up the custom Javascript plugin mentioned there, too.

Flash SEO

Flash is the leading vector graphics technology used in thousands of interactive and design-focused websites. According to estimates, over 98% of Internet users have Flash player software installed in their browsers.
Well-created Flash animations can make sites more attractive, provide a better experience for users than text-based content, and allow designers to show off their creativity and skills.
However, Flash websites are usually much slower to load, very often lack the content users are looking for, do not cater for users with disabilities, and cannot achieve the same search engine rankings as their text-based counterparts.
Some of these shortcomings can be overcome, others cannot. The following post deals with both the drawbacks and possible remedies.

1. Shortcomings

1.1 Download speed

Despite the expansion and lowering of broadband connections prices, the vast majority of Internet users still use slow dial-ups. Flash is not bandwidth friendly and these users have to wait about for a Flash site to load. Many people are not willing to do that even for the most beautiful Flash site.

1.2 Lack of content

Flash is the medium for creativity and design, not for accommodating large amounts of text or information in general. Not many content-focused sites are likely to use it. The problem is that most people use the Net to search for information. As they would not expect to find much of it on a Flash website, they are likely to avoid them. Content is what most users are looking for, and a website lacking it will be unsuccessful regardless of its appearance.

1.3 Accessibility issues

Flash is unfriendly to screen readers of visually impaired users. Screen readers work with plain HTML text and cannot read properly images or text embedded within Flash animations.

1.4 Search engines

Search engines were designed to index and work with HTML documents, not Flash or other non-HTML formats.
Although Google and FAST search engines are now able to crawl some Flash sites, they are still a long way from being able to retrieve and index the content in full. No other search engine has even this very limited ability and so cannot index any information at all.

2. Solutions

Many suggestions abound about how to overcome these shortcomings. A few can partially improve the situation, some are likely to get a site penalised or banned by search engines, and the influence of others on rankings has still to be gauged.

2.1 'Non-solutions'

Doorway pages

Building ‘doorway’ HTML pages no longer works, because search engines now hate them almost as much as they are quick to detect them. In the past, doorway pages were used for ‘keyword stuffing’ and cloaking, and as result, they trigger anti-spam filters on all important search engines.

<noframes> / <noscript> tag

Another trick suggested to contain Flash content is by using invisible framesets and to fill the <noframes> tag with ‘alternative’ and index-able content in plain-text format. Although a true replication of Flash content in HTML format might seem as a legitimate use of this tag, search engines are unable - as well as unwilling - to reach this conclusion.
Search engines succeed or fail on their ability to provide useful information. They are unlikely to understand why any site information should be presented in one form to them, and in another to potential users.

CSS invisible layers

Placing content on a Cascading Style Sheets (CSS) layer set to be invisible is yet another risky plan. It also offers search engine spiders content that is different for human users and, therefore, could be interpreted as ‘spamming’.
Until recently search engines spiders avoided external CSS style sheets and would have been unable to discover the “display: none” declarations. However, in the past few weeks Googlebot has begun to crawl style sheets ignoring all “robots” exclusion statements. Of course, it is impossible to be absolutely certain why it was re-programmed this way, but searching for hidden content is a likely reason.

CSS z-index

Another possibility is to add text content to a CSS layer and to position it either off-screen by negative margin, or behind the Flash content by putting both on layers and setting a lower “z-index” value to the content layer. A site’s visitor will see the Flash movie in a browser, but search engine spiders will find the ‘keyword-rich’ text in the source code.
Many experts argue that if the off-screen content truly reflects the Flash content, there cannot be issues with spamming. They might be correct. Then again, they might not. If search engine spiders cannot see the Flash content, they cannot establish whether it really matches the plain-text content. And because they cannot do that, they could never know whether the ‘alternative’ content is there to help them, or to fool them.

2.2 A possibility

HTML version

In theory it is always possible to reproduce a Flash site in a ‘plain’ HTML version and offer search engine spiders and visitors the choice.
In practice it doesn’t seem to be a very cost- and time-efficient or labour-saving solution.

2.3 Recommended solutions

<applet> tag

Using the <applet> element for embedding Flash is an ‘old-school’ technique that has featured on the W3C “deprecated-feature” list for quite a long time.
However, it does allow textual content to be place between the opening and closing tag, and in the “alternative description” inside the tag itself. Which is about the only reason why is it listed here.

<embed> tag

Another tag for embedding Flash movies is the <embed> tag. Although the <embed> tag is now deprecated too, it is the only tag that works equally well across browsers.
Its sibling - the <noembed></noembed> tag – allows alternative text description to be placed in between. This is its legitimate use.

<object> tag

The <object> tag is the latest element for embedding Flash movies. It is recommended by the W3C. However, it is likely to crash some versions of IE 4.01 and is not supported by older versions of NN.
The text alternative is placed between the opening and closing tag.

Macromedia MX features

Macromedia Flash MX 2004 comes with many great features designed to improve accessibility, and as a consequence, also search engine spiderability, such as:

Providing text equivalents for all visual elements via “Name” and “Description” fields. The name field is used for shorter text equivalents (similar to the <img alt>), the description field is used for longer descriptions (similar to the <longdesc> attribute in HTML.) Both can provide search engine spiders with useful information about the site’s content and relationships between individual site elements.

Specifying reading order of Flash content. When reading the text equivalents in a Flash movie, a spider does not necessarily have to read the content in the order of the visual layout. Specifying the correct order helps spiders associate content of individual site elements, and in mapping internal structure.

Captioning audio content. Captions added to narrative audio provide spiders with useful information about the audio’s content and its relationship to the rest of the page.

Google and FAST (AllTheWeb / Yahoo!) search engines are able to extract some content from Flash pages. FAST uses the Macromedia Flash search engine software developer's kit (SDK), which was designed to convert text and links from a Flash file into HTML for indexing. What technology Google employs is not known.
Both seem to be able to follow embedded links. It remains unclear to what extent they are able to extract the textual information from a page, or from the text equivalents.
Further, XML and XSL can be shoehorned into a supporting role for Flash SEO.

Links

Both Google and FAST can extract some links from Flash files and, as a result, crawl a certain proportion of their content. It enables them to at least partially map the site’s internal structure and analyse relationships between individual pages. In addition, it allows Google to distribute some of the Page Rank allocated to the home page throughout the rest of the site.
This represents an opportunity for well-structured Flash websites with good internal structure to improve their rankings in Google and FAST search engines’ networks.

Page titles and meta descriptions

All search engines use page titles and meta descriptions in conjunction with the page’s content for ranking algorithms.
As most spiders cannot read the content of a Flash file and, therefore, cannot match it to the page’s title and metadata, it is unlikely that even the most carefully written titles and descriptions could significantly improve a site’s ranking.

3. Conclusion

By adopting correct design techniques, it is possible to overcome some of the inherent shortcomings of Flash. A few search engine spiders will then be able to crawl such a site and at least partially index its content, which means an increase in site rankings in these search engines. The site will also become more accessible to individuals with disabilities.
However, even the best optimised Flash website can improve its rankings only in comparison to another Flash website. Regardless of the thoroughness in design, or advancements in search engine retrieval technologies, a Flash website will never outrank a well-optimised HTML site.
Search engines work with text and reward sites that provide it. Therefore, if search engine visibility is the factor, a site cannot be designed in Flash. It is as simple as that.

4. Recommendations

Do not create a website entirely in Flash

Build a website in HTML instead

Use Flash movies as a supporting visual medium for plain text content

Saturday, 12 April 2008

Meta Keywords Wordpress plugin

Ahh.. nothing like a good, theory-centric, hype-free approach to SEO. There's an extremely sensible and (fairly) lightweight Meta Keywords plugin rattling around over at data dump.

It seemed to me that a basic meta keywords tag should be trivial to include. According to Google’s
word of mouth advice, meta keyword tags (apart from being mainly unimportant) shouldn’t contain more than six or seven keyphrases (not keywords, keyphrases - more than one word can go there) separated by commas.

Sounds simple, right? This plugin reads your document’s content and title, strips out crap (formatting, useless words such as “the”, invisible characters) and then performs n-gram analysis to get an idea of term frequency. That’s a fancy way of saying that it finds the most oft-used words and phrases in your post. The top six (or seven, or even twenty - you can edit a constant at the top of the plugin to alter this) phrases are then plonked out into a nice meta keywords tag in your HTML header. Easy as pie!

Go and have a look for yourself, and leech the thing.
Meta Keywords plugin for WordPress

Free backlink analysis

Yeah, you heard that right. Debra at thelinkspiel got a hookup with Receptional. Get over there and see how many authority links you have (if you didn't already know!).

Tool URL: http://labs.receptional.com/links/

User & pass: thelinkspiel

One thing this tool could do with is some kind of drill down into each category, where it showed you the domains that were providing links in each instance, or even the full URIs. The data is a little opaque as it stands. Still - have a try!

Update - if you're after other free tools that backlink analyze, you could do worse than:

SEObook backlink tool

Domainpop backlink checker

Thursday, 10 April 2008

T-Mobile UK SEO Audit

Well, I've had this internal document on my desk for a few years, and it's kinda heavy - and certainly not pertinent enough any more to cause damage. So, here's a high-grade commercial SEO Audit, worth somewhere around 1500GBP at the time of production. Enjoy.

T-Mobile UK SEO Audit

I suppose the most shocking thing here is the quality of the site - thankfully it's improved since then - it's really incomprehensible that a company of this proportion failed so epically. Or, well, not.

You might like to take this audit and use it as a framework for your own; some things are well out of date, there are factual errors, and typos, but the formatting works, and the principles are still the same. Let me know what works for you!

Thursday, 27 March 2008

Blatent Plug

The digital point forums are full of crazy people who can't read specs, the webmaster world ones full of crazy people with huge egos. However! There's a rather good SEO Forum starting up - http://forums.seohelp.org - run by experts who have maintained a free help service in their spare time over the past four years. Drop in!

Wednesday, 26 March 2008

AMEX advice for idiots

AMEX have been blasting out some fairly self-contradictory advice recently, according to WebProNews. They suggest that small businesses avoid employing SEO guys:

don’t waste money on so-called Search Engine Optimization (S.E.O.) specialists. Search engines are very quick to penalize sites that try to trick their filtering techniques, and once your site has been put on Google’s blacklist, it will take forever to get off.

Obviously the SEO people have been up in arms over this rather broad tarring. While of course there are many terrible SEO firms out there, and even more that over-charge, it's fairly easy for anyone to spot a dodgy outfit thanks to Google's published SEO guidelines.

Anyway, the bit where this gets "interesting" is about here, where AMEX then proceed to dish out SEO advice themselves!

using clean U.R.L.’s like yourdomain.com/store/widgets instead of yourdomain.com/store.php?id=42&categoryID= widgets will increase your chances of getting indexed in a search engine.

Sound advice indeed. Shame about the utterly conflicting points of view presented here, though. Tell you what, AMEX: how about you stick to raping me on chargeback costs, and I'll stick to what I do best, too.

Monday, 17 March 2008

Google forgets everyone's News settings

Google news is great. You can view a bunch of categories of news, add and remove these categories for various regions (e.g. take out US entertainment, add Global politics, that kinda thing), as well as control how many stories show for each section. Plus, you can show stories relevant to a specific keyword (say, "SEO", or "free chocolate"). There's even this great thing where you can drag and drop different news sections to create a custom layout.

Yeah.. got a big downside though. Every once in a while, Google deletes all your settings. Last weekend was just one such occasion!

From: David Allen
Date: Tue, 11 Mar 2008 09:41:27 -0700 (PDT)
Local: Tues, Mar 11 2008 4:41 pm
Subject: My customized News page is gone
I had an elaborately edited personal Google News page, including a key
custom section. It is all gone! I get just the plain vanilla 'stock'
News page. Help!

He's not the only one:

Hi,
I logged in to my Google news today, and all my personalisations have
disappeared! I had a heavily customised layout, and now I just have a
default one with "Recommended for you" added. This is terrible - I put
months of tuning into my layout, which I haven't touched since mid
2007. Is it recoverable?

Way to go Google! I'm sure you won't do anything like that with my 6GB of GMail. Get backing up, sheeple..

Saturday, 25 October 2008

Friday, 6 June 2008

Tuesday, 3 June 2008

Monday, 12 May 2008

Saturday, 10 May 2008

Friday, 9 May 2008

Saturday, 26 April 2008

Thursday, 17 April 2008

Monday, 14 April 2008

Sunday, 13 April 2008

1. Shortcomings

1.1 Download speed

1.2 Lack of content

1.3 Accessibility issues

1.4 Search engines

2. Solutions

2.1 'Non-solutions'

Doorway pages

<noframes> / <noscript> tag

CSS invisible layers

CSS z-index

2.2 A possibility

HTML version

2.3 Recommended solutions

<applet> tag

<embed> tag

<object> tag

Macromedia MX features

Links

Page titles and meta descriptions

3. Conclusion

4. Recommendations

Saturday, 12 April 2008

Thursday, 10 April 2008

Thursday, 27 March 2008

Wednesday, 26 March 2008

Monday, 17 March 2008

Subscribe via email

Links

Archived articles

More about me

Followers