Posts filed under 'SEO Discussions'

October 2007 Pagerank Update Underway

Hi everyone –

For those of you who follow these things, just letting you know that the October 27 2007 PR (google pagerank update) is now underway.

It’s been a long time coming, and to be honest i thought that it might never come. Showing pagerank (or at least the toolbar version – TBPR) is one of those things that I often think probably serves no real useful purpose to the average webmaster.

It’s really only just basically a measure of how many folks link to you (and how many link to them… recursively) and NOT (unlike what the toolbar PR says when you hover over it) a measure of how important Google thinks your site is – If you want to know that you just need to check out your visitation stats.

I’ll take that further – I think the little green bar probably helps erode the quality of the internet as a whole by encouraging the abuse of the Google algorithm through link exchange / paid links etc.

It’s like crack for webmasters – it causes a kind of ‘PR fixation’ amongst the SEO and webmaster community. I think that’s probably something which is to the detriment of inexperienced webmasters as it tends to sidetrack them from paying more attention to the other aspects of SEO – like writing good content, for instance. Every new webmaster has suffered from that syndrome at some point or another.

I think there are a number of reasons not to worry too much about Pagerank. I described my views about this around the last PR update in April, and I found this informative article about why pagerank isn’t something to worry about too much a while back.

Of course the whole issue does tend to polarise people. Some folks are firmly of the view that PR has ZERO impact upon your position relative to other sites in the search index, whereas others (including me) believe it’s still quite an important measure – simply because it is an effective way to measure popularity algorithmically – and has no really accurtate peer at present. You can see one such holy war regarding pagerank in this thread – in the red corner we have Cass-Hacks, in the blue corner we have dockarl (me).

Anyway, here’s hoping that your PR moved in the right direction – but if it didn’t, do not despair!

Cheers,

doc

7 comments October 27th, 2007

Danger! Multiple domain names, 1 site – why it is bad

I’ve been on an hiatus from writing here, so I thought I might break the trend by talking about the practice of creating multiple websites to ‘corner the market’ – jealously guarding your url to ensure no-one uses a variation.

An example might be registering mysite.com, and then being seduced by the offer (godaddy does this regularly) to register variants of your new domain name (eg .biz, .net, .org) at a ‘special discount’ – they don’t offer fries just yet, but domain sellers really are the masters of the up-sell.

I consider registering more than one domain a bit pointless

The days of people memorising and typing a url into a browser are pretty much over – except for a few notable and brilliant exceptions with catchy names like utheguru.com, oyoy.eu and other less successful or well known sites such as google and youtube most people get to a site the new-fangled way – by following links or doing a search. So, really in essence, you’re probably paying extra for not much benefit.

Furthermore, the practice can have insidious side effects – you can actually shoot yourself in the foot.

Multiple domains = Multiple sources of links

When presented with duplicate content, google often seems to pick one page as the ‘original’ and consign the others as unimportant copies, and they don’t rank well.

You could end up with a situation where google chooses a page from each of your site copies as the ‘original’ and you end up with search traffic spread between all four.

Registering Multiple domains for the same site can actually be bad for business

Links to your sites naturally tend to come with traffic – and a lot of traffic generally comes from search… so… you’ll also end up with your incoming links spread between all the copies of your site.

In such a circumstance, the meaning of synergy (the parts are greater than the whole) does NOT apply. You end up with four sites with a quarter of the links they should have rather than one strong site that aggregates all the power of the incoming links in one place – end result? You don’t rank as well as you could.

How to use your multiple domains ‘the right way’

Best practice is to use something called a 301 redirect – rather than having 4 actual copies of your site all competing with each other, a 301 redirect seamlessly redirects clients (and google) to the ‘main’ url you want to rank well. If you google “how to do a 301 redirect’ you should be on your way to understanding that a bit better.

50 comments September 2nd, 2007

Supplementals have been abolished

In a case of short term pain for long term gain long term pain for short term gain, everyones favorite search engine has abolished the supplemental index.

But before you go running around your office whooping with delight like I did this morning – STOP. Google hasn’t abolished the supps, they’ve just stopped telling us which pages are in supps.

What’s that mean to the average punter?

Well, it means less questions on the webmaster forums starting with ‘why are my pages all in the supplemental index’, and less time spent by ‘mom and pop’ sites worrying about it.

Possibly a good move.

Me, well, I’m skeptical about the move. The overriding stated aim of Google is to return quality results. I’ve seen plenty of quality pages in the supplemental index – google has stated repeatedly that the biggest reason for a page being in the supps is NOT a perceived lack of quality, but rather a lack of pagerank.

It’s nice to know they are there so that we can make an effort to bring them into the main index where they belong. Google should be adding MORE tools to help genuine webmasters assess how they can improve their index penetration, not less.

It’s a case of ‘need to know’ – Google now no longer reckons we ‘need to know’ which pages their algorithms consider unworthy of a place in the main index. My initial feeling about that move is that it seems a little paternalistic.

Google has eviscerated the ONLY tool that goes any way toward explaining why a page might be performing poorly.

My take? If they are going to stop tagging pages as supplemental they should just abolish the supplemental index altogether – if a page is being crawled but isn’t in the index, well, we know it sucks – so why lump it in with other results? Put differently, why show us pages in a site: search if they’re not going to rank anyway.

At the moment I’m leaning towards thinking this might have been a (short term) backwards step, although it wouldn’t surprise me if we see some new tools in the Google webmaster tools arsenal to help deal with this prob.

ADDENDUM:- Richard Hearne (www.redcardinal.ie) put it best recently on the google webmaster help forums –

“Of course Google would rather we didn’t discuss or even consider this supplemental index.  Then again if Google was serious about fixing issues like these they would scrap the supplemental index… or give us back the supplemental tag so that we can try to fix these issues ourselves. “

5 comments August 1st, 2007

Get Out of The Supplemental Index In 3 Steps

Escape the Supplemental Index

So you have found yourself in the Google supplemental index and you want to escape.

Fair enough – unless you are a webmaster / blogger it’s hard to understand just how frustrating it is to find your hard-work ‘binned’ to the supplemental index – but worry no more, it’s easier to get out of the supplemental index than you may think.

In this, part two of my ongoing series on the supplemental index (see part one here – The Google Supplemental Index – A Primer), I’ll be giving you three key steps you can take to get your web page out of the supplemental index and stay out.

STEP 1 – Duplicate Content causes Supplementals

Pick a few key pages on your site, and run them through ‘copyscape’ (www.copyscape.com). If copyscape says you have duplicate content on your pages, this could be the reason for the supplemental status of your pages.

Edit the pages, make them more unique, put any quotes in a
<quote> tag, and try again. Move to Step 2.

STEP 2 – Backlinks, Backlinks and Backlinks

So you have a page in the supplementals, it is brimming with unique content, and you just can’t wait to get it out – it’s not hard. I have used this technique many, many times, and if done correctly you’ll find it helps bring your whole site from the ‘infant’ status I spoke about in my previous article to ‘adolescence’.

  • Find a page on your site that is in the supplementals, that has heaps of unique content, and note down the url of that page.
  • Find a site that has PR3 or better, and allows you to post your url.
    • If you don’t know what Pagerank is, I define it in my article about nofollow
    • Don’t know how to discover pagerank? You can do so by getting Firefox with Google Toolbar (download it from my toolbar to the right)
  • Post your URL on that page, using descriptive anchor text. (eg, if your page is about widgets, the link should say ‘widgets’ if possible).Try to make your link a deep link – like www.utheguru.com/301-redirects instead of just www.utheguru.com
  • Can’t find somewhere you can post a link? Some tips:-
    • Your host’s forum / bulletin board (make sure that they aren’t no-following links).
    • A friend with an established website (a link from the first page is always best)
    • Another of your own websites (I’ve done this before and it works)
    • Paid editorial.
    • DO NOT subscribe to link exchange schemes, ‘free’ directory listings or other such ‘offers’. At best, they don’t work, at worst, they can get you penalized.

This strategy has worked without fail for me.

Use it, and expect your target page to be out of the supplementals within a week or less.

Some people call it giving a page ‘link juice’, or ‘link love’ – whatever you call it, it works.

STEP 3 – Submit a Sitemap to Google

Google webmaster central, and Matt Cutt’s Video about Webmaster Tools will bring you up to speed about this process.

To generate the sitemap for submission, I highly recommend the following free tool.

Why submit a sitemap? Well, you’ve gone to the effort of getting Google ‘interested’ in your site, so you want to give it the best chance possible of indexing your site properly.

A sitemap will help it do this.

Tomorrow, In part three of this series, I’ll be talking about strategies that will help to KEEP your site indexed.

This advice should help you to progress to a ‘mature site’ that is crawled and indexed regularly, without the need for further intervention to keep new pages from going supplemental.

Cheers,

TheDuck

24 comments July 20th, 2007

July 2007 Pagerank (PR) Update Underway

Nope – actually the July 2007 PR update is not underway at all. It’s almost impossible to know when the next PR update will get underway, and trying to guess the date is a bit pointless.

When is the next Pagerank (PR) Update?

As per usual, people have been trying to guess when the next Pagerank (PR) update will happen, and some even swear that it is happening now (as postulated by some commentators on Matt Cutt’s blog). I personally see absolutely no evidence that there is a July toolbar PR update underway – and in fact, if previous trends are any guide, it is likely that the next toolbar pagerank update will not be until August 2007 at the earliest.

There are a few little ‘ripples in cyberspace’ that are a little indicative that SOMETHING is happening, but I’ve checked all the datacentres for a number of my sites, and I can say quite definitively that Tool-bar Pagerank is not happening right at the moment.

Will the next PR update be in August?

Who knows. The next pagerank push could be in August, it could be in January 2008, or I might be totally wrong and it could be happening right now – but, In any case, don’t get a fixation about PR updates. Why?

It’s a common misconception that all your link building goes unrewarded until toolbar PR is updated – that’s just not true. Real PR (that which Google uses internally) is a dynamic, constantly changing beastie – Google just keeps the real value a secret so that webmasters like us don’t go crazy watching our PR go up and down like a yo-yo in between updates.

If you’d like to know why NOT TO WORRY about PR updates, and how to improve your Page Rank between now and the next one, please see my post about the last pagerank update.

11 comments July 16th, 2007

What is Buzz?

I’ve been thinking of doing another ‘big’ post about SEO and a little strategy that’s jumped out at me recently.

Since sometimes doing one big post is a bit overwhelming, I’ve decided to just write about one little minuscule part of the post first to whet your appetite.

Buzz is the noise that bees make.

My first experience with bee keeping was on my school camp, about age 16. My particular school had a 10 week outdoor education curriculum – every grade 10 class would head out to the school farm (“Ironbark”) for 10 weeks – the aim was to be pretty self sufficient – we had to milk the cows for our milk, make our own butter, bale hay, keep (and ultimately cut the heads off and eat) our own poultry, and of course beekeeping was one of the cool things we got to do too.

The (two) Birds and the Bees

Quite early on I volunteered to head out with one of the local bee keepers to learn all about robbing bee hives – of course, there was an ulterior motive. There were only three positions on the “bee team” and the other two had been taken by the two prettiest girls in the class – Shae and Natalie Alexander, (who in my opinion at the time was a complete SPUNK) 😀 .

I figured that, on balance, the very real possibility of being stung to death by a marauding swarm of angry bees was probably offset by the chance to spend an entire day with them 🙂

So.. off we went. I, being the gentleman that I was, let the girls take the very best beekeeping overalls. I was left with a very moth-eaten pair of blue mechanics overalls.

Handed a roll of masking tape I went about patching the 101 holes in the overalls and set to work. When we first cracked the hive open I remember the beautiful low hum coming out of the hive as we puffed the smoke over the bees.

Sweetness turns to Sadness

Everything went quite ok for about the first 4 (out of 10) frames – we brushed off the bees, replaced each honey filled frame with a frame of fake comb called ‘foundation’ and moved to the next frame.

By the 5th frame, however, the bees were starting to get pretty darn angry. It didn’t matter if I puffed more smoke over the bees, the low hum was steadily increasing gradually more and more guard bees started shooting out kamikaze style and belting into my head net.

I think we had about three frames to go when things started to get really crazy – the hum was now something more akin to a F-16 ratcheting up for take off. The inevitable happened – I’d missed patching a hole, and a bee got inside my overalls and stung me – ouch! But I was super Matt – there was no way I was going to moan about it in front of the two prettiest girls in the class 😉

The thing I failed to realise, though, was that when a bee releases its sting it also releases a scent.

Honey, I’ve lost my pants

Before I knew it, I had virtually every bee in the hive clinging to my blue overalls screaming bloody murder.

The girls (along with the beekeeper) cleared out, hopped in the truck and locked the doors. After initially trying in vain to get them to let me in the truck (there was NO WAY they were going to let me in with all those bees 🙂 ), I finally realised I was going to have to get myself out of that particular situation on my own – so I blindly galloped down the hill, stripping off my clothing as I went – heading for the farm dam.

“Splash” – I belly-flopped into the muddy dam (I was about 6 feet tall and 65kg then – skinny as a rake – I must have looked a sight running down the hill with a swarm of bees chasing me and only a hat to ensure my modesty).

I think I spent about half an hour in the dam, popping up every 30 seconds or so for a gulp of air, before the bees finally decided they’d had their pound of flesh and headed home 😀 . The girls thought it was absolutely hilarious – I still hear about ‘Matt and the bees’ occasionally when I run into old school friends. We counted 65 stings on the way home.

Beee vereee vereee careful wit zee bees

Stupidly, after that introduction, I became a bee keeper – I still have about 10 hives.

I’ve learnt a few things about bees since – if you move slowly, methodically, it is actually possible to raid a hive without any protective clothing, nets or smoke at all – it’s not hard.

If you move quickly though, or you happen to accidentally squish a bee, you’re in deep trouble. The bee next to the one that has just been squished tends to tell his neighbor (buzz – buzz), the neighbor then buzz’s to his neighbors and generally in a matter of seconds you have a hive full of very irate bees.

Bees, Buzz and SEO

Buzz is their form of communication. Buzz is how they get things done. Buzz is the very thing that binds the hive into the co-operative society that it is.

In short, buzz is like an amplifier – in no time flat a buzz from a solitary individual in the hive is capable of mobilizing the forces of the whole hive to a dedicated purpose.

Buzz, my friends, is a powerful force.

__________________

2 comments July 11th, 2007

My Recent Job Interview with Google at the ‘Plex’

Outside The GooglePlex

Well I kept it a big secret from you all because I didn’t want to jinx myself – but I was invited recently to an interview with Google (it was the ‘exciting little company‘ I spoke about a couple months back in a post about an upcoming interview).

The position was to be based at Mountain View, California, and as part of the great Webmaster support team – along with neat and very bright people like Adam Lasnik, Vanessa Fox, Aaron D’Souza and Matt Cutts.

The position was ‘Webmaster Trends Analyst’, something I felt uniquely attracted to – I’ve a strong background in stats (from my undergraduate degree and time running scientific trials with the Sugar industry), have run several ecommerce sites and have a Master’s Degree in Computer and Comms Engineering – as well as being a regular poster on the Google Webmaster Forums – so I love hearing about what other folks are up to.

It was an exciting opportunity – so accordingly I took some valuable time off my PhD to prepare – before hopping on the plane for the 13 hour flight to San Francisco.

It was a great experience, but unfortunately I didn’t get the position.

I was disappointed.

As I wrote to one of my contacts about it:-

“so, either I’ll start looking for work again or I’ll bite my bum, put the pedal to the metal and get back into the PhD.

G was going to be a great fit because working with people like yourself would have been a ‘learning’ experience rather than just a job – I hate the 9-5 ‘office worker’ style culture of uni, but love the learning side.

My main problem when it comes to being hired is that of previous job experience..

I start to look like a jack of all trades but an expert at none.

Imho I thought that would be what would get me the job with G, as I’ve been told it makes me a pretty powerful educator – and a great interface between nutty engineer / scientist types and the general public.”

But let me take a step back here for a moment – I need to emphasise that I found the whole experience incredibly rejuvenating and irregardless of the fact I wasn’t successful, I still feel honoured.

If Google were to turn around today and say they wanted to employ me, I’d say yes in an heartbeat.

Why? Because any company that actually recruits internationally for a position known as ‘Webmaster Trends Analyst’ is a company that has a conscience. I don’t see such a position advertised at Yahoo. I don’t see such a position advertised at MSN… actually, I don’t see such a position advertised ANYWHERE.

When I was going through the interviews, one of the interviewers (and I hope I’m not out of line here) actually spoke about the fact that Google pulls together information from heaps of different resources (blogs, forums etc) on a regular basis and tries to quantitatively (from qualitative signals) assess ‘webmaster sentiment’ – and use it as an early warning system to alert them if things (like an algorithm change for instance) have had any unforseen impact. That made me sit back and go ‘wow’.

I count myself very lucky to have been interviewed by a great company with a social conscience like Google and dearly hope an opportunity pops up soon and I get another crack at it (You can contact me if you know of one).

But enough of that – the whole experience was a complete blast – let me show you a few photos.

This first photo is the centre of the Googleplex – it’s a neat place. I like the fact that I seem to have captured a black crow in mid-flight right below the Google sign 🙂

The GooglePlex

Took this photo on a toilet break at the ‘plex’ – judging from the pace of the interview I figured that time is a commodity in short supply at the googleplex, but this pic (right above the urinal) really rammed it home “Testing on the Toilet” – an A4 page giving thought provoking code tips to the engineers. 🙂

“Testing on the Toilet” - Google takes their debugging seriously :)

I got the opportunity to do a fair bit of sight seeing while I was there…

Highway 101

The Golden Gate Bridge (with me in front of it).

You’ll notice in all of these photos that I’m wearing one of two shirts – bloody Qantas sent all my luggage to Helsinki on the way over, so I had only a pair of shorts, a pair of moleskins, the shirt I wore on the plane and one I bought for the interview (this one) for the whole trip – don’t get me started about QANTAS.

The Golden Gate

Across the Golden gate bridge from San Fran is a beaut spot called ‘Reyes Point’ – here’s a photo looking back towards San Fran from there (with the Golden Gate Bridge in the background).

A View From Reyes Point Across the Bay to Golden Gate Bridge

A pretty flower at Reyes point – I believe the plant is called Pigface – why, I don’t know 🙂

A Pretty Flower - Why do they call this plant ‘Pig Face’?

A ‘Hummer Limousine’ – Wow!

Hummer - Limousine

Another pretty flower in San Fran (are you a REAL Aussie! That’s so COOL! I want a photo with you!!) – the people were very friendly at “Kell’s Bar” – I love a good Irish Pub, and this one was a beauty – it’s just off Columbus.

Friendly People

The owner (right) and head barman of Kell’s Bar..

The Staff at Kell’s

They shouted me quite a few Guinness’s – here I think that magical brown ale is starting to have its curative effects 🙂 (I am not too sure whether the spooky red eyes were caused by the camera or the Gazillion pints of Guinness)

16 Gazillion Pints of Guiness Later..

The morning after – one of those famous cable trams in San Fran.

One of those Cable Car’s San Fran is famous for

My Hotel was right in the centre of SF (Sutter and Powell) – I got it for a nightly rate of like $69 – it was fantastic

My Hotel

Just a pretty picture of one of the brass fire hydrants they have all over the place in San Fran.

Just a Pretty Picture

Some San Fran street art- this was in Chinatown – San Fran has the best Chinatown I’ve seen in any international city – I felt like I was back in Beijing.

Some San Fran Street Art

On the way out of San Fran – you can see the city itself and the Bay Bridge top RHS.

I loved San Fran – it was such a vibrant colourful city – I hope to go back there someday soon.

San Fran From Above - on my way home

QANTAS strikes again – I had to wait 18 hours for my flight back – by the time I took this photo (the time on my watch is AM by the way) we’d been locked in LA airport with no food or refreshments all night waiting for our plane which was recursively only going to be ‘another twenty minutes’ all night.

Some rather unfortunate baggage handler had managed to run into the wing with the mobile stairs, causing severe damage to the port side aileron.

I felt sorry for the parents with little kids – the time on my watch is AM – roughly 20 hours after the plane was meant to leave.

Bloody Qantas.. 5am in the morning - still waiting

33 comments June 18th, 2007

Spam or Not? Duplicate Content, Different Domains, Different Language

Hi Folks,

A little while back I asked the following question on the Google Webmasters help forum:-

Is it OK to duplicate content in a different language?

Nobody could really give me a solid answer at the time. At the risk of setting off a new wave of ‘language spamming’, it seems it is. The following pronouncement from Matt Cutts (Google) seems to confirm it.

Matt: Having content from two different domains isn’t risky if they are in different languages (for example, Chinese and English), but if you have the exact same content on two different domains, it’s better to use a permanent redirect from the duplicate domains to a single preferred domain. (see this interview with Matt Cutts for the full length version.

Language Spamming ??

What do you people think about that? To me, it’s a very significant admission of a potential major future web-spam weakness, given the availability of (relatively accurate) online translation tools like Babel-Fish etc. It also presents enormous SEO possibilities for crawlers / spammers.

Apart from the obvious inferences, I have a few others:-

  • Can Googlebot ‘understand’ foreign language words in an english site?
  • If so, what effect do these foreign language words have upon a site’s ‘relevance score’…
  • hmmm…

Bye,

doc

それは別の言語の内容を重複させることは良いか ?

它是好复制内容在一种另外语言吗?

Ist es OKAY, Inhalt in einer anderen Sprache zu kopieren?

¿Es ACEPTABLE duplicar el contenido en una diversa lengua?

7 comments May 9th, 2007

April 2007 Pagerank Update is Underway!

The April 2007 Pagerank Update

It’s official – as of a few hours ago new PR’s are starting to filter through the system – the April 2007 Toolbar PR update is underway! Here’s a few insights about the update and what it means to you, and some tips and tricks you might not know..

Why is my Pagerank Jumping Around?

When a toolbar PR update happens, it doesn’t happen all at once – Google has many ‘datacenters’, and your new PR will ‘percolate’ between those datacentres over the next few days to a week.

The PR shown in your toolbar is usually taken from a relatively random datacenter – for that reason, you’ll tend to see your toolbar PR jumping around alot – this isn’t an indication of any kind of penalty, or anything unusual – it’s just an indication that the PR update is underway.

You can see your PR over the various datacentres at http://www.oy-oy.eu.

What is Pagerank (PR)?

PR, or pagerank is one of the factors used by Google to calculate the importance of your site. Importance is different to relevance – you can have a very low PR site and still outrank much higher PR sites that don’t have content that is as relevant to the user’s search as yours.

People tend to get fixated on PR as it is one of the most visible forms of ‘feedback’ from Google about how your site building efforts are going – and since it only gets updated 3 or 4 times a year, people with active sites (including me) tend to look forward to it.

Should I worry too much about PR?

No. A few reasons:-

  • RELEVANCE almost always beats PR if you want good search engine positioning – such things as the words that people use to link back to your site, words in your page, your page title and headings, and words in your url all give Google clues about the relevance of your site. Some people claim that their are 200+ factors such as these that Google uses to calculate relevance.
  • Pagerank is generally out of date – it is really, in its most basic form, just a snapshot measure of how many other sites link back to you (and how many sites link back to them).
  • You can have a PR 0 site and still beat much higher PR sites in a Google search if you concentrate on RELEVANCE.

As time has gone by, Google has got much better at gleaning ‘relevance’ from a page – and with that enhanced functionality, the relative importance of PR (which was probably once the major contributor to search engine positioning) as a factor in calculating your search engine positioning has been diluted by these other factors – but it is still a factor, and it is worth aiming to improve your PR.

Tips and Tricks to Improve your PR

Well, it’s too late now for this update, but if you’d like to work towards improving your PR (and site traffic) you need to get more sites linking to you, and preferably sites with high PR. Here’s some tips off the top of my head:-

  1. LINK OUT – link to sites that interest you. This has two effects – it makes your site much more informative for your readers, and it also helps other sites (the target of your link) learn about you. Whilst it is counter-intuitive that linking out will improve the number of sites linking to you, it does. Why? Because it tends to increase your readership. A site with lots of readers becomes a site that people want to link to. Also, people with active sites tend to spend alot of time monitoring who is linking to them – write an interesting article which links to their high PR site, and it’s likely they’ll come and check out your site – if you are lucky, you might get a link from their high PR site back to you as a thankyou.
  2. WRITE UNIQUE, INFORMATIVE, INTERESTING ARTICLES – if you ever do a Google search for something and you can’t find what you’re looking for easily you have a great opportunity. Find the answer, and write about it. Chances are other people are asking the same question – and you’ll attract links if you write a good quality blog entry about it. Sites that just regurgitate / duplicate information easily found elsewhere won’t tend to get lots of links.
  3. WRITE SOMETHING CONTROVERSIAL – this is one strategy fitting under the general banner ‘link bait’. My best performing pages are those that have controversial content :).
  4. USE SOCIAL NETWORKING TOOLS – Things like mybloglog, feedburner, digg-it etc are a great way to improve your following and traffic. I can pretty much guarantee that links to my site increase proportionally with the amount of traffic I receive.
  5. MAKE A USEFUL TOOL – many of my links come from my wordpress theme, Blix-Krieg. If you put something on your site that is useful, you will attract links.
  6. USE YOUR HOST– Many web hosting companies have online forums for their users – often, these forums have obscenely high PR. Write something genuinely interesting, and link back to it from your host’s forum. This is also often a great way to help trigger an initial crawl on a new site (see my series on the supplemental index for more info on this).
  7. BE A GUEST POSTER. Many sites (including mine) allow users to submit their own articles for inclusion – take advantage of the opportunity – write an interesting, relevant article and ask the owner of a high PR site if they’d like to include it – with a link to your own site in the body.

Also check out this page on the top 13 things that won’t effect your pagerank by JLH. Actually JLH is an example of a successful blogger that applies alot of these principles – He writes great articles that are often interesting, controversial and informative all at the same time. He links liberally. He uses a broad array of social networking tools.

Now – could I please ask you folks a favour? I’ve written a WTF at technorati – I’d appreciate your votes – it’s my first experiment with social networking 🙂  Click this link to vote.

Any other suggestions, feel free to post – hell, why not add your url to your comments – I remove no-follow from all comments after 14 days if they are relevant.

All the best,

Matt

27 comments April 28th, 2007

Google-Bombing Still Works

A Google Bomb

Well – that’s a provocative title – but at least they do work sometimes.. let me explain more..

What is Google Bombing?

Some argue that the first widely known Googlebomb was created by a men’s magazine, which used the anchor text (anchor text is the words I use in a link) ‘dumb motherfu**er’ in a link to a site selling George W Bush merchandise.

In fact George seems to have been the target of quite a few Google-bombs.

My first experience of a Google Bomb was when I was sent an email suggesting I type the words ‘miserable failure’ in a google search back in about 2003 – the resultant page was, of course, George W. Bush’s Whitehouse page.

Definition: GoogleBomb

Strictly defined, a good Google Bomb should be constructed in such a way that a site returned for a given phrase does not even have that phrase in its content. The theory is that if enough sites link to a site using a particular word or phrase, Google will simply assume that the site must in fact be about that phrase – even if the phrase isn’t on the target page.

So, of course, George Bush’s site doesn’t in fact have the words ‘miserable failure’ on it at all, but it (once) ranked first place for that phrase in any Google search because of the viral campaign launched to get thousands of webmasters to link back to that site with those words.

The coining of the actual phrase ‘Google Bomb’ is credited to a fellow by the name of Adam Mathes, who linked the phrase ‘talentless hack’ to a friends website.. this was documented on the site www.uber.nu, which unfortunately seems to be down now.

Google bombs the GoogleBomb

I used to love the Google Bomb so much that I registered and still own a site that I hoped would become a place where people could suggest and democratically vote upon potential political / humerous / educational googlebombs.

I was intoxicated by their potential power to educate and cause giggles, and perhaps even lead to real change – but, alas, on January 25th, 2007, things came to a premature halt.

On that day, Google announced in this post on their official blog that the glory days had come to an end – Google had created an algorithm that would curtail the impact of Google Bombs. Matt Cutts also spoke about the algorithm change in this article..

I was very sad and disappointed – but all good things must come to an end.

Some practical examples of good anchor text selection in action

These aren’t really examples of ‘Google Bombs’ per-se, but they do show the power of good anchor text selection.

For example, linking back to my lingerie site with the anchor text ‘brassiere’ brought my site from 5th page to 2nd position very quickly for that highly competitive word, even though I don’t have it on my site anywhere.

As another example, I once had an occassion where the inventor of a product which I was manufacturing and promoting (and had sunk hundreds of thousands of dollars in cash and man hours as seed funding) was becoming difficult / adversarial towards us, and had appointed a competitor without our knowledge.

The person was interviewed on national TV, and through sheer pig-headedness, chose to promote the product under a different name to what we had been promoting it as for several years. It was just a spiteful attempt to send search traffic to a competitor who had been appointed without our knowledge.

GoogleBombs / anchor text work very quickly

This taught me a little lesson about Google Bombs (or more specifically in this case, good use of anchor text) – because they are distributed in nature, they can work very, very rapidly to alter search results.

Luckily, in this case, we had a head start. I had noticed in my logs about three days before that we were receiving hits from the website of a media organisation.. this led me to the website, and I noted that it was a ‘draft’ page detailing the upcoming interview, in which this person referred to our product by another name.

Subsequently, I went to the person’s website and discovered that the ‘buy this product’ link pointed to a new site, with a url containing the ‘new name’, which was in fact 302 redirecting to our site in what I suspected at the time (correctly) was an amateurish attempt to nick our PR in preparation for an assault upon our business – on consulting my site logs I discovered we had been receiving hits with this site as HTTP_REFERER for at least 3 months..

Their cover was blown. I spent the next few days starting to optimize my site for ‘our new name’, and had 301 redirected their 302 redirect using some clever .htaccess tricks to a site I knew was likely to get their new site banned or at the least seriously retard their hijack attempt.

I also got some friends with regularly crawled sites to link to the ‘rogue site’ to give Google a clear shot at indexing the ‘new’ content – this might seem like a low act – but, remember, this was a defensive action rather than offensive.

Through the use of googlebombs (it helps having webmaster friends) and the fact I already had an established brand and high crawl rate, I was able to quickly (in less than 24 hours) rank first place for the products ‘newly invented’ name, and take advantage of the media exposure.

I noted on the night before the interview aired, that the 302 redirect was removed – but the damage had already been done. It actually took them about 3 months to get reincluded in Google, so my counter-attack seemed to have worked.

We also ranked first place for the person’s name for about the next 6 months, which, of course, made us the villains, not the person attempting to steal our rankings 😉

So – how does the new anti-GoogleBomb algo work?

I thought possibly Google may have changed their algorithm in such a way that a googlebomb for a word or phrase that either seemed contextually irrelevant, or didn’t exist on the target page would no longer rank. This made me a little worried that what had been a previously powerful seo tool for some of my commercial sites would no longer work.

As alot of you know, I have a pretty successful WordPress theme called Blix Krieg.

When people install my theme, there is a link in the footer back to this site, and also one of my commercial sites (www.jaisaben.com). The anchor text for the link back to my other site is ‘by theDuck’.

Just recently, I was checking Google Webmaster tools and I found that a fair percentage of traffic coming to my other site now comes from people searching for the phrase ‘theDuck’. Who the hell searches for “theDuck” – I dunno – but it seems quite a few do.

Sure – it’s not a highly competitive phrase, but it does prove to me that Google hasn’t deprecated the value of inbound link anchor text outright – and whatever their new anti-googlebomb algorithm is, it probably has very little to do with contextual relevance.

Certainly, my Jaisaben site does not have the phrase ‘theDuck’ on it anywhere, it has nothing about ducks on it and yet it now ranks in second place for a search for the phrase ‘theduck’ – purely and simply because of anchor text.

Lessons I have Learnt

This has taught me a few lessons:-

  • Next time I release a wordpress theme, I’ll use non-nonsense words in my footer anchor-text, preferably valuable ones (mesothelioma anyone? – see this link ).
  • Anchor text should still be in anyone’s SEO arsenal.
  • It would appear (at least for uncompetitive phrases) that having heaps of sites linking back to you with the exact same anchor text doesn’t cause any penalty, contrary to what other seo’s have said.
  • However the new Google-bombing algorithm works, it is not simple, and it still leaves enough latitude to use anchor text in clever and powerful ways.
  • Never form a business partnership with a mad person, no matter how good their ideas seem to be, unless you have the patience of a Saint and the bank account of Bill Gates. The same probably also applies to personal relationships 🙂

All the best!

theDuck

7 comments April 9th, 2007

Matt Cutts Blog has been HACKED!

<!--enpts-->Dark Seo has Hacked Cutts’ Site<!--enpte-->

Hi everyone – sorry for the long time between drinks 🙂 I’ve been working hard getting back into my PhD…

Matt Cutts Blog Hijacked

Just an interesting little tid-bit today. It seems that Google’s famous unofficial blogger, Matt Cutts, has had his site hijacked by a bunch of hackers calling themselves the Dark Seo Team.

It may be that the Hijack has been resolved by the time you read this, so just in case, if you click the thumbnail above, you can see what his site looked like when I visited today.

Is this a WordPress Vulnerability?

It seems that no-one, not even Matt, is immune to being hacked.I wonder whether this has anything to do with Matt’s recent upgrade to the latest version of WordPress?

I guess we will see soon enough!

Dark SEO

It’s interesting that the hack is from a mob calling themselves Dark SEO..

Interesting, I say, because I think I saw warning signs that they were planning something funky about three weeks back.

Dark SEO had Cloaked Copies of Matt Cutt’s Page Weeks Ago

How? Well I was having a brief look at my Google webmaster tools ‘links’ section back then, and I noticed that one of my pages (Damn Ugly Websites) was being linked to from a site with a fairly suspicious URL (click here to see the site). I checked out the site and pretty quickly realised they were running a cleverly cloaked copy of Matt’s Site…

What is Cloaking?

What’s Cloaking? Well, basically, if you follow the link, you’ll see it looks like an innocuous web page to us, the human reader, but it seems that if you happen to be googlebot, that site presents completely different content – a copy of Matt’s Site. The perpetrators? None other than darkseoteam.com, who have now Hijacked Matt’s Site.

I did send a message to Matt suggesting he should check it out.. I think my exact words were “Matt, I found this URL – you should probably check it out” – Wow – if it turns out that was the first stage in their hack attack, I will be feeling very prescient indeed :).

But DAMN – If they were clever enough to hack his site, surely they should have been clever enough to put some adsense units on there as well 🙂

NOTE: He got the better of me.. this was an April Fools Joke from Matt 🙂

Cheers,

theDuck

5 comments April 1st, 2007

Ad Selection?

Just a quick post today.. been busy with work..

I was reading the local (online) newspaper today and came across this article.

What struck me was not so much the article, but rather the ad that had been placed next to it.. sometimes automatic as placements send an unintended message – (click the picture to make it readable).

For those readers who aren’t familliar with Rugby, it’s kind of equivalent to American football.

Get Involved with Rugby!

2 comments March 15th, 2007

Can I be penalised by being linked to from a bad neighbourhood?

Scraper Sites - Benevolent or Otherwise?

The ‘Scraper Site’ – Benevolent Friend or Deadly Foe?

One of our regulars, Susie J, left the following question for me this morning –

Can you have a bad link? I checked my inlinks through technorati. A few stood out with questions marks. Here’s a couple of them:
cold remedies
cancer research

These sites do not have any of their own content — just a list of other sties. There is a link to my site to a specific article — but it does not identify my site by name.

Hiya Susie – these are called ‘scraper’ sites.

I’ve got several of them linking back to me too.

There are a number of things you need to consider first before you get too worried about them.

Links from a Bad Neighbourhood – Good or Bad?

Is it bad to have them linking back to you? Well, there are a number of different perspectives on that.

I’d say this right off the bat – Google knows that you can’t help who links to you, so it is impossible to get an official Google ‘penalty’ from such a site linking to you.

If that were possible, I could set up a mean link farm violating every one of Googles webmaster guidelines, and get my competitors struck off Google’s index just by linking to them from my Uber-evil site.

The only exception, of course, is if you link back to the scrapers, in which case it is possible (but unlikely) that Google may consider you’re participating in some link exchange scheme with them and you might get penalised – that’s called linking to a ‘bad neighbourhood’.

Whether or not links from these sites is good or bad from an SEO perspective is a different matter.

What’s their game?

I had the following discussion about this with a few of my SEO friends a few months back, and the general consensus is that those sites are trying to get good search engine positioning by fooling Google into thinking that they are authorities on a particular topic – such as the common cold, in this instance.

Since they link back to me, I don’t get overly perturbed about them, but I have been puzzled about what their game is – because:-

  1. They can’t be after Pagerank – who’s going to link back to a site with no real information? (except people like us, wondering why they are linking to us – but you’ll note I nofollowed the links to them)
  2. They aren’t stealing content – they are acknowledging the source of the content.
  3. They aren’t MFOA (made for adsense) as they (mostly) aren’t displaying ads YET.

So what’s their game? Well Susie, I got your message this morning right after I got back from the gym. I’ve just had a shower (my thinking place) and I believe I may have their strategy sussed.

I reckon they have the same opinion as me – make your outlinks count. Whilst linking out to other sites does, by definition, reduce your pagerank, the effect on your search engine positioning can actually be positive.

This is somewhere along the same lines as ‘it’s not what you know, it’s who you know’ – if you link to a lot of other sites about a topic you start to look like an authority in that topic.

A Devious Black-Hat Scheme..

So search engine positioning is really a combination of relevance and pagerank. So in this case, they are trying to gain relevance in the topic of ‘the common cold’.

I think their strategy might go something like this.

  1. Use adwords to find some lucrative keywords (for instance, I would imagine competition for the keyphrase ‘the common cold’ would be fierce, so it would be lucrative).
  2. Crawl the net looking for articles about ‘the common cold’ – or better still, just do a Google or technorati search for the phrase.
  3. Take small snippets of those articles, and link back to the origin, thus reducing the likelihood of being reported as spammers (after all, everyone likes being linked to).
  4. Cobble together a large number of snippets in such a way that it’s unlikely that the density of information from any one source is suspiciously high on the page (thus avoiding the possibility of triggering a spam flag or duplicate content penalty from Google – and being deindexed or sent to supplemental).
  5. Wait to be crawled by Google.

So now, what do they have – they’ve got a keyword rich page, full of relevant links to topical pages about the common cold.. If I’m an automated robot I’m beginning to figure ‘hey, this looks like an interesting page about the common cold’.

So, they’ve got relevance – all they are now missing for good search engine positioning for the phrase ‘the common cold’ is pagerank (PR). Easy fixed – buy a link from a high pagerank site, or indeed (since these people likely have heaps of sites) throw a link at the page from several of your high pagerank sites, preferably in a related field.

Now Comes the Traffic.

VOILA! You’ve got pagerank and relevance – you suddenly appear to Google to be an authority on the topic of ‘the common cold’.

So hopefully, since you’re now the new authority on the common cold, you’ve got great search engine positioning too – and with positioning comes traffic – lots of traffic.

Sir Richard Branson started his empire by standing out the front of potential locations for his record stores, and physically counting the number of people walking past each site per day. He knew that the more people walking past the better – this is the online equivalent.

Think about it – the two links you sent me are scraper sites about cancer and the common cold. Hands up anyone that doesn’t know someone who’s had a cold this winter? Hands up anyone that doesn’t know of someone affected by cancer?

These keywords weren’t chosen by accident – they both have potentially very high traffic!

Money – lots of money, with Adsense.

Here’s where the brilliance lies – since the site doesn’t really give any answers, the first thing people are going to want to do when they get to the site is go elsewhere – so, what to do with all this traffic?

BRING ON THE ADSENSE. Scatter adsense all over the site and make clicking them is the only real way of escaping. Remove the links back to the original sites (after all, you only had them there to make yourself look legit and stop people from reporting you as spammers) and you’ve successfully run the black-hat gauntlet and probably made a motza on your lucrative keyword.

These schemes are all about maximizing traffic and hence financial reward.

They don’t expect to be around long before they are taken down or detected. This is probably the reason they choose very high traffic keywords – so that they can make hay while the sun is shining.

So is having links from scraper sites bad for me?

So from a net useability perspective, sure, these sites are bad for everyone.

I can remember when I first started surfing the net way back in the mid nineties, you could search for just about anything and it would return a multitude of links to porn – back then all you really had to do to game a search engine was to have heaps of ‘keywords’ on your page (a favourite tactic was to have a huge list of smut related words at the bottom of the page). Luckily Google’s algo has matured and that just doesn’t work anymore.

Plus – as of last year, the majority of web users are using the web for commerce and business, rather than porn, which had dominated legitimate searches for the entire public history of the net (says alot about human nature hey?). So these days, the majority of these schemes are in it for adsense income.

Don’t know about you, but I don’t want to go back to the bad-old days where search results are dominated by useless crud, only this time it’s useless crud with adsense ads rather than asking for your credit card number or offering ‘free previews’. Luckily, so far, Google seems to be keeping pace with the spammers and (whilst their is doubtless still loads of money to be made) things have become a whole lot harder for them.

The verdict – good or bad? From the individual short term perspective of your site, being linked to by these sites probably has no effect (at worst) and perhaps even a small positive effect (at best) on your pagerank.

It’s those that steal your content and don’t link back to you that are bad, as (occasionally) Google deems their version of your content the ‘original’ and cans your site to the supplementals as a plagiarised copy.

What can I do about plagiarised content?

A good way to check for copies of your content online is to use the tool called COPYSCAPE.

If it really irritates you that these sites have copies of your material, there are a number of things you can do about it.

First and foremost, most of these sites use some form of spider to harvest your content.

You can try banning the rogue spiders using robots.txt as described in this article, but that approach only works for the ‘well behaved’ bots – those that obey robots.txt. Furthermore, many of these bots seem to harvest their information directly from technorati, so there is nothing you can do about that.

The second approach is to report the sites as a spam site to Google (you can do that in Google webmaster tools – it’s under the ‘tools’ menu described in this article). This gives Google a ‘heads up’ that the site is a spam site.

As for me personally – now that I’ve realised their game, I’ll be reporting these sites.

This goes against my ‘all publicity is good publicity’ ethos, but what the heck – why should they be making money at the expense of legitimate sites.

All the best,

theDuck

8 comments March 12th, 2007

Targeting your Buyers – Google Adsense Tips, Tricks and Latest Gossip Part 3

In part three of this series about some tips and tricks I learnt during my recent visits to the Google Adsense Conference in Brisbane, Australia, I’m going to write about ‘targeting’ your ads to your content.

People get grumpy about the relatively low income they receive from adsense ads on their blogs – essentially, they want to know how to make moneyfrom their blogs, and having heard of the rags to riches stories of bloggers making hundreds of thousands of dollars from advertising online, they want a slice of the action – and why the hell not?

So now I need you to take a deep breath whilst I take you through some of the latest guidelines for helping google to serve ads that are more likely to be of interest to your customers.

Basically, to have a site that makes money you need at least three things:-

  1. Website Traffic.
  2. Relevant Content.
  3. Relevant Ads, and a high click through rate from those ads.

Those are the three key ingredients for making money from Google Adsense, and lacking any 1 of them will be to the detriment of the others.

Assuming you have good traffic (and I’ll be writing tutorials on that in the near future) and compelling content, all that remains is to encourage a high ‘click thru rate’ on your site.

I have to admit, on this site, the content is generally great and informational, but I’ve previously simply relied on the google adsense code to serve ads that are likely to be clicked on. I’ve tried out image ads, I’ve tried out text ads, I’ve tried out different colours and positions for the ads – and I’ve seen minor changes from doing so.

Alot of the time, though, I think I go a bit too by being too solutions oriented – For instance, someone has a problem, I’ll try to solve it for them, they leave happy, I get good feedback from them and probably generate traffic through referrals – that’s all great, and it’s a part of my growth strategy at this early stage of my blog

So, having the content, I hope the ads will be clicked and everything will be fine from there – but I’ve been finding this isn’t the case – on my other comercial sites, I end up with click through ratios (CTR – the percentage of visitors that click on ads) of less than 1%, whereas on my product based sites I get closer to 10%.

I started to think for reasons this may be – perhaps my readers are ‘ad savvy’ and have a form of blindness to the ad content – perhaps I am providing what they need – information, and they have no need to follow my ads to get more of it.

Someone suggested to me that I should leave articles I write ‘hanging’ so that the reader feels compelled to look at the ads to find more information – not a bad idea, but it goes against the ethos of this blog to an extent. I think the real answer is to write my articles in such a way that they want to take the next step, and offer, in the advertisements, companies and individuals that may help them do so.

Enter stage right adsense section targeting– this was released last year, and is an incredibly simple way to ensure that your ads are ‘micro targetted’ to the niche group viewing your pages. So why haven’t we all heard about it? I think alot of website and SEO people have kept this one to their chests a bit, as it’s a fantastic tool that can really help dramatically increase your returns and make the SEO people look worth their weight in gold.

But you don’t need to be a big shot blogger to imlplement this code – all you need to do is place tags around the content you think is most appropriate to your audience.

the tags are <!– google_ad_section_start –> and <!– google_ad_section_end –>

When google adsense bot sees the <!– google_ad_section_start –> tag, it expects that any information appearing between that tag and the end tag should be used by it when it considers what sort of ads to serve.

So, for example, I have taken an abstract from a recent Wired Magazine Article about how Yahoo has missed the boat when it comes to website advertising.

I found a paragraph in their that speaks about Hollywood, TV Shoe, Theaters, TV Sets – all things that aren’t really spoken about much in the rest of the article, but that I think will bring commercial ads about technology – things that some of the geeks reading this blog might be interested in.

    The truth is that when Semel worked in Hollywood, he understood more about how movies and TV shows made it to theaters and TV sets than virtually anyone else on the planet. Early in his career, during stints in New York, Cleveland, and Los Angeles, all Semel did was sell movies to theater chain owners. He’d show up at each theater — there were only a handful of national chains then — with a list of the movies Warner was going to release over the next few months, and each owner would bid on the movies he wanted.The truth is that when Semel worked in Hollywood, he understood more about how movies and TV shows made it to theaters and TV sets than virtually anyone else on the planet. Early in his career, during stints in New York, Cleveland, and Los Angeles, all Semel did was sell movies to theater chain owners. He’d show up at each theater — there were only a handful of national chains then — with a list of the movies Warner was going to release over the next few months, and each owner would bid on the movies he wanted.

In the next paragraph, I’ve found interestig information about the infrastructure of Yahoo – keywords like servers, technlogy, redesigning a database, redesigning a user interface – all are rock solid keywords that should hopefully trigger ‘mediabot’ to deliver an interesting combnation of consumer products ads and advertising for high grade database and server technology.

    But now, despite Semel’s achievements in Hollywood and early success at Yahoo, Silicon Valley is buzzing with a familiar refrain: Wouldn’t an executive with a little more technology savvy be a better fit? Semel has been Yahoo’s CEO for nearly six years, yet he has never acquired an intuitive sense of the company’s plumbing. He understands how to do deals and partnerships, he gets how to market Yahoo’s brand, and he knows how to tap Yahoo’s giant user base to sell brand advertising to corporations. But the challenges of integrating two giant computer systems or redesigning a database or redoing a user interface? Many who have met with him at Yahoo say he still doesn’t know the right questions to ask about technology. “Terry could never pound the table and say, ‘This is where we need to go, guys,'”one former Yahoo executive says. “On those subjects, he always had to have someone next to him explaining why it was important.” One could have made a convincing argument two years ago that such deep technical knowledge didn’t matter much. But now we have empirical evidence: At Yahoo, the marketers rule, and at Google the engineers rule. And for that, Yahoo is finally paying the price

The Lesson endeth for today – tomorrow we will see the results and expand upon them to make some money 🙂 Don’t be alarmed if it doesn’t look like it’s worked at first – it can take 24 to 48 hours.. patience 🙂

15 comments March 12th, 2007

Pulling pages out of supplemental 101 – Test

This post follows on from my tutorial about pulling pages out of the supplemental index.

A reader at Google Webmaster Help Forums has asked me if it would be possible to post a link to his site about classical music to try and pull one of his pages out of the supplementals.

SMc writes:-

I have had difficulty with one page from my site that insists on staying in the supplemetal index. I had a mis typed URL that I subsequently made a 301 redirect back to the correct link. Now both the bad URL and the good are in and have remained in the supplemental for ages and I cant seem to shift it. Would it be possible for you to throw a link at that page for me to try to force it back out ?

Your site has a couple of other probs that may be causing the supps SMc (in particular check for suplicate content), but let’s try and see if it works.

Here is a link to SMc’s Classical Music Site – By the way SMc – some tips:-

  1. I don’t remember where it was that I read this, but google prefers short URL’s – the physical length of the URL, and ‘depth’ of the URL (depth of directories) should be kept to a minimum. If someone has a link I think it was Vanessa Fox that talked about that.
  2. You should keep the number of links on a page to less than 100, if possible (see Google’s Webmaster Guidelines) –
  3. Every page should be 2 or 3 links from the home page.
  4. Links from ‘related’ websites with high PR probably carry more weight than links from ‘unrelated’ sites (ie mine versus a music site).
  5. Use copyscape to check for duplicate content on your pages (see my primer on the causes of supplementals here).

Cheers,

theDuck

_________________

Follow-up – it worked 🙂

M

Add comment March 10th, 2007

Out Ranking Matt Cutts

A while ago one of my online buddies, JLH, wrote this tongue in cheek post (or should that be boast? 🙂 ), in which he pointed out that he now outranked Google’s own famous Blogger, Matt Cutts for one of his posts.

JLH and I have been having a light-hearted game of one-upmanship for a while now (see the now infamous Banalities of Bananas post here, and my even more ridiculous second attempt to beat JLH on the lucrative ‘Banal Bananas’ keywords here). JLH ultimately prevailed in the Banal Bananas stakes, so I’ve been wracking my brains about ways to beat him since.. so JLH – here’s my chance to match you on this one…

We now outrank Matt Cutts for the search “How to get out of the supplemental index” – granted, it’s probably a temporary fluke, but I thought I’d get mileage from it while I can 🙂

UtheGuru Beats Matt Cutts for Supplementals Post

Cheers and Have a great day,

theDuck

______________

Update – for the moment we seem to be holding in 2nd position for the above search, and getting some nice traffic too… must have done something right 🙂

 

 

Add comment March 8th, 2007

Supplementals and the Supplemental Index – a Primer

One of the most annoying (and mysterious) of all seo problems for many bloggers and website owners is the dreaded supplemental index.

In this tutorial / primer, I’m going to aim to give you an idea of what supplementals are, why they occur, how to identify them and how to solve the problems associated with them.

What is a Supplemental?

A supplemental result is defined by Google as follows:-

A supplemental result is just like a regular web result, except that it’s pulled from our supplemental index. We’re able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.

So, translated into plain english, supplementals are those pages that Google considers not important enough to include in their main index, but not bad / useless enough to not bother indexing at all.

How do I know if I have supplementals?

Firstly, go to www.google.com and enter the search site:www.utheguru.com (replace utheguru.com with your own url). The site: in front of your url is known as a search modifier – there are lots of different search modifiers, but in this case we’re using the site: modifier to tell google to return all pages it has indexed from www.utheguru.com.

There are a few misconceptions about what constitutes a supplemental result. Some people think that supplementals are what is returned when you click on the “repeat the search with the omitted results included” link at the end of a google search. This is not the case.

That link actually shows ‘similar’ content that google thinks might not be relevant to your search, and that content can be supplemental, or non-supplemental in nature.

Actually, a supplemental result is one where the words “Supplemental Result” appear just under the ‘snippet’ (the short description of a site) in a google search. The supplemental results usually appear in the later pages of a site: search, following the main indexed pages. If you click on the thumbnail below, you can see examples of both.

Google Site Search Instructions

Why Do I have Supplemental Results?

Supplementals usually occur for one of the following reasons (in order of increasing likelihood):-

Duplicate content from other sites – have you quoted content from other peoples websites? Does this content make up a large proportion of your page?

Google has sophisticated duplicate content filters that will detect this – remember, it’s ok to quote other sites, but make sure you also have enough good original content on your site to ensure google doesn’t think you are just plagiarising.

A general rule is no more than 50% of any given page should be quotes.If you are concerned about whether you may have too much duplicate content, head over to a site called copyscape (www.copyscape.com) and run your page through their tool.

Duplicate content from your own site – it is a sad fact that many content management systems (CMS) are great at helping beginners spend their time writing great original content rather than trying to learn web-design and html, but really lag behind when it comes to being search engine friendly.

WordPress is one example of a CMS, and it will generally put duplicates of your content all over the joint – for instance, you’ll find this article on the front page of my blog, under the SEO discussions category, and in the archive for March on this site, and they’ll all have different URL’s.Find out about avoiding duplicate content in CMS like wordpress here.

Another cause of duplicate content can be Canonicalization issues – that is where the www and non-www versions of your site are indexed as seperate websites, when in fact they are the same. read more about them in our primer on canonicalization issues here.

Not enough pagerank – is your site more than a few months old? Do you have many other sites linking to you?

If the answer to any of these questions is no, it’s likely that you are in the ‘sandbox’, a kind of purgatory between being indexed and being deindexed.

Some people claim the ‘sandbox’ is an actual step one needs to go through (ie 3 months of not being indexed) while Google gains trust in your site, but that’s just not the case – it’s more about how many people link to you rather than any deliberate ‘temporary ban’ on indexing for new sites.

Don’t believe me? I have one site (www.jaisaben.com) which is almost entirely supplemental – that’s because it is very much a ‘niche’ site, and I haven’t bothered working on it too much – it’s been in the supplementals for months and months – eventually, one day, when it gets enough people linking to it, it will suddenly pop into the main index.

This site (www.utheguru.com) is almost entirely indexed, and was within weeks of me starting it. Why? because it has content that other sites like linking to – as a result, Google considers it an important site, and makes pages I write available in their main index within days.

Is Having Supplementals a Bad Thing?

It can be. Are you presenting ‘niche’ content? If that’s the case, your pages will still be returned as answers to a google search whether they are supplemental or not.

If you are presenting mainstream content, supplementals can be a very bad thing. They make it very unlikely that your pages will be returned by a google search (other than using the site: modifier) at all.

Some people say that once your pages are in the supplemental index, they’ll be there for at least three months (until ‘supplemental bot’ comes for a visit) or perhaps forever. This may have been true in the past, but not anymore. Whether the supplemental index is the end of the road for your site is completely up to you.

My advice? Everyone should aim to have at least 80% of their ‘content’ pages in the main index. It is not that difficult to do.

Supplementals 101 – Bot Behaviour

First, a bit of ‘bot behavioural psychology’ :). I’ve been observing bot behaviour on this site, and others, for many years. During that time I’ve noticed they tend to behave in a set pattern:-

Bot behaviour and the ‘Infant Site’

  • When a site is first submitted, the bots will come and have a fairly deep look at the site, and usually within a few weeks you’ll find your index page listed.
  • From that point on, bots will continue to visit regularly, to check for interesting new content, but they seem unusually reticent to add new content to the google index.
  • At this early stage, it’s very difficult to get anything other than your main page indexed.
  • Googlebot will keep on visiting your site pretty regularly, and at some stage or another you’ll notice some of your other pages appearing in the index, but they will be mainly supplemental.
  • This frustrating cycle will continue forever unless you get the bot really interested by achieving a ‘threshold’ of new inlinks.
  • Once a site has a ‘threshold number of inlinks’ the bot will start to treat your site as ‘an adolescent site’.

Bot behaviour and the ‘Adolescent Site’

  • A site reaches adolescence when it has achieved a threshold number of other sites linking to it – this number doesn’t necessarily have to be large – even 1 link from an ‘authority site’ (page rank 3 or higher) seems to be enough to get a site to this stage.
  • During this stage, ‘deep crawls’ of the site become more frequent.
  • New pages appear in the Google index rapidly after they have been crawled, and usually get a ‘honeymoon’ period – Google figures it will give your new pages the benefit of the doubt, and lists your new page in the main index until it has done a thorough crawl of other sites, and seen whether other pages link to it or not.
  • If Google can’t find other sites linking to your new page, it will often drop back to supplemental status within a few days to a week.
  • During adolescence, the majority of your pages will be in supplementals, and you’ll find that those pages that are indexed are pages that have been directly linked to by other sites.

Bot behaviour and the ‘Mature Site’

  • At some stage Googlebot starts to get really interested in your site, crawls become much more regular, and virtually all new original content is indexed.I’ve heard people say that this is due to the ‘trust factor’ – which I suspect is probably a combination of number and quality of other sites linking to yours, and number of clicks to your site from google searches, indicating relevance.That is the stage this site (utheguru) has now reached, and I generally find any new article I write is included in the main index within a day, and stays there, irregardless of whether other sites link directly to it or not.
  • I call this stage ‘the mature site’, and this is where you should aim to be. Don’t listen to people who say it’s hard – this site is only 2 months old.

In part 2 of this article, I provide strategies that will help you get your pages out of the supplemental index quickly. You can read the next stage of this article here.

{Other Search Phrases – supplemental hell and the mispelt version supplimentals.}

7 comments March 5th, 2007

“Rel=nofollow” steals from the good, do-follow is robin hood.

Did you know that it is possible to ‘steal pagerank‘ by ‘comment spamming‘ – for those of you who aren’t familiar, a few definitions:-

PagerankPageRank aka PR is one of the methods Google uses to determine the relevance or importance of a Web page. PageRank is a vote, by all the other Web pages on the Internet, about how important a Web page is. A link to a Web page counts as a vote of support. If there are no incoming links to a Web page then there is no support.

Comment Spam – Link spam (also called blog spam or comment spam) is a form of spamming or spamdexing that recently became publicized most often when targeting weblogs (or blogs), but also affects wikis (where it is often called wikispam), guestbooks, and online discussion boards. Any web application that displays hyperlinks submitted by visitors or the referring URLs of web visitors may be a target.

in short, it is possible to use ‘comment spam’ to gain pagerank by writing comments linking back to your own website or blog on high PR blogs, forums and websites that allow user comments.
Great right? not necessarily. The ‘gotcha’ is that it can reduce the PR of the originating blog through a process known as ‘bleeding pagerank‘. In effect, these user contributed comments look to Google like a ‘vote’ for the target web-page by the originating webpage.

Enter stage right, the NOFOLLOW attribute. Nofollow was introduced to allow website owners to ‘choose’ which links on their pages should be counted as ‘votes’ for pagerank calculation – as per this background to nofollow from Google.

For that reason and others, we’ve recently seen a number of large websites implement no-follow on the majority of posts (wikepedia is a prominent example) and wordpress is now setup to ‘nofollows’ all user comments by default.

So why do I care? Well, I think it’s well accepted that the introduction of nofollow has caused huge fluctuations in the search engine positioning of various websites, as the effect of this change has filtered through the google index.. a bit like the ‘butterfly effect’, such changes to a well establised algorithm can amplify throughout the system and cause something called ‘hysterisis’ – or instability in the algorithm – while the whole system gets back to some form equillibrium.

I wouldn’t mind betting a million bucks that a large proportion of sites that are reporting huge recent drops in their search engine rankings are probably victims of this effect – even if your own site didn’t rely to a great extent on wikipedia links or comment spam links for its page rank, it could be quite possible that a website that links to you did – and so on ad-infinitum.

As for any system in a state of flux, I’d predict the google index will reach a new equillibrium relatively quickly (at least in a few months) and people will adapt new ways of gaining pagerank – but as someone with experience in this area (I did a lot of work that used similiar types of ‘reward algorithms’ as google in my previous incarnation as an Agricultural Scientist).

I see unintended side effects of this change down the track – here’s a little extract from a note I wrote to a Googler recently:-

Your previous missives about nofollow spoke of the fact that it is a great thing, and that backlinks can be built by other white-hat SEO techniques. I’d have to say that, in a lot of cases (one example would be a blog) the backlinks actually only start to build once the content is searchable, so those of us that have rapidly evolving sites designed to answer questions of the moment never ever get listed, even though they provide great answers and unique information – comments on blogs, for example.

My basic feeling is that the big, older, more well established domains are getting bigger and the smaller ones are getting smaller because they never get a chance to have their pages crawled because they are either nofollowed or put in supplemental hell because of ‘a small number of backlinks’, which will only get worse with nofollow.

Could be worthy of a future article – are we getting supp’d because G thinks we are spammers, or are we getting Supp’d because of some other reason? It’s a massive problem that’s diluting the value of Google, for research purposes imho.

Your second point, about not giving any value to links from Blogs is another thing about the (apparent) algo of Google that I find flawed (apart from the growing incidence of supp’s).

In my daily life as a computer and communications engineer with a fairly tangential degree (agricultural science) as well, I’ve learnt a fair bit about gathering information. Obviously, in science in general, the tradition has been that knowledge is built up through lit review and original research.

The original research is then recorded in peer reviewed papers. Any good scientist knows that a good paper is one that references to as many other papers about the topic as possible. This means that that paper can be a first stop for anyone wanting to know what work has been done before in the area of research.

If you take this model, and apply it to the google algo, the lit review is the google search, the peer review is the comments, the paper is the blog, and the references are the forward links.

Google is about information, and building knowledge. For hundreds and hundreds of years, humanity has built knowledge using the above peer review system. It works.

So, what am I getting at – 3 things –

  1. Blogs are pretty damn close to the peer review system, closer than static pages, IMHO.
    By no-following links, bloggers and Google risk penalizing new knowledge rather than encouraging it.
  2. Google needs to consider the effect this will have on its algorithm – new sites and people with great ideas need to be indexed to provide more balanced content and move information forward, rather than remaining static

I also spoke in my letter about the fact that I believe that sites should be rewarded, not penalized, under the PR system for linking to other sites with great information. To an extent I think they already probably are –

I’ll write more about my thoughts tomorrow in part two of this article –

Ciao,

TheDuck

10 comments March 3rd, 2007

MSN Second Place Ranking for “Windows XP”!

One of our regulars sent me an email just recently in which a fellow boasted about how easy it had been for him to be number two in the MSN search engine for what one would assume would be ‘highly competitive’ keyphrases including Windows XP, Windows Media Player and Internet Explorer.

He mentioned that the traffic from MSN had been ‘unbelievable’ as a result, as you would assume it would be, but I think now that his ‘good fortune’ has been exposed, he might find that the sheer number of bloggers writing about it could push him out of second place pretty quickly 😀

Nonetheless, his story triggered my interest a little, so I did a little background work and identified the site in question – Yahoo shows it only has around 630 backlinks, the site content isn’t particularly compelling and there isn’t any obvious search engine optimization strategies being employed.. a look using the www.alexa.com traffic tool (see below) and doesn’t show this site doing particularly well traffic wise, so I’m guessing one of two things:-

  1. His top 2 position in MSN is only very recent, and therefore the spike in traffic he talks of hasn’t been incorporated in the Alexa graph yet.
  2. No-one uses MSN

Both may in fact be true – but still, I’d love to work out how he’s achieved this – it raises the interesting possibility that MSN uses completely different metrics for determining what constitutes an ‘important site’ – I’ll be watching the situation over the coming weeks to see whether he retains his good ranking for long.

Cheers,

TheDuck

http://www.utheguru.com estimated traffic versus the site in question

2 comments March 1st, 2007

Canonicalization and 301 redirects

I’ve been noticing TOO MANY SITES having what’s called canonicalization problems – where both a www and non-www version of the site is indexed.. this is a real problem if you want to maximize traffic and pagerank – in this article I talk about the problem and how to overcome it.

I’d suggest if you’re serious about site optimization, you read here about the the 301 redirect

Continue Reading 15 comments February 1st, 2007

Previous Posts


Featured Advertiser

Buy me a beer!

This sure is thirsty work - Here's your chance to buy me a beer :)

Links

Feeds

Posts by Month