“Rel=nofollow” steals from the good, do-follow is robin hood.

March 3rd, 2007

Did you know that it is possible to ‘steal pagerank‘ by ‘comment spamming‘ – for those of you who aren’t familiar, a few definitions:-

PagerankPageRank aka PR is one of the methods Google uses to determine the relevance or importance of a Web page. PageRank is a vote, by all the other Web pages on the Internet, about how important a Web page is. A link to a Web page counts as a vote of support. If there are no incoming links to a Web page then there is no support.

Comment Spam – Link spam (also called blog spam or comment spam) is a form of spamming or spamdexing that recently became publicized most often when targeting weblogs (or blogs), but also affects wikis (where it is often called wikispam), guestbooks, and online discussion boards. Any web application that displays hyperlinks submitted by visitors or the referring URLs of web visitors may be a target.

in short, it is possible to use ‘comment spam’ to gain pagerank by writing comments linking back to your own website or blog on high PR blogs, forums and websites that allow user comments.
Great right? not necessarily. The ‘gotcha’ is that it can reduce the PR of the originating blog through a process known as ‘bleeding pagerank‘. In effect, these user contributed comments look to Google like a ‘vote’ for the target web-page by the originating webpage.

Enter stage right, the NOFOLLOW attribute. Nofollow was introduced to allow website owners to ‘choose’ which links on their pages should be counted as ‘votes’ for pagerank calculation – as per this background to nofollow from Google.

For that reason and others, we’ve recently seen a number of large websites implement no-follow on the majority of posts (wikepedia is a prominent example) and wordpress is now setup to ‘nofollows’ all user comments by default.

So why do I care? Well, I think it’s well accepted that the introduction of nofollow has caused huge fluctuations in the search engine positioning of various websites, as the effect of this change has filtered through the google index.. a bit like the ‘butterfly effect’, such changes to a well establised algorithm can amplify throughout the system and cause something called ‘hysterisis’ – or instability in the algorithm – while the whole system gets back to some form equillibrium.

I wouldn’t mind betting a million bucks that a large proportion of sites that are reporting huge recent drops in their search engine rankings are probably victims of this effect – even if your own site didn’t rely to a great extent on wikipedia links or comment spam links for its page rank, it could be quite possible that a website that links to you did – and so on ad-infinitum.

As for any system in a state of flux, I’d predict the google index will reach a new equillibrium relatively quickly (at least in a few months) and people will adapt new ways of gaining pagerank – but as someone with experience in this area (I did a lot of work that used similiar types of ‘reward algorithms’ as google in my previous incarnation as an Agricultural Scientist).

I see unintended side effects of this change down the track – here’s a little extract from a note I wrote to a Googler recently:-

Your previous missives about nofollow spoke of the fact that it is a great thing, and that backlinks can be built by other white-hat SEO techniques. I’d have to say that, in a lot of cases (one example would be a blog) the backlinks actually only start to build once the content is searchable, so those of us that have rapidly evolving sites designed to answer questions of the moment never ever get listed, even though they provide great answers and unique information – comments on blogs, for example.

My basic feeling is that the big, older, more well established domains are getting bigger and the smaller ones are getting smaller because they never get a chance to have their pages crawled because they are either nofollowed or put in supplemental hell because of ‘a small number of backlinks’, which will only get worse with nofollow.

Could be worthy of a future article – are we getting supp’d because G thinks we are spammers, or are we getting Supp’d because of some other reason? It’s a massive problem that’s diluting the value of Google, for research purposes imho.

Your second point, about not giving any value to links from Blogs is another thing about the (apparent) algo of Google that I find flawed (apart from the growing incidence of supp’s).

In my daily life as a computer and communications engineer with a fairly tangential degree (agricultural science) as well, I’ve learnt a fair bit about gathering information. Obviously, in science in general, the tradition has been that knowledge is built up through lit review and original research.

The original research is then recorded in peer reviewed papers. Any good scientist knows that a good paper is one that references to as many other papers about the topic as possible. This means that that paper can be a first stop for anyone wanting to know what work has been done before in the area of research.

If you take this model, and apply it to the google algo, the lit review is the google search, the peer review is the comments, the paper is the blog, and the references are the forward links.

Google is about information, and building knowledge. For hundreds and hundreds of years, humanity has built knowledge using the above peer review system. It works.

So, what am I getting at – 3 things –

  1. Blogs are pretty damn close to the peer review system, closer than static pages, IMHO.
    By no-following links, bloggers and Google risk penalizing new knowledge rather than encouraging it.
  2. Google needs to consider the effect this will have on its algorithm – new sites and people with great ideas need to be indexed to provide more balanced content and move information forward, rather than remaining static

I also spoke in my letter about the fact that I believe that sites should be rewarded, not penalized, under the PR system for linking to other sites with great information. To an extent I think they already probably are –

I’ll write more about my thoughts tomorrow in part two of this article –

Ciao,

TheDuck

Digg!

Entry Filed under: SEO Discussions

If you found this page useful, consider linking to it.
Simply copy and paste the code below into your web site (Ctrl+C to copy)
It will look like this: “Rel=nofollow” steals from the good, do-follow is robin hood.

10 Comments Add your own

  • 1. John  |  March 3rd, 2007 at 3:48 am

    How much is a million “bucks” in real world currency? I’d love to take you up on that :D. I have only seen a handful of sites that I could trace back to the Wikipedia nofollow changeover.

    My only complaint about the “rel=nofollow” is that browsers do not signal it by default. I have no problem with allowing sites to link in a way that does not “vouch” for the link, but this lack of trust needs to be visible to the user. If this were so, then I am fairly sure that the Wikipedia would not have adopted such a wide-spread use of the rel=nofollows. They would have worked on a system to determine the “trustability” of links, either from age or peer review.

    The same goes for blogs – if your blog is static and just collecting comment spam, it should have nofollows in the comments — the owner would want to make sure that this is so because users would see that he does not trust those links. If a blog is “living” and properly maintained, then the owner might choose to nofollow the links in the beginning, but allowing them to become static links over time – assuming they are not deleted or manually marked as being “tricky”.

    Once everyone sees the rel=nofollow and knows what it means, people will start to use it in a responsible way.

    My only other wish is that there be a differentiation between a “bots should not follow” nofollow and an “untrusted link” nofollow.

  • 2. DuckMan  |  March 3rd, 2007 at 4:24 am

    Under what circumastances, John, would you think that ‘bot should not follow’ would be warranted? I’m guessing for testing purposes?

    Also J, I’ve taken your suggestion and implemented JLH’s do follow policies (except for the bit about no-following developers links in the footer which I had to slap him around a bit over 🙂 )

    Current age links on this site for 7 days before they become followable – really keen to get people to link good high quality posts, informational pieces and tools that add to the debate.

    M

  • 3. JohnMu  |  March 3rd, 2007 at 6:48 am

    Did you see this: http://www.marketingpilgrim.com/2007/03/googles-lasnik-wishes-nofollow-didnt-exist.html ? 🙂

    “Bots should not follow” would be perfect for the obvious endless crawl areas, eg calendar scripts. Ever try to crawl a site with a calendar script? Try it.

    Good job on implementing JLH’s linking policies! It doesn’t have to be the same, but it’s good to have a policy.

  • 4. DuckMan  |  March 4th, 2007 at 2:32 am

    That’s a really interesting article John – i reckon Adam could feel a bit ripped off about the melodramatic title and first few paragraphs – but hey – good link bait.. and interesting to hear the diversity of views on the issue in their comments.

    You know my views on no-follow – I think similar could be achieved with a much simpler solution that requires no webmaster input – but that’s for another article once I’ve thought it through properly 😉

    M

  • 5. JLH  |  March 29th, 2007 at 7:45 pm

    I updated my policy, changed my stance on those links 🙂

  • 6. Media Training  |  October 2nd, 2008 at 8:36 pm

    I feel ‘comment inbound’ links works to a certain extent, but I do feel that Google does not hold much weight in them and it won’t be long before they are ironed out like the Wiki links. Do Follow won’t last forever!

  • 7. Sohbet  |  October 9th, 2008 at 8:15 am

    thanks

  • 8. resimler  |  January 14th, 2009 at 11:08 am

    thanks all

  • 9. yzen  |  February 24th, 2009 at 8:23 am

    thanks for everything!

  • 10. T42p  |  April 18th, 2009 at 9:27 am

    good article, thanks

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed


Featured Advertiser

Buy me a beer!

This sure is thirsty work - Here's your chance to buy me a beer :)

Links

Feeds

Posts by Month