The Biggest Problem Of Google Search Post Panda Algorithmic Update
By on May 9th, 2011

crying-panda

It has been nearly one month since Google rolled out the famous Panda update, which is geared towards returning high quality sites on search results instead of content farms, scraper sites and sites who don’t produce original content.

Since the day Google’s Panda update went global, a lot of websites have seen an overwhelming change in their organic traffic. A lot of high traffic websites practically disappeared from Google search while Google’s own properties saw an increase in search traffic. Search metrics has the complete details about the early losers and winners of Google’s Panda update.

Recently, Google gave some additional guidelines to webmasters affected by the Panda update and how they can focus on delivering the best possible user experience. Google clearly says that the Panda algorithmic update incorporates user feedback as a strong signal, so website owners should focus on making their content or service as user friendly as possible.

Some Questions To Ask and Ponder Upon

The webmaster tools blog post list all the questions site owners should ponder upon, I will highlight some specific questions from that article here:

  • Would you trust the information presented in this article? As a user, trust comes by looking at the cover and the symbol associated with it. If I have never heard about a website before, it has ugly design,   bombards me with popups and subscription boxes every second it is most likely that I won’t trust the content or the service provided on the website. And there is a good chance that I will also tell my friends not to visit that website again.Inference: Blog design matters. So does usability.
  • Is this the sort of page you’d want to bookmark, share with a friend, or recommend? Google wants to know the feel goodfactor about a particular page by judging whether users would want to share the content with their friends or family members. I think this is doubtful because not all users can judge the usefulness of every page, at first glance.Example: Consider a set of users who are searching   for information about Mothers day. All they want is to read quotations and facts about Mom’s day and if you take them to this Wikipedia page , chances are that many won’t read it top to bottom. They might instead prefer this page which has a list of quotations about Mom’s day. This is a fact all users can’t judge the usefulness of a webpage, which is why we need search engines to return them the best possible page.Inference: No doubt Google is taking social sharing as one of their ranking signals, which can be gamed thanks to retweet clubs, spamming each others Facebook fan pages, paying someone to get 100 stumbles the list is endless.
  • Was the article edited well, or does it appear sloppy or hastily produced? Google wants to know the grammatical level of a webpage and how well written it is. This is obvious, because search engines don’t want to return a page which is sloppy.But there is no guarantee that a page which is written badly, does not contain useful information. I know a couple of friends who don’t have good writing skills, but they are really good at their subject. They don’t know how to write like a seasoned blogger, who on the other hand knows how to collect information from different sources, mask it and produce a blog post.
  • Does this article have an excessive amount of ads that distract from or interfere with the main content?

There are more questions on the blog post and the probable answers depend from one webmaster level to another. However, the biggest consequence of Google’s Panda update has remain unanswered scraper sites outranking the sources for the original content they have written.

Google’s Algorithm Doesn’t Make Exceptions

An algorithm works for everyone whether you are a spammer or a genuine source, an algorithm doesn’t give a damn care. It will continue to work the way it is designed and this is exactly where the problem begins.

There is no way a machine or a computer program can accurately determine whether John or harry wrote a piece of content. Before the Panda update was rolled out, Google did a good job keeping the scrapers away and showed the original article on top.

Here are a few examples which show that Google’s so called algorithm is not able to differentiate between the real source and the copycats.

Example # 1: Matt Cutt’s Personal Blog

Performing a search for this string (not an exact match) from Matt’s post on overdoing URL removals shows the following result:

spam-site-ranking-ahead

So what do we have here.

1. Matt’s blog post is nowhere to be found on the first page.

2. The same scraper site is ranking for the first three results on SERP’s which violates Google’s own theory that they tend to show content from different domains on search results. There is more, Google thinks that this piece of content in exclusively genuine and unique on the spam site, so they are showing the little suggestion Read more content from this source. Ridiculous !

3. Google also offers a translated version of another spammer site who has completely copied the original article from top to bottom.

Now consider the following facts

  • Matt’s blog is highly informative and considered an authority site on the subject.
  • Matt’s blog is a trusted source among users.
  • There are ZERO advertisements on the blog.
  • Has good quality backlinks, high domain age, good social influence and a decent design.
  • Google pagerank: 7.

So why is that Matt’s page is not shown at all in the SERP’s?

Example #2: Search Engine Land

Performing a search for this string (not an exact match) from   Greg’s post at SearchEngineLand shows the following results:

scraped-content

The same thing holds true for SearchEngineLand’s article. The original article is nowhere to be found on the first page, while Google thinks it is good to return even the auto generated RSS feed on the scraper site? And why is that Googlebot fails to read the title Latest News?

For the record, SearchEngineLand has Google PR 7 and it’s a high quality site with genuine content and reports on search   analysis. In fact, it is one of the oldest sites to break developments and news about search engines, in this case the latter is not giving the due credit to the former.

Example #3: Techcrunch.com

Performing a search for this string (not an exact match) from Alexia’s post on Techcrunch, shows the following result:

spam-sites-ranking-ahead

Techcrunch’s original article is nowhere shown on the first page, while the first result takes you to an ad laden page with no content.

Note: On performing the above example searches, you might see different results on search result pages. Search rankings of a particular phrase can change any second and it also depends upon your geographical location amongst other factors. If you are seeing different results than those shown in the screenshots above, you might want to check the following video where I have performed the above example searches one by one:

Now looking at the above examples, we go back to the same echo chamber Physician, first heal thyself.

The sole purpose of Google’s new algorithm was to remove content farms, scraper sites and people who blindly copy whole or part of the article from the original source. But the results speak a different story there are numerous occasions when original content is nowhere to be found on the search results. And it’s not that only we are saying this, the folks at Seomoz and Ubergizmo have produced their reports here and here

Judging quality of content comes later, first you should find out who the real source is and Google is failing terribly here.

If you are a webmaster and find that scrapers are outranking for the content you have written, I am afraid there is nothing much you can do here. Because this is an algorithm which detects and differentiates between the source and the scraper and if it is failing to do it’s job, nothing is in your hands.

You can file DMCA complaints and take down the scraper sites one by one, but this is impossible for sites having thousands of pages.

Come on Google ! We have circled back to the point we were before – meaningful results. That’s what Google search is known for and everybody used to appreciate that. You have changed the rules of the game, but please don’t take away the real players and let robots dominate the search results.

Wake up !

Tags: ,
Author: Amit Banerjee Google Profile for Amit Banerjee
Amit has been writing for Techie Buzz since early 2009 and keeps a close eye on web apps, Google and all things Tech. He also writes at his own tech blog, Ampercent. Follow him on Twitter @ amit_banerjee

Amit Banerjee has written and can be contacted at amit@techie-buzz.com.
  • http://www.shoutmeloud.com Harsh Agrawal

    Nice article Amit and no doubt new panda algo has changed many business and if Google doesn’t fix it.. Soon Internet will be bombarded with auto RSS feed blogs.. or none the less, Bing might take search share…
    I already started shifting to Bing search ….

  • http://webtrickz.com Mayur

    Nice writeup Amit and are the examples (screenshots) given here are so much precise which clears how this Panda update is a curse for genuine webmasters.

    I’m really hoping to get a answer on this “why scraper sites are outranking the original content”?

  • http://www.techarraz.com Chinmoy Kanjilal

    This raises some serious questions on Google’s panda update. Pushing real topic sources behind in the search result page is one thing, but sending every kind of source, whether it is an authoritative domain or a trusted source to oblivion, this is outrageous.

  • http://techofweb.com Atul Bansal

    nice article Amit. Its true that Google’s new algorithm is ranking teh scrapper sites ahed of original content. Dont know what changes google has implemented and what will be the future of google?

    I can see such examples at vast number of places on net after this Panda update that Google’s search indexing and ranking has gone to more negative side than before

 
Copyright 2006-2012 Techie Buzz. All Rights Reserved. Our content may not be reproduced on other websites. Content Delivery by MaxCDN