Duplicate Content Problems with the Drupal Quicktabs Module
I'm a big fan of the Quicktabs module, which easily lets you create tabbed content in Drupal. However, there is an SEO problem that people should be aware of when using it.
On my Pitbulls.org site, I noticed Google indexing multiple versions of the same page. This is known as duplicate content and it can penalize your rankings. Given that I had already modified my robots.txt file specifically to prevent this type of thing, I was a little confused.
But the solution was simple. It turns out that Quicktabs creates "unique" URLs of pages it is displayed on, so it can do the correct action when a user clicks on a tab.
So on my site, for instance, the "Most Popular" tab is link with "http://www.pitbulls.org/content/welcome-pitbullsorg?quicktabs_1=0#quicktabs-1"
And Google indexes this link, thinking it's a seperate page from http://www.pitbulls.org/content/welcome-pitbullsorg . But it's not.
Don't worry. The solution is simple.
In your robots.txt file, put the line: Disallow: /*quicktabs_*
And this little problem is solved.
UPDATE: After some more research, the above might not resolve the problem completly. Google cannot crawl the url, but they can still see it and may put it in the index. A robots.txt block is better than nothing, but it might still pass pagerank to page you don't want to rank.
So there are two possible ways to solve this. And if you use one of these ways, do not use Disallow in robots.txt, because then the bot will not be able to see what you are doing.
One is with the canonical tag. This tells search engines to look to another page for the "real" content. This is also a useful tag for pages of data that can be sorted. The data is the essentially the same, just in a different order, and so you don't really want search engines to treat them as seperate pages. For example, lets say you had http://www.example.com/data and when you sorted the data the url ended up being http://www.example.com/data?sort=blahblahblah.
Since it's a different URL, a search engine will treat it as a different page. But you don't want that. So you add:
<link rel="canonical" href="http://www.example.com/data"/>
between the <head> tags of http://www.example.com/data?sort=blahblahblah page so it points back to the core page.
You can read more about it at http://www.google.com/support/webmasters/bin/answer.py?answer=139394
The Nodewords module can help you add canonical tags to nodes, so even if the path has extra variables, it will be pointing to the correct canonical URL.
If a page has a robots "NOINDEX, FOLLOW" meta tag, search engines will crawl the page, but will not add it to the index, so the page will not show up in the results and not count as duplicate content.
<meta name="robots" content="noindex, follow">
The "follow" is very important. It allows any links on the page to continue to pass link juice to other pages.
Nodewords can also help you add this to nodes and pages, however it is less useful in this case because you can't easily target pages that only have extra query strings. And misues of the NOINDEX tag can have severe consequences for your site.
So I recommend you use the "canonical URLs" solution.