Best Robots.txt For Wordpress
As it has been discussed many times before, robots.txt plays a very important role in Search Engine Optimization. Search Engine Robots first look at your robots.txt before crawling your site. Wordpress, eventhough very search-engine friendly with its PermaLinks, is seem to be having problems with latest algorithms of search engines. This is caused by what we call a “duplicate page filter”. When you have different URLs pointing to the same content, search engines like Google consider the content as copied and therefore can penalize your whole blog!
For example, the following two urls point to the same content in wordpress:
domain.com/category/category-name/post-name
domain.com/category-name/post-name
to avoid the duplicate content filter working on your wordpress blog, I have come up with this exclusive robots.txt file for you to benefit from search engines.
User-agent: Googlebot
Disallow: /wp-content/
Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /feed/
Disallow: /archives/
Disallow: /sitemap.xml
Disallow: /index.php
Disallow: /*?
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: */feed/
Disallow: */trackback/
Disallow: /page/
Disallow: /tag/
Disallow: /category/User-agent: Googlebot-Image
Disallow: /wp-includes/User-agent: Mediapartners-Google*
Disallow:User-agent: ia_archiver
Disallow: /User-agent: duggmirror
Disallow: /
This code will let Google Images index all the files except for the ones in the includes folder, let Google Adsense Bot to visit every page of your blog and make Google bot to ignore other unnecessary duplicate content pages. Adding this robots.txt file will increase your traffic by letting Google pay more attention to your important pages and discard duplicate ones.


