Block Google and bots using htaccess and robots.txt
It is as important as including your pages in SERP to prevent some pages from listing from SERPs. Specially when you are a web development company and want to demo your clients with site status on your demo server.
Solution 1 : Password protection
Protecting site with htaccess password is the best way to block anyone else accessing the site. But that is not possible all the time when you have demo audience test.
Solution 2 : Robots.txt
Another Solution Google is providing is to use Robots.txt file to tell Bots not to crawl or list pages in results. But that’s not always a solution. Google’s Matt Cuts has confirmed that Google may include pages from such sites if Google think is relevant.
User-agent: * Disallow: /
Solution 3 : Using .htaccess RewriteCond
So the solution is to block Google and other similar bots from accessing your site. For that, put following code in your htaccess.
RewriteEngine on RewriteCond %{HTTP_USER_AGENT} AltaVista [OR] RewriteCond %{HTTP_USER_AGENT} Googlebot [OR] RewriteCond %{HTTP_USER_AGENT} msnbot [OR] RewriteCond %{HTTP_USER_AGENT} Slurp RewriteRule ^.*$ "http\:\/\/htmlremix\.com" [R=301,L]
Change URL in last line to your main site so that your site gets SEO ranking if someone linked in to your blocked site.
Download this htaccess block other common bot access
Great effort and collection which are very useful to many people , thaks for sharing the info with us
$userAgent = $_SERVER[‘HTTP_USER_AGENT’];
if(stristr(strtolower($userAgent), ‘googlebot’)
Ravi, I think we can keept the htaccess in the directory which we do not want to be indexed. Right? I do not know that this will be the right solution or not. Its just a thought….
Good post
how can we block google from indexing a specific directory using .htaccess?
Can you please help
Thanks bro..
i will try with this step.. 🙂