|
|||||||
| |||||||
|
|||
|
How do I STOP search engines indexing pages -
12-12-2006, 01:36 PM
Strange question in this forum, but I have certain pages which I don't want Search Engines to index, can I do this?
Also I have Adsense for Search on my site and again I don't want that to find certain pages, does anyone know how to stop it from finding them? Thanks James |
|
|
|
||||
|
Re: How do I STOP search engines indexing pages -
12-12-2006, 01:56 PM
Build your self a 'robots.txt' file and exclude the relevant folders and files.
Info here:www.robotstxt.org and here: www.searchtools.com. I use a lot of exclusions, usually for development files and client projects. And even if you don't want to exclude anything, a robots.txt file always a good idea - it's one of the first things a robot looks for when it indexes a site. |
|
|||
|
Re: How do I STOP search engines indexing pages -
14-12-2006, 01:18 AM
Quote:
"Anyone can access your robots file, not just robots. For example, typing http://www.google.com/robots.txt will get you Google's own robots.txt file. I notice that some new webmasters seem to think that they can list their secret directories in their robots.txt file to prevent that directory from being accessed. Far from it. Listing a directory in a robots.txt file often attracts attention to the directory! In fact, some spiders (like certain spammers' email harvesting robots) make it a point to check the robots.txt for excluded directories to spider." I want to keep my download pages secret to avoid people stealing the ebooks. Someone would just need to access the robots.txt file to find out all the download pages. Thinking about it, if I put them in a directory called "downloads" I could exclude that. Thanks Clare, I will look them up. Thanks for the help guys. James |
|
||||
|
Re: How do I STOP search engines indexing pages -
14-12-2006, 09:05 AM
Use robots.txt to block the spiders and your .htaccess files to password protect the directories you don't want people to access.
Tutorial Reading the robots.txt file for any website is of course quite legal - but try to access some of the direcories listed in the file and you will usually get an 'access denied' message. The spider issue is easy to overcome, certainly when it comes to email addresses, if you look at my contact page (link below) and view the source code you will see that I've hidden the email address so it can't be phished. Using 'nofollow, and 'noindex' can work but there is nothing to stop the search engine ignoring the metatag and indexing the page, in the same way it could ignore the robots.txt instructions. Password protecting specific direcories is the only foolproof way to prevent access. |
|
|||
|
Re: How do I STOP search engines indexing pages -
14-12-2006, 10:08 AM
It might be useful to visit the Matt Cutts site for into about the robots.txt file and preventing access to areas of your site. For those who don't know Matt Cutts is the sort of 'Anti Spam PR Engineer' for Google. Most of his comments come straight from the horses mouth - as it were.
The site is also pretty good because he seems to link to the most accurate sites in the outside world who talk about SEO, .htaccess, robots.txt and all the other wiggly internet bits of interest. |
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Submit a semi-completed site to search engines? | cssoh | General Discussion | 8 | 15-02-2006 10:35 AM |
| Search Engines | Marvin | Search Engine Optimisation Forum | 9 | 01-01-2006 04:11 PM |
| Search Engines | c2webdesign | Search Engine Optimisation Forum | 0 | 13-04-2005 08:58 PM |
| Content optimisation for the search engines | ARTzWeb.net | Internet Marketing Forum | 2 | 02-01-2005 08:31 PM |