|
|
|
|
|
An Exclusion URL or Exclusion Mask is a way to tell the
spider where NOT to go. This is a
useful option if you have data that you do not wish to have crawled, thus, not showing up in the
search results pages. You can enter the entire URL or enter a URL Mask.
An Exclusion Mask allows you to enter just a part of the URL. If the spider encounters a
hyperlink that contains the Exclusion Mask, then it will skip that URL.
NOTE: Exclusion URLs and Masks
are useful particularly if you do not have access to edit the robots.txt file. You have the ability
to toggle if the spider should recognize robots.txt in the Spidering Options.
Only enter one Exclusion URL/Mask per line.
Example:
In the example above, the first entry is an entire URL of a directory that will not be spidered.
The second entry instructs the spider not to crawl any document called private.html regardless of the directory.
The third entry instructs the spider not to crawl any document in any directory called /private_directory/. |
|
|
|
|
|
|
|