|
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
IDownloadFilter | A filter for indicating if a Url should be downloaded. |
IIndexFilter | Indicates if a Url or filename should be indexed |
ISperowiderFilter | A convenience interface that wraps all required Sperowider filter interfaces. |
ISpiderFilter | A filtering for spidering urls found in webpages. |
Class Summary | |
ADumbIndexFilter | A temporary way of implementing index filtering for Sperowider. |
AIncludeExcludeFilter | Provides a basic frame for file/url filtering. |
BlocksAllFilter | A URL Filter that says "no" to every candidate URL. |
NoHopRegexSperowiderFilter | This class functions as filter to implement No-Hop logic using Regex for downloading and spidering. |
NoHopSimpleSperowiderFilter | This class functions as filter to implement No-Hop logic using Regex for downloading and spidering. |
OneHopRegexSperowiderFilter | This class functions as filter to implement One-Hop logic using Regex for downloading and spidering. |
OneHopSimpleSperowiderFilter | This class functions as filter to implement One-Hop logic using Regex for downloading and spidering. |
PatternMatcher | Does Regex style pattern matching in support of regex based URL filters
like RegexFilter . |
RegexFilter | A regex based implementation of AIncludeExcludeFilter . |
RegexURLFilter | Deprecated. Use NoHopRegexSperowiderFilter instead of this class. |
SimpleFilter | An implementation of AIncludeExcludeFilter that
uses the filter rules from SimpleMatcher . |
SimpleMatcher | Does simple style pattern matching in support of simple based URL filters
like SimpleFilter . |
SimpleURLFilter | Deprecated. Use NoHopSimpleSperowiderFilter instead of this class. |
URLFilter | Deprecated. Use NoHopSimpleSperowiderFilter instead. |
URL filter intefaces and implementations to allow for control over Simple Spider spidering decisions, and Sperowider spidering, rectifying, downloading, and indexing decisions.
The place to really start in this package is with the
NoHopRegexSperowiderFilter
, which is the most commonly used
URL filter. It uses regex to select what URLs should be downloaded
and indexes all files that have been downloaded.
|
|
|||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |