org.erowid.sperowider.simple
Class SimplePageSpider

java.lang.Object
  extended byorg.erowid.sperowider.ASpiderBase
      extended byorg.erowid.sperowider.simple.SimplePageSpider

public class SimplePageSpider
extends ASpiderBase

Takes an HTML location, indexes it, grabs the list of URLs.

Version:
$Id: SimplePageSpider.java,v 1.4 2005/01/16 05:03:01 gurustu Exp $
Author:
sstatman@real.com

Field Summary
 
Fields inherited from class org.erowid.sperowider.ASpiderBase
ALREADY_GRABBED, BAD_HTTP_RESPONSE, EXCEPTION, FILTER_FAILURE, SPEROWIDER_USER_AGENT, SPEROWIDER_USER_AGENT_NAME, SPEROWIDER_USER_AGENT_VERSION, SUCCESS
 
Constructor Summary
SimplePageSpider(ISimpleSpiderModel model, ISimpleSpiderFilter urlFilter, IndexWriter writer)
          Constructs a page handler with a given data store.
 
Method Summary
 int handleConnection(String sourceUrl, HttpURLConnection connection)
          Loads HTML page, parses for links, indexes.
 void handleConnectionException(String sourceUrl, Throwable e)
          Logs error.
 
Methods inherited from class org.erowid.sperowider.ASpiderBase
getDownloadStatisticCount, getHttpResponseCodeCount, getTotalDownloadAttempts, getTotalHttpAttempts, spider
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimplePageSpider

public SimplePageSpider(ISimpleSpiderModel model,
                        ISimpleSpiderFilter urlFilter,
                        IndexWriter writer)
Constructs a page handler with a given data store.

Method Detail

handleConnectionException

public void handleConnectionException(String sourceUrl,
                                      Throwable e)
Logs error.

Specified by:
handleConnectionException in class ASpiderBase

handleConnection

public int handleConnection(String sourceUrl,
                            HttpURLConnection connection)
Loads HTML page, parses for links, indexes.

Specified by:
handleConnection in class ASpiderBase

spero logo small Sperowider is
© 2005 Erowid.org