org.erowid.sperowider
Class TextHtmlHandler

java.lang.Object
  extended byorg.erowid.sperowider.AHandler
      extended byorg.erowid.sperowider.TextHtmlHandler

public class TextHtmlHandler
extends AHandler

This class does the downloading and spidering of HTML files.

Version:
: $Header: /cvsroot/sperowider/SPEROWIDER_MODULE/javasource/org/erowid/sperowider/TextHtmlHandler.java,v 1.29 2005/05/21 08:51:34 gurustu Exp $
Author:
: $Author: gurustu $

Constructor Summary
TextHtmlHandler()
           
 
Method Summary
 void download(HttpURLConnection connection, String fileName, String originalURL)
          Downloads files, and adds found URLs to the rectification queue.
 String[] getReplaceableFilenameSuffixes()
          Returns ".shtml", ".php", ".asp", ".jsp", and ".do".
 String getRequiredFilenameSuffix()
          All downloaded HTML files should end with ".html"
 void rectify(String filename)
          Replaces URLs found in text/html files with local file references.
 
Methods inherited from class org.erowid.sperowider.AHandler
addURLToModel, getFileOutputStream, getRequiredFilenamePrefix, getSperowiderContext, setSperowiderContext, stampFile, urlFoundInRectify, urlFoundInSpider
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextHtmlHandler

public TextHtmlHandler()
Method Detail

rectify

public void rectify(String filename)
             throws IOException
Replaces URLs found in text/html files with local file references. This class relies on HTMLShredder and URLMongler to find URLs.

Specified by:
rectify in class AHandler
Throws:
IOException

download

public void download(HttpURLConnection connection,
                     String fileName,
                     String originalURL)
              throws IOException
Downloads files, and adds found URLs to the rectification queue. This class relies on HTMLShredder and URLMongler to find URLs.

Specified by:
download in class AHandler
Throws:
IOException

getRequiredFilenameSuffix

public String getRequiredFilenameSuffix()
All downloaded HTML files should end with ".html"

Overrides:
getRequiredFilenameSuffix in class AHandler

getReplaceableFilenameSuffixes

public String[] getReplaceableFilenameSuffixes()
Returns ".shtml", ".php", ".asp", ".jsp", and ".do".

Overrides:
getReplaceableFilenameSuffixes in class AHandler

spero logo small Sperowider is
© 2005 Erowid.org