|
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
This interface defines the core model for data tracking. This allows us to define multiple ways to manage data (such a memory based model vs. a database based model) without touching the underlying codebase.
An explanation of the various methods and the order in which they're used will be included here later.
Method Summary | |
void |
addFileToRectificationQueue(String fileName)
Adds a filename to the rectification queue |
void |
addFoundURL(String foundIn,
String found,
boolean excludeFromDownloadQueue)
The Downloader calls this when it finds a URL in a downloaded page. |
void |
destroy()
Called by the Sperowider to close all open resources |
String |
getFileForRectifying()
Returns a file to be rectified; this will be done after the downloads are all done |
String |
getFileNameForURL(String url)
Returns the filename for a mapped URL. |
List |
getFoundURLs(String sourceURL)
Returns a List of String objects that are the URLs that the passed in URL reference. |
int |
getGrabbedUrlCount()
The count of URLs that have been grabbed for download. |
int |
getInvalidURLCount()
The count of all bad URLs, both found and real. |
Collection |
getInvalidURLs()
Returns the list of invalid URLs |
String |
getRealURLForFoundURL(String foundURL)
Returns the mapping data as set by mapFoundURLToRealURL(String, String) |
int |
getRectifiedHTMLFileCount()
The count of all HTML files that have been "rectified", that have been processed to replace all found URLs with relative URLs to the mapped file names. |
List |
getSourceURLs(String foundURL)
Returns a List of String objects that are the URLs in which the passed in URL is found. |
int |
getUncheckedUrlCount()
A count of URLs that have not yet been checked. |
int |
getUnRectifiedFileCount()
The count of downloaded HTML files that are not yet rectified. |
String |
getUnspideredUrl()
Returns a URL that has yet to be downloaded |
boolean |
grabForSpidering(String realURL)
If this URL has already been downloaded, return false. |
boolean |
isSpiderMapSupported()
Implementing classes should return true if they are capable of handling calls to getSourceURLs(String) and getFoundURLs(String) ,
false otherwise. |
void |
mapFoundURLToRealURL(String foundURL,
String realURL)
Maps a found URL to a "real URL". |
void |
mapRealURLToFileName(String foundURL,
String fileName)
Maps a "real" URL to a file name. |
void |
markInvalidURL(String givenURL,
int responseCode,
String message)
Mark a URL as invalid |
Methods inherited from interface org.erowid.sperowider.IInitializableObject |
init |
Method Detail |
public void addFoundURL(String foundIn, String found, boolean excludeFromDownloadQueue)
public String getUnspideredUrl()
public void mapFoundURLToRealURL(String foundURL, String realURL)
public String getRealURLForFoundURL(String foundURL)
mapFoundURLToRealURL(String, String)
public void mapRealURLToFileName(String foundURL, String fileName)
public void addFileToRectificationQueue(String fileName)
public boolean grabForSpidering(String realURL)
public void markInvalidURL(String givenURL, int responseCode, String message)
public String getFileForRectifying()
public String getFileNameForURL(String url)
public List getSourceURLs(String foundURL) throws UnsupportedOperationException
UnsupportedOperationException
rather than return
a valid value. Those models that do throw the exception
should return false for isSpiderMapSupported()
.
UnsupportedOperationException
- If the model does not support this methodpublic List getFoundURLs(String sourceURL) throws UnsupportedOperationException
UnsupportedOperationException
rather than return
a valid value. Those models that do throw the exception
should return false for isSpiderMapSupported()
.
UnsupportedOperationException
- If the model does not support this methodpublic boolean isSpiderMapSupported()
getSourceURLs(String)
and getFoundURLs(String)
,
false otherwise.
public Collection getInvalidURLs()
public void destroy()
public int getUncheckedUrlCount()
public int getGrabbedUrlCount()
public int getInvalidURLCount()
public int getUnRectifiedFileCount()
public int getRectifiedHTMLFileCount()
|
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |