All Packages Class Hierarchy This Package Previous Next Index
Class Webcrawler.Crawler.HTMLConstants
java.lang.Object
|
+----Webcrawler.Crawler.HTMLConstants
- public class HTMLConstants
- extends Object
Defines usefull constants for parsing HTML-files.
-
charentity
- HTML replaces certain chars with a code for that char.
-
linkTags
- Tags such as A HREF and FRAME SRC
-
loadableTags
- Tags such as BODY BACKGROUND and IMG SRC
-
titleElement
- Miscellaneous Elements and tags
-
whiteSpaces
- Contains all characters which have to be ignored by the HTMLParser
-
HTMLConstants()
-
-
()
-
-
getAttributeForLinkElement(String)
-
-
getAttributeForLoadableElement(String)
-
-
getCharEntity(char)
-
loadableTags
public static HTMLConstants. ConstantTag loadableTags[]
- Tags such as BODY BACKGROUND and IMG SRC
linkTags
public static HTMLConstants. ConstantTag linkTags[]
- Tags such as A HREF and FRAME SRC
titleElement
public static final String titleElement
- Miscellaneous Elements and tags
charentity
protected static String charentity[]
- HTML replaces certain chars with a code for that char.
e.g. the german Ä is represented as Ä and has the UniCode# 196.
Given: Ä converted to int=196 => look in charentity[196-160] -> Auml
whiteSpaces
public static String whiteSpaces
- Contains all characters which have to be ignored by the HTMLParser
HTMLConstants
public HTMLConstants()
getAttributeForLoadableElement
public String getAttributeForLoadableElement(String element)
- Returns:
- the Attribute (e.g: BACKGROUND) for the Element (e.g: BODY), null if no element fits.
getAttributeForLinkElement
public String getAttributeForLinkElement(String element)
- Returns:
- the Attribute (e.g: SRC) for the Element (e.g: FRAME), null if no element fits.
static void ()
getCharEntity
public String getCharEntity(char c)
- Returns:
- the HTML-code for the given character (e.g: auml for ä)
All Packages Class Hierarchy This Package Previous Next Index