Package org.htmlparser.scanners
Class ScriptScanner
java.lang.Object
org.htmlparser.scanners.TagScanner
org.htmlparser.scanners.CompositeTagScanner
org.htmlparser.scanners.ScriptScanner
- All Implemented Interfaces:
Serializable
,Scanner
The ScriptScanner handles script CDATA.
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.htmlparser.scanners.CompositeTagScanner
addChild, createVirtualEndTag, finishTag, isTagToBeEndedFor
-
Field Details
-
STRICT
public static boolean STRICTStrict parsing of CDATA flag. If this flag is set true, the parsing of script is performed without regard to quotes. This means that erroneous script such as:document.write("</scriptinvalid input: '>'");
will be parsed in strict accordance with appendix B.3.2 Specifying non-HTML data of the HTML 4.01 Specification and hence will be split into two or more nodes. Correct javascript would escape the ETAGO:document.write("<\/scriptinvalid input: '>'");
If true, CDATA parsing will stop at the first ETAGO ("</") no matter whether it is quoted or not. If false, balanced quotes (either single or double) will shield an ETAGO. Beacuse of the possibility of quotes within single or multiline comments, these are also parsed. In most cases, users prefer non-strict handling since there is so much broken script out in the wild.
-
-
Constructor Details
-
ScriptScanner
public ScriptScanner()Create a script scanner.
-
-
Method Details
-
scan
Scan for script. Accumulates text from the page, until </[a-zA-Z] is encountered.- Specified by:
scan
in interfaceScanner
- Overrides:
scan
in classCompositeTagScanner
- Parameters:
tag
- The tag this scanner is responsible for.lexer
- The source of CDATA.stack
- The parse stack, not used.- Returns:
- The resultant tag (may be unchanged).
- Throws:
ParserException
- if an unrecoverable problem occurs.
-