SGML::StripParser - strip tags from an SGML instance
use SGML::StripParser; $parser = new SGML::StripParser; $parser->parse_data(\*STDIN);
SGML::StripParser strips SGML tags from document instances and translates entity references for special characters and character references to ASCII (or the character set specified by the set_charset method). The parse_data method is used to specify the input filehandle of the SGML document instance. By default, output will go to STDOUT, but the output filehandle can be changed by the set_outhandle method.
The following class methods are defined:
Instantiate a new SGML::StripParser object.
The following methods are defined:
Process the document instance specified by FILEHANDLE.
Set the output filehandle to FILEHANDLE.
If $boolean is a true value, anchor URLs in HTML documents will be included in the output.
Use $charset as the character set to use while processing. By default, ASCII is assumed, so entity references for special characters and character references are mapped to ASCII text. set_charset allows the entity references and character references to interpreted under a different character set. Only the ISO-8859 character sets (1-10) are supported.
Set the list of parameter entities in @names to "INCLUDE". This method may be useful for instances that have marked sections with parameter entity references for the status keyword.
Set the list of parameter entities in @names to "IGNORE". This method may be useful for instances that have marked sections with parameter entity references for the status keyword.
SGML::StripParser is derived from the SGML::Parser class. Hence, it has the same parsing capbilities and limitation of the SGML::Parser class.
This software is part of the perlSGML package; see (http://www.oac.uci.edu/indiv/ehood/perlSGML.html)