.::HTML(3)     User Contributed Perl Documentation     .::HTML(3)
NAME
       XML::Driver::HTML - SAX Driver for non wellformed HTML.
SYNOPSIS
         use XML::Driver::HTML;
         $driver = new XML::Driver::HTML(
               'Handler' => $some_sax_filter_or_handler,
               'Source' => $some_PerlSAX_like_hash
               );
         $driver->parse();
       or
         use XML::Driver::HTML;
         $driver = new XML::Driver::HTML();
         $driver->parse(
               'Handler' => $some_sax_filter_or_handler,
               'Source' => $some_PerlSAX_like_hash
               );
         $driver->parse(
               'Handler' => $some_other_sax_filter_or_handler,
               'Source' => $some_other_source
               );
DESCRIPTION
       XML::Driver::HTML is a SAX Driver for HTML. There is no
       need for the HTML input to be weel formed, as
       XML::Driver::HTML is generating its SAX events by walking
       a HTML::TreeBuilder object. The simplest kind of use, is a
       filter from HTML to XHTML using XML::Handler::YAWriter as
       a SAX Handler.
           my $ya = new XML::Handler::YAWriter(
               'Output' => new IO::File ( ">-" ),
               'Pretty' => {
                   'NoWhiteSpace'=>1,
                   'NoComments'=>1,
                   'AddHiddenNewline'=>1,
                   'AddHiddenAttrTab'=>1,
                   }
               );
           my $html = new XML::Driver::HTML(
               'Handler' => $ya,
               'Source' => { 'ByteStream' => new IO::File ( "<-" ) }
               );
           $html->parse();
2000-05-09                 perl v5.6.0                          1
.::HTML(3)     User Contributed Perl Documentation     .::HTML(3)
       METHODS
       new Creates a new XML::Driver::HTML object. Default
           options for parsing, described below, are passed as
           key-value pairs or as a single hash.  Options may be
           changed directly in the object.
       parse
           Parses a document.  Options, described below, are
           passed as key-value pairs or as a single hash.
           Options passed to parse() override the default options
           in the parser object for the duration of the parse.
       OPTIONS
       The following options are supported by XML::Driver::HTML :
       Handler
           Default SAX Handler to receive events
       Source
           Hash containing the input source for parsing.  The
           `Source' hash may contain the following parameters:
           ByteStream
               The raw byte stream (file handle) containing the
               document.
           String
               A string containing the document.
           SystemId
               The system identifier (URI) of the document.
           Encoding
               A string describing the character encoding.
           If more than one of `ByteStream', `String', or
           `SystemId', then preference is given first to
           `ByteStream', then `String', then `SystemId'.
NOTES
       XML::Driver::HTML requires Perl 5.6 to convert from
       ISO-8859-1 to UTF-8.
BUGS
       not yet implemented:
           Interpretation of SystemId as being an URI
           XHTML document type
       other bugs:
2000-05-09                 perl v5.6.0                          2
.::HTML(3)     User Contributed Perl Documentation     .::HTML(3)
           HTML::Parser and HTML::TreeBuilder bugs concerning DOCTYPE and CSS
AUTHOR
         Michael Koehne, Kraehe@Copyleft.De
         (c) 2000 NotSoFree License
SEE ALSO
       the XML::Parser::PerlSAX manpage and the HTML::TreeBuilder
       manpage
2000-05-09                 perl v5.6.0                          3