=head1 NAME XML::LibXML::Simple - XML::LibXML clone of XML::Simple::XMLin() =head1 INHERITANCE XML::LibXML::Simple is a Exporter =head1 SYNOPSIS use XML::LibXML::Simple qw(XMLin); my $xml = XMLin , OPTIONS; Or the Object Oriented way: use XML::LibXML::Simple (); my $xs = XML::Simple->new(OPTIONS); my $ref = $xs->XMLin(, OPTIONS); =head1 DESCRIPTION This module is a blunt rewrite of XML::Simple (by Grant McLean) to use the XML::LibXML parser for XML structures, where the original uses plain Perl or SAX parsers. =head1 METHODS =head2 Constructors XML::LibXML::Simple-EB(OPTIONS) =over 4 Instantiate an object, which can be used to call L on. You can provide OPTIONS to this constructor (to be reused for each call to XMLin) and with each call of XMLin (to be used once) For XML-DATA and descriptions of the OPTIONS see the L section of this manual page. =back =head2 Translators $obj-EB(XML-DATA, OPTIONS) =over 4 For XML-DATA and descriptions of the OPTIONS see the L section of this manual page. =back =head1 FUNCTIONS The functions C (exported implictly) and C (exported on request) simply call C<< XML::Simple->new->XMLin() >> with the provided parameters. =head1 DETAILS =head2 Differences with XML::Simple In general, the output and the options are equivalent, although this module has some differences with XML::Simple to be aware of. =over 4 =item . Only L is supported: if you want to write XML the use a schema (for instance with XML::Compile). Do not attempt to create XML by hand! If you still think you need it, then have a look at XMLout() as implemented by XML::Simple or any of a zillion template systems. =item . IMO, you should use a templating system if you want variables filled-in in the input: it is not a task for this module. =item . Also, empty elements are not removed: being empty has a meaning which should not be ignored. =item . There are a few small differences in the result of the C option, because XML::Simple seems to behave inconsequently. =back =head2 Parameter XML-DATA As first parameter to L must provide the XML message to be translated into a Perl structure. Choose one of the following: =over 4 =item A filename If the filename contains no directory components, C will look for the file in each directory in the SearchPath (see OPTIONS below) and in the current directory. eg: $ref = XMLin('/etc/params.xml'); Note, the filename C<< - >> (dash) can be used to parse from STDIN. =item undef If there is no XML specifier, C will check the script directory and each of the SearchPath directories for a file with the same name as the script but with the extension '.xml'. Note: if you wish to specify options, you must specify the value 'undef'. eg: $ref = XMLin(undef, ForceArray => 1); =item A string of XML A string containing XML (recognised by the presence of '<' and '>' characters) will be parsed directly. eg: $ref = XMLin(''); =item An IO::Handle object An IO::Handle object will be read to EOF and its contents parsed. eg: $fh = IO::File->new('/etc/params.xml'); $ref = XMLin($fh); =back =head2 OPTIONS L supports most options defined by XML::Simple, so the interface is quite compatible. Minor changes apply. This explanation is extracted from the XML::Simple manual-page. =over 4 =item * check out C because you'll almost certainly want to turn it on =item * make sure you know what the C option does and what its default value is because it may surprise you otherwise. =item * Option names are case in-sensitive so you can use the mixed case versions shown here; you can add underscores between the words (eg: key_attr) if you like. =back In alphabetic order: =over 4 =item ContentKey => 'keyname' I<# seldom used> When text content is parsed to a hash value, this option let's you specify a name for the hash key to override the default 'content'. So for example: XMLin('Text', ContentKey => 'text') will parse to: { 'one' => 1, 'text' => 'Text' } instead of: { 'one' => 1, 'content' => 'Text' } You can also prefix your selected key name with a '-' character to have C try a little harder to eliminate unnecessary 'content' keys after array folding. For example: XMLin( 'FirstSecond', KeyAttr => {item => 'name'}, ForceArray => [ 'item' ], ContentKey => '-content' ) will parse to: { 'item' => { 'one' => 'First' 'two' => 'Second' } } rather than this (without the '-'): { 'item' => { 'one' => { 'content' => 'First' } 'two' => { 'content' => 'Second' } } } =item ForceArray => 1 I<# important> This option should be set to '1' to force nested elements to be represented as arrays even when there is only one. Eg, with ForceArray enabled, this XML: value would parse to this: { 'name' => [ 'value' ] } instead of this (the default): { 'name' => 'value' } This option is especially useful if the data structure is likely to be written back out as XML and the default behaviour of rolling single nested elements up into attributes is not desirable. If you are using the array folding feature, you should almost certainly enable this option. If you do not, single nested elements will not be parsed to arrays and therefore will not be candidates for folding to a hash. (Given that the default value of 'KeyAttr' enables array folding, the default value of this option should probably also have been enabled as well). =item ForceArray => [ names ] I<# important> This alternative (and preferred) form of the 'ForceArray' option allows you to specify a list of element names which should always be forced into an array representation, rather than the 'all or nothing' approach above. It is also possible to include compiled regular expressions in the list --any element names which match the pattern will be forced to arrays. If the list contains only a single regex, then it is not necessary to enclose it in an arrayref. Eg: ForceArray => qr/_list$/ =item ForceContent => 1 I<# seldom used> When C parses elements which have text content as well as attributes, the text content must be represented as a hash value rather than a simple scalar. This option allows you to force text content to always parse to a hash value even when there are no attributes. So for example: XMLin('text1text2', ForceContent => 1) will parse to: { 'x' => { 'content' => 'text1' }, 'y' => { 'a' => 2, 'content' => 'text2' } } instead of: { 'x' => 'text1', 'y' => { 'a' => 2, 'content' => 'text2' } } =item GroupTags => { grouping tag => grouped tag } I<# handy> You can use this option to eliminate extra levels of indirection in your Perl data structure. For example this XML: /usr/bin /usr/local/bin /usr/X11/bin Would normally be read into a structure like this: { searchpath => { dir => [ '/usr/bin', '/usr/local/bin', '/usr/X11/bin' ] } } But when read in with the appropriate value for 'GroupTags': my $opt = XMLin($xml, GroupTags => { searchpath => 'dir' }); It will return this simpler structure: { searchpath => [ '/usr/bin', '/usr/local/bin', '/usr/X11/bin' ] } The grouping element (C<< >> in the example) must not contain any attributes or elements other than the grouped element. You can specify multiple 'grouping element' to 'grouped element' mappings in the same hashref. If this option is combined with C, the array folding will occur first and then the grouped element names will be eliminated. =item KeepRoot => 1 I<# handy> In its attempt to return a data structure free of superfluous detail and unnecessary levels of indirection, C normally discards the root element name. Setting the 'KeepRoot' option to '1' will cause the root element name to be retained. So after executing this code: $config = XMLin('', KeepRoot => 1) You'll be able to reference the tempdir as C<$config-E{config}-E{tempdir}> instead of the default C<$config-E{tempdir}>. =item KeyAttr => [ list ] I<# important> This option controls the 'array folding' feature which translates nested elements from an array to a hash. It also controls the 'unfolding' of hashes to arrays. For example, this XML: would, by default, parse to this: { 'user' => [ { 'login' => 'grep', 'fullname' => 'Gary R Epstein' }, { 'login' => 'stty', 'fullname' => 'Simon T Tyson' } ] } If the option 'KeyAttr => "login"' were used to specify that the 'login' attribute is a key, the same XML would parse to: { 'user' => { 'stty' => { 'fullname' => 'Simon T Tyson' }, 'grep' => { 'fullname' => 'Gary R Epstein' } } } The key attribute names should be supplied in an arrayref if there is more than one. C will attempt to match attribute names in the order supplied. Note 1: The default value for 'KeyAttr' is C<< ['name', 'key', 'id'] >>. If you do not want folding on input or unfolding on output you must setting this option to an empty list to disable the feature. Note 2: If you wish to use this option, you should also enable the C option. Without 'ForceArray', a single nested element will be rolled up into a scalar rather than an array and therefore will not be folded (since only arrays get folded). =item KeyAttr => { list } I<# important> This alternative (and preferred) method of specifiying the key attributes allows more fine grained control over which elements are folded and on which attributes. For example the option 'KeyAttr => { package => 'id' } will cause any package elements to be folded on the 'id' attribute. No other elements which have an 'id' attribute will be folded at all. Two further variations are made possible by prefixing a '+' or a '-' character to the attribute name: The option 'KeyAttr => { user => "+login" }' will cause this XML: to parse to this data structure: { 'user' => { 'stty' => { 'fullname' => 'Simon T Tyson', 'login' => 'stty' }, 'grep' => { 'fullname' => 'Gary R Epstein', 'login' => 'grep' } } } The '+' indicates that the value of the key attribute should be copied rather than moved to the folded hash key. A '-' prefix would produce this result: { 'user' => { 'stty' => { 'fullname' => 'Simon T Tyson', '-login' => 'stty' }, 'grep' => { 'fullname' => 'Gary R Epstein', '-login' => 'grep' } } } =item NoAttr => 1 I<# handy> When used with C, any attributes in the XML will be ignored. =item NormaliseSpace => 0 | 1 | 2 I<# handy> This option controls how whitespace in text content is handled. Recognised values for the option are: =over 8 =item 0 (default) whitespace is passed through unaltered (except of course for the normalisation of whitespace in attribute values which is mandated by the XML recommendation) =item 1 whitespace is normalised in any value used as a hash key (normalising means removing leading and trailing whitespace and collapsing sequences of whitespace characters to a single space) =item 2 whitespace is normalised in all text content =back Note: you can spell this option with a 'z' if that is more natural for you. =item SearchPath => [ list ] I<# handy> If you pass C a filename, but the filename include no directory component, you can use this option to specify which directories should be searched to locate the file. You might use this option to search first in the user's home directory, then in a global directory such as /etc. If a filename is provided to C but SearchPath is not defined, the file is assumed to be in the current directory. If the first parameter to C is undefined, the default SearchPath will contain only the directory in which the script itself is located. Otherwise the default SearchPath will be empty. =item ValueAttr => [ names ] I<# in - handy> Use this option to deal elements which always have a single attribute and no content. Eg: Setting C<< ValueAttr => [ 'value' ] >> will cause the above XML to parse to: { colour => 'red', size => 'XXL' } instead of this (the default): { colour => { value => 'red' }, size => { value => 'XXL' } } =back =head1 COPYRIGHT The interface design and large parts of the documentation were taken from the L module, written by Grant McLean Egrantm@cpan.orgE This version was composed by Mark Overmeer F See L