1DTDReader(3) User Contributed Perl Documentation DTDReader(3)
2
3
4
6 XML::Simple::DTDReader - Simple XML file reading based on their DTDs
7
9 use XML::Simple::DTDReader;
10
11 my $ref = XMLin("data.xml");
12
13 Or the object oriented way:
14
15 require XML::Simple::DTDReader;
16
17 my $xsd = XML::Simple::DTDReader->new;
18 my $ref = $xsd->XMLin("data.xml");
19
21 XML::Simple::DTDReader aims to be a XML::Simple drop-in replacement,
22 but with several aspects of the module controlled by the XML's DTD.
23 Specifically, array folding and array forcing are inferred from the
24 DTD.
25
26 Currently, only "XMLin" is supported; support for "XMLout" is planned
27 for later releases.
28
29 XMLin()
30 Parses XML formatted data and returns a reference to a data structure
31 which contains the same information in a more readily accessible form.
32 (Skip down to "EXAMPLES" for sample code). The XML must have a valid
33 <!DOCTYPE> element.
34
35 "XMLin()" accepts an optional XML specifier, which can be one of the
36 following:
37
38 A filename
39 If the filename contains no directory components "XMLin()" will
40 look for the file in the current directory. Note, the filename '-'
41 can be used to parse from STDIN. eg:
42
43 $ref = XMLin('/etc/params.xml');
44
45 undef
46 If there is no XML specifier, "XMLin()" will check the script
47 directory for a file with the same name as the script but with the
48 extension '.xml'. eg:
49
50 $ref = XMLin();
51
52 A string of XML
53 A string containing XML (recognized by the presence of '<' and '>'
54 characters) will be parsed directly. eg:
55
56 $ref = XMLin('<opt username="bob" password="flurp" />');
57
58 An IO::Handle object
59 An IO::HAndle object will be read to EOF and its contents parsed.
60 eg:
61
62 $fh = new IO::File('/etc/params.xml');
63 $ref = XMLin($fh);
64
66 Currently, none of XML::Simple's myriad of options are supported.
67 Support for "ContentKey", "ForceContent", "KeepRoot", "SearchPath", and
68 "ValueAttr" are planned for future releases.
69
71 XML::Simple::DTDReader is able to deal with inline and external DTDs.
72 Inline DTDs take the form:
73
74 <?xml version="1.0" encoding="UTF-8" ?>
75 <!DOCTYPE greeting [
76 <!ELEMENT greeting (#PCDATA)>
77 ]>
78 <greeting>Hello, world!</greeting>
79
80 External DTDs are either "system" DTDs or "public" DTDs. System DTDs
81 are of the form:
82
83 <?xml version="1.0"?>
84 <!DOCTYPE greeting SYSTEM "hello.dtd">
85 <greeting>Hello, world!</greeting>
86
87 The path in the external system identifier "hello.dtd" is relative to
88 the path to the XML file in question, or to the current working
89 directory if the XML does not come from a file, or the path to the file
90 cannot be determined.
91
92 Public DTDs take the form:
93
94 <?xml version="1.0"?>
95 <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"
96 "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
97 <svg>
98 <path d="M202,702l1,-3l7,-3l3,1l3,7l-1,3l-7,4l-3,-1l-3,-8z" />
99 </svg>
100
101 Two properties of the DTD are used by XML::Simple::DTDReader when
102 determining the final structure of the data; repeated elements, and ID
103 attributes. In the DTD, specifications of the form "element+" or
104 "element*" will lead to the key "element" mapping to an anonymous
105 array. This is perhaps best illustrated with an example:
106
107 <?xml version="1.0" encoding="iso-8859-1"?>
108 <!DOCTYPE data [
109 <!ELEMENT data (stuff+)>
110 <!ELEMENT stuff (name,other*)>
111 <!ELEMENT name (#PCDATA)>
112 <!ELEMENT other (#PCDATA)>
113 ]>
114 <data>
115 <stuff>
116 <name>Moose</name>
117 <other>Value</other>
118 </stuff>
119 <stuff>
120 <name>Thingy</name>
121 <other>Value</other>
122 <other>Value2</other>
123 </stuff>
124 </data>
125
126 ...will map to the data structure:
127
128 {
129 stuff => [
130 {
131 name => "Moose",
132 other => ["Value"],
133 },
134 {
135 name => "Thingy",
136 other => ["Value", "Value2"],
137 }
138 ]
139 }
140
141 The other element of the DTD that impacts the data structure is ID
142 attributes. In XML, ID attributes are unique across a file, which is a
143 more general case of Perl's restriction that keys be unique in a hash.
144 Hence, the presence of attributes of type ID will cause that layer of
145 the data to be folded into a hash, based on the value of the ID
146 attribute as the key. This is again, best illustrated by example:
147
148 <?xml version="1.0" encoding="iso-8859-1"?>
149 <!DOCTYPE data [
150 <!ELEMENT data (stuff+)>
151 <!ELEMENT stuff (name)>
152 <!ATTLIST stuff attrib ID #REQUIRED>
153 <!ELEMENT name (#PCDATA)>
154 ]>
155 <data>
156 <stuff attrib="first">
157 <name>Moose</name>
158 </stuff>
159 <stuff attrib="second">
160 <name>Thingy</name>
161 </stuff>
162 </data>
163
164 ...will lead to the data structure:
165
166 {
167 stuff => {
168 first => {
169 name => "Moose",
170 attrib => "first"
171 },
172 second => {
173 name => "Thingy",
174 attrib => "second"
175 }
176 }
177 }
178
179 XML::Simple::DTDReader recognizes most ELEMENT types, with the
180 exception of mixed data (#PCDATA intermixed with elements) or ANY data.
181 Attempts to parse DTDs describing elements with these types will result
182 in an error.
183
185 XML::Simple::DTDReader is more strict than XML::Simple in parsing of
186 documents; not only must the documents be compliant, they must also
187 follow the DTD specified. XML::Simple::DTDReader will die with an
188 appropriate message if it encounters a parsing of validation error.
189
191 See the "t/" directory of the distribution for a number of example XML
192 files, and the perl data structures they map to.
193
195 None currently known, but I'm sure there are several.
196
198 Contact Info
199 Alex Vandiver : alexmv@mit.edu
200
201 Copyright
202 Copyright (C) 2003 Alex Vandiver. All rights reserved. This package
203 is free software; you can redistribute it and/or modify it under the
204 same terms as Perl itself.
205
206
207
208perl v5.34.0 2022-01-21 DTDReader(3)