1Schema(3) User Contributed Perl Documentation Schema(3)
2
3
4
6 XML::Validator::Schema - validate XML against a subset of W3C XML
7 Schema
8
10 use XML::SAX::ParserFactory;
11 use XML::Validator::Schema;
12
13 #
14 # create a new validator object, using foo.xsd
15 #
16 $validator = XML::Validator::Schema->new(file => 'foo.xsd');
17
18 #
19 # create a SAX parser and assign the validator as a Handler
20 #
21 $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
22
23 #
24 # validate foo.xml against foo.xsd
25 #
26 eval { $parser->parse_uri('foo.xml') };
27 die "File failed validation: $@" if $@;
28
30 This module allows you to validate XML documents against a W3C XML
31 Schema. This module does not implement the full W3C XML Schema
32 recommendation (http://www.w3.org/XML/Schema), but a useful subset.
33 See the SCHEMA SUPPORT section below.
34
35 IMPORTANT NOTE: To get line and column numbers in the error messages
36 generated by this module you must install XML::Filter::ExceptionLocator
37 and use XML::SAX::ExpatXS as your SAX parser. This module is much more
38 useful if you can tell where your errors are, so using these modules is
39 highly recommeded!
40
42 • "XML::Validator::Schema->new(file => 'file.xsd', cache => 1)"
43
44 Call this method to create a new XML::Validator:Schema object. The
45 only required option is "file" which must provide a path to an XML
46 Schema document.
47
48 Setting the optional "cache" parameter to 1 causes
49 XML::Validator::Schema to keep a copy of the schema parse tree in
50 memory. The tree will be reused on subsequent calls with the same
51 "file" parameter, as long as the mtime on the schema file hasn't
52 changed. This can save a lot of time if you're validating many
53 documents against a single schema.
54
55 Since XML::Validator::Schema is a SAX filter you will normally pass
56 this object to a SAX parser:
57
58 $validator = XML::Validator::Schema->new(file => 'foo.xsd');
59 $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
60
61 Then you can proceed to validate files using the parser:
62
63 eval { $parser->parse_uri('foo.xml') };
64 die "File failed validation: $@" if $@;
65
66 Setting the optional "debug" parameter to 1 causes
67 XML::Validator::Schema to output elements and associated attributes
68 while parsing and validating the XML document. This provides useful
69 information on the position where the validation failed (although
70 not at useful as the line and column numbers included when
71 XML::Filter::ExceptiionLocator and XML::SAX::ExpatXS are used).
72
74 I'm writing a piece of software which uses Xerces/C++ (
75 http://xml.apache.org/xerces-c/ ) to validate documents against XML
76 Schema schemas. This works very well, but I'd like to release my
77 project to the world. Requiring users to install Xerces is simply too
78 onerous a requirement; few will have it already and the Xerces
79 installation system leaves much to be desired.
80
81 On CPAN, the only available XML Schema validator is XML::Schema.
82 Unfortunately, this module isn't ready for use as it lacks the ability
83 to actually parse the XML Schema document format! I looked into
84 enhancing XML::Schema but I must admit that I'm not smart enough to
85 understand the code... One day, when XML::Schema is completed I will
86 replace this module with a wrapper around it.
87
88 This module represents my attempt to support enough XML Schema syntax
89 to be useful without attempting to tackle the full standard. I'm sure
90 this will mean that it can't be used in all situations, but hopefully
91 that won't prevent it from being used at all.
92
94 Supported Elements
95 The following elements are supported by the XML Schema parser. If you
96 don't see an element or an attribute here then you definitely can't use
97 it in a schema document.
98
99 You can expect that the schema document parser will produce an error if
100 you include elements which are not supported. However, unsupported
101 attributes may be silently ignored. This should not be misconstrued as
102 a feature and will eventually be fixed.
103
104 All of these elements must be in the http://www.w3.org/2001/XMLSchema
105 namespace, either using a default namespace or a prefix.
106
107 <schema>
108
109 Supported attributes: targetNamespace, elementFormDefault,
110 attributeFormDefault
111
112 Notes: the only supported values for elementFormDefault and
113 attributeFormDefault are "unqualified." As such, targetNamespace
114 is essentially ignored.
115
116 <element name="foo">
117
118 Supported attributes: name, type, minOccurs, maxOccurs, ref
119
120 <attribute>
121
122 Supported attributes: name, type, use, ref
123
124 <sequence>
125
126 Supported attributes: minOccurs, maxOccurs
127
128 <choice>
129
130 Supported attributes: minOccurs, maxOccurs
131
132 <all>
133
134 Supported attributes: minOccurs, maxOccurs
135
136 <complexType>
137
138 Supported attributes: name
139
140 <simpleContent>
141
142 The only supported sub-element is <extension>.
143
144 <extension>
145
146 Supported attributes: base
147
148 Notes: only allowed inside <simpleContent>
149
150 <simpleType>
151
152 Supported attributes: name
153
154 <restriction>
155
156 Supported attributes: base
157
158 Notes: only allowed inside <simpleType>
159
160 <whiteSpace>
161
162 Supported attributes: value
163
164 <pattern>
165
166 Supported attributes: value
167
168 <enumeration>
169
170 Supported attributes: value
171
172 <length>
173
174 Supported attributes: value
175
176 <minLength>
177
178 Supported attributes: value
179
180 <maxLength>
181
182 Supported attributes: value
183
184 <minInclusive>
185
186 Supported attributes: value
187
188 <minExclusive>
189
190 Supported attributes: value
191
192 <maxInclusive>
193
194 Supported attributes: value
195
196 <maxExclusive>
197
198 Supported attributes: value
199
200 <totalDigits>
201
202 Supported attributes: value
203
204 <fractionDigits>
205
206 Supported attributes: value
207
208 <annotation>
209
210 <documentation>
211
212 Supported attributes: name
213
214 <union>
215 Supported attributes: MemberTypes
216
217 Simple Type Support
218 Supported built-in types are:
219
220 string
221
222 normalizedString
223
224 token
225
226 NMTOKEN
227
228 Notes: the spec says NMTOKEN should only be used for attributes,
229 but this rule is not enforced.
230
231 boolean
232
233 decimal
234
235 Notes: the enumeration facet is not supported on decimal or any
236 types derived from decimal.
237
238 integer
239
240 int
241
242 short
243
244 byte
245
246 unsignedInt
247
248 unsignedShort
249
250 unsignedByte
251
252 positiveInteger
253
254 negativeInteger
255
256 nonPositiveInteger
257
258 nonNegativeInteger
259
260 dateTime
261
262 Notes: Although dateTime correctly validates the lexical format it does not
263 offer comparison facets (min*, max*, enumeration).
264
265 double
266
267 Notes: Although double correctly validates the lexical format it
268 does not offer comparison facets (min*, max*, enumeration). Also,
269 minimum and maximum constraints as described in the spec are not
270 checked.
271
272 float
273
274 Notes: The restrictions on double support apply to float as well.
275
276 duration
277
278 time
279
280 date
281
282 gYearMonth
283
284 gYear
285
286 gMonthDay
287
288 gDay
289
290 gMonth
291
292 hexBinary
293
294 base64Binary
295
296 anyURI
297
298 QName
299
300 NOTATION
301
302 Miscellaneous Details
303 Other known devations from the specification:
304
305 • Patterns specified in pattern simpleType restrictions are Perl
306 regexes with none of the XML Schema extensions available.
307
308 • No effort is made to prevent the declaration of facets which
309 "loosen" the restrictions on a type. This is a bug and will be
310 fixed in a future release. Until then types which attempt to
311 loosen restrictions on their base class will behave unpredictably.
312
313 • No attempt has been made to exclude content models which are
314 ambiguous, as the spec demands. In fact, I don't see any
315 compelling reason to do so, aside from strict compliance to the
316 spec. The content model implementaton uses regular expressions
317 which should be able to handle loads of ambiguity without
318 significant performance problems.
319
320 • Marking a facet "fixed" has no effect.
321
322 • SimpleTypes must come after their base types in the schema body.
323 For example, this is ok:
324
325 <xs:simpleType name="foo">
326 <xs:restriction base="xs:string">
327 <xs:minLength value="10"/>
328 </xs:restriction>
329 </xs:simpleType>
330 <xs:simpleType name="foo_bar">
331 <xs:restriction base="foo">
332 <xs:length value="10"/>
333 </xs:restriction>
334 </xs:simpleType>
335
336 But this is not:
337
338 <xs:simpleType name="foo_bar">
339 <xs:restriction base="foo">
340 <xs:length value="10"/>
341 </xs:restriction>
342 </xs:simpleType>
343 <xs:simpleType name="foo">
344 <xs:restriction base="xs:string">
345 <xs:minLength value="10"/>
346 </xs:restriction>
347 </xs:simpleType>
348
350 Here are a few gotchas that you should know about:
351
352 • No Unicode testing has been performed, although it seems possible
353 that the module will handle Unicode data correctly.
354
355 • Namespace processing is almost entirely missing from the module.
356
357 • Little work has been done to ensure that invalid schemas fail
358 gracefully. Until that is done you may want to develop your
359 schemas using a more mature validator (like Xerces or XML Spy)
360 before using them with this module.
361
363 Please use "rt.cpan.org" to report bugs in this module:
364
365 http://rt.cpan.org
366
367 Please note that I will delete bugs which merely point out the lack of
368 support for a particular feature of XML Schema. Those are feature
369 requests, and believe me, I know we've got a long way to go.
370
372 This module is supported on the perl-xml mailing-list. Please join the
373 list if you have questions, suggestions or patches:
374
375 http://listserv.activestate.com/mailman/listinfo/perl-xml
376
378 If you'd like to help develop XML::Validator::Schema you'll want to
379 check out a copy of the CVS tree:
380
381 http://sourceforge.net/cvs/?group_id=89764
382
384 The following people have contributed bug reports, test cases and/or
385 code:
386
387 Russell B Cecala (aka Plankton)
388 David Wheeler
389 Toby Long-Leather
390 Mathieu
391 h.bridge@fasol.fujitsu.com
392 michael.jacob@schering.de
393 josef@clubphoto.com
394 adamk@ali.as
395 Jean Flouret
396
398 Sam Tregar <sam@tregar.com>
399
401 Copyright (C) 2002-2003 Sam Tregar
402
403 This program is free software; you can redistribute it and/or modify it
404 under the same terms as Perl 5 itself.
405
407 This module isn't just an XML Schema validator, it's also a test of the
408 Test Driven Development methodology. I've been writing tests while I
409 develop code for a while now, but TDD goes further by requiring tests
410 to be written before code. One consequence of this is that the module
411 code may seem naive; it really is just enough code to pass the current
412 test suite. If I'm doing it right then there shouldn't be a single
413 line of code that isn't directly related to passing a test. As I add
414 functionality (by way of writing tests) I'll refactor the code a great
415 deal, but I won't add code only to support future development.
416
417 For more information I recommend "Test Driven Development: By Example"
418 by Kent Beck.
419
421 XML::Schema
422
423 http://www.w3.org/XML/Schema
424
425 http://xml.apache.org/xerces-c/
426
427
428
429perl v5.34.0 2021-07-23 Schema(3)