1Schema(3)             User Contributed Perl Documentation            Schema(3)
2
3
4

NAME

6       XML::Validator::Schema - validate XML against a subset of W3C XML
7       Schema
8

SYNOPSIS

10         use XML::SAX::ParserFactory;
11         use XML::Validator::Schema;
12
13         #
14         # create a new validator object, using foo.xsd
15         #
16         $validator = XML::Validator::Schema->new(file => 'foo.xsd');
17
18         #
19         # create a SAX parser and assign the validator as a Handler
20         #
21         $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
22
23         #
24         # validate foo.xml against foo.xsd
25         #
26         eval { $parser->parse_uri('foo.xml') };
27         die "File failed validation: $@" if $@;
28

DESCRIPTION

30       This module allows you to validate XML documents against a W3C XML
31       Schema.  This module does not implement the full W3C XML Schema
32       recommendation (http://www.w3.org/XML/Schema), but a useful subset.
33       See the SCHEMA SUPPORT section below.
34
35       IMPORTANT NOTE: To get line and column numbers in the error messages
36       generated by this module you must install XML::Filter::ExceptionLocator
37       and use XML::SAX::ExpatXS as your SAX parser.  This module is much more
38       useful if you can tell where your errors are, so using these modules is
39       highly recommeded!
40

INTERFACE

42       ·   "XML::Validator::Schema->new(file => 'file.xsd', cache => 1)"
43
44           Call this method to create a new XML::Validator:Schema object.  The
45           only required option is "file" which must provide a path to an XML
46           Schema document.
47
48           Setting the optional "cache" parameter to 1 causes
49           XML::Validator::Schema to keep a copy of the schema parse tree in
50           memory.  The tree will be reused on subsequent calls with the same
51           "file" parameter, as long as the mtime on the schema file hasn't
52           changed.  This can save a lot of time if you're validating many
53           documents against a single schema.
54
55           Since XML::Validator::Schema is a SAX filter you will normally pass
56           this object to a SAX parser:
57
58             $validator = XML::Validator::Schema->new(file => 'foo.xsd');
59             $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
60
61           Then you can proceed to validate files using the parser:
62
63             eval { $parser->parse_uri('foo.xml') };
64             die "File failed validation: $@" if $@;
65
66           Setting the optional "debug" parameter to 1 causes
67           XML::Validator::Schema to output elements and associated attributes
68           while parsing and validating the XML document. This provides useful
69           information on the position where the validation failed (although
70           not at useful as the line and column numbers included when
71           XML::Filter::ExceptiionLocator and XML::SAX::ExpatXS are used).
72

RATIONALE

74       I'm writing a piece of software which uses Xerces/C++ (
75       http://xml.apache.org/xerces-c/ ) to validate documents against XML
76       Schema schemas.  This works very well, but I'd like to release my
77       project to the world.  Requiring users to install Xerces is simply too
78       onerous a requirement; few will have it already and the Xerces
79       installation system leaves much to be desired.
80
81       On CPAN, the only available XML Schema validator is XML::Schema.
82       Unfortunately, this module isn't ready for use as it lacks the ability
83       to actually parse the XML Schema document format!  I looked into
84       enhancing XML::Schema but I must admit that I'm not smart enough to
85       understand the code...  One day, when XML::Schema is completed I will
86       replace this module with a wrapper around it.
87
88       This module represents my attempt to support enough XML Schema syntax
89       to be useful without attempting to tackle the full standard.  I'm sure
90       this will mean that it can't be used in all situations, but hopefully
91       that won't prevent it from being used at all.
92

SCHEMA SUPPORT

94   Supported Elements
95       The following elements are supported by the XML Schema parser.  If you
96       don't see an element or an attribute here then you definitely can't use
97       it in a schema document.
98
99       You can expect that the schema document parser will produce an error if
100       you include elements which are not supported.  However, unsupported
101       attributes may be silently ignored.  This should not be misconstrued as
102       a feature and will eventually be fixed.
103
104       All of these elements must be in the http://www.w3.org/2001/XMLSchema
105       namespace, either using a default namespace or a prefix.
106
107         <schema>
108
109            Supported attributes: targetNamespace, elementFormDefault,
110            attributeFormDefault
111
112            Notes: the only supported values for elementFormDefault and
113            attributeFormDefault are "unqualified."  As such, targetNamespace
114            is essentially ignored.
115
116         <element name="foo">
117
118            Supported attributes: name, type, minOccurs, maxOccurs, ref
119
120         <attribute>
121
122            Supported attributes: name, type, use, ref
123
124         <sequence>
125
126            Supported attributes: minOccurs, maxOccurs
127
128         <choice>
129
130            Supported attributes: minOccurs, maxOccurs
131
132         <all>
133
134            Supported attributes: minOccurs, maxOccurs
135
136         <complexType>
137
138           Supported attributes: name
139
140         <simpleContent>
141
142           The only supported sub-element is <extension>.
143
144         <extension>
145
146           Supported attributes: base
147
148           Notes: only allowed inside <simpleContent>
149
150         <simpleType>
151
152           Supported attributes: name
153
154         <restriction>
155
156           Supported attributes: base
157
158           Notes: only allowed inside <simpleType>
159
160         <whiteSpace>
161
162           Supported attributes: value
163
164         <pattern>
165
166           Supported attributes: value
167
168         <enumeration>
169
170           Supported attributes: value
171
172         <length>
173
174           Supported attributes: value
175
176         <minLength>
177
178           Supported attributes: value
179
180         <maxLength>
181
182           Supported attributes: value
183
184         <minInclusive>
185
186           Supported attributes: value
187
188         <minExclusive>
189
190           Supported attributes: value
191
192         <maxInclusive>
193
194           Supported attributes: value
195
196         <maxExclusive>
197
198           Supported attributes: value
199
200         <totalDigits>
201
202           Supported attributes: value
203
204         <fractionDigits>
205
206           Supported attributes: value
207
208         <annotation>
209
210         <documentation>
211
212           Supported attributes: name
213
214         <union>
215           Supported attributes: MemberTypes
216
217   Simple Type Support
218       Supported built-in types are:
219
220         string
221
222         normalizedString
223
224         token
225
226         NMTOKEN
227
228          Notes: the spec says NMTOKEN should only be used for attributes,
229          but this rule is not enforced.
230
231         boolean
232
233         decimal
234
235          Notes: the enumeration facet is not supported on decimal or any
236          types derived from decimal.
237
238         integer
239
240         int
241
242         short
243
244         byte
245
246         unsignedInt
247
248         unsignedShort
249
250         unsignedByte
251
252         positiveInteger
253
254         negativeInteger
255
256         nonPositiveInteger
257
258         nonNegativeInteger
259
260         dateTime
261
262           Notes: Although dateTime correctly validates the lexical format it does not
263           offer comparison facets (min*, max*, enumeration).
264
265         double
266
267           Notes: Although double correctly validates the lexical format it
268           does not offer comparison facets (min*, max*, enumeration).  Also,
269           minimum and maximum constraints as described in the spec are not
270           checked.
271
272         float
273
274           Notes: The restrictions on double support apply to float as well.
275
276         duration
277
278         time
279
280         date
281
282         gYearMonth
283
284         gYear
285
286         gMonthDay
287
288         gDay
289
290         gMonth
291
292         hexBinary
293
294         base64Binary
295
296         anyURI
297
298         QName
299
300         NOTATION
301
302   Miscellaneous Details
303       Other known devations from the specification:
304
305       ·   Patterns specified in pattern simpleType restrictions are Perl
306           regexes with none of the XML Schema extensions available.
307
308       ·   No effort is made to prevent the declaration of facets which
309           "loosen" the restrictions on a type.  This is a bug and will be
310           fixed in a future release.  Until then types which attempt to
311           loosen restrictions on their base class will behave unpredictably.
312
313       ·   No attempt has been made to exclude content models which are
314           ambiguous, as the spec demands.  In fact, I don't see any
315           compelling reason to do so, aside from strict compliance to the
316           spec.  The content model implementaton uses regular expressions
317           which should be able to handle loads of ambiguity without
318           significant performance problems.
319
320       ·   Marking a facet "fixed" has no effect.
321
322       ·   SimpleTypes must come after their base types in the schema body.
323           For example, this is ok:
324
325               <xs:simpleType name="foo">
326                   <xs:restriction base="xs:string">
327                       <xs:minLength value="10"/>
328                   </xs:restriction>
329               </xs:simpleType>
330               <xs:simpleType name="foo_bar">
331                   <xs:restriction base="foo">
332                       <xs:length value="10"/>
333                   </xs:restriction>
334               </xs:simpleType>
335
336           But this is not:
337
338               <xs:simpleType name="foo_bar">
339                   <xs:restriction base="foo">
340                       <xs:length value="10"/>
341                   </xs:restriction>
342               </xs:simpleType>
343               <xs:simpleType name="foo">
344                   <xs:restriction base="xs:string">
345                       <xs:minLength value="10"/>
346                   </xs:restriction>
347               </xs:simpleType>
348

CAVEATS

350       Here are a few gotchas that you should know about:
351
352       ·   No Unicode testing has been performed, although it seems possible
353           that the module will handle Unicode data correctly.
354
355       ·   Namespace processing is almost entirely missing from the module.
356
357       ·   Little work has been done to ensure that invalid schemas fail
358           gracefully.  Until that is done you may want to develop your
359           schemas using a more mature validator (like Xerces or XML Spy)
360           before using them with this module.
361

BUGS

363       Please use "rt.cpan.org" to report bugs in this module:
364
365         http://rt.cpan.org
366
367       Please note that I will delete bugs which merely point out the lack of
368       support for a particular feature of XML Schema.  Those are feature
369       requests, and believe me, I know we've got a long way to go.
370

SUPPORT

372       This module is supported on the perl-xml mailing-list.  Please join the
373       list if you have questions, suggestions or patches:
374
375         http://listserv.activestate.com/mailman/listinfo/perl-xml
376

CVS

378       If you'd like to help develop XML::Validator::Schema you'll want to
379       check out a copy of the CVS tree:
380
381         http://sourceforge.net/cvs/?group_id=89764
382

CREDITS

384       The following people have contributed bug reports, test cases and/or
385       code:
386
387         Russell B Cecala (aka Plankton)
388         David Wheeler
389         Toby Long-Leather
390         Mathieu
391         h.bridge@fasol.fujitsu.com
392         michael.jacob@schering.de
393         josef@clubphoto.com
394         adamk@ali.as
395         Jean Flouret
396

AUTHOR

398       Sam Tregar <sam@tregar.com>
399
401       Copyright (C) 2002-2003 Sam Tregar
402
403       This program is free software; you can redistribute it and/or modify it
404       under the same terms as Perl 5 itself.
405

A NOTE ON DEVELOPMENT METHODOLOGY

407       This module isn't just an XML Schema validator, it's also a test of the
408       Test Driven Development methodology.  I've been writing tests while I
409       develop code for a while now, but TDD goes further by requiring tests
410       to be written before code.  One consequence of this is that the module
411       code may seem naive; it really is just enough code to pass the current
412       test suite.  If I'm doing it right then there shouldn't be a single
413       line of code that isn't directly related to passing a test.  As I add
414       functionality (by way of writing tests) I'll refactor the code a great
415       deal, but I won't add code only to support future development.
416
417       For more information I recommend "Test Driven Development: By Example"
418       by Kent Beck.
419

SEE ALSO

421       XML::Schema
422
423       http://www.w3.org/XML/Schema
424
425       http://xml.apache.org/xerces-c/
426
427
428
429perl v5.28.0                      2008-01-31                         Schema(3)
Impressum