1Schema(3)             User Contributed Perl Documentation            Schema(3)
2
3
4

NAME

6       XML::Validator::Schema - validate XML against a subset of W3C XML
7       Schema
8

SYNOPSIS

10         use XML::SAX::ParserFactory;
11         use XML::Validator::Schema;
12
13         #
14         # create a new validator object, using foo.xsd
15         #
16         $validator = XML::Validator::Schema->new(file => 'foo.xsd');
17
18         #
19         # create a SAX parser and assign the validator as a Handler
20         #
21         $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
22
23         #
24         # validate foo.xml against foo.xsd
25         #
26         eval { $parser->parse_uri('foo.xml') };
27         die "File failed validation: $@" if $@;
28

DESCRIPTION

30       This module allows you to validate XML documents against a W3C XML
31       Schema.  This module does not implement the full W3C XML Schema recom‐
32       mendation (http://www.w3.org/XML/Schema), but a useful subset.  See the
33       SCHEMA SUPPORT section below.
34
35       IMPORTANT NOTE: To get line and column numbers in the error messages
36       generated by this module you must install XML::Filter::ExceptionLocator
37       and use XML::SAX::ExpatXS as your SAX parser.  This module is much more
38       useful if you can tell where your errors are, so using these modules is
39       highly recommeded!
40

INTERFACE

42       ·   "XML::Validator::Schema->new(file => 'file.xsd', cache => 1)"
43
44           Call this method to create a new XML::Validator:Schema object.  The
45           only required option is "file" which must provide a path to an XML
46           Schema document.
47
48           Setting the optional "cache" parameter to 1 causes XML::Valida‐
49           tor::Schema to keep a copy of the schema parse tree in memory.  The
50           tree will be reused on subsequent calls with the same "file" param‐
51           eter, as long as the mtime on the schema file hasn't changed.  This
52           can save a lot of time if you're validating many documents against
53           a single schema.
54
55           Since XML::Validator::Schema is a SAX filter you will normally pass
56           this object to a SAX parser:
57
58             $validator = XML::Validator::Schema->new(file => 'foo.xsd');
59             $parser = XML::SAX::ParserFactory->parser(Handler => $validator);
60
61           Then you can proceed to validate files using the parser:
62
63             eval { $parser->parse_uri('foo.xml') };
64             die "File failed validation: $@" if $@;
65
66           Setting the optional "debug" parameter to 1 causes XML::Valida‐
67           tor::Schema to output elements and associated attributes while
68           parsing and validating the XML document. This provides useful
69           information on the position where the validation failed (although
70           not at useful as the line and column numbers included when
71           XML::Filter::ExceptiionLocator and XML::SAX::ExpatXS are used).
72

RATIONALE

74       I'm writing a piece of software which uses Xerces/C++ (
75       http://xml.apache.org/xerces-c/ ) to validate documents against XML
76       Schema schemas.  This works very well, but I'd like to release my
77       project to the world.  Requiring users to install Xerces is simply too
78       onerous a requirement; few will have it already and the Xerces instal‐
79       lation system leaves much to be desired.
80
81       On CPAN, the only available XML Schema validator is XML::Schema.
82       Unfortunately, this module isn't ready for use as it lacks the ability
83       to actually parse the XML Schema document format!  I looked into
84       enhancing XML::Schema but I must admit that I'm not smart enough to
85       understand the code...  One day, when XML::Schema is completed I will
86       replace this module with a wrapper around it.
87
88       This module represents my attempt to support enough XML Schema syntax
89       to be useful without attempting to tackle the full standard.  I'm sure
90       this will mean that it can't be used in all situations, but hopefully
91       that won't prevent it from being used at all.
92

SCHEMA SUPPORT

94       Supported Elements
95
96       The following elements are supported by the XML Schema parser.  If you
97       don't see an element or an attribute here then you definitely can't use
98       it in a schema document.
99
100       You can expect that the schema document parser will produce an error if
101       you include elements which are not supported.  However, unsupported
102       attributes may be silently ignored.  This should not be misconstrued as
103       a feature and will eventually be fixed.
104
105       All of these elements must be in the http://www.w3.org/2001/XMLSchema
106       namespace, either using a default namespace or a prefix.
107
108         <schema>
109
110            Supported attributes: targetNamespace, elementFormDefault,
111            attributeFormDefault
112
113            Notes: the only supported values for elementFormDefault and
114            attributeFormDefault are "unqualified."  As such, targetNamespace
115            is essentially ignored.
116
117         <element name="foo">
118
119            Supported attributes: name, type, minOccurs, maxOccurs, ref
120
121         <attribute>
122
123            Supported attributes: name, type, use, ref
124
125         <sequence>
126
127            Supported attributes: minOccurs, maxOccurs
128
129         <choice>
130
131            Supported attributes: minOccurs, maxOccurs
132
133         <all>
134
135            Supported attributes: minOccurs, maxOccurs
136
137         <complexType>
138
139           Supported attributes: name
140
141         <simpleContent>
142
143         <extension>
144
145           Supported attributes: base
146
147           Notes: only allowed inside <simpleContent>
148
149         <simpleType>
150
151           Supported attributes: name
152
153         <restriction>
154
155           Supported attributes: base
156
157         <whiteSpace>
158
159           Supported attributes: value
160
161         <pattern>
162
163           Supported attributes: value
164
165         <enumeration>
166
167           Supported attributes: value
168
169         <length>
170
171           Supported attributes: value
172
173         <minLength>
174
175           Supported attributes: value
176
177         <maxLength>
178
179           Supported attributes: value
180
181         <minInclusive>
182
183           Supported attributes: value
184
185         <minExclusive>
186
187           Supported attributes: value
188
189         <maxInclusive>
190
191           Supported attributes: value
192
193         <maxExclusive>
194
195           Supported attributes: value
196
197         <totalDigits>
198
199           Supported attributes: value
200
201         <fractionDigits>
202
203           Supported attributes: value
204
205         <annotation>
206
207         <documentation>
208
209           Supported attributes: name
210
211       Simple Type Support
212
213       Supported built-in types are:
214
215         string
216
217         normalizedString
218
219         token
220
221         NMTOKEN
222
223          Notes: the spec says NMTOKEN should only be used for attributes,
224          but this rule is not enforced.
225
226         boolean
227
228         decimal
229
230          Notes: the enumeration facet is not supported on decimal or any
231          types derived from decimal.
232
233         integer
234
235         int
236
237         short
238
239         byte
240
241         unsignedInt
242
243         unsignedShort
244
245         unsignedByte
246
247         positiveInteger
248
249         negativeInteger
250
251         nonPositiveInteger
252
253         nonNegativeInteger
254
255         dateTime
256
257           Notes: Although dateTime correctly validates the lexical format it does not
258           offer comparison facets (min*, max*, enumeration).
259
260         double
261
262           Notes: Although double correctly validates the lexical format it
263           does not offer comparison facets (min*, max*, enumeration).  Also,
264           minimum and maximum constraints as described in the spec are not
265           checked.
266
267         float
268
269           Notes: The restrictions on double support apply to float as well.
270
271         duration
272
273         time
274
275         date
276
277         gYearMonth
278
279         gYear
280
281         gMonthDay
282
283         gDay
284
285         gMonth
286
287         hexBinary
288
289         base64Binary
290
291         anyURI
292
293         QName
294
295         NOTATION
296
297       Miscellaneous Details
298
299       Other known devations from the specification:
300
301       ·   Patterns specified in pattern simpleType restrictions are Perl
302           regexes with none of the XML Schema extensions available.
303
304       ·   No effort is made to prevent the declaration of facets which
305           "loosen" the restrictions on a type.  This is a bug and will be
306           fixed in a future release.  Until then types which attempt to
307           loosen restrictions on their base class will behave unpredictably.
308
309       ·   No attempt has been made to exclude content models which are
310           ambiguous, as the spec demands.  In fact, I don't see any com‐
311           pelling reason to do so, aside from strict compliance to the spec.
312           The content model implementaton uses regular expressions which
313           should be able to handle loads of ambiguity without significant
314           performance problems.
315
316       ·   Marking a facet "fixed" has no effect.
317
318       ·   SimpleTypes must come after their base types in the schema body.
319           For example, this is ok:
320
321               <xs:simpleType name="foo">
322                   <xs:restriction base="xs:string">
323                       <xs:minLength value="10"/>
324                   </xs:restriction>
325               </xs:simpleType>
326               <xs:simpleType name="foo_bar">
327                   <xs:restriction base="foo">
328                       <xs:length value="10"/>
329                   </xs:restriction>
330               </xs:simpleType>
331
332           But this is not:
333
334               <xs:simpleType name="foo_bar">
335                   <xs:restriction base="foo">
336                       <xs:length value="10"/>
337                   </xs:restriction>
338               </xs:simpleType>
339               <xs:simpleType name="foo">
340                   <xs:restriction base="xs:string">
341                       <xs:minLength value="10"/>
342                   </xs:restriction>
343               </xs:simpleType>
344

CAVEATS

346       Here are a few gotchas that you should know about:
347
348       ·   No Unicode testing has been performed, although it seems possible
349           that the module will handle Unicode data correctly.
350
351       ·   Namespace processing is almost entirely missing from the module.
352
353       ·   Little work has been done to ensure that invalid schemas fail
354           gracefully.  Until that is done you may want to develop your
355           schemas using a more mature validator (like Xerces or XML Spy)
356           before using them with this module.
357

BUGS

359       Please use "rt.cpan.org" to report bugs in this module:
360
361         http://rt.cpan.org
362
363       Please note that I will delete bugs which merely point out the lack of
364       support for a particular feature of XML Schema.  Those are feature
365       requests, and believe me, I know we've got a long way to go.
366

SUPPORT

368       This module is supported on the perl-xml mailing-list.  Please join the
369       list if you have questions, suggestions or patches:
370
371         http://listserv.activestate.com/mailman/listinfo/perl-xml
372

CVS

374       If you'd like to help develop XML::Validator::Schema you'll want to
375       check out a copy of the CVS tree:
376
377         http://sourceforge.net/cvs/?group_id=89764
378

CREDITS

380       The following people have contributed bug reports, test cases and/or
381       code:
382
383         Russell B Cecala (aka Plankton)
384         David Wheeler
385         Toby Long-Leather
386         Mathieu
387         h.bridge@fasol.fujitsu.com
388         michael.jacob@schering.de
389         josef@clubphoto.com
390         adamk@ali.as
391         Jean Flouret
392

AUTHOR

394       Sam Tregar <sam@tregar.com>
395
397       Copyright (C) 2002-2003 Sam Tregar
398
399       This program is free software; you can redistribute it and/or modify it
400       under the same terms as Perl 5 itself.
401

A NOTE ON DEVELOPMENT METHODOLOGY

403       This module isn't just an XML Schema validator, it's also a test of the
404       Test Driven Development methodology.  I've been writing tests while I
405       develop code for a while now, but TDD goes further by requiring tests
406       to be written before code.  One consequence of this is that the module
407       code may seem naive; it really is just enough code to pass the current
408       test suite.  If I'm doing it right then there shouldn't be a single
409       line of code that isn't directly related to passing a test.  As I add
410       functionality (by way of writing tests) I'll refactor the code a great
411       deal, but I won't add code only to support future development.
412
413       For more information I recommend "Test Driven Development: By Example"
414       by Kent Beck.
415

SEE ALSO

417       XML::Schema
418
419       http://www.w3.org/XML/Schema
420
421       http://xml.apache.org/xerces-c/
422
423
424
425perl v5.8.8                       2004-11-04                         Schema(3)
Impressum