1XMLWF(1)                                                              XMLWF(1)
2
3
4

NAME

6       xmlwf - Determines if an XML document is well-formed
7

SYNOPSIS

9       xmlwf  [  -s]  [ -n]  [ -p]  [ -x]  [ -e encoding]  [ -w]  [ -d output-
10       dir]  [ -c]  [ -m]  [ -r]  [ -t]  [ -v]  [ file ...]
11
12

DESCRIPTION

14       xmlwf uses the Expat library to determine if an XML document  is  well-
15       formed.  It is non-validating.
16
17       If  you  do  not  specify any files on the command-line, and you have a
18       recent version of xmlwf, the input file  will  be  read  from  standard
19       input.
20

WELL-FORMED DOCUMENTS

22       A well-formed document must adhere to the following rules:
23
24       · The  file  begins  with an XML declaration.  For instance, <?xml ver‐
25         sion="1.0" standalone="yes"?>.  NOTE: xmlwf does not currently  check
26         for a valid XML declaration.
27
28       · Every  start  tag is either empty (<tag/>) or has a corresponding end
29         tag.
30
31       · There is exactly one root element.  This  element  must  contain  all
32         other elements in the document.  Only comments, white space, and pro‐
33         cessing instructions may come after the close of the root element.
34
35       · All elements nest properly.
36
37       · All attribute values are enclosed in quotes (either  single  or  dou‐
38         ble).
39
40       If the document has a DTD, and it strictly complies with that DTD, then
41       the document is also  considered  valid.   xmlwf  is  a  non-validating
42       parser -- it does not check the DTD.  However, it does support external
43       entities (see the -x option).
44

OPTIONS

46       When an option includes an  argument,  you  may  specify  the  argument
47       either  separately  ("-d  output")  or  concatenated  with  the  option
48       ("-doutput").  xmlwf supports both.
49
50       -c     If the input file is well-formed and xmlwf doesn't encounter any
51              errors,  the input file is simply copied to the output directory
52              unchanged.  This  implies  no  namespaces  (turns  off  -n)  and
53              requires -d to specify an output file.
54
55       -d output-dir
56              Specifies  a directory to contain transformed representations of
57              the input files.  By default, -d outputs a canonical representa‐
58              tion (described below).  You can select different output formats
59              using -c and -m.
60
61              The output filenames will be exactly the same as the input file‐
62              names  or  "STDIN"  if  the input is coming from standard input.
63              Therefore, you must be careful that the output file does not  go
64              into  the  same  directory  as the input file.  Otherwise, xmlwf
65              will delete the input file before it generates the  output  file
66              (just like running cat < file > file in most shells).
67
68              Two  structurally  equivalent XML documents have a byte-for-byte
69              identical canonical XML  representation.   Note  that  ignorable
70              white  space  is  considered  significant and is treated equiva‐
71              lently  to  data.   More  on  canonical  XML  can  be  found  at
72              http://www.jclark.com/xml/canonxml.html .
73
74       -e encoding
75              Specifies  the  character  encoding for the document, overriding
76              any document encoding declaration.  xmlwf supports four built-in
77              encodings:  US-ASCII,  UTF-8,  UTF-16, and ISO-8859-1.  Also see
78              the -w option.
79
80       -m     Outputs some strange sort of XML file that completely  describes
81              the  input  file, including character positions.  Requires -d to
82              specify an output file.
83
84       -n     Turns on namespace processing.  (describe  namespaces)  -c  dis‐
85              ables namespaces.
86
87       -p     Tells xmlwf to process external DTDs and parameter entities.
88
89              Normally  xmlwf never parses parameter entities.  -p tells it to
90              always parse them.  -p implies -x.
91
92       -r     Normally xmlwf memory-maps the XML file before parsing; this can
93              result  in  faster parsing on many platforms.  -r turns off mem‐
94              ory-mapping and uses normal file IO calls instead.   Of  course,
95              memory-mapping  is  automatically  turned  off when reading from
96              standard input.
97
98              Use of memory-mapping can cause some platforms  to  report  sub‐
99              stantially higher memory usage for xmlwf, but this appears to be
100              a matter of the operating system reporting memory in  a  strange
101              way; there is not a leak in xmlwf.
102
103       -s     Prints  an  error if the document is not standalone.  A document
104              is standalone if it has no external subset and no references  to
105              parameter entities.
106
107       -t     Turns  on  timings.   This tells Expat to parse the entire file,
108              but not perform any processing.  This gives  a  fairly  accurate
109              idea  of  the raw speed of Expat itself without client overhead.
110              -t turns off most of the output options (-d, -m, -c, ...).
111
112       -v     Prints the version of the Expat library  being  used,  including
113              some  information  on  the  compile-time  configuration  of  the
114              library, and then exits.
115
116       -w     Enables support for Windows code pages.   Normally,  xmlwf  will
117              throw  an  error  if  it  runs across an encoding that it is not
118              equipped to handle itself.  With -w, xmlwf will  try  to  use  a
119              Windows code page.  See also -e.
120
121       -x     Turns on parsing external entities.
122
123              Non-validating  parsers  are  not  required  to resolve external
124              entities, or even expand entities at all.  Expat always  expands
125              internal  entities  (?),  but  external  entity  parsing must be
126              enabled explicitly.
127
128              External entities are simply entities  that  obtain  their  data
129              from outside the XML file currently being parsed.
130
131              This is an example of an internal entity:
132
133              <!ENTITY vers '1.0.2'>
134
135              And here are some examples of external entities:
136
137              <!ENTITY header SYSTEM "header-&vers;.xml">  (parsed)
138              <!ENTITY logo SYSTEM "logo.png" PNG>         (unparsed)
139
140       --     (Two  hyphens.)   Terminates  the list of options.  This is only
141              needed if a filename starts with a hyphen.  For example:
142
143              xmlwf -- -myfile.xml
144
145              will run xmlwf on the file -myfile.xml.
146
147       Older versions of xmlwf do not support reading from standard input.
148

OUTPUT

150       If an input file  is  not  well-formed,  xmlwf  prints  a  single  line
151       describing  the  problem to standard output.  If a file is well formed,
152       xmlwf outputs nothing.  Note that the result code is not set.
153

BUGS

155       According to the W3C standard, an XML file without a declaration at the
156       beginning is not considered well-formed.  However, xmlwf allows this to
157       pass.
158
159       xmlwf returns a 0 - noerr result, even if the file is not  well-formed.
160       There is no good way for a program to use xmlwf to quickly check a file
161       -- it must parse xmlwf's standard output.
162
163       The errors should go to standard error, not standard output.
164
165       There should be a way to get -d to send its output to  standard  output
166       rather than forcing the user to send it to a file.
167
168       I have no idea why anyone would want to use the -d, -c, and -m options.
169       If someone could explain it to me, I'd like to add this information  to
170       this manpage.
171

ALTERNATIVES

173       Here are some XML validators on the web:
174
175       http://www.hcrc.ed.ac.uk/~richard/xml-check.html
176       http://www.stg.brown.edu/service/xmlvalid/
177       http://www.scripting.com/frontier5/xml/code/xmlValidator.html
178       http://www.xml.com/pub/a/tools/ruwf/check.html
179

SEE ALSO

181       The Expat home page:        http://www.libexpat.org/
182       The W3 XML specification:   http://www.w3.org/TR/REC-xml
183

AUTHOR

185       This manual page was written by Scott Bronson <bronson@rinspin.com> for
186       the Debian GNU/Linux system (but may be used by others).  Permission is
187       granted to copy, distribute and/or modify this document under the terms
188       of the GNU Free Documentation License, Version 1.1.
189
190
191
192                                24 January 2003                       XMLWF(1)
Impressum