1dcm2xml(1) OFFIS DCMTK dcm2xml(1)
2
3
4
6 dcm2xml - Convert DICOM file and data set to XML
7
9 dcm2xml [options] dcmfile-in [xmlfile-out]
10
12 The dcm2xml utility converts the contents of a DICOM file (file format
13 or raw data set) to XML (Extensible Markup Language). The DTD (Document
14 Type Definition) is described in the file dcm2xml.dtd.
15
16 If dcm2xml reads a raw data set (DICOM data without a file format meta-
17 header) it will attempt to guess the transfer syntax by examining the
18 first few bytes of the file. It is not always possible to correctly
19 guess the transfer syntax and it is better to convert a data set to a
20 file format whenever possible (using the dcmconv utility). It is also
21 possible to use the -f and -t[ieb] options to force dcm2xml to read a
22 data set with a particular transfer syntax.
23
25 dcmfile-in DICOM input filename to be converted
26
27 xmlfile-out XML output filename (default: stdout)
28
30 general options
31 -h --help
32 print this help text and exit
33
34 --version
35 print version information and exit
36
37 --arguments
38 print expanded command line arguments
39
40 -q --quiet
41 quiet mode, print no warnings and errors
42
43 -v --verbose
44 verbose mode, print processing details
45
46 -d --debug
47 debug mode, print debug information
48
49 -ll --log-level [l]evel: string constant
50 (fatal, error, warn, info, debug, trace)
51 use level l for the logger
52
53 -lc --log-config [f]ilename: string
54 use config file f for the logger
55
56 input options
57 input file format:
58
59 +f --read-file
60 read file format or data set (default)
61
62 +fo --read-file-only
63 read file format only
64
65 -f --read-dataset
66 read data set without file meta information
67
68 input transfer syntax:
69
70 -t= --read-xfer-auto
71 use TS recognition (default)
72
73 -td --read-xfer-detect
74 ignore TS specified in the file meta header
75
76 -te --read-xfer-little
77 read with explicit VR little endian TS
78
79 -tb --read-xfer-big
80 read with explicit VR big endian TS
81
82 -ti --read-xfer-implicit
83 read with implicit VR little endian TS
84
85 long tag values:
86
87 +M --load-all
88 load very long tag values (e.g. pixel data)
89
90 -M --load-short
91 do not load very long values (default)
92
93 +R --max-read-length [k]bytes: integer (4..4194302, default: 4)
94 set threshold for long values to k kbytes
95
96 processing options
97 character set:
98
99 +Cr --charset-require
100 require declaration of extended charset (default)
101
102 +Ca --charset-assume [c]harset: string constant
103 (latin-1 to -5, cyrillic, arabic, greek, hebrew)
104 assume charset c if no extended charset declared
105
106 +Cc --charset-check-all
107 check all data elements with string values
108 (default: only PN, LO, LT, SH, ST and UT)
109
110 output options
111 XML structure:
112
113 +Xd --add-dtd-reference
114 add reference to document type definition (DTD)
115
116 +Xe --embed-dtd-content
117 embed document type definition into XML document
118
119 +Xf --use-dtd-file [f]ilename: string
120 use specified DTD file (only with +Xe)
121 (default: /usr/local/share/dcmtk/dcm2xml.dtd)
122
123 +Xn --use-xml-namespace
124 add XML namespace declaration to root element
125
126 DICOM data elements:
127
128 +Wn --write-element-name
129 write name of the DICOM data elements (default)
130
131 -Wn --no-element-name
132 do not write name of the DICOM data elements
133
134 +Wb --write-binary-data
135 write binary data of OB and OW elements
136 (default: off, be careful with --load-all)
137
138 +Eh --encode-hex
139 encode binary data as hex numbers (default)
140
141 +Eb --encode-base64
142 encode binary data as Base64 (RFC 2045, MIME)
143
145 The basic structure of the XML output created from a DICOM image file
146 looks like the following:
147
148 <?xml version="1.0" encoding="ISO-8859-1"?>
149 <!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
150 <file-format xmlns="http://dicom.offis.de/dcmtk">
151 <meta-header xfer="1.2.840.10008.1.2.1" name="LittleEndianExplicit">
152 <element tag="0002,0000" vr="UL" vm="1" len="4"
153 name="MetaElementGroupLength">
154 166
155 </element>
156 ...
157 <element tag="0002,0013" vr="SH" vm="1" len="16"
158 name="ImplementationVersionName">
159 OFFIS_DCMTK_353
160 </element>
161 </meta-header>
162 <data-set xfer="1.2.840.10008.1.2" name="LittleEndianImplicit">
163 <element tag="0008,0005" vr="CS" vm="1" len="10"
164 name="SpecificCharacterSet">
165 ISO_IR 100
166 </element>
167 ...
168 <sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
169 <item card="3">
170 <element tag="0028,3002" vr="xs" vm="3" len="6"
171 name="LUTDescriptor">
172 256\0\8
173 </element>
174 ...
175 </item>
176 ...
177 </sequence>
178 ...
179 <element tag="7fe0,0010" vr="OW" vm="1" len="262144"
180 name="PixelData" loaded="no" binary="hidden">
181 </element>
182 </data-set>
183 </file-format>
184
185 The 'file-format' and 'meta-header' tags are absent for DICOM data
186 sets.
187
188 Character Encoding
189 The XML encoding is determined automatically from the DICOM attribute
190 (0008,0005) 'Specific Character Set' (if present) using the following
191 mapping:
192
193 ASCII "ISO_IR 6" => "UTF-8"
194 UTF-8 "ISO_IR 192" => "UTF-8"
195 ISO Latin 1 "ISO_IR 100" => "ISO-8859-1"
196 ISO Latin 2 "ISO_IR 101" => "ISO-8859-2"
197 ISO Latin 3 "ISO_IR 109" => "ISO-8859-3"
198 ISO Latin 4 "ISO_IR 110" => "ISO-8859-4"
199 ISO Latin 5 "ISO_IR 148" => "ISO-8859-9"
200 Cyrillic "ISO_IR 144" => "ISO-8859-5"
201 Arabic "ISO_IR 127" => "ISO-8859-6"
202 Greek "ISO_IR 126" => "ISO-8859-7"
203 Hebrew "ISO_IR 138" => "ISO-8859-8"
204
205 Multiple character sets are not supported (only the first attribute
206 value is mapped in case of value multiplicity).
207
208 XML Encoding
209 Attributes with very large value fields (e.g. pixel data) are not
210 loaded by default. They can be identified by the additional attribute
211 'loaded' with a value of 'no' (see example above). The command line
212 option --load-all forces to load all value fields including the very
213 long ones.
214
215 Furthermore, binary information of OB and OW attributes are not written
216 to the XML output file by default. These elements can be identified by
217 the additional attribute 'binary' with a value of 'hidden' (default is
218 'no'). The command line option --write-binary-data causes also binary
219 value fields to be printed (attribute value is 'yes' or 'base64'). But,
220 be careful when using this option together with --load-all because of
221 the large amounts of pixel data that might be printed to the output.
222
223 Multiple values (i.e. where the DICOM value multiplicity is greater
224 than 1) are separated by a backslash '\' (except for Base64 encoded
225 data). The 'len' attribute indicates the number of bytes for the
226 particular value field as stored in the DICOM data set, i.e. it might
227 deviate from the XML encoded value length e.g. because of non-
228 significant padding that has been removed. If this attribute is missing
229 in 'sequence' or 'item' start tags, the corresponding DICOM element has
230 been stored with undefined length.
231
233 The level of logging output of the various command line tools and
234 underlying libraries can be specified by the user. By default, only
235 errors and warnings are written to the standard error stream. Using
236 option --verbose also informational messages like processing details
237 are reported. Option --debug can be used to get more details on the
238 internal activity, e.g. for debugging purposes. Other logging levels
239 can be selected using option --log-level. In --quiet mode only fatal
240 errors are reported. In such very severe error events, the application
241 will usually terminate. For more details on the different logging
242 levels, see documentation of module 'oflog'.
243
244 In case the logging output should be written to file (optionally with
245 logfile rotation), to syslog (Unix) or the event log (Windows) option
246 --log-config can be used. This configuration file also allows for
247 directing only certain messages to a particular output stream and for
248 filtering certain messages based on the module or application where
249 they are generated. An example configuration file is provided in
250 <etcdir>/logger.cfg).
251
253 All command line tools use the following notation for parameters:
254 square brackets enclose optional values (0-1), three trailing dots
255 indicate that multiple values are allowed (1-n), a combination of both
256 means 0 to n values.
257
258 Command line options are distinguished from parameters by a leading '+'
259 or '-' sign, respectively. Usually, order and position of command line
260 options are arbitrary (i.e. they can appear anywhere). However, if
261 options are mutually exclusive the rightmost appearance is used. This
262 behaviour conforms to the standard evaluation rules of common Unix
263 shells.
264
265 In addition, one or more command files can be specified using an '@'
266 sign as a prefix to the filename (e.g. @command.txt). Such a command
267 argument is replaced by the content of the corresponding text file
268 (multiple whitespaces are treated as a single separator unless they
269 appear between two quotation marks) prior to any further evaluation.
270 Please note that a command file cannot contain another command file.
271 This simple but effective approach allows to summarize common
272 combinations of options/parameters and avoids longish and confusing
273 command lines (an example is provided in file <datadir>/dumppat.txt).
274
276 The dcm2xml utility will attempt to load DICOM data dictionaries
277 specified in the DCMDICTPATH environment variable. By default, i.e. if
278 the DCMDICTPATH environment variable is not set, the file
279 <datadir>/dicom.dic will be loaded unless the dictionary is built into
280 the application (default for Windows).
281
282 The default behaviour should be preferred and the DCMDICTPATH
283 environment variable only used when alternative data dictionaries are
284 required. The DCMDICTPATH environment variable has the same format as
285 the Unix shell PATH variable in that a colon (':') separates entries.
286 On Windows systems, a semicolon (';') is used as a separator. The data
287 dictionary code will attempt to load each file specified in the
288 DCMDICTPATH environment variable. It is an error if no data dictionary
289 can be loaded.
290
292 <datadir>/dcm2xml.dtd - Document Type Definition (DTD) file
293
295 xml2dcm(1), dcmconv(1)
296
298 Copyright (C) 2002-2010 by OFFIS e.V., Escherweg 2, 26121 Oldenburg,
299 Germany.
300
301
302
303Version 3.6.0 6 Jan 2011 dcm2xml(1)