1dcm2xml(1) OFFIS DCMTK dcm2xml(1)
2
3
4
6 dcm2xml - Convert DICOM file and data set to XML
7
8
10 dcm2xml [options] dcmfile-in [xmlfile-out]
11
13 The dcm2xml utility converts the contents of a DICOM file (file format
14 or raw data set) to XML (Extensible Markup Language). There are two
15 output formats. The first one is specific to DCMTK with its DTD
16 (Document Type Definition) described in the file dcm2xml.dtd. The
17 second one refers to the 'Native DICOM Model' which is specified for
18 the DICOM Application Hosting service found in DICOM part 19.
19
20 If dcm2xml reads a raw data set (DICOM data without a file format meta-
21 header) it will attempt to guess the transfer syntax by examining the
22 first few bytes of the file. It is not always possible to correctly
23 guess the transfer syntax and it is better to convert a data set to a
24 file format whenever possible (using the dcmconv utility). It is also
25 possible to use the -f and -t[ieb] options to force dcm2xml to read a
26 data set with a particular transfer syntax.
27
29 dcmfile-in DICOM input filename to be converted
30
31 xmlfile-out XML output filename (default: stdout)
32
34 general options
35 -h --help
36 print this help text and exit
37
38 --version
39 print version information and exit
40
41 --arguments
42 print expanded command line arguments
43
44 -q --quiet
45 quiet mode, print no warnings and errors
46
47 -v --verbose
48 verbose mode, print processing details
49
50 -d --debug
51 debug mode, print debug information
52
53 -ll --log-level [l]evel: string constant
54 (fatal, error, warn, info, debug, trace)
55 use level l for the logger
56
57 -lc --log-config [f]ilename: string
58 use config file f for the logger
59
60 input options
61 input file format:
62
63 +f --read-file
64 read file format or data set (default)
65
66 +fo --read-file-only
67 read file format only
68
69 -f --read-dataset
70 read data set without file meta information
71
72 input transfer syntax:
73
74 -t= --read-xfer-auto
75 use TS recognition (default)
76
77 -td --read-xfer-detect
78 ignore TS specified in the file meta header
79
80 -te --read-xfer-little
81 read with explicit VR little endian TS
82
83 -tb --read-xfer-big
84 read with explicit VR big endian TS
85
86 -ti --read-xfer-implicit
87 read with implicit VR little endian TS
88
89 long tag values:
90
91 +M --load-all
92 load very long tag values (e.g. pixel data)
93
94 -M --load-short
95 do not load very long values (default)
96
97 +R --max-read-length [k]bytes: integer (4..4194302, default: 4)
98 set threshold for long values to k kbytes
99
100 processing options
101 specific character set:
102
103 +Cr --charset-require
104 require declaration of extended charset (default)
105
106 +Ca --charset-assume [c]harset: string
107 assume charset c if no extended charset declared
108
109 +Cc --charset-check-all
110 check all data elements with string values
111 (default: only PN, LO, LT, SH, ST, UC and UT)
112
113 # this option is only used for the mapping to an appropriate
114 # XML character encoding, but not for the conversion to UTF-8
115
116 +U8 --convert-to-utf8
117 convert all element values that are affected
118 by Specific Character Set (0008,0005) to UTF-8
119
120 # requires support from an underlying character encoding library
121 # (see output of --version on which one is available)
122
123 output options
124 general XML format:
125
126 -dtk --dcmtk-format
127 output in DCMTK-specific format (default)
128
129 -nat --native-format
130 output in Native DICOM Model format (part 19)
131
132 +Xn --use-xml-namespace
133 add XML namespace declaration to root element
134
135 DCMTK-specific format (not with --native-format):
136
137 +Xd --add-dtd-reference
138 add reference to document type definition (DTD)
139
140 +Xe --embed-dtd-content
141 embed document type definition into XML document
142
143 +Xf --use-dtd-file [f]ilename: string
144 use specified DTD file (only with +Xe)
145 (default: /usr/local/share/dcmtk/dcm2xml.dtd)
146
147 +Wn --write-element-name
148 write name of the DICOM data elements (default)
149
150 -Wn --no-element-name
151 do not write name of the DICOM data elements
152
153 +Wb --write-binary-data
154 write binary data of OB and OW elements
155 (default: off, be careful with --load-all)
156
157 encoding of binary data:
158
159 +Eh --encode-hex
160 encode binary data as hex numbers
161 (default for DCMTK-specific format)
162
163 +Eu --encode-uuid
164 encode binary data as a UUID reference
165 (default for Native DICOM Model)
166
167 +Eb --encode-base64
168 encode binary data as Base64 (RFC 2045, MIME)
169
171 The basic structure of the DCMTK-specific XML output created from a
172 DICOM file looks like the following:
173
174 <?xml version="1.0" encoding="ISO-8859-1"?>
175 <!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
176 <file-format xmlns="http://dicom.offis.de/dcmtk">
177 <meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
178 <element tag="0002,0000" vr="UL" vm="1" len="4"
179 name="MetaElementGroupLength">
180 166
181 </element>
182 ...
183 <element tag="0002,0013" vr="SH" vm="1" len="16"
184 name="ImplementationVersionName">
185 OFFIS_DCMTK_353
186 </element>
187 </meta-header>
188 <data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
189 <element tag="0008,0005" vr="CS" vm="1" len="10"
190 name="SpecificCharacterSet">
191 ISO_IR 100
192 </element>
193 ...
194 <sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
195 <item card="3">
196 <element tag="0028,3002" vr="xs" vm="3" len="6"
197 name="LUTDescriptor">
198 256\0\8
199 </element>
200 ...
201 </item>
202 ...
203 </sequence>
204 ...
205 <element tag="7fe0,0010" vr="OW" vm="1" len="262144"
206 name="PixelData" loaded="no" binary="hidden">
207 </element>
208 </data-set>
209 </file-format>
210
211 The 'file-format' and 'meta-header' tags are absent for DICOM data
212 sets.
213
214 XML Encoding
215 Attributes with very large value fields (e.g. pixel data) are not
216 loaded by default. They can be identified by the additional attribute
217 'loaded' with a value of 'no' (see example above). The command line
218 option --load-all forces to load all value fields including the very
219 long ones.
220
221 Furthermore, binary information of OB and OW attributes are not written
222 to the XML output file by default. These elements can be identified by
223 the additional attribute 'binary' with a value of 'hidden' (default is
224 'no'). The command line option --write-binary-data causes also binary
225 value fields to be printed (attribute value is 'yes' or 'base64'). But,
226 be careful when using this option together with --load-all because of
227 the large amounts of pixel data that might be printed to the output.
228 Please note that in this context element values with a VR of OD or OF
229 are not regarded as 'binary information'.
230
231 Multiple values (i.e. where the DICOM value multiplicity is greater
232 than 1) are separated by a backslash '\' (except for Base64 encoded
233 data). The 'len' attribute indicates the number of bytes for the
234 particular value field as stored in the DICOM data set, i.e. it might
235 deviate from the XML encoded value length e.g. because of non-
236 significant padding that has been removed. If this attribute is missing
237 in 'sequence' or 'item' start tags, the corresponding DICOM element has
238 been stored with undefined length.
239
241 The description of the Native DICOM Model format can be found in the
242 DICOM standard, part 19 ('Application Hosting').
243
244 Bulk Data
245 Binary data, i.e. DICOM element values with Value Representations (VR)
246 of OB or OW, as well as OD, OF and UN values are by default not written
247 to the XML output because of their size. Instead, for each element, a
248 new Universally Unique Identifier (UUID) is being generated and written
249 as an attribute of a <BulkData> XML element. So far, there is no
250 possibility to write an additional file to hold the binary data for
251 each of the binary data chunks. This is not required by the standard,
252 however, it might be useful for implementing an Application Hosting
253 interface; thus this feature may be available in future versions of
254 dcm2xml.
255
256 In addition, Supplement 163 (Store Over the Web by Representational
257 State Transfer Services) introduces a new <InlineBinary> XML element
258 that allows for encoding binary data as Base64. Currently, the command
259 line option --encode-base64 enables this encoding for the following
260 VRs: OB, OD, OF, OW, and UN.
261
262 Known Issues
263 In addition to what is written in the above section on 'Bulk Data',
264 there are further known issues with the current implementation of the
265 Native DICOM Model format. For example, large element values with a VR
266 other than OB, OD, OF, OW or UN are currently never written as bulk
267 data, although it might be useful, e.g. for very long text elements
268 (especially UT) or very long numeric fields (of various VRs).
269
271 Character Encoding
272 The XML encoding is determined automatically from the DICOM attribute
273 (0008,0005) 'Specific Character Set' using the following mapping:
274
275 ASCII (ISO_IR 6) => "UTF-8"
276 UTF-8 "ISO_IR 192" => "UTF-8"
277 ISO Latin 1 "ISO_IR 100" => "ISO-8859-1"
278 ISO Latin 2 "ISO_IR 101" => "ISO-8859-2"
279 ISO Latin 3 "ISO_IR 109" => "ISO-8859-3"
280 ISO Latin 4 "ISO_IR 110" => "ISO-8859-4"
281 ISO Latin 5 "ISO_IR 148" => "ISO-8859-9"
282 Cyrillic "ISO_IR 144" => "ISO-8859-5"
283 Arabic "ISO_IR 127" => "ISO-8859-6"
284 Greek "ISO_IR 126" => "ISO-8859-7"
285 Hebrew "ISO_IR 138" => "ISO-8859-8"
286
287 If this DICOM attribute is missing in the input file, although needed,
288 option --charset-assume can be used to specify an appropriate character
289 set manually (using one of the DICOM defined terms). For reasons of
290 backward compatibility with previous versions of this tool, the
291 following terms are also supported and mapped automatically to the
292 associated DICOM defined terms: latin-1, latin-2, latin-3, latin-4,
293 latin-5, cyrillic, arabic, greek, hebrew.
294
295 Multiple character sets using code extension techniques are not
296 supported. If needed, option --convert-to-utf8 can be used to convert
297 the DICOM file or data set to UTF-8 encoding prior to the conversion to
298 XML format. This is also useful for DICOMDIR files where each directory
299 record can have a different character set.
300
302 The level of logging output of the various command line tools and
303 underlying libraries can be specified by the user. By default, only
304 errors and warnings are written to the standard error stream. Using
305 option --verbose also informational messages like processing details
306 are reported. Option --debug can be used to get more details on the
307 internal activity, e.g. for debugging purposes. Other logging levels
308 can be selected using option --log-level. In --quiet mode only fatal
309 errors are reported. In such very severe error events, the application
310 will usually terminate. For more details on the different logging
311 levels, see documentation of module 'oflog'.
312
313 In case the logging output should be written to file (optionally with
314 logfile rotation), to syslog (Unix) or the event log (Windows) option
315 --log-config can be used. This configuration file also allows for
316 directing only certain messages to a particular output stream and for
317 filtering certain messages based on the module or application where
318 they are generated. An example configuration file is provided in
319 <etcdir>/logger.cfg.
320
322 All command line tools use the following notation for parameters:
323 square brackets enclose optional values (0-1), three trailing dots
324 indicate that multiple values are allowed (1-n), a combination of both
325 means 0 to n values.
326
327 Command line options are distinguished from parameters by a leading '+'
328 or '-' sign, respectively. Usually, order and position of command line
329 options are arbitrary (i.e. they can appear anywhere). However, if
330 options are mutually exclusive the rightmost appearance is used. This
331 behavior conforms to the standard evaluation rules of common Unix
332 shells.
333
334 In addition, one or more command files can be specified using an '@'
335 sign as a prefix to the filename (e.g. @command.txt). Such a command
336 argument is replaced by the content of the corresponding text file
337 (multiple whitespaces are treated as a single separator unless they
338 appear between two quotation marks) prior to any further evaluation.
339 Please note that a command file cannot contain another command file.
340 This simple but effective approach allows one to summarize common
341 combinations of options/parameters and avoids longish and confusing
342 command lines (an example is provided in file <datadir>/dumppat.txt).
343
345 The dcm2xml utility will attempt to load DICOM data dictionaries
346 specified in the DCMDICTPATH environment variable. By default, i.e. if
347 the DCMDICTPATH environment variable is not set, the file
348 <datadir>/dicom.dic will be loaded unless the dictionary is built into
349 the application (default for Windows).
350
351 The default behavior should be preferred and the DCMDICTPATH
352 environment variable only used when alternative data dictionaries are
353 required. The DCMDICTPATH environment variable has the same format as
354 the Unix shell PATH variable in that a colon (':') separates entries.
355 On Windows systems, a semicolon (';') is used as a separator. The data
356 dictionary code will attempt to load each file specified in the
357 DCMDICTPATH environment variable. It is an error if no data dictionary
358 can be loaded.
359
361 <datadir>/dcm2xml.dtd - Document Type Definition (DTD) file
362
364 xml2dcm(1), dcmconv(1)
365
367 Copyright (C) 2002-2016 by OFFIS e.V., Escherweg 2, 26121 Oldenburg,
368 Germany.
369
370
371
372Version 3.6.2 Fri Jul 14 2017 dcm2xml(1)