1dcm2xml(1)                        OFFIS DCMTK                       dcm2xml(1)
2
3
4

NAME

6       dcm2xml - Convert DICOM file and data set to XML
7
8

SYNOPSIS

10       dcm2xml [options] dcmfile-in [xmlfile-out]
11

DESCRIPTION

13       The  dcm2xml utility converts the contents of a DICOM file (file format
14       or raw data set) to XML (Extensible Markup  Language).  There  are  two
15       output  formats.  The  first  one  is  specific  to  DCMTK with its DTD
16       (Document Type Definition)  described  in  the  file  dcm2xml.dtd.  The
17       second  one  refers  to the 'Native DICOM Model' which is specified for
18       the DICOM Application Hosting service found in DICOM part 19.
19
20       If dcm2xml reads a raw data set (DICOM data without a file format meta-
21       header)  it  will attempt to guess the transfer syntax by examining the
22       first few bytes of the file. It is not  always  possible  to  correctly
23       guess  the  transfer syntax and it is better to convert a data set to a
24       file format whenever possible (using the dcmconv utility). It  is  also
25       possible  to  use the -f and -t[ieb] options to force dcm2xml to read a
26       data set with a particular transfer syntax.
27

PARAMETERS

29       dcmfile-in   DICOM input filename to be converted
30
31       xmlfile-out  XML output filename (default: stdout)
32

OPTIONS

34   general options
35         -h    --help
36                 print this help text and exit
37
38               --version
39                 print version information and exit
40
41               --arguments
42                 print expanded command line arguments
43
44         -q    --quiet
45                 quiet mode, print no warnings and errors
46
47         -v    --verbose
48                 verbose mode, print processing details
49
50         -d    --debug
51                 debug mode, print debug information
52
53         -ll   --log-level  [l]evel: string constant
54                 (fatal, error, warn, info, debug, trace)
55                 use level l for the logger
56
57         -lc   --log-config  [f]ilename: string
58                 use config file f for the logger
59
60   input options
61       input file format:
62
63         +f    --read-file
64                 read file format or data set (default)
65
66         +fo   --read-file-only
67                 read file format only
68
69         -f    --read-dataset
70                 read data set without file meta information
71
72       input transfer syntax:
73
74         -t=   --read-xfer-auto
75                 use TS recognition (default)
76
77         -td   --read-xfer-detect
78                 ignore TS specified in the file meta header
79
80         -te   --read-xfer-little
81                 read with explicit VR little endian TS
82
83         -tb   --read-xfer-big
84                 read with explicit VR big endian TS
85
86         -ti   --read-xfer-implicit
87                 read with implicit VR little endian TS
88
89       long tag values:
90
91         +M    --load-all
92                 load very long tag values (e.g. pixel data)
93
94         -M    --load-short
95                 do not load very long values (default)
96
97         +R    --max-read-length  [k]bytes: integer (4..4194302, default: 4)
98                 set threshold for long values to k kbytes
99
100   processing options
101       specific character set:
102
103         +Cr   --charset-require
104                 require declaration of extended charset (default)
105
106         +Ca   --charset-assume  [c]harset: string
107                 assume charset c if no extended charset declared
108
109         +Cc   --charset-check-all
110                 check all data elements with string values
111                 (default: only PN, LO, LT, SH, ST, UC and UT)
112
113                 # this option is only used for the mapping to an appropriate
114                 # XML character encoding, but not for the conversion to UTF-8
115
116         +U8   --convert-to-utf8
117                 convert all element values that are affected
118                 by Specific Character Set (0008,0005) to UTF-8
119
120                 # requires support from an underlying character encoding library
121                 # (see output of --version on which one is available)
122
123   output options
124       general XML format:
125
126         -dtk  --dcmtk-format
127                 output in DCMTK-specific format (default)
128
129         -nat  --native-format
130                 output in Native DICOM Model format (part 19)
131
132         +Xn   --use-xml-namespace
133                 add XML namespace declaration to root element
134
135       DCMTK-specific format (not with --native-format):
136
137         +Xd   --add-dtd-reference
138                 add reference to document type definition (DTD)
139
140         +Xe   --embed-dtd-content
141                 embed document type definition into XML document
142
143         +Xf   --use-dtd-file  [f]ilename: string
144                 use specified DTD file (only with +Xe)
145                 (default: /usr/local/share/dcmtk/dcm2xml.dtd)
146
147         +Wn   --write-element-name
148                 write name of the DICOM data elements (default)
149
150         -Wn   --no-element-name
151                 do not write name of the DICOM data elements
152
153         +Wb   --write-binary-data
154                 write binary data of OB and OW elements
155                 (default: off, be careful with --load-all)
156
157       encoding of binary data:
158
159         +Eh   --encode-hex
160                 encode binary data as hex numbers
161                 (default for DCMTK-specific format)
162
163         +Eu   --encode-uuid
164                 encode binary data as a UUID reference
165                 (default for Native DICOM Model)
166
167         +Eb   --encode-base64
168                 encode binary data as Base64 (RFC 2045, MIME)
169

DCMTK Format

171       The basic structure of the DCMTK-specific XML  output  created  from  a
172       DICOM file looks like the following:
173
174       <?xml version="1.0" encoding="ISO-8859-1"?>
175       <!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
176       <file-format xmlns="http://dicom.offis.de/dcmtk">
177         <meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
178           <element tag="0002,0000" vr="UL" vm="1" len="4"
179                    name="MetaElementGroupLength">
180             166
181           </element>
182           ...
183           <element tag="0002,0013" vr="SH" vm="1" len="16"
184                    name="ImplementationVersionName">
185             OFFIS_DCMTK_353
186           </element>
187         </meta-header>
188         <data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
189           <element tag="0008,0005" vr="CS" vm="1" len="10"
190                    name="SpecificCharacterSet">
191             ISO_IR 100
192           </element>
193           ...
194           <sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
195             <item card="3">
196               <element tag="0028,3002" vr="xs" vm="3" len="6"
197                        name="LUTDescriptor">
198                 256\0\8
199               </element>
200               ...
201             </item>
202             ...
203           </sequence>
204           ...
205           <element tag="7fe0,0010" vr="OW" vm="1" len="262144"
206                    name="PixelData" loaded="no" binary="hidden">
207           </element>
208         </data-set>
209       </file-format>
210
211       The  'file-format'  and  'meta-header'  tags  are absent for DICOM data
212       sets.
213
214   XML Encoding
215       Attributes with very large value  fields  (e.g.  pixel  data)  are  not
216       loaded  by  default. They can be identified by the additional attribute
217       'loaded' with a value of 'no' (see example  above).  The  command  line
218       option  --load-all  forces  to load all value fields including the very
219       long ones.
220
221       Furthermore, binary information of OB and OW attributes are not written
222       to  the XML output file by default. These elements can be identified by
223       the additional attribute 'binary' with a value of 'hidden' (default  is
224       'no').  The  command line option --write-binary-data causes also binary
225       value fields to be printed (attribute value is 'yes' or 'base64'). But,
226       be  careful  when using this option together with --load-all because of
227       the large amounts of pixel data that might be printed  to  the  output.
228       Please  note  that in this context element values with a VR of OD or OF
229       are not regarded as 'binary information'.
230
231       Multiple values (i.e. where the DICOM  value  multiplicity  is  greater
232       than  1)  are  separated  by a backslash '\' (except for Base64 encoded
233       data). The 'len' attribute  indicates  the  number  of  bytes  for  the
234       particular  value  field as stored in the DICOM data set, i.e. it might
235       deviate from  the  XML  encoded  value  length  e.g.  because  of  non-
236       significant padding that has been removed. If this attribute is missing
237       in 'sequence' or 'item' start tags, the corresponding DICOM element has
238       been stored with undefined length.
239

Native DICOM Model Format

241       The  description  of  the Native DICOM Model format can be found in the
242       DICOM standard, part 19 ('Application Hosting').
243
244   Bulk Data
245       Binary data, i.e. DICOM element values with Value Representations  (VR)
246       of OB or OW, as well as OD, OF and UN values are by default not written
247       to the XML output because of their size. Instead, for each  element,  a
248       new Universally Unique Identifier (UUID) is being generated and written
249       as an attribute of a <BulkData>  XML  element.  So  far,  there  is  no
250       possibility  to  write  an  additional file to hold the binary data for
251       each of the binary data chunks. This is not required by  the  standard,
252       however,  it  might  be  useful for implementing an Application Hosting
253       interface; thus this feature may be available  in  future  versions  of
254       dcm2xml.
255
256       In  addition,  Supplement  163  (Store Over the Web by Representational
257       State Transfer Services) introduces a new  <InlineBinary>  XML  element
258       that  allows for encoding binary data as Base64. Currently, the command
259       line option --encode-base64 enables this  encoding  for  the  following
260       VRs: OB, OD, OF, OW, and UN.
261
262   Known Issues
263       In  addition  to  what  is written in the above section on 'Bulk Data',
264       there are further known issues with the current implementation  of  the
265       Native  DICOM Model format. For example, large element values with a VR
266       other than OB, OD, OF, OW or UN are currently  never  written  as  bulk
267       data,  although  it  might  be useful, e.g. for very long text elements
268       (especially UT) or very long numeric fields (of various VRs).
269

NOTES

271   Character Encoding
272       The XML encoding is determined automatically from the  DICOM  attribute
273       (0008,0005) 'Specific Character Set' using the following mapping:
274
275       ASCII         (ISO_IR 6)    =>  "UTF-8"
276       UTF-8         "ISO_IR 192"  =>  "UTF-8"
277       ISO Latin 1   "ISO_IR 100"  =>  "ISO-8859-1"
278       ISO Latin 2   "ISO_IR 101"  =>  "ISO-8859-2"
279       ISO Latin 3   "ISO_IR 109"  =>  "ISO-8859-3"
280       ISO Latin 4   "ISO_IR 110"  =>  "ISO-8859-4"
281       ISO Latin 5   "ISO_IR 148"  =>  "ISO-8859-9"
282       Cyrillic      "ISO_IR 144"  =>  "ISO-8859-5"
283       Arabic        "ISO_IR 127"  =>  "ISO-8859-6"
284       Greek         "ISO_IR 126"  =>  "ISO-8859-7"
285       Hebrew        "ISO_IR 138"  =>  "ISO-8859-8"
286
287       If  this DICOM attribute is missing in the input file, although needed,
288       option --charset-assume can be used to specify an appropriate character
289       set  manually  (using  one  of the DICOM defined terms). For reasons of
290       backward  compatibility  with  previous  versions  of  this  tool,  the
291       following  terms  are  also  supported  and mapped automatically to the
292       associated DICOM defined terms:  latin-1,  latin-2,  latin-3,  latin-4,
293       latin-5, cyrillic, arabic, greek, hebrew.
294
295       Multiple  character  sets  using  code  extension  techniques  are  not
296       supported. If needed, option --convert-to-utf8 can be used  to  convert
297       the DICOM file or data set to UTF-8 encoding prior to the conversion to
298       XML format. This is also useful for DICOMDIR files where each directory
299       record can have a different character set.
300

LOGGING

302       The  level  of  logging  output  of  the various command line tools and
303       underlying libraries can be specified by the  user.  By  default,  only
304       errors  and  warnings  are  written to the standard error stream. Using
305       option --verbose also informational messages  like  processing  details
306       are  reported.  Option  --debug  can be used to get more details on the
307       internal activity, e.g. for debugging purposes.  Other  logging  levels
308       can  be  selected  using option --log-level. In --quiet mode only fatal
309       errors are reported. In such very severe error events, the  application
310       will  usually  terminate.  For  more  details  on the different logging
311       levels, see documentation of module 'oflog'.
312
313       In case the logging output should be written to file  (optionally  with
314       logfile  rotation),  to syslog (Unix) or the event log (Windows) option
315       --log-config can be used.  This  configuration  file  also  allows  for
316       directing  only  certain messages to a particular output stream and for
317       filtering certain messages based on the  module  or  application  where
318       they  are  generated.  An  example  configuration  file  is provided in
319       <etcdir>/logger.cfg.
320

COMMAND LINE

322       All command line tools  use  the  following  notation  for  parameters:
323       square  brackets  enclose  optional  values  (0-1), three trailing dots
324       indicate that multiple values are allowed (1-n), a combination of  both
325       means 0 to n values.
326
327       Command line options are distinguished from parameters by a leading '+'
328       or '-' sign, respectively. Usually, order and position of command  line
329       options  are  arbitrary  (i.e.  they  can appear anywhere). However, if
330       options are mutually exclusive the rightmost appearance is  used.  This
331       behavior  conforms  to  the  standard  evaluation  rules of common Unix
332       shells.
333
334       In addition, one or more command files can be specified  using  an  '@'
335       sign  as  a  prefix to the filename (e.g. @command.txt). Such a command
336       argument is replaced by the content  of  the  corresponding  text  file
337       (multiple  whitespaces  are  treated  as a single separator unless they
338       appear between two quotation marks) prior to  any  further  evaluation.
339       Please  note  that  a command file cannot contain another command file.
340       This simple but effective  approach  allows  one  to  summarize  common
341       combinations  of  options/parameters  and  avoids longish and confusing
342       command lines (an example is provided in file <datadir>/dumppat.txt).
343

ENVIRONMENT

345       The dcm2xml utility  will  attempt  to  load  DICOM  data  dictionaries
346       specified  in the DCMDICTPATH environment variable. By default, i.e. if
347       the  DCMDICTPATH  environment   variable   is   not   set,   the   file
348       <datadir>/dicom.dic  will be loaded unless the dictionary is built into
349       the application (default for Windows).
350
351       The  default  behavior  should  be  preferred   and   the   DCMDICTPATH
352       environment  variable  only used when alternative data dictionaries are
353       required. The DCMDICTPATH environment variable has the same  format  as
354       the  Unix  shell PATH variable in that a colon (':') separates entries.
355       On Windows systems, a semicolon (';') is used as a separator. The  data
356       dictionary  code  will  attempt  to  load  each  file  specified in the
357       DCMDICTPATH environment variable. It is an error if no data  dictionary
358       can be loaded.
359

FILES

361       <datadir>/dcm2xml.dtd - Document Type Definition (DTD) file
362

SEE ALSO

364       xml2dcm(1), dcmconv(1)
365
367       Copyright  (C)  2002-2016  by OFFIS e.V., Escherweg 2, 26121 Oldenburg,
368       Germany.
369
370
371
372Version 3.6.2                   Fri Jul 14 2017                     dcm2xml(1)
Impressum