1dos2unix(1) 2017-03-10 dos2unix(1)
2
3
4
6 dos2unix - DOS/Mac to Unix and vice versa text file format converter
7
9 dos2unix [options] [FILE ...] [-n INFILE OUTFILE ...]
10 unix2dos [options] [FILE ...] [-n INFILE OUTFILE ...]
11
13 The Dos2unix package includes utilities "dos2unix" and "unix2dos" to
14 convert plain text files in DOS or Mac format to Unix format and vice
15 versa.
16
17 In DOS/Windows text files a line break, also known as newline, is a
18 combination of two characters: a Carriage Return (CR) followed by a
19 Line Feed (LF). In Unix text files a line break is a single character:
20 the Line Feed (LF). In Mac text files, prior to Mac OS X, a line break
21 was single Carriage Return (CR) character. Nowadays Mac OS uses Unix
22 style (LF) line breaks.
23
24 Binary files are automatically skipped, unless conversion is forced.
25
26 Non-regular files, such as directories and FIFOs, are automatically
27 skipped.
28
29 Symbolic links and their targets are by default kept untouched.
30 Symbolic links can optionally be replaced, or the output can be written
31 to the symbolic link target. Symbolic links on Windows are not
32 supported. Windows symbolic links always replaced, keeping the targets
33 unchanged.
34
35 Dos2unix was modelled after dos2unix under SunOS/Solaris and has
36 similar conversion modes.
37
39 -- Treat all following options as file names. Use this option if you
40 want to convert files whose names start with a dash. For instance
41 to convert a file named "-foo", you can use this command:
42
43 dos2unix -- -foo
44
45 Or in new file mode:
46
47 dos2unix -n -- -foo out.txt
48
49 -ascii
50 Convert only line breaks. This is the default conversion mode.
51
52 -iso
53 Conversion between DOS and ISO-8859-1 character set. See also
54 section CONVERSION MODES.
55
56 -1252
57 Use Windows code page 1252 (Western European).
58
59 -437
60 Use DOS code page 437 (US). This is the default code page used for
61 ISO conversion.
62
63 -850
64 Use DOS code page 850 (Western European).
65
66 -860
67 Use DOS code page 860 (Portuguese).
68
69 -863
70 Use DOS code page 863 (French Canadian).
71
72 -865
73 Use DOS code page 865 (Nordic).
74
75 -7 Convert 8 bit characters to 7 bit space.
76
77 -c, --convmode CONVMODE
78 Set conversion mode. Where CONVMODE is one of: ascii, 7bit, iso,
79 mac with ascii being the default.
80
81 -f, --force
82 Force conversion of binary files.
83
84 -h, --help
85 Display help and exit.
86
87 -k, --keepdate
88 Keep the date stamp of output file same as input file.
89
90 -L, --license
91 Display program's license.
92
93 -l, --newline
94 Add additional newline.
95
96 dos2unix: Only DOS line breaks are changed to two Unix line breaks.
97 In Mac mode only Mac line breaks are changed to two Unix line
98 breaks.
99
100 unix2dos: Only Unix line breaks are changed to two DOS line breaks.
101 In Mac mode Unix line breaks are changed to two Mac line breaks.
102
103 -m, --add-bom
104 Write an UTF-8 Byte Order Mark in the output file. Never use this
105 option when the output encoding is other than UTF-8. See also
106 section UNICODE.
107
108 -n, --newfile INFILE OUTFILE ...
109 New file mode. Convert file INFILE and write output to file
110 OUTFILE. File names must be given in pairs and wildcard names
111 should not be used or you will lose your files.
112
113 The person who starts the conversion in new file (paired) mode will
114 be the owner of the converted file. The read/write permissions of
115 the new file will be the permissions of the original file minus the
116 umask(1) of the person who runs the conversion.
117
118 -o, --oldfile FILE ...
119 Old file mode. Convert file FILE and overwrite output to it. The
120 program defaults to run in this mode. Wildcard names may be used.
121
122 In old file (in-place) mode the converted file gets the same owner,
123 group, and read/write permissions as the original file. Also when
124 the file is converted by another user who has write permissions on
125 the file (e.g. user root). The conversion will be aborted when it
126 is not possible to preserve the original values. Change of owner
127 could mean that the original owner is not able to read the file any
128 more. Change of group could be a security risk, the file could be
129 made readable for persons for whom it is not intended.
130 Preservation of owner, group, and read/write permissions is only
131 supported on Unix.
132
133 -q, --quiet
134 Quiet mode. Suppress all warnings and messages. The return value is
135 zero. Except when wrong command-line options are used.
136
137 -s, --safe
138 Skip binary files (default).
139
140 -F, --follow-symlink
141 Follow symbolic links and convert the targets.
142
143 -R, --replace-symlink
144 Replace symbolic links with converted files (original target files
145 remain unchanged).
146
147 -S, --skip-symlink
148 Keep symbolic links and targets unchanged (default).
149
150 -V, --version
151 Display version information and exit.
152
154 In normal mode line breaks are converted from DOS to Unix and vice
155 versa. Mac line breaks are not converted.
156
157 In Mac mode line breaks are converted from Mac to Unix and vice versa.
158 DOS line breaks are not changed.
159
160 To run in Mac mode use the command-line option "-c mac" or use the
161 commands "mac2unix" or "unix2mac".
162
164 Conversion modes ascii, 7bit, and iso are similar to those of
165 dos2unix/unix2dos under SunOS/Solaris.
166
167 ascii
168 In mode "ascii" only line breaks are converted. This is the default
169 conversion mode.
170
171 Although the name of this mode is ASCII, which is a 7 bit standard,
172 the actual mode is 8 bit. Use always this mode when converting
173 Unicode UTF-8 files.
174
175 7bit
176 In this mode all 8 bit non-ASCII characters (with values from 128
177 to 255) are converted to a 7 bit space.
178
179 iso Characters are converted between a DOS character set (code page)
180 and ISO character set ISO-8859-1 (Latin-1) on Unix. DOS characters
181 without ISO-8859-1 equivalent, for which conversion is not
182 possible, are converted to a dot. The same counts for ISO-8859-1
183 characters without DOS counterpart.
184
185 When only option "-iso" is used dos2unix will try to determine the
186 active code page. When this is not possible dos2unix will use
187 default code page CP437, which is mainly used in the USA. To force
188 a specific code page use options "-437" (US), "-850" (Western
189 European), "-860" (Portuguese), "-863" (French Canadian), or "-865"
190 (Nordic). Windows code page CP1252 (Western European) is also
191 supported with option "-1252". For other code pages use dos2unix in
192 combination with iconv(1). Iconv can convert between a long list
193 of character encodings.
194
195 Never use ISO converion on Unicode text files. It will corrupt
196 UTF-8 encoded files.
197
198 Some examples:
199
200 Convert from DOS default code page to Unix Latin-1
201
202 dos2unix -iso -n in.txt out.txt
203
204 Convert from DOS CP850 to Unix Latin-1
205
206 dos2unix -850 -n in.txt out.txt
207
208 Convert from Windows CP1252 to Unix Latin-1
209
210 dos2unix -1252 -n in.txt out.txt
211
212 Convert from Windows CP1252 to Unix UTF-8 (Unicode)
213
214 iconv -f CP1252 -t UTF-8 in.txt | dos2unix > out.txt
215
216 Convert from Unix Latin-1 to DOS default code page.
217
218 unix2dos -iso -n in.txt out.txt
219
220 Convert from Unix Latin-1 to DOS CP850
221
222 unix2dos -850 -n in.txt out.txt
223
224 Convert from Unix Latin-1 to Windows CP1252
225
226 unix2dos -1252 -n in.txt out.txt
227
228 Convert from Unix UTF-8 (Unicode) to Windows CP1252
229
230 unix2dos < in.txt | iconv -f UTF-8 -t CP1252 > out.txt
231
232 See also <http://czyborra.com/charsets/codepages.html> and
233 <http://czyborra.com/charsets/iso8859.html>.
234
236 Encodings
237 There exist different Unicode encodings. On Unix and Linux Unicode
238 files are typically encoded in UTF-8 encoding. On Windows Unicode text
239 files can be encoded in UTF-8, UTF-16, or UTF-16 big endian, but are
240 mostly encoded in UTF-16 format.
241
242 Conversion
243 Unicode text files can have DOS, Unix or Mac line breaks, like regular
244 text files.
245
246 All versions of dos2unix and unix2dos can convert UTF-8 encoded files,
247 because UTF-8 was designed for backward compatiblity with ASCII.
248
249 Dos2unix and unix2dos with Unicode UTF-16 support, can read little and
250 big endian UTF-16 encoded text files. To see if dos2unix was built with
251 UTF-16 support type "dos2unix -V".
252
253 The Windows versions of dos2unix and unix2dos convert UTF-16 encoded
254 files always to UTF-8 encoded files. Unix versions of dos2unix/unix2dos
255 convert UTF-16 encoded files to the locale character encoding when it
256 is set to UTF-8. Use the locale(1) command to find out what the locale
257 character encoding is.
258
259 Because UTF-8 formatted text files are well supported on both Windows
260 and Unix, dos2unix and unix2dos have no option to write UTF-16 files.
261 All UTF-16 characters can be encoded in UTF-8. Conversion from UTF-16
262 to UTF-8 is without loss. UTF-16 files will be skipped on Unix when the
263 locale character encoding is not UTF-8, to prevent accidental loss of
264 text. When an UTF-16 to UTF-8 conversion error occurs, for instance
265 when the UTF-16 input file contains an error, the file will be skipped.
266
267 ISO and 7-bit mode conversion do not work on UTF-16 files.
268
269 Byte Order Mark
270 On Windows Unicode text files typically have a Byte Order Mark (BOM),
271 because many Windows programs (including Notepad) add BOMs by default.
272 See also <http://en.wikipedia.org/wiki/Byte_order_mark>.
273
274 On Unix Unicode files typically don't have a BOM. It is assumed that
275 text files are encoded in the locale character encoding.
276
277 Dos2unix can only detect if a file is in UTF-16 format if the file has
278 a BOM. When an UTF-16 file doesn't have a BOM, dos2unix will see the
279 file as a binary file.
280
281 Use dos2unix in combination with iconv(1) to convert an UTF-16 file
282 without BOM.
283
284 Dos2unix never writes a BOM in the output file, unless you use option
285 "-m".
286
287 Unix2dos writes a BOM in the output file when the input file has a BOM,
288 or when option "-m" is used.
289
290 Unicode examples
291 Convert from Windows UTF-16 (with BOM) to Unix UTF-8
292
293 dos2unix -n in.txt out.txt
294
295 Convert from Windows UTF-16 (without BOM) to Unix UTF-8
296
297 iconv -f UTF-16 -t UTF-8 in.txt | dos2unix > out.txt
298
299 Convert from Unix UTF-8 to Windows UTF-8 with BOM
300
301 unix2dos -m -n in.txt out.txt
302
303 Convert from Unix UTF-8 to Windows UTF-16
304
305 unix2dos < in.txt | iconv -f UTF-8 -t UTF-16 > out.txt
306
308 Read input from 'stdin' and write output to 'stdout'.
309
310 dos2unix
311 dos2unix -l -c mac
312
313 Convert and replace a.txt. Convert and replace b.txt.
314
315 dos2unix a.txt b.txt
316 dos2unix -o a.txt b.txt
317
318 Convert and replace a.txt in ascii conversion mode.
319
320 dos2unix a.txt
321
322 Convert and replace a.txt in ascii conversion mode. Convert and
323 replace b.txt in 7bit conversion mode.
324
325 dos2unix a.txt -c 7bit b.txt
326 dos2unix -c ascii a.txt -c 7bit b.txt
327 dos2unix -ascii a.txt -7 b.txt
328
329 Convert a.txt from Mac to Unix format.
330
331 dos2unix -c mac a.txt
332 mac2unix a.txt
333
334 Convert a.txt from Unix to Mac format.
335
336 unix2dos -c mac a.txt
337 unix2mac a.txt
338
339 Convert and replace a.txt while keeping original date stamp.
340
341 dos2unix -k a.txt
342 dos2unix -k -o a.txt
343
344 Convert a.txt and write to e.txt.
345
346 dos2unix -n a.txt e.txt
347
348 Convert a.txt and write to e.txt, keep date stamp of e.txt same as
349 a.txt.
350
351 dos2unix -k -n a.txt e.txt
352
353 Convert and replace a.txt. Convert b.txt and write to e.txt.
354
355 dos2unix a.txt -n b.txt e.txt
356 dos2unix -o a.txt -n b.txt e.txt
357
358 Convert c.txt and write to e.txt. Convert and replace a.txt. Convert
359 and replace b.txt. Convert d.txt and write to f.txt.
360
361 dos2unix -n c.txt e.txt -o a.txt b.txt -n d.txt f.txt
362
364 Use dos2unix in combination with the find(1) and xargs(1) commands to
365 recursively convert text files in a directory tree structure. For
366 instance to convert all .txt files in the directory tree under the
367 current directory type:
368
369 find . -name *.txt |xargs dos2unix
370
372 LANG
373 The primary language is selected with the environment variable
374 LANG. The LANG variable consists out of several parts. The first
375 part is in small letters the language code. The second is optional
376 and is the country code in capital letters, preceded with an
377 underscore. There is also an optional third part: character
378 encoding, preceded with a dot. A few examples for POSIX standard
379 type shells:
380
381 export LANG=nl Dutch
382 export LANG=nl_NL Dutch, The Netherlands
383 export LANG=nl_BE Dutch, Belgium
384 export LANG=es_ES Spanish, Spain
385 export LANG=es_MX Spanish, Mexico
386 export LANG=en_US.iso88591 English, USA, Latin-1 encoding
387 export LANG=en_GB.UTF-8 English, UK, UTF-8 encoding
388
389 For a complete list of language and country codes see the gettext
390 manual:
391 <http://www.gnu.org/software/gettext/manual/gettext.html#Language-Codes>
392
393 On Unix systems you can use to command locale(1) to get locale
394 specific information.
395
396 LANGUAGE
397 With the LANGUAGE environment variable you can specify a priority
398 list of languages, separated by colons. Dos2unix gives preference
399 to LANGUAGE over LANG. For instance, first Dutch and then German:
400 "LANGUAGE=nl:de". You have to first enable localization, by setting
401 LANG (or LC_ALL) to a value other than "C", before you can use a
402 language priority list through the LANGUAGE variable. See also the
403 gettext manual:
404 <http://www.gnu.org/software/gettext/manual/gettext.html#The-LANGUAGE-variable>
405
406 If you select a language which is not available you will get the
407 standard English messages.
408
409 DOS2UNIX_LOCALEDIR
410 With the environment variable DOS2UNIX_LOCALEDIR the LOCALEDIR set
411 during compilation can be overruled. LOCALEDIR is used to find the
412 language files. The GNU default value is "/usr/local/share/locale".
413 Option --version will display the LOCALEDIR that is used.
414
415 Example (POSIX shell):
416
417 export DOS2UNIX_LOCALEDIR=$HOME/share/locale
418
420 On success, zero is returned. When a system error occurs the last
421 system error will be returned. For other errors 1 is returned.
422
423 The return value is always zero in quiet mode, except when wrong
424 command-line options are used.
425
427 <http://en.wikipedia.org/wiki/Text_file>
428
429 <http://en.wikipedia.org/wiki/Carriage_return>
430
431 <http://en.wikipedia.org/wiki/Newline>
432
433 <http://en.wikipedia.org/wiki/Unicode>
434
436 Benjamin Lin - <blin@socs.uts.edu.au> Bernd Johannes Wuebben (mac2unix
437 mode) - <wuebben@kde.org>, Christian Wurll (add extra newline) -
438 <wurll@ira.uka.de>, Erwin Waterlander - <waterlan@xs4all.nl>
439 (Maintainer)
440
441 Project page: <http://waterlan.home.xs4all.nl/dos2unix.html>
442
443 SourceForge page: <http://sourceforge.net/projects/dos2unix/>
444
445 Freecode: <http://freecode.com/projects/dos2unix>
446
448 file(1) find(1) iconv(1) locale(1) xargs(1)
449
450
451
452dos2unix 2012-09-15 dos2unix(1)