1
2UTRAC(1)                         Alliance MCA                         UTRAC(1)
3
4
5

NAME

7       utrac - recognize and convert charset and end-of-line of text files
8
9
10

SYNOPSIS

12       utrac [OPTION] [FILE]
13
14

DESCRIPTION

16       Utrac  is a tool (and a library) that recognize the charset and the end
17       of line type used in a text file. It can also convert it.  In  case  of
18       8bits charsets, recognition is not sure, so it can also assist the user
19       to choose the correct charset, for instance by filtering the  text  and
20       displaying only lines that matter.
21
22

OPTIONS

24       With  no FILE, read standard input. With no OPTION, recognize and write
25       converted text to standard output.
26
27
28       -p, --print-charset
29              Print the name of the charset that suits best the input file.
30
31
32       -P, --print-all-charset
33              Print ranked list of charsets.  The first  column  is  the  mark
34              with  locale bonus (language and system), the second is the mark
35              brut, the third is the checksum of all  extended  character  (to
36              know  which charsets produce the same results) and the fourth is
37              the charset name (on the same line if their mark with bonus  and
38              their checksum are identical).
39              If  the  recognition  is  sure  (ASCII or UTF-8), print only one
40              name.
41
42
43       -f, --from
44              Force  input  charset  (disable  recognition)  and/or  EOL.  For
45              instance, "UTF-8/CRLF".
46
47
48       -t, --to
49              Select output charset and/or EOL. See above.
50
51
52       -L, --language
53              Select  language. All charsets that fit this language will get a
54              bonus during recognition. If none specified, LC_* variables  are
55              used.
56
57       -S, --system
58              Select  system.  All  charsets that fit this language will get a
59              bonus during recognition.
60
61
62       -x, --ext-chars
63              Print lines with extended characters (try to print each extended
64              character not more than once).
65
66
67       -d, --distribution
68              Print distribution, i.e. the count of each 8bits character.
69
70
71       -a, --all-ext-chars
72              Print  each  extended  character  of  the file in each different
73              charset (UTF-8 output is recommended).
74
75
76       -c, --colors
77              (with -x or -a) Use color.
78
79
80       -b, --bar
81              Display a progress bar.
82
83
84       -i, --info
85              Print default/chosen parameters.
86
87
88       -l, --list
89              List charsets/eol/languages/systems.
90
91
92       -h, --help
93              Print some help.
94
95       -v, --version
96              Print version.
97
98

FILES

100       charset.dat
101              This  file  should  be  located  in  /usr/local/share/utrac/  or
102              /usr/share/utrac/.  It  contains informations about charsets and
103              their related charmap. If you want to  add  new  charsets  (they
104              must  be 8bits and ASCII compatible), check the script  merge.pl
105              in Utrac source directory.
106
107

BUGS

109       Utrac is still a beta version, so you can expect to find  some  bugs...
110       Please  report  them  to  <antoine@alliancemca.net>. If you have a text
111       file that is not well recognize by Utrac, please send it to improve the
112       recognition algorithm.
113
114

AUTHOR

116       Written by Antoine Calando <antoine@alliancemca.net>.
117
118
120       Copyright © 2004 Alliance MCA.
121       This is free software; see the source for copying conditions.  There is
122       NO warranty; not even for MERCHANTABILITY or FITNESS FOR  A  PARTICULAR
123       PURPOSE.
124
125

SEE ALSO

127       You can find more documentation from http://utrac.sourceforge.net
128
129
130
131Utrac 0.3                        January 2005                         UTRAC(1)
Impressum