1utf8trans(1) docbook2X utf8trans(1)
2
3
4
6 utf8trans - Transliterate UTF-8 characters according to a table
7
9 utf8trans charmap [file]...
10
12 utf8trans transliterates characters in the specified files (or standard
13 input, if they are not specified) and writes the output to standard
14 output. All input and output is in the UTF-8 encoding.
15
16 This program is usually used to render characters in Unicode text files
17 as some markup escapes or ASCII transliterations. (It is not intended
18 for general charset conversions.) It provides functionality similar to
19 the character maps in XSLT 2.0 (XML Stylesheet Language – Transforma‐
20 tions, version 2.0).
21
23 -m, --modify
24 Modifies the given files in-place with their transliterated out‐
25 put, instead of sending it to standard output.
26
27 This option is useful for efficient transliteration of many
28 files at once.
29
30 --help Show brief usage information and exit.
31
32 --version
33 Show version and exit.
34
36 The translation is done according to the rules in the ‘character map’,
37 named in the file charmap. It has the following format:
38
39 1. Each line represents a translation entry, except for blank lines
40 and comment lines, which are ignored.
41
42 2. Any amount of whitespace (space or tab) may precede the start of an
43 entry.
44
45 3. Comment lines begin with #. Everything on the same line is ig‐
46 nored.
47
48 4. Each entry consists of the Unicode codepoint of the character to
49 translate, in hexadecimal, followed one space or tab, followed by
50 the translation string, up to the end of the line.
51
52 5. The translation string is taken literally, including any leading
53 and trailing spaces (except the delimeter between the codepoint and
54 the translation string), and all types of characters. The newline
55 at the end is not included.
56
57 The above format is intended to be restrictive, to keep utf8trans sim‐
58 ple. But if a XML-based format is desired, there is a
59 xmlcharmap2utf8trans script that comes with the docbook2X distribution,
60 that converts character maps in XSLT 2.0 format to the utf8trans for‐
61 mat.
62
64 · utf8trans does not work with binary files, because malformed UTF-8
65 sequences in the input are substituted with U+FFFD characters. Howev‐
66 er, null characters in the input are handled correctly. This limita‐
67 tion may be removed in the future.
68
69 · There is no way to include a newline or null in the substitution
70 string.
71
73 Steve Cheng <stevecheng@users.sourceforge.net>.
74
75
76
77docbook2X 0.8.7 18 April 2006 utf8trans(1)