1DETEX(1) General Commands Manual DETEX(1)
2
3
4
6 detex - a filter to strip TeX commands from a .tex file.
7
9 detex [ -clnstw ] [ -e environment-list ] [ filename[.tex] ... ]
10
12 Detex reads each file in sequence, removes all comments and TeX control
13 sequences and writes the remainder on the standard output. All text in
14 math mode and display mode is removed. By default, detex follows \in‐
15 put commands. If a file cannot be opened, a warning message is printed
16 and the command is ignored. If the -n option is used, no \input or
17 \include commands will be processed. This allows single file process‐
18 ing. If no input file is given on the command line, detex reads from
19 standard input.
20
21 If the magic sequence ``\begin{document}'' appears in the text, detex
22 assumes it is dealing with LaTeX source and detex recognizes additional
23 constructs used in LaTeX. These include the \include and \includeonly
24 commands. The -l option can be used to force LaTeX mode and the -t op‐
25 tion can be used to force TeX mode regardless of input content.
26
27 Text in various environment modes of LaTeX is ignored. The default
28 modes are array, eqnarray, equation, longtable, picture, tabular and
29 verbatim. The -e option can be used to specify a comma separated envi‐
30 ronment-list of environments to ignore. The list replaces the defaults
31 so specifying an empty list effectively causes no environments to be
32 ignored.
33
34 The -c option can be used in LaTeX mode to have detex echo the argu‐
35 ments to \cite, \ref, and \pageref macros. This can be useful when
36 sending the output to a style checker.
37
38 Detex assumes the standard character classes are being used for TeX.
39 Detex allows white space between control sequences and magic characters
40 like `{' when recognizing things like LaTeX environments.
41
42 The -r option tries to naively replace $..$, $$..$$, \(..\) and \[..\]
43 with nouns and verbs (in particular, "noun" and "verbs") in a way that
44 keeps sentences readable.
45
46 If the -w flag is given, the output is a word list, one `word' (string
47 of two or more letters and apostrophes beginning with a letter) per
48 line, and all other characters ignored. Without -w the output follows
49 the original, with the deletions mentioned above. Newline characters
50 are preserved where possible so that the lines of output match the in‐
51 put as closely as possible.
52
53 The -1 option will prefix each printed line with `filename:linenumber:`
54 indicating where that line is coming from in terms of the original
55 (La)TeX document.
56
57 The TEXINPUTS environment variable is used to find \input and \include
58 files. Like TeX, it interprets a leading or trailing `:' as the de‐
59 fault TEXINPUTS. It does not support the `//' directory expansion
60 magic sequence.
61
62 Detex now handles the basic TeX ligatures as a special case, replacing
63 the ligatures with acceptable character substitutes. This eliminates
64 spelling errors introduced by merely removing them. The ligatures are
65 \aa, \ae, \oe, \ss, \o, \l (and their upper-case equivalents). The
66 special "dotless" characters \i and \j are also replaced with i and j
67 respectively.
68
69 Note that previous versions of detex would replace control sequences
70 with a space character to prevent words from running together. How‐
71 ever, this caused accents in the middle of words to break words, gener‐
72 ating "spelling errors" that were not desirable. Therefore, the new
73 version merely removes these accents. The old functionality can be es‐
74 sentially duplicated by using the -s option.
75
77 tex(1)
78
80 Nesting of \input is allowed but the number of opened files must not
81 exceed the system's limit on the number of simultaneously opened files.
82 Detex ignores unrecognized option characters after printing a warning
83 message.
84
86 Originally written by Daniel Trinkle, Computer Science Department, Pur‐
87 due University.
88
89 Maintained by Piotr Kubowicz <https://github.com/pkubowicz/opendetex>.
90
92 Detex is not a TeX interpreter (it essentially reads the input with a
93 (f)lex program), so it is easily confused by some constructs. Most er‐
94 rors result in too much rather than too little output.
95
96 Running LaTeX source without a ``\begin{document}'' through detex may
97 produce errors.
98
99 Suggestions for improvements are (mildly) encouraged.
100
101
102
103TeX Live 15 September 2022 DETEX(1)