1DETEX(1)                    General Commands Manual                   DETEX(1)
2
3
4

NAME

6       detex - a filter to strip TeX commands from a .tex file.
7

SYNOPSIS

9       detex [ -clnstw ] [ -e environment-list ] [ filename[.tex] ... ]
10

DESCRIPTION

12       Detex reads each file in sequence, removes all comments and TeX control
13       sequences and writes the remainder on the standard output.  All text in
14       math  mode and display mode is removed.  By default, detex follows \in‐
15       put commands.  If a file cannot be opened, a warning message is printed
16       and  the  command  is  ignored.  If the -n option is used, no \input or
17       \include commands will be processed.  This allows single file  process‐
18       ing.   If  no input file is given on the command line, detex reads from
19       standard input.
20
21       If the magic sequence ``\begin{document}'' appears in the  text,  detex
22       assumes it is dealing with LaTeX source and detex recognizes additional
23       constructs used in LaTeX.  These include the \include and  \includeonly
24       commands.  The -l option can be used to force LaTeX mode and the -t op‐
25       tion can be used to force TeX mode regardless of input content.
26
27       Text in various environment modes of LaTeX  is  ignored.   The  default
28       modes  are  array,  eqnarray, equation, longtable, picture, tabular and
29       verbatim.  The -e option can be used to specify a comma separated envi‐
30       ronment-list of environments to ignore.  The list replaces the defaults
31       so specifying an empty list effectively causes no  environments  to  be
32       ignored.
33
34       The  -c  option  can be used in LaTeX mode to have detex echo the argu‐
35       ments to \cite, \ref, and \pageref macros.  This  can  be  useful  when
36       sending the output to a style checker.
37
38       Detex  assumes  the  standard character classes are being used for TeX.
39       Detex allows white space between control sequences and magic characters
40       like `{' when recognizing things like LaTeX environments.
41
42       The  -r option tries to naively replace $..$, $$..$$, \(..\) and \[..\]
43       with nouns and verbs (in particular, "noun" and "verbs") in a way  that
44       keeps sentences readable.
45
46       If  the -w flag is given, the output is a word list, one `word' (string
47       of two or more letters and apostrophes beginning  with  a  letter)  per
48       line,  and all other characters ignored.  Without -w the output follows
49       the original, with the deletions mentioned above.   Newline  characters
50       are  preserved where possible so that the lines of output match the in‐
51       put as closely as possible.
52
53       The -1 option will prefix each printed line with `filename:linenumber:`
54       indicating  where  that  line  is  coming from in terms of the original
55       (La)TeX document.
56
57       The TEXINPUTS environment variable is used to find \input and  \include
58       files.   Like  TeX,  it interprets a leading or trailing `:' as the de‐
59       fault TEXINPUTS.  It does not  support  the  `//'  directory  expansion
60       magic sequence.
61
62       Detex  now handles the basic TeX ligatures as a special case, replacing
63       the ligatures with acceptable character substitutes.   This  eliminates
64       spelling  errors introduced by merely removing them.  The ligatures are
65       \aa, \ae, \oe, \ss, \o, \l (and  their  upper-case  equivalents).   The
66       special  "dotless"  characters \i and \j are also replaced with i and j
67       respectively.
68
69       Note that previous versions of detex would  replace  control  sequences
70       with  a  space  character to prevent words from running together.  How‐
71       ever, this caused accents in the middle of words to break words, gener‐
72       ating  "spelling  errors"  that were not desirable.  Therefore, the new
73       version merely removes these accents.  The old functionality can be es‐
74       sentially duplicated by using the -s option.
75

SEE ALSO

77       tex(1)
78

DIAGNOSTICS

80       Nesting  of  \input  is allowed but the number of opened files must not
81       exceed the system's limit on the number of simultaneously opened files.
82       Detex  ignores  unrecognized option characters after printing a warning
83       message.
84

AUTHOR

86       Originally written by Daniel Trinkle, Computer Science Department, Pur‐
87       due University.
88
89       Maintained by Piotr Kubowicz <https://github.com/pkubowicz/opendetex>.
90

BUGS

92       Detex  is  not a TeX interpreter (it essentially reads the input with a
93       (f)lex program), so it is easily confused by some constructs. Most  er‐
94       rors result in too much rather than too little output.
95
96       Running  LaTeX  source without a ``\begin{document}'' through detex may
97       produce errors.
98
99       Suggestions for improvements are (mildly) encouraged.
100
101
102
103TeX Live                       15 September 2022                      DETEX(1)
Impressum