1docstrip(n)                Literate programming tool               docstrip(n)
2
3
4
5______________________________________________________________________________
6


## NAME

8       docstrip - Docstrip style source code extraction
9


## SYNOPSIS

11       package require Tcl  8.4
12
13       package require docstrip  ?1.2?
14
15       docstrip::extract text terminals ?option value ...?
16
17       docstrip::sourcefrom filename terminals ?option value ...?
18
19______________________________________________________________________________
20


## DESCRIPTION

22       Docstrip  is a tool created to support a brand of Literate Programming.
23       It is most common in the (La)TeX community, where it is being used  for
24       pretty much everything from the LaTeX core and up, but there is nothing
25       about docstrip which prevents using it for other types of software.
26
27       In short, the basic principle of literate programming is  that  program
28       source  should primarily be written and structured to suit the develop‐
29       ers (and advanced users who want to peek "under the hood"), not to suit
30       the  whims  of  a compiler or corresponding source code consumer.  This
31       means literate sources often need some  kind  of  "translation"  to  an
32       illiterate  form  that  dumb software can understand.  The docstrip Tcl
33       package handles this translation.
34
35       Even for those who do not whole-hartedly subscribe  to  the  philosophy
36       behind  literate  programming, docstrip can bring greater clarity to in
37       particular:
38
39       ·      programs employing non-obvious mathematics
40
41       ·      projects where separate pieces of  code,  perhaps  in  different
42              languages, need to be closely coordinated.
43
44       The  first  is  by providing access to much more powerful typographical
45       features for source code comments than are possible in plain text.  The
46       second  is  because all the separate pieces of code can be kept next to
47       each other in the same source file.
48
49       The way it works is that the programmer edits directly only one or sev‐
50       eral "master" source code files, from which docstrip generates the more
51       traditional "source" files compilers or the like would expect. The mas‐
52       ter  sources  typically  contain a large amount of documentation of the
53       code, sometimes even in places where the code consumers would not allow
54       any  comments.  The  etymology of "docstrip" is that this documentation
55       was stripped  away  (although  "code  extraction"  might  be  a  better
56       description,  as it has always been a matter of copying selected pieces
57       of the master source rather than deleting text from it).  The  docstrip
58       Tcl  package  contains a reimplementation of the basic extraction func‐
59       tionality from the docstrip program, and thus makes it possible  for  a
60       Tcl interpreter to read and interpret the master source files directly.
61
62       Readers  who are not previously familiar with docstrip but want to know
63       more about it may consult the following sources.
64
65       [1]    The   tclldoc   package   and   class,   http://ctan.org/tex-ar‐
66              chive/macros/latex/contrib/tclldoc/.
67
68       [2]    The        DocStrip       utility,       http://ctan.org/tex-ar‐
69              chive/macros/latex/base/docstrip.dtx.
70
71       [3]    The   doc   and   shortvrb   Packages,   http://ctan.org/tex-ar‐
72              chive/macros/latex/base/doc.dtx.
73
74       [4]    Chapter 14 of The LaTeX Companion (second edition), Addison-Wes‐
75              ley, 2004; ISBN 0-201-36299-6.
76


## FILE FORMAT

78       The basic unit docstrip operates on are the lines of  a  master  source
79       file. Extraction consists of selecting some of these lines to be copied
80       from input text to output text. The basic distinction is  that  between
81       code lines (which are copied and do not begin with a percent character)
82       and comment lines (which begin with a percent  character  and  are  not
83       copied).
84
85
86                 docstrip::extract [join {
87                   {% comment}
88                   {% more comment !"#$%&/(} 89 {some command} 90 { % blah$blah "Not a comment."}
91                   {% abc; this is comment}
92                   {# def; this is code}
93                   {ghi}
94                   {% jkl}
95                 } \n] {}
96
97       returns the same sequence of lines as
98
99
100                 join {
101                   {some command}
102                   { % blah $blah "Not a comment."} 103 {# def; this is code} 104 {ghi} "" 105 } \n 106 107 It does not matter to docstrip what format is used for the documenta‐ 108 tion in the comment lines, but in order to do better than plain text 109 comments, one typically uses some markup language. Most commonly LaTeX 110 is used, as that is a very established standard and also provides the 111 best support for mathematical formulae, but the docstrip::util package 112 also gives some support for doctools-like markup. 113 114 Besides the basic code and comment lines, there are also guard lines, 115 which begin with the two characters '%<', and meta-comment lines, which 116 begin with the two characters ´%%'. Within guard lines there is fur‐ 117 thermore the distinction between verbatim guard lines, which begin with 118 '%<<', and ordinary guard lines, where the '%<' is not followed by 119 another '<'. The last category is by far the most common. 120 121 Ordinary guard lines conditions extraction of the code line(s) they 122 guard by the value of a boolean expression; the guarded block of code 123 lines will only be included if the expression evaluates to true. The 124 syntax of an ordinary guard line is one of 125 126 127 '%' '<' STARSLASH EXPRESSION '>' 128 '%' '<' PLUSMINUS EXPRESSION '>' CODE 129 130 where 131 132 133 STARSLASH ::= '*' | '/' 134 PLUSMINUS ::= | '+' | '-' 135 EXPRESSION ::= SECONDARY | SECONDARY ',' EXPRESSION 136 | SECONDARY '|' EXPRESSION 137 SECONDARY ::= PRIMARY | PRIMARY '&' SECONDARY 138 PRIMARY ::= TERMINAL | '!' PRIMARY | '(' EXPRESSION ')' 139 CODE ::= { any character except end-of-line } 140 141 Comma and vertical bar both denote 'or'. Ampersand denotes 'and'. 142 Exclamation mark denotes 'not'. A TERMINAL can be any nonempty string 143 of characters not containing '>', '&', '|', comma, '(', or ')', 144 although the docstrip manual is a bit restrictive and only guarantees 145 proper operation for strings of letters (although even the LaTeX core 146 sources make heavy use also of digits in TERMINALs). The second argu‐ 147 ment of docstrip::extract is the list of those TERMINALs that should 148 count as having the value 'true'; all other TERMINALs count as being 149 'false' when guard expressions are evaluated. 150 151 In the case of a '%<*EXPRESSION>' guard, the lines guarded are all 152 lines up to the next '%</EXPRESSION>' guard with the same EXPRESSION 153 (compared as strings). The blocks of code delimited by such '*' and '/' 154 guard lines must be properly nested. 155 156 157 set text [join { 158 {begin} 159 {%<*foo>} 160 {1} 161 {%<*bar>} 162 {2} 163 {%</bar>} 164 {%<*!bar>} 165 {3} 166 {%</!bar>} 167 {4} 168 {%</foo>} 169 {5} 170 {%<*bar>} 171 {6} 172 {%</bar>} 173 {end} 174 } \n] 175 set res [docstrip::extract$text foo]
176                 append res [docstrip::extract $text {foo bar}] 177 append res [docstrip::extract$text bar]
178
179       sets $res to the result of 180 181 182 join { 183 {begin} 184 {1} 185 {3} 186 {4} 187 {5} 188 {end} 189 {begin} 190 {1} 191 {2} 192 {4} 193 {5} 194 {6} 195 {end} 196 {begin} 197 {5} 198 {6} 199 {end} "" 200 } \n 201 202 In guard lines without a '*', '/', '+', or '-' modifier after the ´%<', 203 the guard applies only to the CODE following the '>' on that single 204 line. A '+' modifier is equivalent to no modifier. A '-' modifier is 205 like the case with no modifier, but the expression is implicitly 206 negated, i.e., the CODE of a '%<-' guard line is only included if the 207 expression evaluates to false. 208 209 Metacomment lines are "comment lines which should not be stripped 210 away", but be extracted like code lines; these are sometimes used for 211 copyright notices and similar material. The '%%' prefix is however not 212 kept, but substituted by the current -metaprefix, which is customarily 213 set to some "comment until end of line" character (or character 214 sequence) of the language of the code being extracted. 215 216 217 set text [join { 218 {begin} 219 {%<foo> foo} 220 {%<+foo>plusfoo} 221 {%<-foo>minusfoo} 222 {middle} 223 {%% some metacomment} 224 {%<*foo>} 225 {%%another metacomment} 226 {%</foo>} 227 {end} 228 } \n] 229 set res [docstrip::extract$text foo -metaprefix {# }]
230                 append res [docstrip::extract $text bar -metaprefix {#}] 231 232 sets$res to the result of
233
234
235                 join {
236                    {begin}
237                    { foo}
238                    {plusfoo}
239                    {middle}
240                    {#  some metacomment}
241                    {# another metacomment}
242                    {end}
243                    {begin}
244                    {minusfoo}
245                    {middle}
246                    {# some metacomment}
247                    {end} ""
248                 } \n
249
250       Verbatim  guards  can  be  used  to force code line interpretation of a
251       block of lines even if some of them happen to look like any other  type
252       of  lines  to  docstrip. A verbatim guard has the form '%<<END-TAG' and
253       the verbatim block is terminated by the  first  line  that  is  exactly
254       '%END-TAG'.
255
256
257                 set text [join {
258                    {begin}
259                    {%<*myblock>}
260                    {some stupid()}
261                    {   #computer<program>}
262                    {%<<QQQ-98765}
263                    {% These three lines are copied verbatim (including percents}
264                    {%% even if -metaprefix is something different than %%).}
265                    {%</myblock>}
266                    {%QQQ-98765}
267                    {   using*strange@programming<language>}
268                    {%</myblock>}
269                    {end}
270                 } \n]
271                 set res [docstrip::extract $text myblock -metaprefix {# }] 272 append res [docstrip::extract$text {}]
273
274       sets \$res to the result of
275
276
277                 join {
278                    {begin}
279                    {some stupid()}
280                    {   #computer<program>}
281                    {% These three lines are copied verbatim (including percents}
282                    {%% even if -metaprefix is something different than %%).}
283                    {%</myblock>}
284                    {   using*strange@programming<language>}
285                    {end}
286                    {begin}
287                    {end} ""
288                 } \n
289
290       The  processing  of  verbatim  guards takes place also inside blocks of
291       lines which due to some outer block guard will not be copied.
292
293       The final piece of docstrip syntax is that extraction stops at  a  line
294       that is exactly "\endinput"; this is often used to avoid copying random
295       whitespace at the end of a file. In the unlikely case  that  one  wants
296       such a code line, one can protect it with a verbatim guard.
297


## COMMANDS

299       The package defines two commands.
300
301       docstrip::extract text terminals ?option value ...?
302              The extract command docstrips the text and returns the extracted
303              lines of code, as a string with each line terminated with a new‐
304              line. The terminals is the list of those guard expression termi‐
305              nals which should evaluate to true.  The available options are:
306
307              -annotate lines
308                     Requests the specified number of lines of  annotation  to
309                     follow  each extracted line in the result. Defaults to 0.
310                     Annotation lines are mostly  useful  when  the  extracted
311                     lines are to undergo some further transformation. A first
312                     annotation line is a list of three elements:  line  type,
313                     prefix  removed  in  extraction,  and  prefix inserted in
314                     extraction. The line type is one of: 'V' (verbatim),  ´M'
315                     (metacomment),  '+' (+ or no modifier guard line), '-' (-
316                     modifier guard line), '.' (normal line). A second annota‐
317                     tion  line  is the source line number. A third annotation
318                     line is the current stack  of  block  guards.  Requesting
319                     more than three lines of annotation is currently not sup‐
320                     ported.
321
322              -metaprefix string
323                     The string by which the '%%' prefix of a metacomment line
324                     will  be  replaced.  Defaults  to '%%'. For Tcl code this
325                     would typically be '#'.
326
327              -onerror keyword
328                     Controls what will be done when a  format  error  in  the
329                     text being processed is detected. The settings are:
330
331                     ignore Just ignore the error; continue as if nothing hap‐
332                            pened.
333
334                     puts   Write an error message to  stderr,  then  continue
335                            processing.
336
337                     throw  Throw  an  error.  The -errorcode is set to a list
338                            whose first element is DOCSTRIP, second element is
339                            the  type  of error, and third element is the line
340                            number where the error is detected.  This  is  the
341                            default.
342
343              -trimlines boolean
344                     Controls  whether  spaces  at the end of a line should be
345                     trimmed away before the line is  processed.  Defaults  to
346                     true.
347
348              It  should  be  remarked  that  the  terminals  are often called
349              "options" in the context of the docstrip  program,  since  these
350              specify which optional code fragments should be included.
351
352       docstrip::sourcefrom filename terminals ?option value ...?
353              The sourcefrom command is a docstripping emulation of source. It
354              opens the file filename, reads it, closes it, docstrips the con‐
355              tents as specified by the terminals, and evaluates the result in
356              the local context of the caller,  during  which  time  the  info
357              script  value will be the filename. The options are passed on to
358              fconfigure to configure the file before its contents  are  read.
359              The  -metaprefix  is  set to '#', all other extract options have
360              their default values.
361


## DOCUMENT STRUCTURE

363       The file format (as described above) determines whether a master source
364       code file can be processed correctly by docstrip, but the usefulness of
365       the format is to no little part also dependent on  that  the  code  and
366       comment lines together constitute a well-formed document.
367
368       For  a  document format that does not require any non-Tcl software, see
369       the ddt2man command in the docstrip::util package. It is suggested that
370       files  employing  that  document format are given the suffix ".ddt", to
371       distinguish them from the more traditional LaTeX-based ".dtx" files.
372
373       Master source files with ".dtx" extension are usually set  up  so  that
374       they  can  be  typeset directly by latex without any support from other
375       files. This is achieved by beginning the file with the lines
376
377
378                 % \iffalse
379                 %<*driver>
380                 \documentclass{tclldoc}
381                 \begin{document}
382                 \DocInput{filename.dtx}
383                 \end{document}
384                 %</driver>
385                 % \fi
386
387       or some variation thereof. The trick is that the file gets read  twice.
388       With  normal  LaTeX reading rules, the first two lines are comments and
389       therefore ignored. The third line is the document preamble, the  fourth
390       line begins the document body, and the sixth line ends the document, so
391       LaTeX stops there — non-comments below that point in the file are never
392       subjected  to the normal LaTeX reading rules. Before that, however, the
393       \DocInput command on the fifth line is processed,  and  that  does  two
394       things:  it  changes  the  interpretation  of  '%'  from  "comment"  to
395       "ignored", and it inputs the file specified in the argument  (which  is
396       normally  the  name  of the file the command is in).  It is this second
397       time that the file is being read that the comments and code in  it  are
398       typeset.
399
400       The  function  of the \iffalse ... \fi is to skip lines two to seven on
401       this second time through; this is similar to the "if 0 { ...  }"  idiom
402       for  block comments in Tcl code, and it is needed here because (amongst
403       other things) the \documentclass command may only be executed once. The
404       function of the <driver> guards is to prevent this short piece of LaTeX
405       code from being extracted by docstrip.  The total effect  is  that  the
406       file  can  function  both  as a LaTeX document and as a docstrip master
407       source code file.
408
409       It is not necessary to use the tclldoc document class,  but  that  does
410       provide  a number of features that are convenient for ".dtx" files con‐
411       taining Tcl code. More information on this matter can be found  in  the
412       references above.
413


415       docstrip_util
416


## KEYWORDS

418       \.dtx, LaTeX, docstrip, documentation, literate programming, source
419


## CATEGORY

421       Documentation tools
422

424       Copyright (c) 2003–2010 Lars Hellström <Lars dot Hellstrom at residenset dot net>
425
426
427
428
429tcllib                                1.2                          docstrip(n)