1docstrip(n)                Literate programming tool               docstrip(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       docstrip - Docstrip style source code extraction
9

SYNOPSIS

11       package require Tcl  8.4
12
13       package require docstrip  ?1.2?
14
15       docstrip::extract text terminals ?option value ...?
16
17       docstrip::sourcefrom filename terminals ?option value ...?
18
19_________________________________________________________________
20

DESCRIPTION

22       Docstrip  is a tool created to support a brand of Literate Programming.
23       It is most common in the (La)TeX community, where it is being used  for
24       pretty much everything from the LaTeX core and up, but there is nothing
25       about docstrip which prevents using it for other types of software.
26
27       In short, the basic principle of literate programming is  that  program
28       source  should primarily be written and structured to suit the develop‐
29       ers (and advanced users who want to peek "under the hood"), not to suit
30       the  whims  of  a compiler or corresponding source code consumer.  This
31       means literate sources often need some  kind  of  "translation"  to  an
32       illiterate  form  that  dumb software can understand.  The docstrip Tcl
33       package handles this translation.
34
35       Even for those who do not whole-hartedly subscribe  to  the  philosophy
36       behind  literate  programming, docstrip can bring greater clarity to in
37       particular:
38
39       ·      programs employing non-obvious mathematics
40
41       ·      projects where separate pieces of  code,  perhaps  in  different
42              languages, need to be closely coordinated.
43
44       The  first  is  by providing access to much more powerful typographical
45       features for source code comments than are possible in plain text.  The
46       second  is  because all the separate pieces of code can be kept next to
47       each other in the same source file.
48
49       The way it works is that the programmer edits directly only one or sev‐
50       eral "master" source code files, from which docstrip generates the more
51       traditional "source" files compilers or the like would expect. The mas‐
52       ter  sources  typically  contain a large amount of documentation of the
53       code, sometimes even in places where the code consumers would not allow
54       any  comments.  The  etymology of "docstrip" is that this documentation
55       was stripped  away  (although  "code  extraction"  might  be  a  better
56       description,  as it has always been a matter of copying selected pieces
57       of the master source rather than deleting text from it).  The  docstrip
58       Tcl  package  contains a reimplementation of the basic extraction func‐
59       tionality from the docstrip program, and thus makes it possible  for  a
60       Tcl interpreter to read and interpret the master source files directly.
61
62       Readers  who are not previously familiar with docstrip but want to know
63       more about it may consult the following sources.
64
65       [1]    The   tclldoc   package   and   class,    http://tug.org/tex-ar
66              chive/macros/latex/contrib/tclldoc/.
67
68       [2]    The        DocStrip        utility,       http://tug.org/tex-ar
69              chive/macros/latex/base/docstrip.dtx.
70
71       [3]    The   doc   and   shortvrb   Packages,    http://tug.org/tex-ar
72              chive/macros/latex/base/doc.dtx.
73
74       [4]    Chapter 14 of The LaTeX Companion (second edition), Addison-Wes‐
75              ley, 2004; ISBN 0-201-36299-6.
76

FILE FORMAT

78       The basic unit docstrip operates on are the lines of  a  master  source
79       file. Extraction consists of selecting some of these lines to be copied
80       from input text to output text. The basic distinction is  that  between
81       code lines (which are copied and do not begin with a percent character)
82       and comment lines (which begin with a percent  character  and  are  not
83       copied).
84
85          docstrip::extract [join {
86            {% comment}
87            {% more comment !"#$%&/(}
88            {some command}
89            { % blah $blah "Not a comment."}
90            {% abc; this is comment}
91            {# def; this is code}
92            {ghi}
93            {% jkl}
94          } \n] {}
95
96       returns the same sequence of lines as
97
98          join {
99            {some command}
100            { % blah $blah "Not a comment."}
101            {# def; this is code}
102            {ghi} ""
103          } \n
104
105       It  does  not matter to docstrip what format is used for the documenta‐
106       tion in the comment lines, but in order to do better  than  plain  text
107       comments,  one typically uses some markup language. Most commonly LaTeX
108       is used, as that is a very established standard and also  provides  the
109       best  support for mathematical formulae, but the docstrip::util package
110       also gives some support for doctools-like markup.
111
112       Besides the basic code and comment lines, there are also  guard  lines,
113       which begin with the two characters '%<', and meta-comment lines, which
114       begin with the two characters ´%%'. Within guard lines  there  is  fur‐
115       thermore the distinction between verbatim guard lines, which begin with
116       '%<<', and ordinary guard lines, where the  '%<'  is  not  followed  by
117       another '<'. The last category is by far the most common.
118
119       Ordinary  guard  lines  conditions  extraction of the code line(s) they
120       guard by the value of a boolean expression; the guarded block  of  code
121       lines  will  only be included if the expression evaluates to true.  The
122       syntax of an ordinary guard line is one of
123
124           '%' '<' STARSLASH EXPRESSION '>'
125           '%' '<' PLUSMINUS EXPRESSION '>' CODE
126
127       where
128
129           STARSLASH  ::=  '*' | '/'
130           PLUSMINUS  ::=  '+' | '-' |
131           EXPRESSION ::= SECONDARY | SECONDARY ',' EXPRESSION
132                        | SECONDARY '|' EXPRESSION
133           SECONDARY  ::= PRIMARY | PRIMARY '&' SECONDARY
134           PRIMARY    ::= TERMINAL | '!' PRIMARY | '(' EXPRESSION ')'
135           CODE       ::= { any character except end-of-line }
136
137       Comma and vertical bar  both  denote  'or'.  Ampersand  denotes  'and'.
138       Exclamation  mark  denotes 'not'. A TERMINAL can be any nonempty string
139       of characters not  containing  '>',  '&',  '|',  comma,  '(',  or  ')',
140       although  the  docstrip manual is a bit restrictive and only guarantees
141       proper operation for strings of letters (although even the  LaTeX  core
142       sources  make heavy use also of digits in TERMINALs).  The second argu‐
143       ment of docstrip::extract is the list of those  TERMINALs  that  should
144       count  as  having  the value 'true'; all other TERMINALs count as being
145       'false' when guard expressions are evaluated.
146
147       In the case of a '%<*EXPRESSION>' guard,  the  lines  guarded  are  all
148       lines  up  to  the next '%</EXPRESSION>' guard with the same EXPRESSION
149       (compared as strings). The blocks of code delimited by such '*' and '/'
150       guard lines must be properly nested.
151
152          set text [join {
153             {begin}
154             {%<*foo>}
155             {1}
156             {%<*bar>}
157             {2}
158             {%</bar>}
159             {%<*!bar>}
160             {3}
161             {%</!bar>}
162             {4}
163             {%</foo>}
164             {5}
165             {%<*bar>}
166             {6}
167             {%</bar>}
168             {end}
169          } \n]
170          set res [docstrip::extract $text foo]
171          append res [docstrip::extract $text {foo bar}]
172          append res [docstrip::extract $text bar]
173
174       sets $res to the result of
175
176          join {
177             {begin}
178             {1}
179             {3}
180             {4}
181             {5}
182             {end}
183             {begin}
184             {1}
185             {2}
186             {4}
187             {5}
188             {6}
189             {end}
190             {begin}
191             {5}
192             {6}
193             {end} ""
194          } \n
195
196       In guard lines without a '*', '/', '+', or '-' modifier after the ´%<',
197       the guard applies only to the CODE following the  '>'  on  that  single
198       line.  A  '+'  modifier is equivalent to no modifier. A '-' modifier is
199       like the case with  no  modifier,  but  the  expression  is  implicitly
200       negated,  i.e.,  the CODE of a '%<-' guard line is only included if the
201       expression evaluates to false.
202
203       Metacomment lines are "comment  lines  which  should  not  be  stripped
204       away",  but  be extracted like code lines; these are sometimes used for
205       copyright notices and similar material. The '%%' prefix is however  not
206       kept,  but substituted by the current -metaprefix, which is customarily
207       set to some  "comment  until  end  of  line"  character  (or  character
208       sequence) of the language of the code being extracted.
209
210          set text [join {
211             {begin}
212             {%<foo> foo}
213             {%<+foo>plusfoo}
214             {%<-foo>minusfoo}
215             {middle}
216             {%% some metacomment}
217             {%<*foo>}
218             {%%another metacomment}
219             {%</foo>}
220             {end}
221          } \n]
222          set res [docstrip::extract $text foo -metaprefix {# }]
223          append res [docstrip::extract $text bar -metaprefix {#}]
224
225       sets $res to the result of
226
227          join {
228             {begin}
229             { foo}
230             {plusfoo}
231             {middle}
232             {#  some metacomment}
233             {# another metacomment}
234             {end}
235             {begin}
236             {minusfoo}
237             {middle}
238             {# some metacomment}
239             {end} ""
240          } \n
241
242       Verbatim  guards  can  be  used  to force code line interpretation of a
243       block of lines even if some of them happen to look like any other  type
244       of  lines  to  docstrip. A verbatim guard has the form '%<<END-TAG' and
245       the verbatim block is terminated by the  first  line  that  is  exactly
246       '%END-TAG'.
247
248          set text [join {
249             {begin}
250             {%<*myblock>}
251             {some stupid()}
252             {   #computer<program>}
253             {%<<QQQ-98765}
254             {% These three lines are copied verbatim (including percents}
255             {%% even if -metaprefix is something different than %%).}
256             {%</myblock>}
257             {%QQQ-98765}
258             {   using*strange@programming<language>}
259             {%</myblock>}
260             {end}
261          } \n]
262          set res [docstrip::extract $text myblock -metaprefix {# }]
263          append res [docstrip::extract $text {}]
264
265       sets $res to the result of
266
267          join {
268             {begin}
269             {some stupid()}
270             {   #computer<program>}
271             {% These three lines are copied verbatim (including percents}
272             {%% even if -metaprefix is something different than %%).}
273             {%</myblock>}
274             {   using*strange@programming<language>}
275             {end}
276             {begin}
277             {end} ""
278          } \n
279
280       The  processing  of  verbatim  guards takes place also inside blocks of
281       lines which due to some outer block guard will not be copied.
282
283       The final piece of docstrip syntax is that extraction stops at  a  line
284       that is exactly "\endinput"; this is often used to avoid copying random
285       whitespace at the end of a file. In the unlikely case  that  one  wants
286       such a code line, one can protect it with a verbatim guard.
287

COMMANDS

289       The package defines two commands.
290
291       docstrip::extract text terminals ?option value ...?
292              The extract command docstrips the text and returns the extracted
293              lines of code, as a string with each line terminated with a new‐
294              line. The terminals is the list of those guard expression termi‐
295              nals which should evaluate to true.  The available options are:
296
297              -annotate lines
298                     Requests the specified number of lines of  annotation  to
299                     follow  each extracted line in the result. Defaults to 0.
300                     Annotation lines are mostly  useful  when  the  extracted
301                     lines are to undergo some further transformation. A first
302                     annotation line is a list of three elements:  line  type,
303                     prefix  removed  in  extraction,  and  prefix inserted in
304                     extraction. The line type is one of: 'V' (verbatim),  ´M'
305                     (metacomment),  '+' (+ or no modifier guard line), '-' (-
306                     modifier guard line), '.' (normal line). A second annota‐
307                     tion  line  is the source line number. A third annotation
308                     line is the current stack  of  block  guards.  Requesting
309                     more than three lines of annotation is currently not sup‐
310                     ported.
311
312              -metaprefix string
313                     The string by which the '%%' prefix of a metacomment line
314                     will  be  replaced.  Defaults  to '%%'. For Tcl code this
315                     would typically be '#'.
316
317              -onerror keyword
318                     Controls what will be done when a  format  error  in  the
319                     text being processed is detected. The settings are:
320
321                     ignore Just ignore the error; continue as if nothing hap‐
322                            pened.
323
324                     puts   Write an error message to  stderr,  then  continue
325                            processing.
326
327                     throw  Throw an error. ::errorCode is set to a list whose
328                            first element is DOCSTRIP, second element  is  the
329                            type  of error, and third element is the line num‐
330                            ber where the  error  is  detected.  This  is  the
331                            default.
332
333              -trimlines boolean
334                     Controls  whether  spaces  at the end of a line should be
335                     trimmed away before the line is  processed.  Defaults  to
336                     true.
337       It  should be remarked that the terminals are often called "options" in
338       the context of the docstrip program, since these specify which optional
339       code fragments should be included.
340
341       docstrip::sourcefrom filename terminals ?option value ...?
342              The sourcefrom command is a docstripping emulation of source. It
343              opens the file filename, reads it, closes it, docstrips the con‐
344              tents as specified by the terminals, and evaluates the result in
345              the local context of the caller,  during  which  time  the  info
346              script  value will be the filename. The options are passed on to
347              fconfigure to configure the file before its contents  are  read.
348              The  -metaprefix  is  set to '#', all other extract options have
349              their default values.
350

DOCUMENT STRUCTURE

352       The file format (as described above) determines whether a master source
353       code file can be processed correctly by docstrip, but the usefulness of
354       the format is to no little part also dependent on  that  the  code  and
355       comment lines together constitute a well-formed document.
356
357       For  a  document format that does not require any non-Tcl software, see
358       the ddt2man command in the docstrip::util package. It is suggested that
359       files  employing  that  document format are given the suffix ".ddt", to
360       distinguish them from the more traditional LaTeX-based ".dtx" files.
361
362       Master source files with ".dtx" extension are usually set  up  so  that
363       they  can  be  typeset directly by latex without any support from other
364       files. This is achieved by beginning the file with the lines
365
366          % \iffalse
367          %<*driver>
368          \documentclass{tclldoc}
369          \begin{document}
370          \DocInput{filename.dtx}
371          \end{document}
372          %</driver>
373          % \fi
374
375       or some variation thereof. The trick is that the file gets read  twice.
376       With  normal  LaTeX reading rules, the first two lines are comments and
377       therefore ignored. The third line is the document preamble, the  fourth
378       line begins the document body, and the sixth line ends the document, so
379       LaTeX stops there -- non-comments below that  point  in  the  file  are
380       never  subjected  to  the normal LaTeX reading rules. Before that, how‐
381       ever, the \DocInput command on the fifth line is  processed,  and  that
382       does two things: it changes the interpretation of '%' from "comment" to
383       "ignored", and it inputs the file specified in the argument  (which  is
384       normally  the  name  of the file the command is in).  It is this second
385       time that the file is being read that the comments and code in  it  are
386       typeset.
387
388       The  function  of the \iffalse ... \fi is to skip lines two to seven on
389       this second time through; this is similar to the "if 0 { ...  }"  idiom
390       for  block comments in Tcl code, and it is needed here because (amongst
391       other things) the \documentclass command may only be executed once. The
392       function of the <driver> guards is to prevent this short piece of LaTeX
393       code from being extracted by docstrip.  The total effect  is  that  the
394       file  can  function  both  as a LaTeX document and as a docstrip master
395       source code file.
396
397       It is not necessary to use the tclldoc document class,  but  that  does
398       provide  a number of features that are convenient for ".dtx" files con‐
399       taining Tcl code. More information on this matter can be found  in  the
400       references above.
401

SEE ALSO

403       docstrip_util
404

KEYWORDS

407       Copyright (c) 2003-2005 Lars Hellström <Lars dot Hellstrom at residenset dot net>
408
409
410
411
412docstrip                              1.2                          docstrip(n)
Impressum