1GPP(1) General Commands Manual GPP(1)
2
3
4
6 GPP - Generic Preprocessor
7
8
10 gpp [-{o|O} outfile] [-I/include/path] [-Dname=val ...]
11 [-z|+z] [-x] [-m] [-C|-T|-H|-X|-P|-U ... [-M ...]]
12 [-n|+n] [+c<n> str1 str2] [+s<n> str1 str2 c]
13 [-c str1] [--nostdinc] [--nocurinc]
14 [--curdirinclast] [--warninglevel n]
15 [--includemarker str] [--include file]
16 [infile]
17
18 gpp --help
19
20 gpp --version
21
22
24 GPP is a general-purpose preprocessor with customizable syntax, suit‐
25 able for a wide range of preprocessing tasks. Its independence from any
26 programming language makes it much more versatile than cpp, while its
27 syntax is lighter and more flexible than that of m4.
28
29
30 GPP is targeted at all common preprocessing tasks where cpp is not
31 suitable and where no very sophisticated features are needed. In order
32 to be able to process equally efficiently text files or source code in
33 a variety of languages, the syntax used by GPP is fully customizable.
34 The handling of comments and strings is especially advanced.
35
36
37 Initially, GPP only understands a minimal set of built-in macros,
38 called meta-macros. These meta-macros allow the definition of user
39 macros as well as some basic operations forming the core of the prepro‐
40 cessing system, including conditional tests, arithmetic evaluation,
41 wildcard matching (globbing), and syntax specification. All user macro
42 definitions are global -- i.e., they remain valid until explicitly
43 removed; meta-macros cannot be redefined. With each user macro defini‐
44 tion GPP keeps track of the corresponding syntax specification so that
45 a macro can be safely invoked regardless of any subsequent change in
46 operating mode.
47
48
49 In addition to macros, GPP understands comments and strings, whose syn‐
50 tax and behavior can be widely customized to fit any particular pur‐
51 pose. Internally comments and strings are the same construction, so
52 everything that applies to comments applies to strings as well.
53
54
56 GPP recognizes the following command-line switches and options. Note
57 that the -nostdinc, -nocurinc, -curdirinclast, -warninglevel, and
58 -includemarker options from version 2.1 and earlier are deprecated and
59 should not be used. Use the "long option" variants instead (--nostd‐
60 inc, etc.).
61
62 -h --help
63 Print a short help message.
64
65 --version
66 Print version information.
67
68 -o outfile
69 Specify a file to which all output should be sent (by default,
70 everything is sent to standard output).
71
72 -O outfile
73 Specify a file to which all output should be sent; output is
74 simultanously sent to stdout.
75
76 -I/include/path
77 Specify a path where the #include meta-macro will look for
78 include files if they are not present in the current directory.
79 The default is /usr/include if no -I option is specified. Multi‐
80 ple -I options may be specified to look in several directories.
81
82 -Dname=val
83 Define the user macro name as equal to val. This is strictly
84 equivalent to using the #define meta-macro, but makes it possi‐
85 ble to define macros from the command-line. If val makes refer‐
86 ences to arguments or other macros, it should conform to the
87 syntax of the mode specified on the command-line. Starting with
88 version 2.1, macro argument naming is allowed on the command-
89 line. The syntax is as follows: -Dmacro(arg1,...)=definition.
90 The arguments are specified in C-style syntax, without any
91 whitespace, but the definition should still conform to the syn‐
92 tax of the mode specified on the command-line.
93
94 +z Set text mode to Unix mode (LF terminator). Any CR character in
95 the input is systematically discarded. This is the default under
96 Unix systems.
97
98 -z Set text mode to DOS mode (CR-LF terminator). In this mode all
99 CR characters are removed from the input, and all output LF
100 characters are converted to CR-LF. This is the default if GPP is
101 compiled with the WIN_NT option.
102
103 -x Enable the use of the #exec meta-macro. Since #exec includes the
104 output of an arbitrary shell command line, it may cause a poten‐
105 tial security threat, and is thus disabled unless this option is
106 specified.
107
108 -m Enable automatic mode switching to the cpp compatibility mode if
109 the name of an included file ends in `.h' or `.c'. This makes it
110 possible to include C header files with only minor modifica‐
111 tions.
112
113 -n Prevent newline or whitespace characters from being removed from
114 the input when they occur as the end of a macro call or of a
115 comment. By default, when a newline or whitespace character
116 forms the end of a macro or a comment it is parsed as part of
117 the macro call or comment and therefore removed from output. Use
118 the -n option to keep the last character in the input stream if
119 it was whitespace or a newline. This is activated in cpp and
120 Prolog modes.
121
122 +n The opposite of -n. This is the default in all modes except cpp
123 and Prolog. Note that +n must be placed after -C or -P in order
124 to have any effect.
125
126 -U arg1 ... arg9
127 User-defined mode. The nine following command-line arguments are
128 taken to be respectively the macro start sequence, the macro end
129 sequence for a call without arguments, the argument start
130 sequence, the argument separator, the argument end sequence, the
131 list of characters to stack for argument balancing, the list of
132 characters to unstack, the string to be used for referring to an
133 argument by number, and finally the quote character (if there is
134 none an empty string should be provided). These settings apply
135 both to user macros and to meta-macros, unless the -M option is
136 used to define other settings for meta-macros. See the section
137 on syntax specification for more details.
138
139 -M arg1 ... arg7
140 User-defined mode specifications for meta-macros. This option
141 can only be used together with -M. The seven following command-
142 line arguments are taken to be respectively the macro start
143 sequence, the macro end sequence for a call without arguments,
144 the argument start sequence, the argument separator, the argu‐
145 ment end sequence, the list of characters to stack for argument
146 balancing, and the list of characters to unstack. See below for
147 more details.
148
149 (default mode)
150 The default mode is a vaguely cpp-like mode, but it does not
151 handle comments, and presents various incompatibilities with
152 cpp. Typical meta-macros and user macros look like this:
153
154 #define x y
155 macro(arg,...)
156
157 This mode is equivalent to
158
159 -U "" "" "(" "," ")" "(" ")" "#" "\\"
160 -M "#" "\n" " " " " "\n" "(" ")"
161
162
163 -C cpp compatibility mode. This is the mode where GPP's behavior is
164 the closest to that of cpp. Unlike in the default mode, meta-
165 macro expansion occurs only at the beginning of lines, and C
166 comments and strings are understood. This mode is equivalent to
167
168 -n -U "" "" "(" "," ")" "(" ")" "#" ""
169 -M "\n#\w" "\n" " " " " "\n" "" ""
170 +c "/*" "*/" +c "//" "\n" +c "\\\n" ""
171 +s "\"" "\"" "\\" +s "'" "'" "\\"
172
173
174 -T TeX-like mode. In this mode, typical meta-macros and user macros
175 look like this:
176
177 \define{x}{y}
178 \macro{arg}{...}
179
180 No comments are understood. This mode is equivalent to
181
182 -U "\\" "" "{" "}{" "}" "{" "}" "#" "@"
183
184
185 -H HTML-like mode. In this mode, typical meta-macros and user
186 macros look like this:
187
188 <#define x|y>
189 <#macro arg|...>
190
191 No comments are understood. This mode is equivalent to
192
193 -U "<#" ">" "\B" "|" ">" "<" ">" "#" "\\"
194
195
196 -X XHTML-like mode. In this mode, typical meta-macros and user
197 macros look like this:
198
199 <#define x|y/>
200 <#macro arg|.../>
201
202 No comments are understood. This mode is equivalent to
203
204 -U "<#" "/>" "\B" "|" "/>" "<" ">" "#" "\\"
205
206
207 -P Prolog-compatible cpp-like mode. This mode differs from the cpp
208 compatibility mode by its handling of comments, and is equiva‐
209 lent to
210
211 -n -U "" "" "(" "," ")" "(" ")" "#" ""
212 -M "\n#\w" "\n" " " " " "\n" "" ""
213 +ccss "\!o/*" "*/" +ccss "%" "\n" +ccii "\\\n" ""
214 +s "\"" "\"" "" +s "\!#'" "'" ""
215
216
217 +c<n> str1 str2
218 Specify comments. Any unquoted occurrence of str1 will be inter‐
219 preted as the beginning of a comment. All input up to the first
220 following occurrence of str2 will be discarded. This option may
221 be used multiple times to specify different types of comment
222 delimiters. The optional parameter <n> can be specified to alter
223 the behavior of the comment and, e.g., turn it into a string or
224 make it ignored under certain circumstances, see below.
225
226 -c str1
227 Un-specify comments or strings. The comment/string specification
228 whose start sequence is str1 is removed. This is useful to alter
229 the built-in comment specifications of a standard mode -- e.g.,
230 the cpp compatibility mode.
231
232 +s<n> str1 str2 c
233 Specify strings. Any unquoted occurrence of str1 will be inter‐
234 preted as the beginning of a string. All input up to the first
235 following occurrence of str2 will be output as is without any
236 evaluation. The delimiters themselves are output. If c is non-
237 empty, its first character is used as a string-quote character
238 -- i.e., a character whose presence immediately before an occur‐
239 rence of str2 prevents it from terminating the string. The
240 optional parameter <n> can be specified to alter the behavior of
241 the string and, e.g., turn it into a comment, enable macro eval‐
242 uation inside the string, or make the string specification
243 ignored under certain circumstances. See below.
244
245 -s str1
246 Un-specify comments or strings. Identical to -c.
247
248 --include file
249 Process file before infile
250
251 --nostdinc
252 Do not look for include files in the standard directory
253 /usr/include.
254
255 --nocurinc
256 Do not look for include files in the current directory.
257
258 --curdirinclast
259 Look for include files in the current directory after the direc‐
260 tories specified by -I rather than before them.
261
262 --warninglevel n
263 Set warning level to n (0, 1 or 2). Default is 2 (most verbose).
264
265 --includemarker str
266 keep track of #include directives by inserting a marker in the
267 output stream. The format of the marker is determined by str,
268 which must contain three occurrences of the character % (or
269 equivalently ?). The first occurrence is replaced with the line
270 number, the second with the file name, and the third with 1, 2
271 or blank. When this option is specified in default, cpp or Pro‐
272 log mode, GPP does its best to ensure that line numbers are the
273 same in the output as in the input by inserting blank lines in
274 the place of definitions or comments.
275
276 infile Specify an input file from which GPP reads its input. If no
277 input file is specified, input is read from standard input.
278
279
280
282 The syntax of a macro call is as follows: it must start with a sequence
283 of characters matching the macro start sequence as specified in the
284 current mode, followed immediately by the name of the macro, which must
285 be a valid identifier -- i.e., a sequence of letters, digits, or under‐
286 scores ("_"). The macro name must be followed by a short macro end
287 sequence if the macro has no arguments, or by a sequence of arguments
288 initiated by an argument start sequence. The various arguments are then
289 separated by an argument separator, and the macro ends with a long
290 macro end sequence.
291
292
293 In all cases, the parameters of the current context -- i.e., the argu‐
294 ments passed to the body being evaluated -- can be referred to by using
295 an argument reference sequence followed by a digit between 1 and 9.
296 Alternatively, macro parameters may be named (see below). Furthermore,
297 to avoid interference between the GPP syntax and the contents of the
298 input file, a quote character is provided. The quote character can be
299 used to prevent the interpretation of a macro call, comment, or string
300 as anything but plain text. The quote character "protects" the follow‐
301 ing character, and always gets removed during evaluation. Two consecu‐
302 tive quote characters evaluate as a single quote character.
303
304
305 Finally, to facilitate proper argument delimitation, certain characters
306 can be "stacked" when they occur in a macro argument, so that the argu‐
307 ment separator or macro end sequence are not parsed if the argument
308 body is not balanced. This allows nesting macro calls without using
309 quotes. If an improperly balanced argument is needed, quote characters
310 should be added in front of some stacked characters to make it bal‐
311 anced.
312
313
314 The macro construction sequences described above can be different for
315 meta-macros and for user macros: this is the case in cpp mode, for
316 example. Note that, since meta-macros can only have up to two argu‐
317 ments, the delimitation rules for the second argument are somewhat
318 sloppier, and unquoted argument separator sequences are allowed in the
319 second argument of a meta-macro.
320
321
322 Unless one of the standard operating modes is selected, the above syn‐
323 tax sequences can be specified either on the command-line, using the -M
324 and -U options respectively for meta-macros and user macros, or inside
325 an input file via the #mode meta and #mode user meta-macro calls. In
326 both cases the mode description consists of nine parameters for user
327 macro specifications, namely the macro start sequence, the short macro
328 end sequence, the argument start sequence, the argument separator, the
329 long macro end sequence, the string listing characters to stack, the
330 string listing characters to unstack, the argument reference sequence,
331 and finally the quote character. As explained below, these sequences
332 should be supplied using the syntax of C strings; they must start with
333 a non-alphanumeric character, and in the first five strings special
334 matching sequences can be used (see below). If the argument correspond‐
335 ing to the quote character is the empty string, that argument's func‐
336 tionality is disabled. For meta-macro specifications there are only
337 seven parameters, as the argument reference sequence and quote charac‐
338 ter are shared with the user macro syntax.
339
340
341 The structure of a comment/string is as follows: it must start with a
342 sequence of characters matching the given comment/string start
343 sequence, and always ends at the first occurrence of the comment/string
344 end sequence, unless it is preceded by an odd number of occurrences of
345 the string-quote character (if such a character has been specified).
346 In certain cases comment/strings can be specified to enable macro eval‐
347 uation inside the comment/string; in that case, if a quote character
348 has been defined for macros it can be used as well to prevent the com‐
349 ment/string from ending, with the difference that the macro quote char‐
350 acter is always removed from output whereas the string-quote character
351 is always output. Also note that under certain circumstances a com‐
352 ment/string specification can be disabled, in which case the com‐
353 ment/string start sequence is simply ignored. Finally, it is possible
354 to specify a string warning character whose presence inside a com‐
355 ment/string will cause GPP to output a warning (this is useful to
356 locate unterminated strings in cpp mode). Note that input files are
357 not allowed to contain unterminated comments/strings.
358
359
360 A comment/string specification can be declared from within the input
361 file using the #mode comment meta-macro call (or equivalently #mode
362 string), in which case the number of C strings to be given as arguments
363 to describe the comment/string can be anywhere between two and four:
364 the first two arguments (mandatory) are the start sequence and the end
365 sequence, and can make use of the special matching sequences (see
366 below). They may not start with alphanumeric characters. The first
367 character of the third argument, if there is one, is used as the
368 string-quote character (use an empty string to disable the functional‐
369 ity), and the first character of the fourth argument, if there is one,
370 is used as the string-warning character. A specification may also be
371 given from the command-line, in which case there must be two arguments
372 if using the +c option and three if using the +s option.
373
374
375 The behavior of a comment/string is specified by a three-character mod‐
376 ifier string, which may be passed as an optional argument either to the
377 +c/+s command-line options or to the #mode comment/#mode string meta-
378 macros. If no modifier string is specified, the default value is "ccc"
379 for comments and "sss" for strings. The first character corresponds to
380 the behavior inside meta-macro calls (including user-macro definitions
381 since these come inside a #define meta-macro call), the second charac‐
382 ter corresponds to the behavior inside user-macro parameters, and the
383 third character corresponds to the behavior outside of any macro call.
384 Each of these characters can take the following values:
385
386
387 i disable the comment/string specification.
388
389 c comment (neither evaluated nor output).
390
391 s string (the string and its delimiter sequences are output as-
392 is).
393
394 q quoted string (the string is output as-is, without the delimiter
395 sequences).
396
397 C evaluated comment (macros are evaluated, but output is dis‐
398 carded).
399
400 S evaluated string (macros are evaluated, delimiters are output).
401
402 Q evaluated quoted string (macros are evaluated, delimiters are
403 not output).
404
405
406 Important note: any occurrence of a comment/string start sequence
407 inside another comment/string is always ignored, even if macro evalua‐
408 tion is enabled. In other words, comments/strings cannot be nested. In
409 particular, the `Q' modifier can be a convenient way of defining a syn‐
410 tax for temporarily disabling all comment and string specifications.
411
412
413 Syntax specification strings should always be provided as C strings,
414 whether they are given as arguments to a #mode meta-macro call or on
415 the command-line of a Unix shell. If command-line arguments are given
416 via another method than a standard Unix shell, then the shell behavior
417 must be emulated -- i.e., the surrounding "" quotes should be removed,
418 all occurrences of `\\' should be replaced by a single backslash, and
419 similarly `\"' should be replaced by `"'. Sequences like `\n' are rec‐
420 ognized by GPP and should be left as is.
421
422
423 Special sequences matching certain subsets of the character set can be
424 used. They are of the form `\x', where x is one of:
425
426
427 b matches any sequence of one or more spaces or tab characters
428 (`\b' is identical to ` ').
429
430 w matches any sequence of zero or more spaces or tab characters.
431
432 B matches any sequence of one or more spaces, tabs or newline
433 characters.
434
435 W matches any sequence of zero or more spaces, tabs or newline
436 characters.
437
438 a an alphabetic character (`a' to `z' and `A' to `Z').
439
440 A an alphabetic character, or a space, tab or newline.
441
442 # a digit (`0' to `9').
443
444 i an identifier character. The set of matched characters is cus‐
445 tomizable using the #mode charset id command. The default set‐
446 ting matches alphanumeric characters and underscores (`a' to
447 `z', `A' to `Z', `0' to `9' and `_').
448
449 t a tab character.
450
451 n a newline character.
452
453 o an operator character. The set of matched characters is custom‐
454 izable using the #mode charset op command. The default setting
455 matches all characters in "+-*/\^<>=`~:.?@#&!%|", except in Pro‐
456 log mode where `!', `%' and `|' are not matched.
457
458 O an operator character or a parenthesis character. The set of
459 additional matched characters in comparison with `\o' is custom‐
460 izable using the #mode charset par command. The default setting
461 is to have the characters in "()[]{}" as parentheses.
462
463
464 Moreover, all of these matching subsets except `\w' and `\W' can be
465 negated by inserting a `!' -- i.e., by writing `\!x' instead of `\x'.
466
467
468 Note an important distinctive feature of start sequences: when the
469 first character of a macro or comment/string start sequence is ' ' or
470 one of the above special sequences, it is not taken to be part of the
471 sequence itself but is used instead as a context check: for example a
472 start sequence beginning with '\n' matches only at the beginning of a
473 line, but the matching newline character is not taken to be part of the
474 sequence. Similarly a start sequence beginning with ' ' matches only
475 if some whitespace is present, but the matching whitespace is not con‐
476 sidered to be part of the start sequence and is therefore sent to out‐
477 put. If a context check is performed at the very beginning of a file
478 (or more generally of any body to be evaluated), the result is the same
479 as matching with a newline character (this makes it possible for a cpp-
480 mode file to start with a meta-macro call).
481
482
483 Two special syntax rules were added in version 2.1. First, argument
484 references (#n) are no longer evaluated when they are outside of macro
485 calls and definitions. However, they are no longer allowed to appear
486 (unless protected by quote characters) inside a call to a defined user
487 macro; the current behavior (backwards compatible) is to remove them
488 silently from the input if that happens.
489
490
491 Second, if the end sequence (either for macros or comments) consists of
492 a single newline character, and if delimitation rules lead to evalua‐
493 tion in a context where the final newline character is absent, GPP
494 silently ignores the missing newline instead of producing an error. The
495 main consequence is that meta-macro calls can now be nested in a simple
496 way in standard, cpp and Prolog modes.
497
498
499
501 Input is read sequentially and interpreted according to the rules of
502 the current mode. All input text is first matched against the specified
503 comment/string start sequences of the current mode (except those which
504 are disabled by the 'i' modifier), unless the body being evaluated is
505 the contents of a comment/string whose modifier enables macro evalua‐
506 tion. The most recently defined comment/string specifications are
507 checked for first. Important note: comments may not appear between the
508 name of a macro and its arguments (doing so results in undefined behav‐
509 ior).
510
511
512 Anything that is not a comment/string is then matched against a possi‐
513 ble meta-macro call, and if that fails too, against a possible user-
514 macro call. All remaining text undergoes substitution of argument ref‐
515 erence sequences by the relevant argument text (empty unless the body
516 being evaluated is the definition of a user macro) and removal of the
517 quote character if there is one.
518
519
520 Note that meta-macro arguments are passed to the meta-macro prior to
521 any evaluation (although the meta-macro may choose to evaluate them,
522 see meta-macro descriptions below). In the case of the #mode meta-
523 macro, GPP temporarily adds a comment/string specification to enable
524 recognition of C strings ("...") and prevent any evaluation inside
525 them, so no interference of the characters being put in the C string
526 arguments to #mode with the current syntax is to be feared.
527
528
529 On the other hand, the arguments to a user macro are systematically
530 evaluated, and then passed as context parameters to the macro defini‐
531 tion body, which gets evaluated with that environment. The only excep‐
532 tion is when the macro definition is empty, in which case its arguments
533 are not evaluated. Note that GPP temporarily switches back to the mode
534 in which the macro was defined in order to evaluate it, so it is per‐
535 fectly safe to change the operating mode between the time a macro is
536 defined and the time when it is called. Conversely, if a user macro
537 wishes to work with the current mode instead of the one that was used
538 to define it it needs to start with a #mode restore call and end with a
539 #mode save call.
540
541
542 A user macro may be defined with named arguments (see #define descrip‐
543 tion below). In that case, when the macro definition is being evalu‐
544 ated, each named parameter causes a temporary virtual user-macro defi‐
545 nition to be created; such a macro may be called only without arguments
546 and simply returns the text of the corresponding argument.
547
548
549 Note that, since macros are evaluated when they are called rather than
550 when they are defined, any attempt to call a recursive macro causes
551 undefined behavior except in the very specific case when the macro uses
552 #undef to erase itself after finitely many loop iterations.
553
554
555 Finally, a special case occurs when a user macro whose definition does
556 not involve any arguments (neither named arguments nor the argument
557 reference sequence) is called in a mode where the short user-macro end
558 sequence is empty (e.g., cpp or TeX mode). In that case it is assumed
559 to be an alias macro: its arguments are first evaluated in the current
560 mode as usual, but instead of being passed to the macro definition as
561 parameters (which would cause them to be discarded) they are actually
562 appended to the macro definition, using the syntax rules of the mode in
563 which the macro was defined, and the resulting text is evaluated again.
564 It is therefore important to note that, in the case of a macro alias,
565 the arguments actually get evaluated twice in two potentially different
566 modes.
567
568
570 These macros are always predefined. Their actual calling sequence
571 depends on the current mode; here we use cpp-like notation.
572
573
574 #define x y
575 This defines the user macro x as y. y can be any valid GPP
576 input, and may for example refer to other macros. x must be an
577 identifier (i.e., a sequence of alphanumeric characters and
578 '_'), unless named arguments are specified. If x is already
579 defined, the previous definition is overwritten. If no second
580 argument is given, x will be defined as a macro that outputs
581 nothing. Neither x nor y are evaluated; the macro definition is
582 only evaluated when it is called, not when it is declared.
583
584 It is also possible to name the arguments in a macro definition:
585 in that case, the argument x should be a user-macro call whose
586 arguments are all identifiers. These identifiers become avail‐
587 able as user-macros inside the macro definition; these virtual
588 macros must be called without arguments, and evaluate to the
589 corresponding macro parameter.
590
591 #defeval x y
592 This acts in a similar way to #define, but the second argument y
593 is evaluated immediately. Since user macro definitions are also
594 evaluated each time they are called, this means that the macro y
595 will undergo two successive evaluations. The usefulness of
596 #defeval is considerable as it is the only way to evaluate some‐
597 thing more than once, which may be needed to force evaluation of
598 the arguments of a meta-macro that normally doesn't perform any
599 evaluation. However since all argument references evaluated at
600 define-time are understood as the arguments of the body in which
601 the macro is being defined and not as the arguments of the macro
602 itself, usually one has to use the quote character to prevent
603 immediate evaluation of argument references.
604
605 #undef x
606 This removes any existing definition of the user macro x.
607
608 #ifdef x
609 This begins a conditional block. Everything that follows is
610 evaluated only if the identifier x is defined, and until either
611 a #else or a #endif statement is reached. Note, however, that
612 the commented text is still scanned thoroughly, so its syntax
613 must be valid. It is in particular legal to have the #else or
614 #endif statement ending the conditional block appear only as the
615 result of a user-macro expansion and not explicitly in the
616 input.
617
618 #ifndef x
619 This begins a conditional block. Everything that follows is
620 evaluated only if the identifier x is not defined.
621
622 #ifeq x y
623 This begins a conditional block. Everything that follows is
624 evaluated only if the results of the evaluations of x and y are
625 identical as character strings. Any leading or trailing white‐
626 space is ignored for the comparison. Note that in cpp-mode any
627 unquoted whitespace character is understood as the end of the
628 first argument, so it is necessary to be careful.
629
630 #ifneq x y
631 This begins a conditional block. Everything that follows is
632 evaluated only if the results of the evaluations of x and y are
633 not identical (even up to leading or trailing whitespace).
634
635 #else This toggles the logical value of the current conditional block.
636 What follows is evaluated if and only if the preceding input was
637 commented out.
638
639 #endif This ends a conditional block started by a #if... meta-macro.
640
641 #include file
642 This causes GPP to open the specified file and evaluate its con‐
643 tents, inserting the resulting text in the current output. All
644 defined user macros are still available in the included file,
645 and reciprocally all macros defined in the included file will be
646 available in everything that follows. The include file is looked
647 for first in the current directory, and then, if not found, in
648 one of the directories specified by the -I command-line option
649 (or /usr/include if no directory was specified). Note that, for
650 compatibility reasons, it is possible to put the file name
651 between "" or <>.
652
653 The order in which the various directories are searched for
654 include files is affected by the -nostdinc, -nocurinc and -cur‐
655 dirinclast command-line options.
656
657 Upon including a file, GPP immediately saves a copy of the cur‐
658 rent operating mode onto the mode stack, and restores the oper‐
659 ating mode at the end of the included file. The included file
660 may override this behavior by starting with a #mode restore call
661 and ending with a #mode push call. Additionally, when the -m
662 command line option is specified, GPP will automatically switch
663 to the cpp compatibility mode upon including a file whose name
664 ends with either '.c' or '.h'.
665
666 #exec command
667 This causes GPP to execute the specified command line and
668 include its standard output in the current output. Note that,
669 for security reasons, this meta-macro is disabled unless the -x
670 command line flag was specified. If use of #exec is not
671 allowed, a warning message is printed and the output is left
672 blank. Note that the specified command line is evaluated before
673 being executed, thus allowing the use of macros in the command-
674 line. However, the output of the command is included verbatim
675 and not evaluated. If you need the output to be evaluated, you
676 must use #defeval (see above) to cause a double evaluation.
677
678 #eval expr
679 The #eval meta-macro attempts to evaluate expr first by expand‐
680 ing macros (normal GPP evaluation) and then by performing arith‐
681 metic evaluation and/or wildcard matching. The syntax and oper‐
682 ator precedence for arithmetic expressions are the same as in C;
683 the only missing operators are <<, >>, ?:, and the assignment
684 operators.
685
686 POSIX-style wildcard matching ('globbing') is available only on
687 POSIX implementations and can be invoked with the =~ operator.
688 In brief, a '?' matches any single character, a '*' matches any
689 string (including the empty string), and '[...]' matches any one
690 of the characters enclosed in brackets. A '[...]' class is com‐
691 plemented when the first character in the brackets is '!'. The
692 characters in a '[...]' class can also be specified as a range
693 using the '-' character -- e.g., '[F-N]' is equivalent to
694 '[FGHIJKLMN]'.
695
696 If unable to assign a numerical value to the result, the
697 returned text is simply the result of macro expansion without
698 any arithmetic evaluation. The only exceptions to this rule are
699 the comparison operators ==, !=, <, >, <=, and >= which, if one
700 of the sides does not evaluate to a number, perform string com‐
701 parison instead (ignoring trailing and leading spaces). Addi‐
702 tionally, the length(...) arithmetic operator returns the length
703 in characters of its evaluated argument.
704
705 Inside arithmetic expressions, the defined(...) special user
706 macro is also available: it takes only one argument, which is
707 not evaluated, and returns 1 if it is the name of a user macro
708 and 0 otherwise.
709
710 #if expr
711 This meta-macro invokes the arithmetic/globbing evaluator in the
712 same manner as #eval and compares the result of evaluation with
713 the string "0" in order to begin a conditional block. In partic‐
714 ular note that the logical value of expr is always true when it
715 cannot be evaluated to a number.
716
717 #elif expr
718 This meta-macro can be used to avoid nested #if conditions. #if
719 ... #elif ... #endif is equivalent to #if ... #else #if ...
720 #endif #endif.
721
722 #mode keyword ...
723 This meta-macro controls GPP's operating mode. See below for a
724 list of #mode commands.
725
726 #line This meta-macro evaluates to the line number of the current
727 input file.
728
729 #file This meta-macro evaluates to the filename of the current input
730 file as it appears on the command line or in the argument to
731 #include. If GPP is reading its input from stdin, then #file
732 evaluates to `stdin'.
733
734 #date fmt
735 This meta-macro evaluates to the current date and time as for‐
736 matted by the specified format string fmt. See the section DATE
737 AND TIME CONVERSION SPECIFIERS below.
738
739 #error msg
740 This meta-macro causes an error message with the current file‐
741 name and line number, and with the text msg, to be printed to
742 the standard error device. Subsequent processing is then
743 aborted.
744
745 #warning msg
746 This meta-macro causes a warning message with the current file‐
747 name and line number, and with the text msg, to be printed to
748 the standard error device. Subsequent processing is then
749 resumed.
750
751
752
753 The key to GPP's flexibility is the #mode meta-macro. Its first argu‐
754 ment is always one of a list of available keywords (see below); its
755 second argument is always a sequence of words separated by whitespace.
756 Apart from possibly the first of them, each of these words is always a
757 delimiter or syntax specifier, and should be provided as a C string
758 delimited by double quotes (" "). The various special matching
759 sequences listed in the section on syntax specification are available.
760 Any #mode command is parsed in a mode where "..." is understood to be a
761 C-style string, so it is safe to put any character inside these
762 strings. Also note that the first argument of #mode (the keyword) is
763 never evaluated, while the second argument is evaluated (except of
764 course for the contents of C strings), so that the syntax specification
765 may be obtained as the result of a macro evaluation.
766
767
768 The available #mode commands are:
769
770
771 #mode save / #mode push
772 Push the current mode specification onto the mode stack.
773
774 #mode restore / #mode pop
775 Pop mode specification from the mode stack.
776
777 #mode standard name
778 Select one of the standard modes. The only argument must be one
779 of: default (default mode); cpp, C (cpp mode); tex, TeX (tex
780 mode); html, HTML (html mode); xhtml, XHTML (xhtml mode); pro‐
781 log, Prolog (prolog mode). The mode name must be given directly,
782 not as a C string.
783
784 #mode user "s1" ... "s9"
785 Specify user macro syntax. The 9 arguments, all of them C
786 strings, are the mode specification for user macros (see the -U
787 command-line option and the section on syntax specification).
788 The meta-macro specification is not affected.
789
790 #mode meta {user | "s1" ... "s7"}
791 Specify meta-macro syntax. Either the only argument is user
792 (not as a string), and the user-macro mode specifications are
793 copied into the meta-macro mode specifications, or there must be
794 seven string arguments, whose significance is the same as for
795 the -M command-line option (see section on syntax specifica‐
796 tion).
797
798 #mode quote ["c"]
799 With no argument or "" as argument, removes the quote character
800 specification and disables the quoting functionality. With one
801 string argument, the first character of the string is taken to
802 be the new quote character. The quote character can be neither
803 alphanumeric nor '_', nor can it be one of the special matching
804 sequences.
805
806 #mode comment [xxx] "start" "end" ["c" ["c"]]
807 Add a comment specification. Optionally a first argument con‐
808 sisting of three characters not enclosed in " " can be used to
809 specify a comment/string modifier (see the section on syntax
810 specification). The default modifier is ccc. The first two
811 string arguments are used as comment start and end sequences
812 respectively. The third string argument is optional and can be
813 used to specify a string-quote character. (If it is "", the
814 functionality is disabled.) The fourth string argument is
815 optional and can be used to specify a string delimitation warn‐
816 ing character. (If it is "", the functionality is disabled.)
817
818 #mode string [xxx] "start" "end" ["c" ["c"]]
819 Add a string specification. Identical to #mode comment except
820 that the default modifier is sss.
821
822 #mode nocomment / #mode nostring ["start"]
823 With no argument, remove all comment/string specifications. With
824 one string argument, delete the comment/string specification
825 whose start sequence is the argument.
826
827 #mode preservelf { on | off | 1 | 0 }
828 Equivalent to the -n command-line switch. If the argument is on
829 or 1, any newline or whitespace character terminating a macro
830 call or a comment/string is left in the input stream for further
831 processing. If the argument is off or 0 this feature is dis‐
832 abled.
833
834 #mode charset { id | op | par } "string"
835 Specify the character sets to be used for matching the \o, \O
836 and \i special sequences. The first argument must be one of id
837 (the set matched by \i), op (the set matched by \o) or par (the
838 set matched by \O in addition to the one matched by \o).
839 "string" is a C string which lists all characters to put in the
840 set. It may contain only the special matching sequences \a, \A,
841 \b, \B, and \# (the other sequences and the negated sequences
842 are not allowed). When a '-' is found inbetween two non-special
843 characters this adds all characters inbetween (e.g. "A-Z" corre‐
844 sponds to all uppercase characters). To have '-' in the matched
845 set, either put it in first or last position or place it next to
846 a \x sequence.
847
848
849
851 Ordinary characters placed in the format string are copied to without
852 conversion. Conversion specifiers are introduced by a `%' character,
853 and are replaced as follows:
854
855
856 %a The abbreviated weekday name according to the current locale.
857
858 %A The full weekday name according to the current locale.
859
860 %b The abbreviated month name according to the current locale.
861
862 %B The full month name according to the current locale.
863
864 %c The preferred date and time representation for the current
865 locale.
866
867 %d The day of the month as a decimal number (range 01 to 31).
868
869 %F Equivalent to %Y-%m-%d (the ISO 8601 date format).
870
871 %H The hour as a decimal number using a 24-hour clock (range 00 to
872 23).
873
874 %I The hour as a decimal number using a 12-hour clock (range 01 to
875 12).
876
877 %j The day of the year as a decimal number (range 001 to 366).
878
879 %m The month as a decimal number (range 01 to 12).
880
881 %M The minute as a decimal number (range 00 to 59).
882
883 %p Either `AM' or `PM' according to the given time value, or
884 the corresponding strings for the current locale. Noon is
885 treated as `pm' and midnight as `am'.
886
887 %R The time in 24-hour notation (%H:%M).
888
889 %S The second as a decimal number (range 00 to 61).
890
891 %U The week number of the current year as a decimal number,
892 range 00 to 53, starting with the first Sunday as the first
893 day of week 01.
894
895 %w The day of the week as a decimal, range 0 to 6, Sunday being
896 0.
897
898 %W The week number of the current year as a decimal number,
899 range 00 to 53, starting with the first Monday as the first
900 day of week 01.
901
902 %x The preferred date representation for the current locale with‐
903 out the time.
904
905 %X The preferred time representation for the current locale with‐
906 out the date.
907
908 %y The year as a decimal number without a century (range 00 to
909 99).
910
911 %Y The year as a decimal number including the century.
912
913 %Z The time zone or name or abbreviation.
914
915 %% A literal `%' character.
916
917
918
919 Depending on the C compiler and library used to compile GPP, there may
920 be more conversion specifiers available. Consult your compiler's docu‐
921 mentation for the strftime() function. Note, however, that any conver‐
922 sion specifiers not listed above may not be portable across installa‐
923 tions of GPP.
924
925
927 Here is a basic self-explanatory example in standard or cpp mode:
928
929
930 #define FOO This is
931 #define BAR a message.
932 #define concat #1 #2
933 concat(FOO,BAR)
934 #ifeq (concat(foo,bar)) (foo bar)
935 This is output.
936 #else
937 This is not output.
938 #endif
939
940 Using argument naming, the concat macro could alternatively be defined
941 as
942
943
944 #define concat(x,y) x y
945
946 In TeX mode and using argument naming, the same example becomes:
947
948
949 \define{FOO}{This is}
950 \define{BAR}{a message.}
951 \define{\concat{x}{y}}{\x \y}
952 \concat{\FOO}{\BAR}
953 \ifeq{\concat{foo}{bar}}{foo bar}
954 This is output.
955 \else
956 This is not output.
957 \endif
958
959 In HTML mode and without argument naming, one gets similarly:
960
961
962 <#define FOO|This is>
963 <#define BAR|a message.>
964 <#define concat|#1 #2>
965 <#concat <#FOO>|<#BAR>>
966 <#ifeq <#concat foo|bar>|foo bar>
967 This is output.
968 <#else>
969 This is not output.
970 <#endif>
971
972 The following example (in standard mode) illustrates the use of the
973 quote character:
974
975
976 #define FOO This is \
977 a multiline definition.
978 #define BLAH(x) My argument is x
979 BLAH(urf)
980 \BLAH(urf)
981
982 Note that the multiline definition is also valid in cpp and Prolog
983 modes despite the absence of quote character, because '\' followed by a
984 newline is then interpreted as a comment and discarded.
985
986
987 In cpp mode, C strings and comments are understood as such, as illus‐
988 trated by the following example:
989
990
991 #define BLAH foo
992 BLAH "BLAH" /* BLAH */
993 'It\'s a /*string*/ !'
994
995 The main difference between Prolog mode and cpp mode is the handling of
996 strings and comments: in Prolog, a '...' string may not begin immedi‐
997 ately after a digit, and a /*...*/ comment may not begin immediately
998 after an operator character. Furthermore, comments are not removed from
999 the output unless they occur in a #command.
1000
1001
1002 The differences between cpp mode and default mode are deeper: in
1003 default mode #commands may start anywhere, while in cpp mode they must
1004 be at the beginning of a line; the default mode has no knowledge of
1005 comments and strings, but has a quote character ('\'), while cpp mode
1006 has extensive comment/string specifications but no quote character.
1007 Moreover, the arguments to meta-macros need to be correctly parenthe‐
1008 sized in default mode, while no such checking is performed in cpp mode.
1009
1010
1011 This makes it easier to nest meta-macro calls in default mode than in
1012 cpp mode. For example, consider the following HTML mode input, which
1013 tests for the availability of the #exec command:
1014
1015
1016 <#ifeq <#exec echo blah>|blah
1017 > #exec allowed <#else> #exec not allowed <#endif>
1018
1019 There is no cpp mode equivalent, while in default mode it can be easily
1020 translated as
1021
1022
1023 #ifeq (#exec echo blah
1024 ) (blah
1025 )
1026 \#exec allowed
1027 #else
1028 \#exec not allowed
1029 #endif
1030
1031 In order to nest meta-macro calls in cpp mode it is necessary to modify
1032 the mode description, either by changing the meta-macro call syntax, or
1033 more elegantly by defining a silent string and using the fact that the
1034 context at the beginning of an evaluated string is a newline character:
1035
1036
1037 #mode string QQQ "$" "$"
1038 #ifeq $#exec echo blah
1039 $ $blah
1040 $
1041 \#exec allowed
1042 #else
1043 \#exec not allowed
1044 #endif
1045
1046 Note, however, that comments/strings cannot be nested ("..." inside
1047 $...$ would go undetected), so one needs to be careful about what to
1048 include inside such a silent evaluated string. In this example, the
1049 loose meta-macro nesting introduced in version 2.1 makes it possible to
1050 use the following simpler version:
1051
1052
1053 #ifeq blah #exec echo -n blah
1054 \#exec allowed
1055 #else
1056 \#exec not allowed
1057 #endif
1058
1059 Remember that macros without arguments are actually understood to be
1060 aliases when they are called with arguments, as illustrated by the fol‐
1061 lowing example (default or cpp mode):
1062
1063
1064 #define DUP(x) x x
1065 #define FOO and I said: DUP
1066 FOO(blah)
1067
1068 The usefulness of the #defeval meta-macro is shown by the following
1069 example in HTML mode:
1070
1071
1072 <#define APPLY|<#defeval TEMP|<\##1 \#1>><#TEMP #2>>
1073 <#define <#foo x>|<#x> and <#x>>
1074 <#APPLY foo|BLAH>
1075
1076 The reason why #defeval is needed is that, since everything is evalu‐
1077 ated in a single pass, the input that will result in the desired macro
1078 call needs to be generated by a first evaluation of the arguments
1079 passed to APPLY before being evaluated a second time.
1080
1081
1082 To translate this example in default mode, one needs to resort to
1083 parenthesizing in order to nest the #defeval call inside the definition
1084 of APPLY, but need to do so without outputting the parentheses. The
1085 easiest solution is
1086
1087
1088 #define BALANCE(x) x
1089 #define APPLY(f,v) BALANCE(#defeval TEMP f
1090 TEMP(v))
1091 #define foo(x) x and x
1092 APPLY(\foo,BLAH)
1093
1094 As explained above the simplest version in cpp mode relies on defining
1095 a silent evaluated string to play the role of the BALANCE macro.
1096
1097
1098 The following example (default or cpp mode) demonstrates arithmetic
1099 evaluation:
1100
1101
1102 #define x 4
1103 The answer is:
1104 #eval x*x + 2*(16-x) + 1998%x
1105
1106 #if defined(x)&&!(3*x+5>17)
1107 This should be output.
1108 #endif
1109
1110 To finish, here are some examples involving mode switching. The fol‐
1111 lowing example is self-explanatory (starting in default mode):
1112
1113
1114 #mode push
1115 #define f(x) x x
1116 #mode standard tex
1117 \f{blah}
1118 \mode{string}{"$" "$"}
1119 \mode{comment}{"/*" "*/"}
1120 $\f{urf}$ /* blah */
1121 \define{FOO}{bar/* and some more */}
1122 \mode{pop}
1123 f($FOO$)
1124
1125 A good example where a user-defined mode becomes useful is the GPP
1126 source of this document (available with GPP's source code distribu‐
1127 tion).
1128
1129
1130 Another interesting application is selectively forcing evaluation of
1131 macros in C strings when in cpp mode. For example, consider the follow‐
1132 ing input:
1133
1134
1135 #define blah(x) "and he said: x"
1136 blah(foo)
1137
1138 Obviously one would want the parameter x to be expanded inside the
1139 string. There are several ways around this problem:
1140
1141
1142 #mode push
1143 #mode nostring "\""
1144 #define blah(x) "and he said: x"
1145 #mode pop
1146
1147 #mode quote "`"
1148 #define blah(x) `"and he said: x`"
1149
1150 #mode string QQQ "$$" "$$"
1151 #define blah(x) $$"and he said: x"$$
1152
1153 The first method is very natural, but has the inconvenience of being
1154 lengthy and neutralizing string semantics, so that having an unevalu‐
1155 ated instance of 'x' in the string, or an occurrence of '/*', would be
1156 impossible without resorting to further contortions.
1157
1158 The second method is slightly more efficient because the local presence
1159 of a quote character makes it easier to control what is evaluated and
1160 what isn't, but has the drawback that it is sometimes impossible to
1161 find a reasonable quote character without having to either signifi‐
1162 cantly alter the source file or enclose it inside a #mode push/pop con‐
1163 struct. For example, any occurrence of '/*' in the string would have to
1164 be quoted.
1165
1166 The last method demonstrates the efficiency of evaluated strings in the
1167 context of selective evaluation: since comments/strings cannot be
1168 nested, any occurrence of '"' or '/*' inside the '$$' gets output as
1169 plain text, as expected inside a string, and only macro evaluation is
1170 enabled. Also note that there is much more freedom in the choice of a
1171 string delimiter than in the choice of a quote character.
1172
1173
1174 Starting with version 2.1, meta-macro calls can be nested more effi‐
1175 ciently in default, cpp and Prolog modes. This makes it easy to make a
1176 user version of a meta-macro, or to increment a counter:
1177
1178
1179 #define myeval #eval #1
1180
1181 #define x 1
1182 #defeval x #eval x+1
1183
1184
1185
1187 Here are some examples of advanced constructions using GPP. They tend
1188 to be pretty awkward and should be considered as evidence of GPP's lim‐
1189 itations.
1190
1191
1192 The first example is a recursive macro. The main problem is that (since
1193 GPP evaluates everything) a recursive macro must be very careful about
1194 the way in which recursion is terminated in order to avoid undefined
1195 behavior (most of the time GPP will simply crash). In particular, rely‐
1196 ing on a #if/#else/#endif construct to end recursion is not possible
1197 and results in an infinite loop, because GPP scans user macro calls
1198 even in the unevaluated branch of the conditional block. A safe way to
1199 proceed is for example as follows (we give the example in TeX mode):
1200
1201
1202 \define{countdown}{
1203 \if{#1}
1204 #1...
1205 \define{loop}{\countdown}
1206 \else
1207 Done.
1208 \define{loop}{}
1209 \endif
1210 \loop{\eval{#1-1}}
1211 }
1212 \countdown{10}
1213
1214
1215 Another example, in cpp mode:
1216
1217
1218 #mode string QQQ "$" "$"
1219 #define triangle(x,y) y \
1220 $#if length(y)<x$ $#define iter triangle$ $#else$ \
1221 $#define iter$ $#endif
1222 $ iter(x,*y)
1223 triangle(20)
1224
1225
1226 The following is an (unfortunately very weak) attempt at implementing
1227 functional abstraction in GPP (in standard mode). Understanding this
1228 example and why it can't be made much simpler is an exercise left to
1229 the curious reader.
1230
1231
1232 #mode string "`" "`" "\\"
1233 #define ASIS(x) x
1234 #define SILENT(x) ASIS()
1235 #define EVAL(x,f,v) SILENT(
1236 #mode string QQQ "`" "`" "\\"
1237 #defeval TEMP0 x
1238 #defeval TEMP1 (
1239 \#define \TEMP2(TEMP0) f
1240 )
1241 TEMP1
1242 )TEMP2(v)
1243 #define LAMBDA(x,f,v) SILENT(
1244 #ifneq (v) ()
1245 #define TEMP3(a,b,c) EVAL(a,b,c)
1246 #else
1247 #define TEMP3(a,b,c) \LAMBDA(a,b)
1248 #endif
1249 )TEMP3(x,f,v)
1250 #define EVALAMBDA(x,y) SILENT(
1251 #defeval TEMP4 x
1252 #defeval TEMP5 y
1253 )
1254 #define APPLY(f,v) SILENT(
1255 #defeval TEMP6 ASIS(\EVA)f
1256 TEMP6
1257 )EVAL(TEMP4,TEMP5,v)
1258
1259 This yields the following results:
1260
1261
1262 LAMBDA(z,z+z)
1263 => LAMBDA(z,z+z)
1264
1265 LAMBDA(z,z+z,2)
1266 => 2+2
1267
1268 #define f LAMBDA(y,y*y)
1269 f
1270 => LAMBDA(y,y*y)
1271
1272 APPLY(f,blah)
1273 => blah*blah
1274
1275 APPLY(LAMBDA(t,t t),(t t))
1276 => (t t) (t t)
1277
1278 LAMBDA(x,APPLY(f,(x+x)),urf)
1279 => (urf+urf)*(urf+urf)
1280
1281 APPLY(APPLY(LAMBDA(x,LAMBDA(y,x*y)),foo),bar)
1282 => foo*bar
1283
1284 #define test LAMBDA(y,`#ifeq y urf
1285 y is urf#else
1286 y is not urf#endif
1287 `)
1288 APPLY(test,urf)
1289 => urf is urf
1290
1291 APPLY(test,foo)
1292 => foo is not urf
1293
1294
1295
1297 strftime(3), glob(7), m4(1V), cpp(1)
1298
1299 GPP home page: http://www.nothingisreal.com/gpp/
1300
1301
1303 GPP was written by Denis Auroux <auroux@math.mit.edu>. Since version
1304 2.12 it has been maintained by Tristan Miller <psychonaut@nothingis‐
1305 real.com>.
1306
1307
1309 Copyright (C) 1996-2001 Denis Auroux.
1310
1311 Copyright (C) 2003, 2004 Tristan Miller.
1312
1313 Permission is granted to anyone to make or distribute verbatim copies
1314 of this document as received, in any medium, provided that the copy‐
1315 right notice and this permission notice are preserved, thus giving the
1316 recipient permission to redistribute in turn.
1317
1318 Permission is granted to distribute modified versions of this document,
1319 or of portions of it, under the above conditions, provided also that
1320 they carry prominent notices stating who last changed them.
1321
1322
1323
1324 GPP(1)