1GPP(1) General Commands Manual GPP(1)
2
3
4
6 GPP - Generic Preprocessor
7
8
10 gpp [-{o|O} outfile] [-I/include/path ...]
11 [-Dname=val ...] [-z|+z] [-x] [-m]
12 [-C|-T|-H|-X|-P|-U ... [-M ...]]
13 [-n|+n] [+c<n> str1 str2] [+s<n> str1 str2 c]
14 [-c str1] [--nostdinc] [--nocurinc]
15 [--curdirinclast] [--warninglevel n]
16 [--includemarker str] [--include file]
17 [infile]
18
19 gpp --help
20
21 gpp --version.in -.25i
22
23
24
26 GPP is a general-purpose preprocessor with customizable syntax, suit‐
27 able for a wide range of preprocessing tasks. Its independence from any
28 programming language makes it much more versatile than cpp, while its
29 syntax is lighter and more flexible than that of m4.
30
31
32 GPP is targeted at all common preprocessing tasks where cpp is not
33 suitable and where no very sophisticated features are needed. In order
34 to be able to process equally efficiently text files or source code in
35 a variety of languages, the syntax used by GPP is fully customizable.
36 The handling of comments and strings is especially advanced.
37
38
39 Initially, GPP only understands a minimal set of built-in macros,
40 called meta-macros. These meta-macros allow the definition of user
41 macros as well as some basic operations forming the core of the prepro‐
42 cessing system, including conditional tests, arithmetic evaluation,
43 wildcard matching (globbing), and syntax specification. All user macro
44 definitions are global -- i.e., they remain valid until explicitly re‐
45 moved; meta-macros cannot be redefined. With each user macro definition
46 GPP keeps track of the corresponding syntax specification so that a
47 macro can be safely invoked regardless of any subsequent change in op‐
48 erating mode.
49
50
51 In addition to macros, GPP understands comments and strings, whose syn‐
52 tax and behavior can be widely customized to fit any particular pur‐
53 pose. Internally comments and strings are the same construction, so
54 everything that applies to comments applies to strings as well.
55
56
58 GPP recognizes the following command-line switches and options. Note
59 that the -nostdinc, -nocurinc, -curdirinclast, -warninglevel, and -in‐
60 cludemarker options from version 2.1 and earlier are deprecated and
61 should not be used. Use the "long option" variants instead (--nostd‐
62 inc, etc.).
63
64 -h --help
65 Print a short help message.
66
67 --version
68 Print version information.
69
70 -o outfile
71 Specify a file to which all output should be sent (by default,
72 everything is sent to standard output).
73
74 -O outfile
75 Specify a file to which all output should be sent; output is si‐
76 multaneously sent to stdout.
77
78 -I/include/path
79 Specify a path where the #include meta-macro will look for in‐
80 clude files if they are not present in the current directory.
81 The default is /usr/include if no -I option is specified. Multi‐
82 ple -I options may be specified to look in several directories.
83
84 -Dname=val
85 Define the user macro name as equal to val. This is strictly
86 equivalent to using the #define meta-macro, but makes it possi‐
87 ble to define macros from the command-line. If val makes refer‐
88 ences to arguments or other macros, it should conform to the
89 syntax of the mode specified on the command-line. Starting with
90 version 2.1, macro argument naming is allowed on the command-
91 line. The syntax is as follows: -Dmacro(arg1,...)=definition.
92 The arguments are specified in C-style syntax, without any
93 whitespace, but the definition should still conform to the syn‐
94 tax of the mode specified on the command-line.
95
96 +z Set text mode to Unix mode (LF terminator). Any CR character in
97 the input is systematically discarded. This is the default under
98 Unix systems.
99
100 -z Set text mode to DOS mode (CR-LF terminator). In this mode all
101 CR characters are removed from the input, and all output LF
102 characters are converted to CR-LF. This is the default if GPP is
103 compiled with the WIN_NT option.
104
105 -x Enable the use of the #exec meta-macro. Since #exec includes the
106 output of an arbitrary shell command line, it may cause a poten‐
107 tial security threat, and is thus disabled unless this option is
108 specified.
109
110 -m Enable automatic mode switching to the cpp compatibility mode if
111 the name of an included file ends in `.h' or `.c'. This makes it
112 possible to include C header files with only minor modifica‐
113 tions.
114
115 -n Prevent newline or whitespace characters from being removed from
116 the input when they occur as the end of a macro call or of a
117 comment. By default, when a newline or whitespace character
118 forms the end of a macro or a comment it is parsed as part of
119 the macro call or comment and therefore removed from output. Use
120 the -n option to keep the last character in the input stream if
121 it was whitespace or a newline. This is activated in cpp and
122 Prolog modes.
123
124 +n The opposite of -n. This is the default in all modes except cpp
125 and Prolog. Note that +n must be placed after -C or -P in order
126 to have any effect.
127
128 -U arg1 ... arg9
129 User-defined mode. The nine following command-line arguments are
130 taken to be respectively the macro start sequence, the macro end
131 sequence for a call without arguments, the argument start se‐
132 quence, the argument separator, the argument end sequence, the
133 list of characters to stack for argument balancing, the list of
134 characters to unstack, the string to be used for referring to an
135 argument by number, and finally the quote character (if there is
136 none an empty string should be provided). These settings apply
137 both to user macros and to meta-macros, unless the -M option is
138 used to define other settings for meta-macros. See the section
139 on syntax specification for more details.
140
141 -M arg1 ... arg7
142 User-defined mode specifications for meta-macros. This option
143 can only be used together with -U. The seven following command-
144 line arguments are taken to be respectively the macro start se‐
145 quence, the macro end sequence for a call without arguments, the
146 argument start sequence, the argument separator, the argument
147 end sequence, the list of characters to stack for argument bal‐
148 ancing, and the list of characters to unstack. See below for
149 more details.
150
151 (default mode)
152 The default mode is a vaguely cpp-like mode, but it does not
153 handle comments, and presents various incompatibilities with
154 cpp. Typical meta-macros and user macros look like this:
155
156
157 #define x y
158 macro(arg,...)
159
160 This mode is equivalent to
161
162
163 -U "" "" "(" "," ")" "(" ")" "#" "\\"
164 -M "#" "\n" " " " " "\n" "(" ")"
165
166
167 -C cpp compatibility mode. This is the mode where GPP's behavior is
168 the closest to that of cpp. Unlike in the default mode, meta-
169 macro expansion occurs only at the beginning of lines, and C
170 comments and strings are understood. This mode is equivalent to
171
172
173 -n -U "" "" "(" "," ")" "(" ")" "#" ""
174 -M "\n#\w" "\n" " " " " "\n" "" ""
175 +c "/*" "*/" +c "//" "\n" +c "\\\n" ""
176 +s "\"" "\"" "\\" +s "'" "'" "\\"
177
178
179 -T TeX-like mode. In this mode, typical meta-macros and user macros
180 look like this:
181
182
183 \define{x}{y}
184 \macro{arg}{...}
185
186 No comments are understood. This mode is equivalent to
187
188
189 -U "\\" "" "{" "}{" "}" "{" "}" "#" "@"
190
191
192 -H HTML-like mode. In this mode, typical meta-macros and user
193 macros look like this:
194
195
196 <#define x|y>
197 <#macro arg|...>
198
199 No comments are understood. This mode is equivalent to
200
201
202 -U "<#" ">" "\B" "|" ">" "<" ">" "#" "\\"
203
204
205 -X XHTML-like mode. In this mode, typical meta-macros and user
206 macros look like this:
207
208
209 <#define x|y/>
210 <#macro arg|.../>
211
212 No comments are understood. This mode is equivalent to
213
214
215 -U "<#" "/>" "\B" "|" "/>" "<" ">" "#" "\\"
216
217
218 -P Prolog-compatible cpp-like mode. This mode differs from the cpp
219 compatibility mode by its handling of comments, and is equiva‐
220 lent to
221
222
223 -n -U "" "" "(" "," ")" "(" ")" "#" ""
224 -M "\n#\w" "\n" " " " " "\n" "" ""
225 +ccss "\!o/*" "*/" +ccss "%" "\n" +ccii "\\\n" ""
226 +s "\"" "\"" "" +s "\!#'" "'" ""
227
228
229 +c<n> str1 str2
230 Specify comments. Any unquoted occurrence of str1 will be inter‐
231 preted as the beginning of a comment. All input up to the first
232 following occurrence of str2 will be discarded. This option may
233 be used multiple times to specify different types of comment de‐
234 limiters. The optional parameter <n> can be specified to alter
235 the behavior of the comment and, e.g., turn it into a string or
236 make it ignored under certain circumstances, see below.
237
238 -c str1
239 Un-specify comments or strings. The comment/string specification
240 whose start sequence is str1 is removed. This is useful to alter
241 the built-in comment specifications of a standard mode -- e.g.,
242 the cpp compatibility mode.
243
244 +s<n> str1 str2 c
245 Specify strings. Any unquoted occurrence of str1 will be inter‐
246 preted as the beginning of a string. All input up to the first
247 following occurrence of str2 will be output as is without any
248 evaluation. The delimiters themselves are output. If c is non-
249 empty, its first character is used as a string-quote character
250 -- i.e., a character whose presence immediately before an occur‐
251 rence of str2 prevents it from terminating the string. The op‐
252 tional parameter <n> can be specified to alter the behavior of
253 the string and, e.g., turn it into a comment, enable macro eval‐
254 uation inside the string, or make the string specification ig‐
255 nored under certain circumstances. See below.
256
257 -s str1
258 Un-specify comments or strings. Identical to -c.
259
260 --include file
261 Process file before infile
262
263 --nostdinc
264 Do not look for include files in the standard directory /usr/in‐
265 clude.
266
267 --nocurinc
268 Do not look for include files in the current directory.
269
270 --curdirinclast
271 Look for include files in the current directory after the direc‐
272 tories specified by -I rather than before them.
273
274 --warninglevel n
275 Set warning level to n (0, 1 or 2). Default is 2 (most verbose).
276
277 --includemarker str
278 keep track of #include directives by inserting a marker in the
279 output stream. The format of the marker is determined by str,
280 which must contain three occurrences of the character % (or
281 equivalently ?). The first occurrence is replaced with the line
282 number, the second with the file name, and the third with 1, 2
283 or blank. When this option is specified in default, cpp or Pro‐
284 log mode, GPP does its best to ensure that line numbers are the
285 same in the output as in the input by inserting blank lines in
286 the place of definitions or comments.
287
288 infile Specify an input file from which GPP reads its input. If no in‐
289 put file is specified, input is read from standard input.
290
291
292
294 The syntax of a macro call is as follows: it must start with a sequence
295 of characters matching the macro start sequence as specified in the
296 current mode, followed immediately by the name of the macro, which must
297 be a valid identifier -- i.e., a sequence of letters, digits, or under‐
298 scores ("_"). The macro name must be followed by a short macro end se‐
299 quence if the macro has no arguments, or by a sequence of arguments
300 initiated by an argument start sequence. The various arguments are then
301 separated by an argument separator, and the macro ends with a long
302 macro end sequence.
303
304
305 In all cases, the parameters of the current context -- i.e., the argu‐
306 ments passed to the body being evaluated -- can be referred to by using
307 an argument reference sequence followed by a digit between 1 and 9.
308 Alternatively, macro parameters may be named (see below). Furthermore,
309 to avoid interference between the GPP syntax and the contents of the
310 input file, a quote character is provided. The quote character can be
311 used to prevent the interpretation of a macro call, comment, or string
312 as anything but plain text. The quote character "protects" the follow‐
313 ing character, and always gets removed during evaluation. Two consecu‐
314 tive quote characters evaluate as a single quote character.
315
316
317 Finally, to facilitate proper argument delimitation, certain characters
318 can be "stacked" when they occur in a macro argument, so that the argu‐
319 ment separator or macro end sequence are not parsed if the argument
320 body is not balanced. This allows nesting macro calls without using
321 quotes. If an improperly balanced argument is needed, quote characters
322 should be added in front of some stacked characters to make it bal‐
323 anced.
324
325
326 The macro construction sequences described above can be different for
327 meta-macros and for user macros: this is the case in cpp mode, for ex‐
328 ample. Note that, since meta-macros can only have up to two arguments,
329 the delimitation rules for the second argument are somewhat sloppier,
330 and unquoted argument separator sequences are allowed in the second ar‐
331 gument of a meta-macro.
332
333
334 Unless one of the standard operating modes is selected, the above syn‐
335 tax sequences can be specified either on the command-line, using the -M
336 and -U options respectively for meta-macros and user macros, or inside
337 an input file via the #mode meta and #mode user meta-macro calls. In
338 both cases the mode description consists of nine parameters for user
339 macro specifications, namely the macro start sequence, the short macro
340 end sequence, the argument start sequence, the argument separator, the
341 long macro end sequence, the string listing characters to stack, the
342 string listing characters to unstack, the argument reference sequence,
343 and finally the quote character. As explained below, these sequences
344 should be supplied using the syntax of C strings; they must start with
345 a non-alphanumeric character, and in the first five strings special
346 matching sequences can be used (see below). If the argument correspond‐
347 ing to the quote character is the empty string, that argument's func‐
348 tionality is disabled. For meta-macro specifications there are only
349 seven parameters, as the argument reference sequence and quote charac‐
350 ter are shared with the user macro syntax.
351
352
353 The structure of a comment/string is as follows: it must start with a
354 sequence of characters matching the given comment/string start se‐
355 quence, and always ends at the first occurrence of the comment/string
356 end sequence, unless it is preceded by an odd number of occurrences of
357 the string-quote character (if such a character has been specified). In
358 certain cases comment/strings can be specified to enable macro evalua‐
359 tion inside the comment/string; in that case, if a quote character has
360 been defined for macros it can be used as well to prevent the com‐
361 ment/string from ending, with the difference that the macro quote char‐
362 acter is always removed from output whereas the string-quote character
363 is always output. Also note that under certain circumstances a com‐
364 ment/string specification can be disabled, in which case the com‐
365 ment/string start sequence is simply ignored. Finally, it is possible
366 to specify a string warning character whose presence inside a com‐
367 ment/string will cause GPP to output a warning (this is useful to lo‐
368 cate unterminated strings in cpp mode). Note that input files are not
369 allowed to contain unterminated comments/strings.
370
371
372 A comment/string specification can be declared from within the input
373 file using the #mode comment meta-macro call (or equivalently #mode
374 string), in which case the number of C strings to be given as arguments
375 to describe the comment/string can be anywhere between two and four:
376 the first two arguments (mandatory) are the start sequence and the end
377 sequence, and can make use of the special matching sequences (see be‐
378 low). They may not start with alphanumeric characters. The first char‐
379 acter of the third argument, if there is one, is used as the string-
380 quote character (use an empty string to disable the functionality), and
381 the first character of the fourth argument, if there is one, is used as
382 the string-warning character. A specification may also be given from
383 the command-line, in which case there must be two arguments if using
384 the +c option and three if using the +s option.
385
386
387 The behavior of a comment/string is specified by a three-character mod‐
388 ifier string, which may be passed as an optional argument either to the
389 +c/+s command-line options or to the #mode comment/#mode string meta-
390 macros. If no modifier string is specified, the default value is "ccc"
391 for comments and "sss" for strings. The first character corresponds to
392 the behavior inside meta-macro calls (including user-macro definitions
393 since these come inside a #define meta-macro call), the second charac‐
394 ter corresponds to the behavior inside user-macro parameters, and the
395 third character corresponds to the behavior outside of any macro call.
396 Each of these characters can take the following values:
397
398
399 i disable the comment/string specification.
400
401 c comment (neither evaluated nor output).
402
403 s string (the string and its delimiter sequences are output as-
404 is).
405
406 q quoted string (the string is output as-is, without the delimiter
407 sequences).
408
409 C evaluated comment (macros are evaluated, but output is dis‐
410 carded).
411
412 S evaluated string (macros are evaluated, delimiters are output).
413
414 Q evaluated quoted string (macros are evaluated, delimiters are
415 not output).
416
417
418 Important note: any occurrence of a comment/string start sequence in‐
419 side another comment/string is always ignored, even if macro evaluation
420 is enabled. In other words, comments/strings cannot be nested. In par‐
421 ticular, the `Q' modifier can be a convenient way of defining a syntax
422 for temporarily disabling all comment and string specifications.
423
424
425 Syntax specification strings should always be provided as C strings,
426 whether they are given as arguments to a #mode meta-macro call or on
427 the command-line of a Unix shell. If command-line arguments are given
428 via another method than a standard Unix shell, then the shell behavior
429 must be emulated -- i.e., the surrounding "" quotes should be removed,
430 all occurrences of `\\' should be replaced by a single backslash, and
431 similarly `\"' should be replaced by `"'. Sequences like `\n' are rec‐
432 ognized by GPP and should be left as is.
433
434
435 Special sequences matching certain subsets of the character set can be
436 used. They are of the form `\x', where x is one of:
437
438
439 b matches any sequence of one or more spaces or tab characters
440 (`\b' is identical to ` ').
441
442 w matches any sequence of zero or more spaces or tab characters.
443
444 B matches any sequence of one or more spaces, tabs or newline
445 characters.
446
447 W matches any sequence of zero or more spaces, tabs or newline
448 characters.
449
450 a an alphabetic character (`a' to `z' and `A' to `Z').
451
452 A an alphabetic character, or a space, tab or newline.
453
454 # a digit (`0' to `9').
455
456 i an identifier character. The set of matched characters is cus‐
457 tomizable using the #mode charset id command. The default set‐
458 ting matches alphanumeric characters and underscores (`a' to
459 `z', `A' to `Z', `0' to `9' and `_').
460
461 t a tab character.
462
463 n a newline character.
464
465 o an operator character. The set of matched characters is custom‐
466 izable using the #mode charset op command. The default setting
467 matches all characters in "+-*/\^<>=`~:.?@#&!%|", except in Pro‐
468 log mode where `!', `%' and `|' are not matched.
469
470 O an operator character or a parenthesis character. The set of ad‐
471 ditional matched characters in comparison with `\o' is customiz‐
472 able using the #mode charset par command. The default setting is
473 to have the characters in "()[]{}" as parentheses.
474
475
476 Moreover, all of these matching subsets except `\w' and `\W' can be
477 negated by inserting a `!' -- i.e., by writing `\!x' instead of `\x'.
478
479
480 Note an important distinctive feature of start sequences: when the
481 first character of a macro or comment/string start sequence is ` ' or
482 one of the above special sequences, it is not taken to be part of the
483 sequence itself but is used instead as a context check: for example a
484 start sequence beginning with `\n' matches only at the beginning of a
485 line, but the matching newline character is not taken to be part of the
486 sequence. Similarly a start sequence beginning with ` ' matches only
487 if some whitespace is present, but the matching whitespace is not con‐
488 sidered to be part of the start sequence and is therefore sent to out‐
489 put. If a context check is performed at the very beginning of a file
490 (or more generally of any body to be evaluated), the result is the same
491 as matching with a newline character (this makes it possible for a cpp-
492 mode file to start with a meta-macro call).
493
494
495 Two special syntax rules were added in version 2.1. First, argument
496 references (#n) are no longer evaluated when they are outside of macro
497 calls and definitions. However, they are no longer allowed to appear
498 (unless protected by quote characters) inside a call to a defined user
499 macro; the current behavior (backwards compatible) is to remove them
500 silently from the input if that happens.
501
502
503 Second, if the end sequence (either for macros or comments) consists of
504 a single newline character, and if delimitation rules lead to evalua‐
505 tion in a context where the final newline character is absent, GPP
506 silently ignores the missing newline instead of producing an error. The
507 main consequence is that meta-macro calls can now be nested in a simple
508 way in standard, cpp and Prolog modes.
509
510
511
513 Input is read sequentially and interpreted according to the rules of
514 the current mode. All input text is first matched against the specified
515 comment/string start sequences of the current mode (except those which
516 are disabled by the `i' modifier), unless the body being evaluated is
517 the contents of a comment/string whose modifier enables macro evalua‐
518 tion. The most recently defined comment/string specifications are
519 checked for first. Important note: comments may not appear between the
520 name of a macro and its arguments (doing so results in undefined behav‐
521 ior).
522
523
524 Anything that is not a comment/string is then matched against a possi‐
525 ble meta-macro call, and if that fails too, against a possible user-
526 macro call. All remaining text undergoes substitution of argument ref‐
527 erence sequences by the relevant argument text (empty unless the body
528 being evaluated is the definition of a user macro) and removal of the
529 quote character if there is one.
530
531
532 Note that meta-macro arguments are passed to the meta-macro prior to
533 any evaluation (although the meta-macro may choose to evaluate them,
534 see meta-macro descriptions below). In the case of the #mode meta-
535 macro, GPP temporarily adds a comment/string specification to enable
536 recognition of C strings ("...") and prevent any evaluation inside
537 them, so no interference of the characters being put in the C string
538 arguments to #mode with the current syntax is to be feared.
539
540
541 On the other hand, the arguments to a user macro are systematically
542 evaluated, and then passed as context parameters to the macro defini‐
543 tion body, which gets evaluated with that environment. The only excep‐
544 tion is when the macro definition is empty, in which case its arguments
545 are not evaluated. Note that GPP temporarily switches back to the mode
546 in which the macro was defined in order to evaluate it, so it is per‐
547 fectly safe to change the operating mode between the time a macro is
548 defined and the time when it is called. Conversely, if a user macro
549 wishes to work with the current mode instead of the one that was used
550 to define it it needs to start with a #mode restore call and end with a
551 #mode save call.
552
553
554 A user macro may be defined with named arguments (see #define descrip‐
555 tion below). In that case, when the macro definition is being evalu‐
556 ated, each named parameter causes a temporary virtual user-macro defi‐
557 nition to be created; such a macro may be called only without arguments
558 and simply returns the text of the corresponding argument.
559
560
561 Note that, since macros are evaluated when they are called rather than
562 when they are defined, any attempt to call a recursive macro causes un‐
563 defined behavior except in the very specific case when the macro uses
564 #undef to erase itself after finitely many loop iterations.
565
566
567 Finally, a special case occurs when a user macro whose definition does
568 not involve any arguments (neither named arguments nor the argument
569 reference sequence) is called in a mode where the short user-macro end
570 sequence is empty (e.g., cpp or TeX mode). In that case it is assumed
571 to be an alias macro: its arguments are first evaluated in the current
572 mode as usual, but instead of being passed to the macro definition as
573 parameters (which would cause them to be discarded) they are actually
574 appended to the macro definition, using the syntax rules of the mode in
575 which the macro was defined, and the resulting text is evaluated again.
576 It is therefore important to note that, in the case of a macro alias,
577 the arguments actually get evaluated twice in two potentially different
578 modes.
579
580
582 These macros are always predefined. Their actual calling sequence de‐
583 pends on the current mode; here we use cpp-like notation.
584
585
586 #define x y
587 This defines the user macro x as y. y can be any valid GPP in‐
588 put, and may for example refer to other macros. x must be an
589 identifier (i.e., a sequence of alphanumeric characters and
590 `_'), unless named arguments are specified. If x is already de‐
591 fined, the previous definition is overwritten. If no second ar‐
592 gument is given, x will be defined as a macro that outputs noth‐
593 ing. Neither x nor y are evaluated; the macro definition is only
594 evaluated when it is called, not when it is declared.
595
596 It is also possible to name the arguments in a macro definition:
597 in that case, the argument x should be a user-macro call whose
598 arguments are all identifiers. These identifiers become avail‐
599 able as user-macros inside the macro definition; these virtual
600 macros must be called without arguments, and evaluate to the
601 corresponding macro parameter.
602
603 #defeval x y
604 This acts in a similar way to #define, but the second argument y
605 is evaluated immediately. Since user macro definitions are also
606 evaluated each time they are called, this means that the macro y
607 will undergo two successive evaluations. The usefulness of #de‐
608 feval is considerable as it is the only way to evaluate some‐
609 thing more than once, which may be needed to force evaluation of
610 the arguments of a meta-macro that normally doesn't perform any
611 evaluation. However since all argument references evaluated at
612 define-time are understood as the arguments of the body in which
613 the macro is being defined and not as the arguments of the macro
614 itself, usually one has to use the quote character to prevent
615 immediate evaluation of argument references.
616
617 #undef x
618 This removes any existing definition of the user macro x.
619
620 #ifdef x
621 This begins a conditional block. Everything that follows is
622 evaluated only if the identifier x is defined, and until either
623 a #else or a #endif statement is reached. Note, however, that
624 the commented text is still scanned thoroughly, so its syntax
625 must be valid. It is in particular legal to have the #else or
626 #endif statement ending the conditional block appear only as the
627 result of a user-macro expansion and not explicitly in the in‐
628 put.
629
630 #ifndef x
631 This begins a conditional block. Everything that follows is
632 evaluated only if the identifier x is not defined.
633
634 #ifeq x y
635 This begins a conditional block. Everything that follows is
636 evaluated only if the results of the evaluations of x and y are
637 identical as character strings. Any leading or trailing white‐
638 space is ignored for the comparison. Note that in cpp-mode any
639 unquoted whitespace character is understood as the end of the
640 first argument, so it is necessary to be careful.
641
642 #ifneq x y
643 This begins a conditional block. Everything that follows is
644 evaluated only if the results of the evaluations of x and y are
645 not identical (even up to leading or trailing whitespace).
646
647 #else This toggles the logical value of the current conditional block.
648 What follows is evaluated if and only if the preceding input was
649 commented out.
650
651 #endif This ends a conditional block started by a #if... meta-macro.
652
653 #include file
654 This causes GPP to open the specified file and evaluate its con‐
655 tents, inserting the resulting text in the current output. All
656 defined user macros are still available in the included file,
657 and reciprocally all macros defined in the included file will be
658 available in everything that follows. The include file is looked
659 for first in the current directory, and then, if not found, in
660 one of the directories specified by the -I command-line option
661 (or /usr/include if no directory was specified). Note that, for
662 compatibility reasons, it is possible to put the file name be‐
663 tween "" or <>.
664
665 The order in which the various directories are searched for in‐
666 clude files is affected by the -nostdinc, -nocurinc and -cur‐
667 dirinclast command-line options.
668
669 Upon including a file, GPP immediately saves a copy of the cur‐
670 rent operating mode onto the mode stack, and restores the oper‐
671 ating mode at the end of the included file. The included file
672 may override this behavior by starting with a #mode restore call
673 and ending with a #mode push call. Additionally, when the -m
674 command line option is specified, GPP will automatically switch
675 to the cpp compatibility mode upon including a file whose name
676 ends with either `.c' or `.h'.
677
678 #exec command
679 This causes GPP to execute the specified command line and in‐
680 clude its standard output in the current output. Note that, for
681 security reasons, this meta-macro is disabled unless the -x com‐
682 mand line flag was specified. If use of #exec is not allowed, a
683 warning message is printed and the output is left blank. Note
684 that the specified command line is evaluated before being exe‐
685 cuted, thus allowing the use of macros in the command-line. How‐
686 ever, the output of the command is included verbatim and not
687 evaluated. If you need the output to be evaluated, you must use
688 #defeval (see above) to cause a double evaluation.
689
690 #eval expr
691 The #eval meta-macro attempts to evaluate expr first by expand‐
692 ing macros (normal GPP evaluation) and then by performing arith‐
693 metic evaluation and/or wildcard matching. The syntax and oper‐
694 ator precedence for arithmetic expressions are the same as in C;
695 the only missing operators are <<, >>, ?:, and the assignment
696 operators.
697
698 POSIX-style wildcard matching (`globbing') is available only on
699 POSIX implementations and can be invoked with the =~ operator.
700 In brief, a `?' matches any single character, a `*' matches any
701 string (including the empty string), and `[...]' matches any one
702 of the characters enclosed in brackets. A `[...]' class is com‐
703 plemented when the first character in the brackets is `!'. The
704 characters in a `[...]' class can also be specified as a range
705 using the `-' character -- e.g., `[F-N]' is equivalent to
706 `[FGHIJKLMN]'.
707
708 If unable to assign a numerical value to the result, the re‐
709 turned text is simply the result of macro expansion without any
710 arithmetic evaluation. The only exceptions to this rule are the
711 comparison operators ==, !=, <, >, <=, and >= which, if one of
712 the sides does not evaluate to a number, perform string compari‐
713 son instead (ignoring trailing and leading spaces). Addition‐
714 ally, the length(...) arithmetic operator returns the length in
715 characters of its evaluated argument.
716
717 Inside arithmetic expressions, the defined(...) special user
718 macro is also available: it takes only one argument, which is
719 not evaluated, and returns 1 if it is the name of a user macro
720 and 0 otherwise.
721
722 #if expr
723 This meta-macro invokes the arithmetic/globbing evaluator in the
724 same manner as #eval and compares the result of evaluation with
725 the string "0" in order to begin a conditional block. In partic‐
726 ular note that the logical value of expr is always true when it
727 cannot be evaluated to a number.
728
729 #elif expr
730 This meta-macro can be used to avoid nested #if conditions. #if
731 ... #elif ... #endif is equivalent to #if ... #else #if ...
732 #endif #endif.
733
734 #mode keyword ...
735 This meta-macro controls GPP's operating mode. See below for a
736 list of #mode commands.
737
738 #line This meta-macro evaluates to the line number of the current in‐
739 put file.
740
741 #file This meta-macro evaluates to the filename of the current input
742 file as it appears on the command line or in the argument to
743 #include. If GPP is reading its input from stdin, then #file
744 evaluates to `stdin'.
745
746 #date fmt
747 This meta-macro evaluates to the current date and time as for‐
748 matted by the specified format string fmt. See the section DATE
749 AND TIME CONVERSION SPECIFIERS below.
750
751 #error msg
752 This meta-macro causes an error message with the current file‐
753 name and line number, and with the text msg, to be printed to
754 the standard error device. Subsequent processing is then
755 aborted.
756
757 #warning msg
758 This meta-macro causes a warning message with the current file‐
759 name and line number, and with the text msg, to be printed to
760 the standard error device. Subsequent processing is then re‐
761 sumed.
762
763
764
765 The key to GPP's flexibility is the #mode meta-macro. Its first argu‐
766 ment is always one of a list of available keywords (see below); its
767 second argument is always a sequence of words separated by whitespace.
768 Apart from possibly the first of them, each of these words is always a
769 delimiter or syntax specifier, and should be provided as a C string de‐
770 limited by double quotes (" "). The various special matching sequences
771 listed in the section on syntax specification are available. Any #mode
772 command is parsed in a mode where "..." is understood to be a C-style
773 string, so it is safe to put any character inside these strings. Also
774 note that the first argument of #mode (the keyword) is never evaluated,
775 while the second argument is evaluated (except of course for the con‐
776 tents of C strings), so that the syntax specification may be obtained
777 as the result of a macro evaluation.
778
779
780 The available #mode commands are:
781
782
783 #mode save / #mode push
784 Push the current mode specification onto the mode stack.
785
786 #mode restore / #mode pop
787 Pop mode specification from the mode stack.
788
789 #mode standard name
790 Select one of the standard modes. The only argument must be one
791 of: default (default mode); cpp, C (cpp mode); tex, TeX
792 (TeX mode); html, HTML (html mode); xhtml, XHTML (xhtml mode);
793 prolog, Prolog (prolog mode). The mode name must be given di‐
794 rectly, not as a C string.
795
796 #mode user "s1" ... "s9"
797 Specify user macro syntax. The 9 arguments, all of them C
798 strings, are the mode specification for user macros (see the -U
799 command-line option and the section on syntax specification).
800 The meta-macro specification is not affected.
801
802 #mode meta {user | "s1" ... "s7"}
803 Specify meta-macro syntax. Either the only argument is user
804 (not as a string), and the user-macro mode specifications are
805 copied into the meta-macro mode specifications, or there must be
806 seven string arguments, whose significance is the same as for
807 the -M command-line option (see section on syntax specifica‐
808 tion).
809
810 #mode quote ["c"]
811 With no argument or "" as argument, removes the quote character
812 specification and disables the quoting functionality. With one
813 string argument, the first character of the string is taken to
814 be the new quote character. The quote character can be neither
815 alphanumeric nor `_', nor can it be one of the special matching
816 sequences.
817
818 #mode comment [xxx] "start" "end" ["c" ["c"]]
819 Add a comment specification. Optionally a first argument con‐
820 sisting of three characters not enclosed in " " can be used to
821 specify a comment/string modifier (see the section on syntax
822 specification). The default modifier is ccc. The first two
823 string arguments are used as comment start and end sequences re‐
824 spectively. The third string argument is optional and can be
825 used to specify a string-quote character. (If it is "", the
826 functionality is disabled.) The fourth string argument is op‐
827 tional and can be used to specify a string delimitation warning
828 character. (If it is "", the functionality is disabled.)
829
830 #mode string [xxx] "start" "end" ["c" ["c"]]
831 Add a string specification. Identical to #mode comment except
832 that the default modifier is sss.
833
834 #mode nocomment / #mode nostring ["start"]
835 With no argument, remove all comment/string specifications. With
836 one string argument, delete the comment/string specification
837 whose start sequence is the argument.
838
839 #mode preservelf { on | off | 1 | 0 }
840 Equivalent to the -n command-line switch. If the argument is on
841 or 1, any newline or whitespace character terminating a macro
842 call or a comment/string is left in the input stream for further
843 processing. If the argument is off or 0 this feature is dis‐
844 abled.
845
846 #mode charset { id | op | par } "string"
847 Specify the character sets to be used for matching the \o, \O
848 and \i special sequences. The first argument must be one of id
849 (the set matched by \i), op (the set matched by \o) or par (the
850 set matched by \O in addition to the one matched by \o).
851 "string" is a C string which lists all characters to put in the
852 set. It may contain only the special matching sequences \a, \A,
853 \b, \B, and \# (the other sequences and the negated sequences
854 are not allowed). When a `-' is found inbetween two non-special
855 characters this adds all characters inbetween (e.g. "A-Z" corre‐
856 sponds to all uppercase characters). To have `-' in the matched
857 set, either put it in first or last position or place it next to
858 a \x sequence.
859
860
861
863 Ordinary characters placed in the format string are copied without con‐
864 version. Conversion specifiers are introduced by a `%' character, and
865 are replaced as follows:
866
867
868 %a The abbreviated weekday name according to the current locale.
869
870 %A The full weekday name according to the current locale.
871
872 %b The abbreviated month name according to the current locale.
873
874 %B The full month name according to the current locale.
875
876 %c The preferred date and time representation for the current lo‐
877 cale.
878
879 %d The day of the month as a decimal number (range 01 to 31).
880
881 %F Equivalent to %Y-%m-%d (the ISO 8601 date format).
882
883 %H The hour as a decimal number using a 24-hour clock (range 00 to
884 23).
885
886 %I The hour as a decimal number using a 12-hour clock (range 01 to
887 12).
888
889 %j The day of the year as a decimal number (range 001 to 366).
890
891 %m The month as a decimal number (range 01 to 12).
892
893 %M The minute as a decimal number (range 00 to 59).
894
895 %p Either `AM' or `PM' according to the given time value, or
896 the corresponding strings for the current locale. Noon is
897 treated as `PM' and midnight as `AM'.
898
899 %R The time in 24-hour notation (%H:%M).
900
901 %S The second as a decimal number (range 00 to 61).
902
903 %U The week number of the current year as a decimal number,
904 range 00 to 53, starting with the first Sunday as the first
905 day of week 01.
906
907 %w The day of the week as a decimal, range 0 to 6, Sunday being
908 0.
909
910 %W The week number of the current year as a decimal number,
911 range 00 to 53, starting with the first Monday as the first
912 day of week 01.
913
914 %x The preferred date representation for the current locale with‐
915 out the time.
916
917 %X The preferred time representation for the current locale with‐
918 out the date.
919
920 %y The year as a decimal number without a century (range 00 to
921 99).
922
923 %Y The year as a decimal number including the century.
924
925 %Z The time zone or name or abbreviation.
926
927 %% A literal `%' character.
928
929
930
931 Depending on the C compiler and library used to compile GPP, there may
932 be more conversion specifiers available. Consult your compiler's docu‐
933 mentation for the strftime() function. Note, however, that any conver‐
934 sion specifiers not listed above may not be portable across installa‐
935 tions of GPP.
936
937
939 Here is a basic self-explanatory example in standard or cpp mode:
940
941
942
943 #define FOO This is
944 #define BAR a message.
945 #define concat #1 #2
946 concat(FOO,BAR)
947 #ifeq (concat(foo,bar)) (foo bar)
948 This is output.
949 #else
950 This is not output.
951 #endif
952
953 Using argument naming, the concat macro could alternatively be defined
954 as
955
956
957
958 #define concat(x,y) x y
959
960 In TeX mode and using argument naming, the same example becomes:
961
962
963
964 \define{FOO}{This is}
965 \define{BAR}{a message.}
966 \define{\concat{x}{y}}{\x \y}
967 \concat{\FOO}{\BAR}
968 \ifeq{\concat{foo}{bar}}{foo bar}
969 This is output.
970 \else
971 This is not output.
972 \endif
973
974 In HTML mode and without argument naming, one gets similarly:
975
976
977
978 <#define FOO|This is>
979 <#define BAR|a message.>
980 <#define concat|#1 #2>
981 <#concat <#FOO>|<#BAR>>
982 <#ifeq <#concat foo|bar>|foo bar>
983 This is output.
984 <#else>
985 This is not output.
986 <#endif>
987
988 The following example (in standard mode) illustrates the use of the
989 quote character:
990
991
992
993 #define FOO This is \
994 a multiline definition.
995 #define BLAH(x) My argument is x
996 BLAH(urf)
997 \BLAH(urf)
998
999 Note that the multiline definition is also valid in cpp and Prolog
1000 modes despite the absence of quote character, because `\' followed by a
1001 newline is then interpreted as a comment and discarded.
1002
1003
1004 In cpp mode, C strings and comments are understood as such, as illus‐
1005 trated by the following example:
1006
1007
1008
1009 #define BLAH foo
1010 BLAH "BLAH" /* BLAH */
1011 `It\'s a /*string*/ !'
1012
1013 The main difference between Prolog mode and cpp mode is the handling of
1014 strings and comments: in Prolog, a `...' string may not begin immedi‐
1015 ately after a digit, and a /*...*/ comment may not begin immediately
1016 after an operator character. Furthermore, comments are not removed from
1017 the output unless they occur in a #command.
1018
1019
1020 The differences between cpp mode and default mode are deeper: in de‐
1021 fault mode #commands may start anywhere, while in cpp mode they must be
1022 at the beginning of a line; the default mode has no knowledge of com‐
1023 ments and strings, but has a quote character (`\'), while cpp mode has
1024 extensive comment/string specifications but no quote character. More‐
1025 over, the arguments to meta-macros need to be correctly parenthesized
1026 in default mode, while no such checking is performed in cpp mode.
1027
1028
1029 This makes it easier to nest meta-macro calls in default mode than in
1030 cpp mode. For example, consider the following HTML mode input, which
1031 tests for the availability of the #exec command:
1032
1033
1034
1035 <#ifeq <#exec echo blah>|blah
1036 > #exec allowed <#else> #exec not allowed <#endif>
1037
1038 There is no cpp mode equivalent, while in default mode it can be easily
1039 translated as
1040
1041
1042
1043 #ifeq (#exec echo blah
1044 ) (blah
1045 )
1046 \#exec allowed
1047 #else
1048 \#exec not allowed
1049 #endif
1050
1051 In order to nest meta-macro calls in cpp mode it is necessary to modify
1052 the mode description, either by changing the meta-macro call syntax, or
1053 more elegantly by defining a silent string and using the fact that the
1054 context at the beginning of an evaluated string is a newline character:
1055
1056
1057
1058 #mode string QQQ "$" "$"
1059 #ifeq $#exec echo blah
1060 $ $blah
1061 $
1062 \#exec allowed
1063 #else
1064 \#exec not allowed
1065 #endif
1066
1067 Note, however, that comments/strings cannot be nested ("..." inside
1068 $...$ would go undetected), so one needs to be careful about what to
1069 include inside such a silent evaluated string. In this example, the
1070 loose meta-macro nesting introduced in version 2.1 makes it possible to
1071 use the following simpler version:
1072
1073
1074
1075 #ifeq blah #exec echo -n blah
1076 \#exec allowed
1077 #else
1078 \#exec not allowed
1079 #endif
1080
1081 Remember that macros without arguments are actually understood to be
1082 aliases when they are called with arguments, as illustrated by the fol‐
1083 lowing example (default or cpp mode):
1084
1085
1086
1087 #define DUP(x) x x
1088 #define FOO and I said: DUP
1089 FOO(blah)
1090
1091 The usefulness of the #defeval meta-macro is shown by the following ex‐
1092 ample in HTML mode:
1093
1094
1095
1096 <#define APPLY|<#defeval TEMP|<\##1 \#1>><#TEMP #2>>
1097 <#define <#foo x>|<#x> and <#x>>
1098 <#APPLY foo|BLAH>
1099
1100 The reason why #defeval is needed is that, since everything is evalu‐
1101 ated in a single pass, the input that will result in the desired macro
1102 call needs to be generated by a first evaluation of the arguments
1103 passed to APPLY before being evaluated a second time.
1104
1105
1106 To translate this example in default mode, one needs to resort to
1107 parenthesizing in order to nest the #defeval call inside the definition
1108 of APPLY, but need to do so without outputting the parentheses. The
1109 easiest solution is
1110
1111
1112
1113 #define BALANCE(x) x
1114 #define APPLY(f,v) BALANCE(#defeval TEMP f
1115 TEMP(v))
1116 #define foo(x) x and x
1117 APPLY(\foo,BLAH)
1118
1119 As explained above the simplest version in cpp mode relies on defining
1120 a silent evaluated string to play the role of the BALANCE macro.
1121
1122
1123 The following example (default or cpp mode) demonstrates arithmetic
1124 evaluation:
1125
1126
1127
1128 #define x 4
1129 The answer is:
1130 #eval x*x + 2*(16-x) + 1998%x
1131
1132 #if defined(x)&&!(3*x+5>17)
1133 This should be output.
1134 #endif
1135
1136 To finish, here are some examples involving mode switching. The fol‐
1137 lowing example is self-explanatory (starting in default mode):
1138
1139
1140
1141 #mode push
1142 #define f(x) x x
1143 #mode standard tex
1144 \f{blah}
1145 \mode{string}{"$" "$"}
1146 \mode{comment}{"/*" "*/"}
1147 $\f{urf}$ /* blah */
1148 \define{FOO}{bar/* and some more */}
1149 \mode{pop}
1150 f($FOO$)
1151
1152 A good example where a user-defined mode becomes useful is the GPP
1153 source of this document (available with GPP's source code distribu‐
1154 tion).
1155
1156
1157 Another interesting application is selectively forcing evaluation of
1158 macros in C strings when in cpp mode. For example, consider the follow‐
1159 ing input:
1160
1161
1162
1163 #define blah(x) "and he said: x"
1164 blah(foo)
1165
1166 Obviously one would want the parameter x to be expanded inside the
1167 string. There are several ways around this problem:
1168
1169
1170
1171 #mode push
1172 #mode nostring "\""
1173 #define blah(x) "and he said: x"
1174 #mode pop
1175
1176 #mode quote "`"
1177 #define blah(x) `"and he said: x`"
1178
1179 #mode string QQQ "$$" "$$"
1180 #define blah(x) $$"and he said: x"$$
1181
1182 The first method is very natural, but has the inconvenience of being
1183 lengthy and neutralizing string semantics, so that having an unevalu‐
1184 ated instance of `x' in the string, or an occurrence of `/*', would be
1185 impossible without resorting to further contortions.
1186
1187 The second method is slightly more efficient because the local presence
1188 of a quote character makes it easier to control what is evaluated and
1189 what isn't, but has the drawback that it is sometimes impossible to
1190 find a reasonable quote character without having to either signifi‐
1191 cantly alter the source file or enclose it inside a #mode push/pop con‐
1192 struct. For example, any occurrence of `/*' in the string would have to
1193 be quoted.
1194
1195 The last method demonstrates the efficiency of evaluated strings in the
1196 context of selective evaluation: since comments/strings cannot be
1197 nested, any occurrence of `"' or `/*' inside the `$$' gets output as
1198 plain text, as expected inside a string, and only macro evaluation is
1199 enabled. Also note that there is much more freedom in the choice of a
1200 string delimiter than in the choice of a quote character.
1201
1202
1203 Starting with version 2.1, meta-macro calls can be nested more effi‐
1204 ciently in default, cpp and Prolog modes. This makes it easy to make a
1205 user version of a meta-macro, or to increment a counter:
1206
1207
1208
1209 #define myeval #eval #1
1210
1211 #define x 1
1212 #defeval x #eval x+1
1213
1214
1215
1217 Here are some examples of advanced constructions using GPP. They tend
1218 to be pretty awkward and should be considered as evidence of GPP's lim‐
1219 itations.
1220
1221
1222 The first example is a recursive macro. The main problem is that (since
1223 GPP evaluates everything) a recursive macro must be very careful about
1224 the way in which recursion is terminated in order to avoid undefined
1225 behavior (most of the time GPP will simply crash). In particular, rely‐
1226 ing on a #if/#else/#endif construct to end recursion is not possible
1227 and results in an infinite loop, because GPP scans user macro calls
1228 even in the unevaluated branch of the conditional block. A safe way to
1229 proceed is for example as follows (we give the example in TeX mode):
1230
1231
1232
1233 \define{countdown}{
1234 \if{#1}
1235 #1...
1236 \define{loop}{\countdown}
1237 \else
1238 Done.
1239 \define{loop}{}
1240 \endif
1241 \loop{\eval{#1-1}}
1242 }
1243 \countdown{10}
1244
1245
1246 Another example, in cpp mode:
1247
1248
1249
1250 #mode string QQQ "$" "$"
1251 #define triangle(x,y) y \
1252 $#if length(y)<x$ $#define iter triangle$ $#else$ \
1253 $#define iter$ $#endif
1254 $ iter(x,*y)
1255 triangle(20)
1256
1257
1258 The following is an (unfortunately very weak) attempt at implementing
1259 functional abstraction in GPP (in standard mode). Understanding this
1260 example and why it can't be made much simpler is an exercise left to
1261 the curious reader.
1262
1263
1264
1265 #mode string "`" "`" "\\"
1266 #define ASIS(x) x
1267 #define SILENT(x) ASIS()
1268 #define EVAL(x,f,v) SILENT(
1269 #mode string QQQ "`" "`" "\\"
1270 #defeval TEMP0 x
1271 #defeval TEMP1 (
1272 \#define \TEMP2(TEMP0) f
1273 )
1274 TEMP1
1275 )TEMP2(v)
1276 #define LAMBDA(x,f,v) SILENT(
1277 #ifneq (v) ()
1278 #define TEMP3(a,b,c) EVAL(a,b,c)
1279 #else
1280 #define TEMP3(a,b,c) \LAMBDA(a,b)
1281 #endif
1282 )TEMP3(x,f,v)
1283 #define EVALAMBDA(x,y) SILENT(
1284 #defeval TEMP4 x
1285 #defeval TEMP5 y
1286 )
1287 #define APPLY(f,v) SILENT(
1288 #defeval TEMP6 ASIS(\EVA)f
1289 TEMP6
1290 )EVAL(TEMP4,TEMP5,v)
1291
1292 This yields the following results:
1293
1294
1295
1296 LAMBDA(z,z+z)
1297 => LAMBDA(z,z+z)
1298
1299 LAMBDA(z,z+z,2)
1300 => 2+2
1301
1302 #define f LAMBDA(y,y*y)
1303 f
1304 => LAMBDA(y,y*y)
1305
1306 APPLY(f,blah)
1307 => blah*blah
1308
1309 APPLY(LAMBDA(t,t t),(t t))
1310 => (t t) (t t)
1311
1312 LAMBDA(x,APPLY(f,(x+x)),urf)
1313 => (urf+urf)*(urf+urf)
1314
1315 APPLY(APPLY(LAMBDA(x,LAMBDA(y,x*y)),foo),bar)
1316 => foo*bar
1317
1318 #define test LAMBDA(y,`#ifeq y urf
1319 y is urf#else
1320 y is not urf#endif
1321 `)
1322 APPLY(test,urf)
1323 => urf is urf
1324
1325 APPLY(test,foo)
1326 => foo is not urf
1327
1328
1329
1331 strftime(3), glob(7), m4(1V), cpp(1)
1332
1333 GPP home page: https://logological.org/gpp/
1334
1335
1337 GPP was written by Denis Auroux <auroux@math.mit.edu>. Since version
1338 2.12 it has been maintained by Tristan Miller <tristan@logologi‐
1339 cal.org>.
1340
1341
1343 Copyright (C) 1996-2001 Denis Auroux.
1344
1345 Copyright (C) 2003-2020 Tristan Miller.
1346
1347 Permission is granted to anyone to make or distribute verbatim copies
1348 of this document as received, in any medium, provided that the copy‐
1349 right notice and this permission notice are preserved, thus giving the
1350 recipient permission to redistribute in turn.
1351
1352 Permission is granted to distribute modified versions of this document,
1353 or of portions of it, under the above conditions, provided also that
1354 they carry prominent notices stating who last changed them.
1355
1356
1357
1358 GPP(1)