1GPP(1) General Commands Manual GPP(1)
2
3
4
6 GPP - Generic Preprocessor
7
8
10 gpp [-{o|O} outfile] [-I/include/path ...]
11 [-Dname=val ...] [-z|+z] [-x] [-m]
12 [-C|-T|-H|-X|-P|-U ... [-M ...]]
13 [-n|+n] [+c<n> str1 str2] [+s<n> str1 str2 c]
14 [-c str1] [--nostdinc] [--nocurinc]
15 [--curdirinclast] [--warninglevel n]
16 [--includemarker str] [--include file]
17 [infile]
18
19 gpp --help
20
21 gpp --version
22
23
24
26 GPP is a general-purpose preprocessor with customizable syntax, suit‐
27 able for a wide range of preprocessing tasks. Its independence from any
28 programming language makes it much more versatile than cpp, while its
29 syntax is lighter and more flexible than that of m4.
30
31
32 GPP is targeted at all common preprocessing tasks where cpp is not
33 suitable and where no very sophisticated features are needed. In order
34 to be able to process equally efficiently text files or source code in
35 a variety of languages, the syntax used by GPP is fully customizable.
36 The handling of comments and strings is especially advanced.
37
38
39 Initially, GPP only understands a minimal set of built-in macros,
40 called meta-macros. These meta-macros allow the definition of user
41 macros as well as some basic operations forming the core of the prepro‐
42 cessing system, including conditional tests, arithmetic evaluation,
43 wildcard matching (globbing), and syntax specification. All user macro
44 definitions are global -- i.e., they remain valid until explicitly re‐
45 moved; meta-macros cannot be redefined. With each user macro definition
46 GPP keeps track of the corresponding syntax specification so that a
47 macro can be safely invoked regardless of any subsequent change in op‐
48 erating mode.
49
50
51 In addition to macros, GPP understands comments and strings, whose syn‐
52 tax and behavior can be widely customized to fit any particular pur‐
53 pose. Internally comments and strings are the same construction, so
54 everything that applies to comments applies to strings as well.
55
56
58 GPP recognizes the following command-line switches and options. Note
59 that the -nostdinc, -nocurinc, -curdirinclast, -warninglevel, and -in‐
60 cludemarker options from version 2.1 and earlier are deprecated and
61 should not be used. Use the "long option" variants instead (--nostd‐
62 inc, etc.).
63
64 -h --help
65 Print a short help message.
66
67 --version
68 Print version information.
69
70 -o outfile
71 Specify a file to which all output should be sent (by default,
72 everything is sent to standard output).
73
74 -O outfile
75 Specify a file to which all output should be sent; output is si‐
76 multaneously sent to stdout.
77
78 -I/include/path
79 Specify a path where the #include meta-macro will look for in‐
80 clude files if they are not present in the current directory.
81 The default is /usr/include if no -I option is specified. Multi‐
82 ple -I options may be specified to look in several directories.
83
84 -Dname=val
85 Define the user macro name as equal to val. This is strictly
86 equivalent to using the #define meta-macro, but makes it possi‐
87 ble to define macros from the command-line. If val makes refer‐
88 ences to arguments or other macros, it should conform to the
89 syntax of the mode specified on the command-line. Starting with
90 version 2.1, macro argument naming is allowed on the command-
91 line. The syntax is as follows: -Dmacro(arg1,...)=definition.
92 The arguments are specified in C-style syntax, without any
93 whitespace, but the definition should still conform to the syn‐
94 tax of the mode specified on the command-line.
95
96 +z Set text mode to Unix mode (LF terminator). Any CR character in
97 the input is systematically discarded. This is the default under
98 Unix systems.
99
100 -z Set text mode to DOS mode (CR-LF terminator). In this mode all
101 CR characters are removed from the input, and all output LF
102 characters are converted to CR-LF. This is the default if GPP is
103 compiled with the WIN_NT option.
104
105 -x Enable the use of the #exec meta-macro. Since #exec includes the
106 output of an arbitrary shell command line, it may cause a poten‐
107 tial security threat, and is thus disabled unless this option is
108 specified.
109
110 -m Enable automatic mode switching to the cpp compatibility mode if
111 the name of an included file ends in `.h' or `.c'. This makes it
112 possible to include C header files with only minor modifica‐
113 tions.
114
115 -n Prevent newline or whitespace characters from being removed from
116 the input when they occur as the end of a macro call or of a
117 comment. By default, when a newline or whitespace character
118 forms the end of a macro or a comment it is parsed as part of
119 the macro call or comment and therefore removed from output. Use
120 the -n option to keep the last character in the input stream if
121 it was whitespace or a newline. This is activated in cpp and
122 Prolog modes.
123
124 +n The opposite of -n. This is the default in all modes except cpp
125 and Prolog. Note that +n must be placed after -C or -P in order
126 to have any effect.
127
128 -U arg1 ... arg9
129 User-defined mode. The nine following command-line arguments are
130 taken to be respectively the macro start sequence, the macro end
131 sequence for a call without arguments, the argument start se‐
132 quence, the argument separator, the argument end sequence, the
133 list of characters to stack for argument balancing, the list of
134 characters to unstack, the string to be used for referring to an
135 argument by number, and finally the quote character (if there is
136 none an empty string should be provided). These settings apply
137 both to user macros and to meta-macros, unless the -M option is
138 used to define other settings for meta-macros. See the section
139 on syntax specification for more details.
140
141 -M arg1 ... arg7
142 User-defined mode specifications for meta-macros. This option
143 can only be used together with -U. The seven following command-
144 line arguments are taken to be respectively the macro start se‐
145 quence, the macro end sequence for a call without arguments, the
146 argument start sequence, the argument separator, the argument
147 end sequence, the list of characters to stack for argument bal‐
148 ancing, and the list of characters to unstack. See below for
149 more details.
150
151 (default mode)
152 The default mode is a vaguely cpp-like mode, but it does not
153 handle comments, and presents various incompatibilities with
154 cpp. Typical meta-macros and user macros look like this:
155
156
157 #define x y
158 macro(arg,...)
159
160 This mode is equivalent to
161
162
163 -U "" "" "(" "," ")" "(" ")" "#" "\\"
164 -M "#" "\n" " " " " "\n" "(" ")"
165
166
167 -C cpp compatibility mode. This is the mode where GPP's behavior is
168 the closest to that of cpp. Unlike in the default mode, meta-
169 macro expansion occurs only at the beginning of lines, and C
170 comments and strings are understood. This mode is equivalent to
171
172
173 -n -U "" "" "(" "," ")" "(" ")" "#" ""
174 -M "\n#\w" "\n" " " " " "\n" "" ""
175 +c "/*" "*/" +c "//" "\n" +c "\\\n" ""
176 +s "\"" "\"" "\\" +s "'" "'" "\\"
177
178
179 -T TeX-like mode. In this mode, typical meta-macros and user macros
180 look like this:
181
182
183 \define{x}{y}
184 \macro{arg}{...}
185
186 No comments are understood. This mode is equivalent to
187
188
189 -U "\\" "" "{" "}{" "}" "{" "}" "#" "@"
190
191
192 -H HTML-like mode. In this mode, typical meta-macros and user
193 macros look like this:
194
195
196 <#define x|y>
197 <#macro arg|...>
198
199 No comments are understood. This mode is equivalent to
200
201
202 -U "<#" ">" "\B" "|" ">" "<" ">" "#" "\\"
203
204
205 -X XHTML-like mode. In this mode, typical meta-macros and user
206 macros look like this:
207
208
209 <#define x|y/>
210 <#macro arg|.../>
211
212 No comments are understood. This mode is equivalent to
213
214
215 -U "<#" "/>" "\B" "|" "/>" "<" ">" "#" "\\"
216
217
218 -P Prolog-compatible cpp-like mode. This mode differs from the cpp
219 compatibility mode by its handling of comments, and is equiva‐
220 lent to
221
222
223 -n -U "" "" "(" "," ")" "(" ")" "#" ""
224 -M "\n#\w" "\n" " " " " "\n" "" ""
225 +ccss "\!o/*" "*/" +ccss "%" "\n" +ccii "\\\n" ""
226 +s "\"" "\"" "" +s "\!#'" "'" ""
227
228
229 +c<n> str1 str2
230 Specify comments. Any unquoted occurrence of str1 will be inter‐
231 preted as the beginning of a comment. All input up to the first
232 following occurrence of str2 will be discarded. This option may
233 be used multiple times to specify different types of comment de‐
234 limiters. The optional parameter <n> can be specified to alter
235 the behavior of the comment and, e.g., turn it into a string or
236 make it ignored under certain circumstances, see below.
237
238 -c str1
239 Un-specify comments or strings. The comment/string specification
240 whose start sequence is str1 is removed. This is useful to alter
241 the built-in comment specifications of a standard mode -- e.g.,
242 the cpp compatibility mode.
243
244 +s<n> str1 str2 c
245 Specify strings. Any unquoted occurrence of str1 will be inter‐
246 preted as the beginning of a string. All input up to the first
247 following occurrence of str2 will be output as is without any
248 evaluation. The delimiters themselves are output. If c is non-
249 empty, its first character is used as a string-quote character
250 -- i.e., a character whose presence immediately before an occur‐
251 rence of str2 prevents it from terminating the string. The op‐
252 tional parameter <n> can be specified to alter the behavior of
253 the string and, e.g., turn it into a comment, enable macro eval‐
254 uation inside the string, or make the string specification ig‐
255 nored under certain circumstances. See below.
256
257 -s str1
258 Un-specify comments or strings. Identical to -c.
259
260 --include file
261 Process file before infile
262
263 --nostdinc
264 Do not look for include files in the standard directory /usr/in‐
265 clude.
266
267 --nocurinc
268 Do not look for include files in the current directory.
269
270 --curdirinclast
271 Look for include files in the current directory after the direc‐
272 tories specified by -I rather than before them.
273
274 --warninglevel n
275 Set warning level to n (0, 1 or 2). Default is 2 (most verbose).
276
277 --includemarker str
278 keep track of #include directives by inserting a marker in the
279 output stream. The format of the marker is determined by str,
280 which must contain three occurrences of the character % (or
281 equivalently ?). The first occurrence is replaced with the line
282 number, the second with the file name, and the third with 1, 2
283 or blank. When this option is specified in default, cpp or Pro‐
284 log mode, GPP does its best to ensure that line numbers are the
285 same in the output as in the input by inserting blank lines in
286 the place of definitions or comments.
287
288 infile Specify an input file from which GPP reads its input. If no in‐
289 put file is specified, input is read from standard input.
290
291
292
294 The syntax of a macro call is as follows: it must start with a sequence
295 of characters matching the macro start sequence as specified in the
296 current mode, followed immediately by the name of the macro, which must
297 be a valid identifier -- i.e., a sequence of letters, digits, or under‐
298 scores ("_"). The macro name must be followed by a short macro end se‐
299 quence if the macro has no arguments, or by a sequence of arguments
300 initiated by an argument start sequence. The various arguments are then
301 separated by an argument separator, and the macro ends with a long
302 macro end sequence.
303
304
305 In all cases, the parameters of the current context -- i.e., the argu‐
306 ments passed to the body being evaluated -- can be referred to by using
307 an argument reference sequence followed by a digit between 1 and 9.
308 Alternatively, macro parameters may be named (see below). Furthermore,
309 to avoid interference between the GPP syntax and the contents of the
310 input file, a quote character is provided. The quote character can be
311 used to prevent the interpretation of a macro call, comment, or string
312 as anything but plain text. The quote character "protects" the follow‐
313 ing character, and always gets removed during evaluation. Two consecu‐
314 tive quote characters evaluate as a single quote character.
315
316
317 Finally, to facilitate proper argument delimitation, certain characters
318 can be "stacked" when they occur in a macro argument, so that the argu‐
319 ment separator or macro end sequence are not parsed if the argument
320 body is not balanced. This allows nesting macro calls without using
321 quotes. If an improperly balanced argument is needed, quote characters
322 should be added in front of some stacked characters to make it bal‐
323 anced.
324
325
326 The macro construction sequences described above can be different for
327 meta-macros and for user macros: this is the case in cpp mode, for ex‐
328 ample. Note that, since meta-macros can only have up to two arguments,
329 the delimitation rules for the second argument are somewhat sloppier,
330 and unquoted argument separator sequences are allowed in the second ar‐
331 gument of a meta-macro.
332
333
334 Unless one of the standard operating modes is selected, the above syn‐
335 tax sequences can be specified either on the command-line, using the -M
336 and -U options respectively for meta-macros and user macros, or inside
337 an input file via the #mode meta and #mode user meta-macro calls. In
338 both cases the mode description consists of nine parameters for user
339 macro specifications, namely the macro start sequence, the short macro
340 end sequence, the argument start sequence, the argument separator, the
341 long macro end sequence, the string listing characters to stack, the
342 string listing characters to unstack, the argument reference sequence,
343 and finally the quote character. As explained below, these sequences
344 should be supplied using the syntax of C strings; they must start with
345 a non-alphanumeric character, and in the first five strings special
346 matching sequences can be used (see below). If the argument correspond‐
347 ing to the quote character is the empty string, that argument's func‐
348 tionality is disabled. For meta-macro specifications there are only
349 seven parameters, as the argument reference sequence and quote charac‐
350 ter are shared with the user macro syntax.
351
352
353 The structure of a comment/string is as follows: it must start with a
354 sequence of characters matching the given comment/string start se‐
355 quence, and always ends at the first occurrence of the comment/string
356 end sequence, unless it is preceded by an odd number of occurrences of
357 the string-quote character (if such a character has been specified). In
358 certain cases comment/strings can be specified to enable macro evalua‐
359 tion inside the comment/string; in that case, if a quote character has
360 been defined for macros it can be used as well to prevent the com‐
361 ment/string from ending, with the difference that the macro quote char‐
362 acter is always removed from output whereas the string-quote character
363 is always output. Also note that under certain circumstances a com‐
364 ment/string specification can be disabled, in which case the com‐
365 ment/string start sequence is simply ignored. Finally, it is possible
366 to specify a string warning character whose presence inside a com‐
367 ment/string will cause GPP to output a warning (this is useful to lo‐
368 cate unterminated strings in cpp mode). Note that input files are not
369 allowed to contain unterminated comments/strings.
370
371
372 A comment/string specification can be declared from within the input
373 file using the #mode comment meta-macro call (or equivalently #mode
374 string), in which case the number of C strings to be given as arguments
375 to describe the comment/string can be anywhere between two and four:
376 the first two arguments (mandatory) are the start sequence and the end
377 sequence, and can make use of the special matching sequences (see be‐
378 low). They may not start with alphanumeric characters. The first char‐
379 acter of the third argument, if there is one, is used as the string-
380 quote character (use an empty string to disable the functionality), and
381 the first character of the fourth argument, if there is one, is used as
382 the string-warning character. A specification may also be given from
383 the command-line, in which case there must be two arguments if using
384 the +c option and three if using the +s option.
385
386
387 The behavior of a comment/string is specified by a three-character mod‐
388 ifier string, which may be passed as an optional argument either to the
389 +c/+s command-line options or to the #mode comment/#mode string meta-
390 macros. If no modifier string is specified, the default value is "ccc"
391 for comments and "sss" for strings. The first character corresponds to
392 the behavior inside meta-macro calls (including user-macro definitions
393 since these come inside a #define meta-macro call), the second charac‐
394 ter corresponds to the behavior inside user-macro parameters, and the
395 third character corresponds to the behavior outside of any macro call.
396 Each of these characters can take the following values:
397
398
399 i disable the comment/string specification.
400
401 c comment (neither evaluated nor output).
402
403 s string (the string and its delimiter sequences are output as-
404 is).
405
406 q quoted string (the string is output as-is, without the delimiter
407 sequences).
408
409 C evaluated comment (macros are evaluated, but output is dis‐
410 carded).
411
412 S evaluated string (macros are evaluated, delimiters are output).
413
414 Q evaluated quoted string (macros are evaluated, delimiters are
415 not output).
416
417
418 Important note: any occurrence of a comment/string start sequence in‐
419 side another comment/string is always ignored, even if macro evaluation
420 is enabled. In other words, comments/strings cannot be nested. In par‐
421 ticular, the `Q' modifier can be a convenient way of defining a syntax
422 for temporarily disabling all comment and string specifications.
423
424
425 Syntax specification strings should always be provided as C strings,
426 whether they are given as arguments to a #mode meta-macro call or on
427 the command-line of a Unix shell. If command-line arguments are given
428 via another method than a standard Unix shell, then the shell behavior
429 must be emulated -- i.e., the surrounding "" quotes should be removed,
430 all occurrences of `\\' should be replaced by a single backslash, and
431 similarly `\"' should be replaced by `"'. Sequences like `\n' are rec‐
432 ognized by GPP and should be left as is.
433
434
435 Special sequences matching certain subsets of the character set can be
436 used. They are of the form `\x', where x is one of:
437
438
439 b matches any sequence of one or more spaces or tab characters
440 (`\b' is identical to ` ').
441
442 w matches any sequence of zero or more spaces or tab characters.
443
444 B matches any sequence of one or more spaces, tabs or newline
445 characters.
446
447 W matches any sequence of zero or more spaces, tabs or newline
448 characters.
449
450 a an alphabetic character (`a' to `z' and `A' to `Z').
451
452 A an alphabetic character, or a space, tab or newline.
453
454 # a digit (`0' to `9').
455
456 i an identifier character. The set of matched characters is cus‐
457 tomizable using the #mode charset id command. The default set‐
458 ting matches alphanumeric characters and underscores (`a' to
459 `z', `A' to `Z', `0' to `9' and `_').
460
461 t a tab character.
462
463 n a newline character.
464
465 o an operator character. The set of matched characters is custom‐
466 izable using the #mode charset op command. The default setting
467 matches all characters in "+-*/\^<>=`~:.?@#&!%|", except in Pro‐
468 log mode where `!', `%' and `|' are not matched.
469
470 O an operator character or a parenthesis character. The set of ad‐
471 ditional matched characters in comparison with `\o' is customiz‐
472 able using the #mode charset par command. The default setting is
473 to have the characters in "()[]{}" as parentheses.
474
475
476 Moreover, all of these matching subsets except `\w' and `\W' can be
477 negated by inserting a `!' -- i.e., by writing `\!x' instead of `\x'.
478
479
480 Note an important distinctive feature of start sequences: when the
481 first character of a macro or comment/string start sequence is ` ' or
482 one of the above special sequences, it is not taken to be part of the
483 sequence itself but is used instead as a context check: for example a
484 start sequence beginning with `\n' matches only at the beginning of a
485 line, but the matching newline character is not taken to be part of the
486 sequence. Similarly a start sequence beginning with ` ' matches only
487 if some whitespace is present, but the matching whitespace is not con‐
488 sidered to be part of the start sequence and is therefore sent to out‐
489 put. If a context check is performed at the very beginning of a file
490 (or more generally of any body to be evaluated), the result is the same
491 as matching with a newline character (this makes it possible for a cpp-
492 mode file to start with a meta-macro call).
493
494
495 Two special syntax rules were added in version 2.1. First, argument
496 references (#n) are no longer evaluated when they are outside of macro
497 calls and definitions. However, they are no longer allowed to appear
498 (unless protected by quote characters) inside a call to a defined user
499 macro; the current behavior (backwards compatible) is to remove them
500 silently from the input if that happens.
501
502
503 Second, if the end sequence (either for macros or comments) consists of
504 a single newline character, and if delimitation rules lead to evalua‐
505 tion in a context where the final newline character is absent, GPP
506 silently ignores the missing newline instead of producing an error. The
507 main consequence is that meta-macro calls can now be nested in a simple
508 way in standard, cpp and Prolog modes.
509
510
511
513 Input is read sequentially and interpreted according to the rules of
514 the current mode. All input text is first matched against the specified
515 comment/string start sequences of the current mode (except those which
516 are disabled by the `i' modifier), unless the body being evaluated is
517 the contents of a comment/string whose modifier enables macro evalua‐
518 tion. The most recently defined comment/string specifications are
519 checked for first. Important note: comments may not appear between the
520 name of a macro and its arguments (doing so results in undefined behav‐
521 ior).
522
523
524 Anything that is not a comment/string is then matched against a possi‐
525 ble meta-macro call, and if that fails too, against a possible user-
526 macro call. All remaining text undergoes substitution of argument ref‐
527 erence sequences by the relevant argument text (empty unless the body
528 being evaluated is the definition of a user macro) and removal of the
529 quote character if there is one.
530
531
532 Note that meta-macro arguments are passed to the meta-macro prior to
533 any evaluation (although the meta-macro may choose to evaluate them,
534 see meta-macro descriptions below). In the case of the #mode meta-
535 macro, GPP temporarily adds a comment/string specification to enable
536 recognition of C strings ("...") and prevent any evaluation inside
537 them, so no interference of the characters being put in the C string
538 arguments to #mode with the current syntax is to be feared.
539
540
541 On the other hand, the arguments to a user macro are systematically
542 evaluated, and then passed as context parameters to the macro defini‐
543 tion body, which gets evaluated with that environment. The only excep‐
544 tion is when the macro definition is empty, in which case its arguments
545 are not evaluated. Note that GPP temporarily switches back to the mode
546 in which the macro was defined in order to evaluate it, so it is per‐
547 fectly safe to change the operating mode between the time a macro is
548 defined and the time when it is called. Conversely, if a user macro
549 wishes to work with the current mode instead of the one that was used
550 to define it it needs to start with a #mode restore call and end with a
551 #mode save call.
552
553
554 A user macro may be defined with named arguments (see #define descrip‐
555 tion below). In that case, when the macro definition is being evalu‐
556 ated, each named parameter causes a temporary virtual user-macro defi‐
557 nition to be created; such a macro may be called only without arguments
558 and simply returns the text of the corresponding argument.
559
560
561 Note that, since macros are evaluated when they are called rather than
562 when they are defined, any attempt to call a recursive macro causes un‐
563 defined behavior except in the very specific case when the macro uses
564 #undef to erase itself after finitely many loop iterations.
565
566
567 Finally, a special case occurs when a user macro whose definition does
568 not involve any arguments (neither named arguments nor the argument
569 reference sequence) is called in a mode where the short user-macro end
570 sequence is empty (e.g., cpp or TeX mode). In that case it is assumed
571 to be an alias macro: its arguments are first evaluated in the current
572 mode as usual, but instead of being passed to the macro definition as
573 parameters (which would cause them to be discarded) they are actually
574 appended to the macro definition, using the syntax rules of the mode in
575 which the macro was defined, and the resulting text is evaluated again.
576 It is therefore important to note that, in the case of a macro alias,
577 the arguments actually get evaluated twice in two potentially different
578 modes.
579
580
582 These macros are always predefined. Their actual calling sequence de‐
583 pends on the current mode; here we use cpp-like notation.
584
585
586 #define x y
587 This defines the user macro x as y. y can be any valid GPP in‐
588 put, and may for example refer to other macros. x must be an
589 identifier (i.e., a sequence of alphanumeric characters and
590 `_'), unless named arguments are specified. If x is already de‐
591 fined, the previous definition is overwritten. If no second ar‐
592 gument is given, x will be defined as a macro that outputs noth‐
593 ing. Neither x nor y are evaluated; the macro definition is only
594 evaluated when it is called, not when it is declared.
595
596 It is also possible to name the arguments in a macro definition:
597 in that case, the argument x should be a user-macro call whose
598 arguments are all identifiers. These identifiers become avail‐
599 able as user-macros inside the macro definition; these virtual
600 macros must be called without arguments, and evaluate to the
601 corresponding macro parameter.
602
603 #defeval x y
604 This acts in a similar way to #define, but the second argument y
605 is evaluated immediately. Since user macro definitions are also
606 evaluated each time they are called, this means that the macro y
607 will undergo two successive evaluations. The usefulness of #de‐
608 feval is considerable as it is the only way to evaluate some‐
609 thing more than once, which may be needed to force evaluation of
610 the arguments of a meta-macro that normally doesn't perform any
611 evaluation. However since all argument references evaluated at
612 define-time are understood as the arguments of the body in which
613 the macro is being defined and not as the arguments of the macro
614 itself, usually one has to use the quote character to prevent
615 immediate evaluation of argument references.
616
617 #undef x
618 This removes any existing definition of the user macro x.
619
620 #ifdef x
621 This begins a conditional block. Everything that follows is
622 evaluated only if the identifier x is defined, and until either
623 a #else or a #endif statement is reached. Note, however, that
624 the commented text is still scanned thoroughly, so its syntax
625 must be valid. It is in particular legal to have the #else or
626 #endif statement ending the conditional block appear only as the
627 result of a user-macro expansion and not explicitly in the in‐
628 put.
629
630 #ifndef x
631 This begins a conditional block. Everything that follows is
632 evaluated only if the identifier x is not defined.
633
634 #ifeq x y
635 This begins a conditional block. Everything that follows is
636 evaluated only if the results of the evaluations of x and y are
637 identical as character strings. Any leading or trailing white‐
638 space is ignored for the comparison. Note that in cpp-mode any
639 unquoted whitespace character is understood as the end of the
640 first argument, so it is necessary to be careful.
641
642 #ifneq x y
643 This begins a conditional block. Everything that follows is
644 evaluated only if the results of the evaluations of x and y are
645 not identical (even up to leading or trailing whitespace).
646
647 #else This toggles the logical value of the current conditional block.
648 What follows is evaluated if and only if the preceding input was
649 commented out.
650
651 #endif This ends a conditional block started by a #if... meta-macro.
652
653 #include file
654 This causes GPP to open the specified file and evaluate its con‐
655 tents, inserting the resulting text in the current output. All
656 defined user macros are still available in the included file,
657 and reciprocally all macros defined in the included file will be
658 available in everything that follows. The include file is looked
659 for first in the current directory, and then, if not found, in
660 one of the directories specified by the -I command-line option
661 (or /usr/include if no directory was specified). Note that, for
662 compatibility reasons, it is possible to put the file name be‐
663 tween "" or <>.
664
665 The order in which the various directories are searched for in‐
666 clude files is affected by the -nostdinc, -nocurinc and -cur‐
667 dirinclast command-line options.
668
669 Upon including a file, GPP immediately saves a copy of the cur‐
670 rent operating mode onto the mode stack, and restores the oper‐
671 ating mode at the end of the included file. The included file
672 may override this behavior by starting with a #mode restore call
673 and ending with a #mode push call. Additionally, when the -m
674 command line option is specified, GPP will automatically switch
675 to the cpp compatibility mode upon including a file whose name
676 ends with either `.c' or `.h'.
677
678 #sinclude file
679 This is a "silent" version of the #include meta-macro that does
680 not emit an error in the event that the specified file does not
681 exist or cannot be opened.
682
683 #exec command
684 This causes GPP to execute the specified command line and in‐
685 clude its standard output in the current output. Note that, for
686 security reasons, this meta-macro is disabled unless the -x com‐
687 mand line flag was specified. If use of #exec is not allowed, a
688 warning message is printed and the output is left blank. Note
689 that the specified command line is evaluated before being exe‐
690 cuted, thus allowing the use of macros in the command-line. How‐
691 ever, the output of the command is included verbatim and not
692 evaluated. If you need the output to be evaluated, you must use
693 #defeval (see above) to cause a double evaluation.
694
695 #eval expr
696 The #eval meta-macro attempts to evaluate expr first by expand‐
697 ing macros (normal GPP evaluation) and then by performing arith‐
698 metic evaluation and/or wildcard matching. The syntax and oper‐
699 ator precedence for arithmetic expressions are the same as in C;
700 the only missing operators are <<, >>, ?:, and the assignment
701 operators.
702
703 POSIX-style wildcard matching (`globbing') is available only on
704 POSIX implementations and can be invoked with the =~ operator.
705 In brief, a `?' matches any single character, a `*' matches any
706 string (including the empty string), and `[...]' matches any one
707 of the characters enclosed in brackets. A `[...]' class is com‐
708 plemented when the first character in the brackets is `!'. The
709 characters in a `[...]' class can also be specified as a range
710 using the `-' character -- e.g., `[F-N]' is equivalent to
711 `[FGHIJKLMN]'.
712
713 If unable to assign a numerical value to the result, the re‐
714 turned text is simply the result of macro expansion without any
715 arithmetic evaluation. The only exceptions to this rule are the
716 comparison operators ==, !=, <, >, <=, and >= which, if one of
717 the sides does not evaluate to a number, perform string compari‐
718 son instead (ignoring trailing and leading spaces). Addition‐
719 ally, the length(...) arithmetic operator returns the length in
720 characters of its evaluated argument.
721
722 Inside arithmetic expressions, the defined(...) special user
723 macro is also available: it takes only one argument, which is
724 not evaluated, and returns 1 if it is the name of a user macro
725 and 0 otherwise.
726
727 #if expr
728 This meta-macro invokes the arithmetic/globbing evaluator in the
729 same manner as #eval and compares the result of evaluation with
730 the string "0" in order to begin a conditional block. In partic‐
731 ular note that the logical value of expr is always true when it
732 cannot be evaluated to a number.
733
734 #elif expr
735 This meta-macro can be used to avoid nested #if conditions. #if
736 ... #elif ... #endif is equivalent to #if ... #else #if ...
737 #endif #endif.
738
739 #mode keyword ...
740 This meta-macro controls GPP's operating mode. See below for a
741 list of #mode commands.
742
743 #line This meta-macro evaluates to the line number of the current in‐
744 put file.
745
746 #file This meta-macro evaluates to the filename of the current input
747 file as it appears on the command line or in the argument to
748 #include. If GPP is reading its input from stdin, then #file
749 evaluates to `stdin'.
750
751 #date fmt
752 This meta-macro evaluates to the current date and time as for‐
753 matted by the specified format string fmt. See the section DATE
754 AND TIME CONVERSION SPECIFIERS below.
755
756 #error msg
757 This meta-macro causes an error message with the current file‐
758 name and line number, and with the text msg, to be printed to
759 the standard error device. Subsequent processing is then
760 aborted.
761
762 #warning msg
763 This meta-macro causes a warning message with the current file‐
764 name and line number, and with the text msg, to be printed to
765 the standard error device. Subsequent processing is then re‐
766 sumed.
767
768
769
770 The key to GPP's flexibility is the #mode meta-macro. Its first argu‐
771 ment is always one of a list of available keywords (see below); its
772 second argument is always a sequence of words separated by whitespace.
773 Apart from possibly the first of them, each of these words is always a
774 delimiter or syntax specifier, and should be provided as a C string de‐
775 limited by double quotes (" "). The various special matching sequences
776 listed in the section on syntax specification are available. Any #mode
777 command is parsed in a mode where "..." is understood to be a C-style
778 string, so it is safe to put any character inside these strings. Also
779 note that the first argument of #mode (the keyword) is never evaluated,
780 while the second argument is evaluated (except of course for the con‐
781 tents of C strings), so that the syntax specification may be obtained
782 as the result of a macro evaluation.
783
784
785 The available #mode commands are:
786
787
788 #mode save / #mode push
789 Push the current mode specification onto the mode stack.
790
791 #mode restore / #mode pop
792 Pop mode specification from the mode stack.
793
794 #mode standard name
795 Select one of the standard modes. The only argument must be one
796 of: default (default mode); cpp, C (cpp mode); tex, TeX
797 (TeX mode); html, HTML (html mode); xhtml, XHTML (xhtml mode);
798 prolog, Prolog (prolog mode). The mode name must be given di‐
799 rectly, not as a C string.
800
801 #mode user "s1" ... "s9"
802 Specify user macro syntax. The 9 arguments, all of them C
803 strings, are the mode specification for user macros (see the -U
804 command-line option and the section on syntax specification).
805 The meta-macro specification is not affected.
806
807 #mode meta {user | "s1" ... "s7"}
808 Specify meta-macro syntax. Either the only argument is user
809 (not as a string), and the user-macro mode specifications are
810 copied into the meta-macro mode specifications, or there must be
811 seven string arguments, whose significance is the same as for
812 the -M command-line option (see section on syntax specifica‐
813 tion).
814
815 #mode quote ["c"]
816 With no argument or "" as argument, removes the quote character
817 specification and disables the quoting functionality. With one
818 string argument, the first character of the string is taken to
819 be the new quote character. The quote character can be neither
820 alphanumeric nor `_', nor can it be one of the special matching
821 sequences.
822
823 #mode comment [xxx] "start" "end" ["c" ["c"]]
824 Add a comment specification. Optionally a first argument con‐
825 sisting of three characters not enclosed in " " can be used to
826 specify a comment/string modifier (see the section on syntax
827 specification). The default modifier is ccc. The first two
828 string arguments are used as comment start and end sequences re‐
829 spectively. The third string argument is optional and can be
830 used to specify a string-quote character. (If it is "", the
831 functionality is disabled.) The fourth string argument is op‐
832 tional and can be used to specify a string delimitation warning
833 character. (If it is "", the functionality is disabled.)
834
835 #mode string [xxx] "start" "end" ["c" ["c"]]
836 Add a string specification. Identical to #mode comment except
837 that the default modifier is sss.
838
839 #mode nocomment / #mode nostring ["start"]
840 With no argument, remove all comment/string specifications. With
841 one string argument, delete the comment/string specification
842 whose start sequence is the argument.
843
844 #mode preservelf { on | off | 1 | 0 }
845 Equivalent to the -n command-line switch. If the argument is on
846 or 1, any newline or whitespace character terminating a macro
847 call or a comment/string is left in the input stream for further
848 processing. If the argument is off or 0 this feature is dis‐
849 abled.
850
851 #mode charset { id | op | par } "string"
852 Specify the character sets to be used for matching the \o, \O
853 and \i special sequences. The first argument must be one of id
854 (the set matched by \i), op (the set matched by \o) or par (the
855 set matched by \O in addition to the one matched by \o).
856 "string" is a C string which lists all characters to put in the
857 set. It may contain only the special matching sequences \a, \A,
858 \b, \B, and \# (the other sequences and the negated sequences
859 are not allowed). When a `-' is found inbetween two non-special
860 characters this adds all characters inbetween (e.g. "A-Z" corre‐
861 sponds to all uppercase characters). To have `-' in the matched
862 set, either put it in first or last position or place it next to
863 a \x sequence.
864
865
866
868 Ordinary characters placed in the format string are copied without con‐
869 version. Conversion specifiers are introduced by a `%' character, and
870 are replaced as follows:
871
872
873 %a The abbreviated weekday name according to the current locale.
874
875 %A The full weekday name according to the current locale.
876
877 %b The abbreviated month name according to the current locale.
878
879 %B The full month name according to the current locale.
880
881 %c The preferred date and time representation for the current lo‐
882 cale.
883
884 %d The day of the month as a decimal number (range 01 to 31).
885
886 %F Equivalent to %Y-%m-%d (the ISO 8601 date format).
887
888 %H The hour as a decimal number using a 24-hour clock (range 00 to
889 23).
890
891 %I The hour as a decimal number using a 12-hour clock (range 01 to
892 12).
893
894 %j The day of the year as a decimal number (range 001 to 366).
895
896 %m The month as a decimal number (range 01 to 12).
897
898 %M The minute as a decimal number (range 00 to 59).
899
900 %p Either `AM' or `PM' according to the given time value, or
901 the corresponding strings for the current locale. Noon is
902 treated as `PM' and midnight as `AM'.
903
904 %R The time in 24-hour notation (%H:%M).
905
906 %S The second as a decimal number (range 00 to 61).
907
908 %U The week number of the current year as a decimal number,
909 range 00 to 53, starting with the first Sunday as the first
910 day of week 01.
911
912 %w The day of the week as a decimal, range 0 to 6, Sunday being
913 0.
914
915 %W The week number of the current year as a decimal number,
916 range 00 to 53, starting with the first Monday as the first
917 day of week 01.
918
919 %x The preferred date representation for the current locale with‐
920 out the time.
921
922 %X The preferred time representation for the current locale with‐
923 out the date.
924
925 %y The year as a decimal number without a century (range 00 to
926 99).
927
928 %Y The year as a decimal number including the century.
929
930 %Z The time zone or name or abbreviation.
931
932 %% A literal `%' character.
933
934
935
936 Depending on the C compiler and library used to compile GPP, there may
937 be more conversion specifiers available. Consult your compiler's docu‐
938 mentation for the strftime() function. Note, however, that any conver‐
939 sion specifiers not listed above may not be portable across installa‐
940 tions of GPP.
941
942
944 Here is a basic self-explanatory example in standard or cpp mode:
945
946
947
948 #define FOO This is
949 #define BAR a message.
950 #define concat #1 #2
951 concat(FOO,BAR)
952 #ifeq (concat(foo,bar)) (foo bar)
953 This is output.
954 #else
955 This is not output.
956 #endif
957
958 Using argument naming, the concat macro could alternatively be defined
959 as
960
961
962
963 #define concat(x,y) x y
964
965 In TeX mode and using argument naming, the same example becomes:
966
967
968
969 \define{FOO}{This is}
970 \define{BAR}{a message.}
971 \define{\concat{x}{y}}{\x \y}
972 \concat{\FOO}{\BAR}
973 \ifeq{\concat{foo}{bar}}{foo bar}
974 This is output.
975 \else
976 This is not output.
977 \endif
978
979 In HTML mode and without argument naming, one gets similarly:
980
981
982
983 <#define FOO|This is>
984 <#define BAR|a message.>
985 <#define concat|#1 #2>
986 <#concat <#FOO>|<#BAR>>
987 <#ifeq <#concat foo|bar>|foo bar>
988 This is output.
989 <#else>
990 This is not output.
991 <#endif>
992
993 The following example (in standard mode) illustrates the use of the
994 quote character:
995
996
997
998 #define FOO This is \
999 a multiline definition.
1000 #define BLAH(x) My argument is x
1001 BLAH(urf)
1002 \BLAH(urf)
1003
1004 Note that the multiline definition is also valid in cpp and Prolog
1005 modes despite the absence of quote character, because `\' followed by a
1006 newline is then interpreted as a comment and discarded.
1007
1008
1009 In cpp mode, C strings and comments are understood as such, as illus‐
1010 trated by the following example:
1011
1012
1013
1014 #define BLAH foo
1015 BLAH "BLAH" /* BLAH */
1016 `It\'s a /*string*/ !'
1017
1018 The main difference between Prolog mode and cpp mode is the handling of
1019 strings and comments: in Prolog, a `...' string may not begin immedi‐
1020 ately after a digit, and a /*...*/ comment may not begin immediately
1021 after an operator character. Furthermore, comments are not removed from
1022 the output unless they occur in a #command.
1023
1024
1025 The differences between cpp mode and default mode are deeper: in de‐
1026 fault mode #commands may start anywhere, while in cpp mode they must be
1027 at the beginning of a line; the default mode has no knowledge of com‐
1028 ments and strings, but has a quote character (`\'), while cpp mode has
1029 extensive comment/string specifications but no quote character. More‐
1030 over, the arguments to meta-macros need to be correctly parenthesized
1031 in default mode, while no such checking is performed in cpp mode.
1032
1033
1034 This makes it easier to nest meta-macro calls in default mode than in
1035 cpp mode. For example, consider the following HTML mode input, which
1036 tests for the availability of the #exec command:
1037
1038
1039
1040 <#ifeq <#exec echo blah>|blah
1041 > #exec allowed <#else> #exec not allowed <#endif>
1042
1043 There is no cpp mode equivalent, while in default mode it can be easily
1044 translated as
1045
1046
1047
1048 #ifeq (#exec echo blah
1049 ) (blah
1050 )
1051 \#exec allowed
1052 #else
1053 \#exec not allowed
1054 #endif
1055
1056 In order to nest meta-macro calls in cpp mode it is necessary to modify
1057 the mode description, either by changing the meta-macro call syntax, or
1058 more elegantly by defining a silent string and using the fact that the
1059 context at the beginning of an evaluated string is a newline character:
1060
1061
1062
1063 #mode string QQQ "$" "$"
1064 #ifeq $#exec echo blah
1065 $ $blah
1066 $
1067 \#exec allowed
1068 #else
1069 \#exec not allowed
1070 #endif
1071
1072 Note, however, that comments/strings cannot be nested ("..." inside
1073 $...$ would go undetected), so one needs to be careful about what to
1074 include inside such a silent evaluated string. In this example, the
1075 loose meta-macro nesting introduced in version 2.1 makes it possible to
1076 use the following simpler version:
1077
1078
1079
1080 #ifeq blah #exec echo -n blah
1081 \#exec allowed
1082 #else
1083 \#exec not allowed
1084 #endif
1085
1086 Remember that macros without arguments are actually understood to be
1087 aliases when they are called with arguments, as illustrated by the fol‐
1088 lowing example (default or cpp mode):
1089
1090
1091
1092 #define DUP(x) x x
1093 #define FOO and I said: DUP
1094 FOO(blah)
1095
1096 The usefulness of the #defeval meta-macro is shown by the following ex‐
1097 ample in HTML mode:
1098
1099
1100
1101 <#define APPLY|<#defeval TEMP|<\##1 \#1>><#TEMP #2>>
1102 <#define <#foo x>|<#x> and <#x>>
1103 <#APPLY foo|BLAH>
1104
1105 The reason why #defeval is needed is that, since everything is evalu‐
1106 ated in a single pass, the input that will result in the desired macro
1107 call needs to be generated by a first evaluation of the arguments
1108 passed to APPLY before being evaluated a second time.
1109
1110
1111 To translate this example in default mode, one needs to resort to
1112 parenthesizing in order to nest the #defeval call inside the definition
1113 of APPLY, but need to do so without outputting the parentheses. The
1114 easiest solution is
1115
1116
1117
1118 #define BALANCE(x) x
1119 #define APPLY(f,v) BALANCE(#defeval TEMP f
1120 TEMP(v))
1121 #define foo(x) x and x
1122 APPLY(\foo,BLAH)
1123
1124 As explained above the simplest version in cpp mode relies on defining
1125 a silent evaluated string to play the role of the BALANCE macro.
1126
1127
1128 The following example (default or cpp mode) demonstrates arithmetic
1129 evaluation:
1130
1131
1132
1133 #define x 4
1134 The answer is:
1135 #eval x*x + 2*(16-x) + 1998%x
1136
1137 #if defined(x)&&!(3*x+5>17)
1138 This should be output.
1139 #endif
1140
1141 To finish, here are some examples involving mode switching. The fol‐
1142 lowing example is self-explanatory (starting in default mode):
1143
1144
1145
1146 #mode push
1147 #define f(x) x x
1148 #mode standard tex
1149 \f{blah}
1150 \mode{string}{"$" "$"}
1151 \mode{comment}{"/*" "*/"}
1152 $\f{urf}$ /* blah */
1153 \define{FOO}{bar/* and some more */}
1154 \mode{pop}
1155 f($FOO$)
1156
1157 A good example where a user-defined mode becomes useful is the GPP
1158 source of this document (available with GPP's source code distribu‐
1159 tion).
1160
1161
1162 Another interesting application is selectively forcing evaluation of
1163 macros in C strings when in cpp mode. For example, consider the follow‐
1164 ing input:
1165
1166
1167
1168 #define blah(x) "and he said: x"
1169 blah(foo)
1170
1171 Obviously one would want the parameter x to be expanded inside the
1172 string. There are several ways around this problem:
1173
1174
1175
1176 #mode push
1177 #mode nostring "\""
1178 #define blah(x) "and he said: x"
1179 #mode pop
1180
1181 #mode quote "`"
1182 #define blah(x) `"and he said: x`"
1183
1184 #mode string QQQ "$$" "$$"
1185 #define blah(x) $$"and he said: x"$$
1186
1187 The first method is very natural, but has the inconvenience of being
1188 lengthy and neutralizing string semantics, so that having an unevalu‐
1189 ated instance of `x' in the string, or an occurrence of `/*', would be
1190 impossible without resorting to further contortions.
1191
1192 The second method is slightly more efficient because the local presence
1193 of a quote character makes it easier to control what is evaluated and
1194 what isn't, but has the drawback that it is sometimes impossible to
1195 find a reasonable quote character without having to either signifi‐
1196 cantly alter the source file or enclose it inside a #mode push/pop con‐
1197 struct. For example, any occurrence of `/*' in the string would have to
1198 be quoted.
1199
1200 The last method demonstrates the efficiency of evaluated strings in the
1201 context of selective evaluation: since comments/strings cannot be
1202 nested, any occurrence of `"' or `/*' inside the `$$' gets output as
1203 plain text, as expected inside a string, and only macro evaluation is
1204 enabled. Also note that there is much more freedom in the choice of a
1205 string delimiter than in the choice of a quote character.
1206
1207
1208 Starting with version 2.1, meta-macro calls can be nested more effi‐
1209 ciently in default, cpp and Prolog modes. This makes it easy to make a
1210 user version of a meta-macro, or to increment a counter:
1211
1212
1213
1214 #define myeval #eval #1
1215
1216 #define x 1
1217 #defeval x #eval x+1
1218
1219
1220
1222 Here are some examples of advanced constructions using GPP. They tend
1223 to be pretty awkward and should be considered as evidence of GPP's lim‐
1224 itations.
1225
1226
1227 The first example is a recursive macro. The main problem is that (since
1228 GPP evaluates everything) a recursive macro must be very careful about
1229 the way in which recursion is terminated in order to avoid undefined
1230 behavior (most of the time GPP will simply crash). In particular, rely‐
1231 ing on a #if/#else/#endif construct to end recursion is not possible
1232 and results in an infinite loop, because GPP scans user macro calls
1233 even in the unevaluated branch of the conditional block. A safe way to
1234 proceed is for example as follows (we give the example in TeX mode):
1235
1236
1237
1238 \define{countdown}{
1239 \if{#1}
1240 #1...
1241 \define{loop}{\countdown}
1242 \else
1243 Done.
1244 \define{loop}{}
1245 \endif
1246 \loop{\eval{#1-1}}
1247 }
1248 \countdown{10}
1249
1250
1251 Another example, in cpp mode:
1252
1253
1254
1255 #mode string QQQ "$" "$"
1256 #define triangle(x,y) y \
1257 $#if length(y)<x$ $#define iter triangle$ $#else$ \
1258 $#define iter$ $#endif
1259 $ iter(x,*y)
1260 triangle(20)
1261
1262
1263 The following is an (unfortunately very weak) attempt at implementing
1264 functional abstraction in GPP (in standard mode). Understanding this
1265 example and why it can't be made much simpler is an exercise left to
1266 the curious reader.
1267
1268
1269
1270 #mode string "`" "`" "\\"
1271 #define ASIS(x) x
1272 #define SILENT(x) ASIS()
1273 #define EVAL(x,f,v) SILENT(
1274 #mode string QQQ "`" "`" "\\"
1275 #defeval TEMP0 x
1276 #defeval TEMP1 (
1277 \#define \TEMP2(TEMP0) f
1278 )
1279 TEMP1
1280 )TEMP2(v)
1281 #define LAMBDA(x,f,v) SILENT(
1282 #ifneq (v) ()
1283 #define TEMP3(a,b,c) EVAL(a,b,c)
1284 #else
1285 #define TEMP3(a,b,c) \LAMBDA(a,b)
1286 #endif
1287 )TEMP3(x,f,v)
1288 #define EVALAMBDA(x,y) SILENT(
1289 #defeval TEMP4 x
1290 #defeval TEMP5 y
1291 )
1292 #define APPLY(f,v) SILENT(
1293 #defeval TEMP6 ASIS(\EVA)f
1294 TEMP6
1295 )EVAL(TEMP4,TEMP5,v)
1296
1297 This yields the following results:
1298
1299
1300
1301 LAMBDA(z,z+z)
1302 => LAMBDA(z,z+z)
1303
1304 LAMBDA(z,z+z,2)
1305 => 2+2
1306
1307 #define f LAMBDA(y,y*y)
1308 f
1309 => LAMBDA(y,y*y)
1310
1311 APPLY(f,blah)
1312 => blah*blah
1313
1314 APPLY(LAMBDA(t,t t),(t t))
1315 => (t t) (t t)
1316
1317 LAMBDA(x,APPLY(f,(x+x)),urf)
1318 => (urf+urf)*(urf+urf)
1319
1320 APPLY(APPLY(LAMBDA(x,LAMBDA(y,x*y)),foo),bar)
1321 => foo*bar
1322
1323 #define test LAMBDA(y,`#ifeq y urf
1324 y is urf#else
1325 y is not urf#endif
1326 `)
1327 APPLY(test,urf)
1328 => urf is urf
1329
1330 APPLY(test,foo)
1331 => foo is not urf
1332
1333
1334
1336 strftime(3), glob(7), m4(1V), cpp(1)
1337
1338 GPP home page: https://logological.org/gpp/
1339
1340
1342 GPP was written by Denis Auroux <auroux@math.mit.edu>. Since version
1343 2.12 it has been maintained by Tristan Miller <tristan@logologi‐
1344 cal.org>.
1345
1346
1348 Copyright (C) 1996-2001 Denis Auroux.
1349
1350 Copyright (C) 2003-2023 Tristan Miller.
1351
1352 Permission is granted to anyone to make or distribute verbatim copies
1353 of this document as received, in any medium, provided that the copy‐
1354 right notice and this permission notice are preserved, thus giving the
1355 recipient permission to redistribute in turn.
1356
1357 Permission is granted to distribute modified versions of this document,
1358 or of portions of it, under the above conditions, provided also that
1359 they carry prominent notices stating who last changed them.
1360
1361
1362
1363 GPP(1)