1MH-FORMAT(5) File Formats Manual MH-FORMAT(5)
2
3
4
6 mh-format - formatting language for nmh message system
7
9 Several nmh commands utilize either a format string or a format file
10 during their execution. For example, scan uses a format string to gen‐
11 erate its listing of messages; repl uses a format file to generate mes‐
12 sage replies, and so on.
13
14 There are a number of scan listing formats available, including
15 nmh/etc/scan.time, nmh/etc/scan.size, and nmh/etc/scan.timely. Look in
16 /etc/nmh for other scan and repl format files which may have been writ‐
17 ten at your site.
18
19 You can have your local nmh expert write new format commands or modify
20 existing ones, or you can try your hand at it yourself. This manual
21 section explains how to do that. Note: some familiarity with the C
22 printf routine is assumed.
23
24 A format string consists of ordinary text combined with special, multi-
25 character, escape sequences which begin with `%'. When specifying a
26 format string, the usual C backslash characters are honored: `\b',
27 `\f', `\n', `\r', and `\t'. Continuation lines in format files end
28 with `\' followed by the newline character. A literal `%' can be in‐
29 serted into a format file by using the sequence `%%'.
30
31 SYNTAX
32 Format strings are built around escape sequences. There are three
33 types of escape sequence: header components, built-in functions, and
34 flow control. Comments may be inserted in most places where a function
35 argument is not expected. A comment begins with `%;' and ends with a
36 (non-escaped) newline.
37
38 Component escapes
39 A component escape is specified as `%{component}', and exists for each
40 header in the message being processed. For example, `%{date}' refers
41 to the “Date:” field of the message. All component escapes have a
42 string value. Such values are usually compressed by converting any
43 control characters (tab and newline included) to spaces, then eliding
44 any leading or multiple spaces. Some commands, however, may interpret
45 some component escapes differently; be sure to refer to each command's
46 manual entry for details. Some commands (such as ap(8) and mhl(1)) use
47 a special component `%{text}' to refer to the text being processed; see
48 their respective man pages for details and examples.
49
50 Function escapes
51 A function escape is specified as `%(function)'. All functions are
52 built-in, and most have a string or integer value. A function escape
53 may take an argument. The argument follows the function escape (and
54 any separating whitespace is discarded) as in the following example:
55
56 %(function argument)
57
58 In addition to literal numbers or strings, the argument to a function
59 escape can be another function, or a component, or a control escape.
60 When the argument is a function or a component, the argument is speci‐
61 fied without a leading `%'. When the argument is a control escape, it
62 is specified with a leading `%'.
63
64 Control escapes
65 A control escape is one of: `%<', `%?', `%|', or `%>'. These are com‐
66 bined into the conditional execution construct:
67
68 %< condition format-text
69 %? condition format-text
70 ...
71 %| format-text
72 %>
73
74 (Extra white space is shown here only for clarity.) These constructs,
75 which may be nested without ambiguity, form a general if-elseif-else-
76 endif block where only one of the format-texts is interpreted. In
77 other words, `%<' is like the "if", `%?' is like the "elseif", `%|' is
78 like "else", and `%>' is like "endif".
79
80 A `%<' or `%?' control escape causes its condition to be evaluated.
81 This condition is a component or function. For components and func‐
82 tions whose value is an integer, the condition is true if it is non-
83 zero, and false if zero. For components and functions whose value is a
84 string, the condition is true it is a non-empty string, and false if an
85 empty string.
86
87 The `%?' control escape is optional, and can be used multiple times in
88 a conditional block. The `%|' control escape is also optional, but may
89 only be used once.
90
91 Function escapes
92 Functions expecting an argument generally require an argument of a par‐
93 ticular type. In addition to the integer and string types, these in‐
94 clude:
95
96 Argument Description Example Syntax
97 literal A literal number %(func 1234)
98 or string %(func text string)
99 comp Any component %(func{in-reply-to})
100 date A date component %(func{date})
101 addr An address component %(func{from})
102 expr Nothing %(func)
103 or a subexpression %(func(func2))
104 or control escape %(func %<{reply-to}%|%{from}%>)
105
106 The date and addr types have the same syntax as the component type,
107 comp, but require a header component which is a date, or address,
108 string, respectively.
109
110 Most arguments not of type expr are required. When escapes are nested
111 (via expr arguments), evaluation is done from innermost to outermost.
112 As noted above, for the expr argument type, functions and components
113 are written without a leading `%'. Control escape arguments must use a
114 leading `%', preceded by a space.
115
116 For example,
117
118 %<(mymbox{from}) To: %{to}%>
119
120 writes the value of the header component “From:” to the internal reg‐
121 ister named str; then (mymbox) reads str and writes its result to the
122 internal register named num; then the control escape, `%<', evaluates
123 num. If num is non-zero, the string “To:” is printed followed by the
124 value of the header component “To:”.
125
126 Evaluation
127 The evaluation of format strings is performed by a small virtual ma‐
128 chine. The machine is capable of evaluating nested expressions (as de‐
129 scribed above) and, in addition, has an integer register num, and a
130 text string register str. When a function escape that accepts an op‐
131 tional argument is processed, and the argument is not present, the cur‐
132 rent value of either num or str is substituted as the argument: the
133 register used depends on the function, as listed below.
134
135 Component escapes write the value of their message header in str.
136 Function escapes write their return value in num for functions return‐
137 ing integer or boolean values, and in str for functions returning
138 string values. (The boolean type is a subset of integers, with usual
139 values 0=false and 1=true.) Control escapes return a boolean value,
140 setting num to 1 if the last explicit condition evaluated by a `%<' or
141 `%?' control escape succeeded, and 0 otherwise.
142
143 All component escapes, and those function escapes which return an inte‐
144 ger or string value, evaluate to their value as well as setting str or
145 num. Outermost escape expressions in these forms will print their
146 value, but outermost escapes which return a boolean value do not result
147 in printed output.
148
149 Functions
150 The function escapes may be roughly grouped into a few categories.
151
152 Function Argument Return Description
153 msg integer message number
154 cur integer message is current (0 or 1)
155 unseen integer message is unseen (0 or 1)
156 size integer size of message
157 strlen integer length of str
158 width integer column width of terminal
159 charleft integer bytes left in output buffer
160 timenow integer seconds since the Unix epoch
161 me string the user's mailbox (username)
162 myhost string the user's local hostname
163 myname string the user's name
164 localmbox string the complete local mailbox
165 eq literal boolean num == arg
166 ne literal boolean num != arg
167 gt literal boolean num > arg
168 match literal boolean str contains arg
169 amatch literal boolean str starts with arg
170 plus literal integer arg plus num
171 minus literal integer arg minus num
172 multiply literal integer num multiplied by arg
173 divide literal integer num divided by arg
174 modulo literal integer num modulo arg
175 num literal integer Set num to arg.
176 num integer Set num to zero.
177 lit literal string Set str to arg.
178 lit string Clear str.
179 getenv literal string Set str to environment value of arg
180 profile literal string Set str to profile or context
181 component arg value
182 nonzero expr boolean num is non-zero
183 zero expr boolean num is zero
184 null expr boolean str is empty
185 nonnull expr boolean str is non-empty
186 void expr Set str or num
187 comp comp string Set str to component text
188 compval comp integer Set num to “atoi(comp)”
189 decode expr string decode str as RFC 2047 (MIME-encoded)
190 component
191 unquote expr string remove RFC 2822 quotes from str
192 trim expr trim trailing whitespace from str
193 trimr expr string Like %(trim), also returns string
194 kilo expr string express in SI units: 15.9K, 2.3M, etc.
195 %(kilo) scales by factors of 1000,
196 kibi expr string express in IEC units: 15.5Ki, 2.2Mi.
197 %(kibi) scales by factors of 1024.
198 ordinal expr string Output ordinal suffix based on value
199 of num (st, nd, rd, th)
200 putstr expr print str
201 putstrf expr print str in a fixed width
202 putnum expr print num
203 putnumf expr print num in a fixed width
204 putlit expr print str without space compression
205 zputlit expr print str without space compression;
206 str must occupy no width on display
207 bold string set terminal bold mode
208 underline string set terminal underlined mode
209 standout string set terminal standout mode
210 resetterm string reset all terminal attributes
211 hascolor boolean terminal supports color
212 fgcolor literal string set terminal foreground color
213 bgcolor literal string set terminal background color
214 formataddr expr append arg to str as a
215 (comma separated) address list
216 concataddr expr append arg to str as a
217 (comma separated) address list,
218 including duplicates,
219 see Special Handling
220 putaddr literal print str address list with
221 arg as optional label;
222 get line width from num
223
224 The (me) function returns the username of the current user. The (my‐
225 host) function returns the localname entry in mts.conf, or the local
226 hostname if localname is not configured. The (myname) function will
227 return the value of the SIGNATURE environment variable if set, other‐
228 wise it will return the passwd GECOS field (truncated at the first
229 comma if it contains one) for the current user. The (localmbox) func‐
230 tion will return the complete form of the local mailbox, suitable for
231 use in a “From” header. It will return the “Local-Mailbox” profile en‐
232 try if there is one; if not, it will be equivalent to:
233
234 %(myname) <%(me)@%(myhost)>
235
236 The following functions require a date component as an argument:
237
238 Function Argument Return Description
239 sec date integer seconds of the minute
240 min date integer minutes of the hour
241 hour date integer hours of the day (0-23)
242 wday date integer day of the week (Sun=0)
243 day date string day of the week (abbrev.)
244 weekday date string day of the week
245 sday date integer day of the week known?
246 (1=explicit,0=implicit,-1=unknown)
247 mday date integer day of the month
248 yday date integer day of the year
249 mon date integer month of the year
250 month date string month of the year (abbrev.)
251 lmonth date string month of the year
252 year date integer year (may be > 100)
253 zone date integer timezone in minutes
254 tzone date string timezone string
255 szone date integer timezone explicit?
256 (1=explicit,0=implicit,-1=unknown)
257 date2local date coerce date to local timezone
258 date2gmt date coerce date to GMT
259 dst date integer daylight savings in effect? (0 or 1)
260 clock date integer seconds since the Unix epoch
261 rclock date integer seconds prior to current time
262 tws date string official RFC 822 rendering
263 pretty date string user-friendly rendering
264 nodate date integer returns 1 if date is invalid
265
266 The following functions require an address component as an argument.
267 The return value of functions noted with `*' is computed from the first
268 address present in the header component.
269
270 Function Argument Return Description
271 proper addr string official RFC 822 rendering
272 friendly addr string user-friendly rendering
273 addr addr string mbox@host or host!mbox rendering*
274 pers addr string the personal name*
275 note addr string commentary text*
276 mbox addr string the local mailbox*
277 mymbox addr integer list has the user's address? (0 or 1)
278 getmymbox addr string the user's (first) address,
279 with personal name
280 getmyaddr addr string the user's (first) address,
281 without personal name
282 host addr string the host domain*
283 nohost addr integer no host was present (0 or 1)*
284 type addr integer host type* (0=local,1=network,
285 -1=uucp,2=unknown)
286 path addr string any leading host route*
287 ingrp addr integer address was inside a group (0 or 1)*
288 gname addr string name of group*
289
290 (A clarification on (mymbox{comp}) is in order. This function checks
291 each of the addresses in the header component “comp” against the user's
292 mailbox name and any “Alternate-Mailboxes”. It returns true if any ad‐
293 dress matches. However, it also returns true if the “comp” header is
294 not present in the message. If needed, the (null) function can be used
295 to explicitly test for this case.)
296
297 The friendly{comp}) call will return any double-quoted “personal name”
298 (that is, anything before <>), then it will return that. If there's no
299 personal name but there is a “note” (comments string after an email ad‐
300 dress), it will return that. If there is neither of those it will just
301 return the bare email address.
302
303
304 Formatting
305 When a function or component escape is interpreted and the result will
306 be printed immediately, an optional field width can be specified to
307 print the field in exactly a given number of characters. For example,
308 a numeric escape like %4(size) will print at most 4 digits of the mes‐
309 sage size; overflow will be indicated by a `?' in the first position
310 (like `?234'). A string escape like %4(me) will print the first 4
311 characters and truncate at the end. Short fields are padded at the
312 right with the fill character (normally, a blank). If the field width
313 argument begins with a leading zero, then the fill character is set to
314 a zero.
315
316 The functions (putnumf) and (putstrf) print their result in exactly the
317 number of characters specified by their leading field width argument.
318 For example, %06(putnumf(size)) will print the message size in a field
319 six characters wide filled with leading zeros; %14(putstrf{from}) will
320 print the “From:” header component in fourteen characters with trailing
321 spaces added as needed. Using a negative value for the field width
322 causes right-justification within the field, with padding on the left
323 up to the field width. Padding is with spaces except for a left-padded
324 putnumf when the width starts with zero. The functions (putnum) and
325 (putstr) are somewhat special: they print their result in the minimum
326 number of characters required, and ignore any leading field width argu‐
327 ment. The (putlit) function outputs the exact contents of the str reg‐
328 ister without any changes such as duplicate space removal or control
329 character conversion. Similarly, the (zputlit) function outputs the
330 exact contents of the str register, but requires that those contents
331 not occupy any output width. It can therefore be used for outputting
332 terminal escape sequences.
333
334 There are a limited number of function escapes to output terminal es‐
335 cape sequences. These sequences are retrieved from the terminfo(5)
336 database according to the current terminal setting. The (bold), (un‐
337 derline), and (standout) escapes set bold mode, underline mode, and
338 standout mode respectively. (hascolor) can be used to determine if the
339 current terminal supports color. (fgcolor) and (bgcolor) set the fore‐
340 ground and background colors respectively. Both of these escapes take
341 one literal argument, the color name, which can be one of: black, red,
342 green, yellow, blue, magenta, cyan, white. (resetterm) resets all ter‐
343 minal attributes to their default setting. These terminal escapes
344 should be used in conjunction with (zputlit) (preferred) or (putlit),
345 as the normal (putstr) function will strip out control characters.
346
347 The available output width is kept in an internal register; any output
348 exceeding this width will be truncated. The one exception to this is
349 that (zputlit) functions will still be executed if a terminal reset
350 code is being placed at the end of a line.
351
352 Special Handling
353 Some functions have different behavior depending on the command they
354 are invoked from.
355
356 In repl the (formataddr) function stores all email addresses encoun‐
357 tered into an internal cache and will use this cache to suppress dupli‐
358 cate addresses. If you need to create an address list that includes
359 previously-seen addresses you may use the (concataddr) function, which
360 is identical to (formataddr) in all other respects. Note that (con‐
361 cataddr) does not add addresses to the duplicate-suppression cache.
362
363 Other Hints and Tips
364 Sometimes, the writer of a format function is confused because output
365 is duplicated. The general rule to remember is simple: If a function
366 or component escape begins with a `%', it will generate text in the
367 output file. Otherwise, it will not.
368
369 A good example is a simple attempt to generate a To: header based on
370 the From: and Reply-To: headers:
371
372 %(formataddr %<{reply-to}%|%{from})%(putaddr To: )
373
374 Unfortunately, if the Reply-to: header is not present, the output line
375 will be something like:
376
377 My From User <from@example.com>To: My From User <from@example.com>
378
379 What went wrong? When performing the test for the if clause (%<), the
380 component is not output because it is considered an argument to the if
381 statement (so the rule about not starting with % applies). But the
382 component escape in our else statement (everything after the `%|') is
383 not an argument to anything; it begins with a %, and thus the value of
384 that component is output. This also has the side effect of setting the
385 str register, which is later picked up by the (formataddr) function and
386 then output by (putaddr). The example format string above has another
387 bug: there should always be a valid width value in the num register
388 when (putaddr) is called, otherwise bad formatting can take place.
389
390 The solution is to use the (void) function; this will prevent the func‐
391 tion or component from outputting any text. With this in place (and
392 using (width) to set the num register for the width) a better implemen‐
393 tation would look like:
394
395 %(formataddr %<{reply-to}%|%(void{from})%(void(width))%(putaddr To: )
396
397 It should be noted here that the side effects of function and component
398 escapes are still in force and, as a result, each component test in the
399 if-elseif-else-endif clause sets the str register.
400
401 As an additional note, the (formataddr) and (concataddr) functions have
402 special behavior when it comes to the str register. The starting point
403 of the register is saved and is used to build up entries in the address
404 list.
405
406 You will find the fmttest(1) utility invaluable when debugging problems
407 with format strings.
408
409 Examples
410 With all the above in mind, here is a breakdown of the default format
411 string for scan. The first part is:
412
413 %4(msg)%<(cur)+%| %>%<{replied}-%?{encrypted}E%| %>
414
415 which says that the message number should be printed in four digits.
416 If the message is the current message then a `+', else a space, should
417 be printed; if a “Replied:” field is present then a `-', else if an
418 “Encrypted:” field is present then an `E', otherwise a space, should be
419 printed. Next:
420
421 %02(mon{date})/%02(mday{date})
422
423 the month and date are printed in two digits (zero filled) separated by
424 a slash. Next,
425
426 %<{date} %|*%>
427
428 If a “Date:” field is present it is printed, followed by a space; oth‐
429 erwise a `*' is printed. Next,
430
431 %<(mymbox{from})%<{to}To:%14(decode(friendly{to}))%>%>
432
433 if the message is from me, and there is a “To:” header, print “To:”
434 followed by a “user-friendly” rendering of the first address in the
435 “To:” field; any MIME-encoded characters are decoded into the actual
436 characters. Continuing,
437
438 %<(zero)%17(decode(friendly{from}))%>
439
440 if either of the above two tests failed, then the “From:” address is
441 printed in a mime-decoded, “user-friendly” format. And finally,
442
443 %(decode{subject})%<{body}<<%{body}>>%>
444
445 the mime-decoded subject and initial body (if any) are printed.
446
447 For a more complicated example, consider a possible replcomps format
448 file.
449
450 %(lit)%(formataddr %<{reply-to}
451
452 This clears str and formats the “Reply-To:” header if present. If not
453 present, the else-if clause is executed.
454
455 %?{from}%?{sender}%?{return-path}%>)\
456
457 This formats the “From:”, “Sender:” and “Return-Path:” headers, stop‐
458 ping as soon as one of them is present. Next:
459
460 %<(nonnull)%(void(width))%(putaddr To: )\n%>\
461
462 If the formataddr result is non-null, it is printed as an address (with
463 line folding if needed) in a field width wide, with a leading label of
464 “To:”.
465
466 %(lit)%(formataddr{to})%(formataddr{cc})%(formataddr(me))\
467
468 str is cleared, and the “To:” and “Cc:” headers, along with the user's
469 address (depending on what was specified with the “-cc” switch to repl)
470 are formatted.
471
472 %<(nonnull)%(void(width))%(putaddr cc: )\n%>\
473
474 If the result is non-null, it is printed as above with a leading label
475 of “cc:”.
476
477 %<{fcc}Fcc: %{fcc}\n%>\
478
479 If a -fcc folder switch was given to repl (see repl(1) for more details
480 about %{fcc}), an “Fcc:” header is output.
481
482 %<{subject}Subject: Re: %{subject}\n%>\
483
484 If a subject component was present, a suitable reply subject is output.
485
486 %<{message-id}In-Reply-To: %{message-id}\n%>\
487 %<{message-id}References: %<{references} %{references}%>\
488 %{message-id}\n%>
489 --------
490
491 If a message-id component was present, an “In-Reply-To:” header is out‐
492 put including the message-id, followed by a “References:” header with
493 references, if present, and the message-id. As with all plain-text,
494 the row of dashes are output as-is.
495
496 This last part is a good example for a little more elaboration. Here's
497 that part again in pseudo-code:
498
499 if (comp_exists(message-id)) then
500 print (“In-reply-to: ”)
501 print (message-id.value)
502 print (“\n”)
503 endif
504 if (comp_exists(message-id)) then
505 print (“References: ”)
506 if (comp_exists(references)) then
507 print(references.value);
508 endif
509 print (message-id.value)
510 print (“\n”)
511 endif
512
513 One more example: Currently, nmh supports very large message numbers,
514 and it is not uncommon for a folder to have far more than 10000 mes‐
515 sages. Nonetheless (as noted above) the various scan format strings,
516 inherited from older MH versions, are generally hard-coded to 4 digits
517 for the message number. Thereafter, formatting problems occur. The nmh
518 format strings can be modified to behave more sensibly with larger mes‐
519 sage numbers:
520
521 %(void(msg))%<(gt 9999)%(msg)%|%4(msg)%>
522
523 The current message number is placed in num. (Note that (msg) is a
524 function escape which returns an integer, it is not a component.) The
525 (gt) conditional is used to test whether the message number has 5 or
526 more digits. If so, it is printed at full width, otherwise at 4 dig‐
527 its.
528
530 scan(1), repl(1), fmttest(1)
531
533 None
534
535
536
537nmh-1.8 2015-01-10 MH-FORMAT(5)