1STATES(1) STATES STATES(1)
2
3
4
6 states - awk alike text processing tool
7
8
10 states [-hvV] [-D var=val] [-f file] [-o outputfile] [-p path] [-s
11 startstate] [-W level] [filename ...]
12
13
15 States is an awk-alike text processing tool with some state machine
16 extensions. It is designed for program source code highlighting and to
17 similar tasks where state information helps input processing.
18
19 At a single point of time, States is in one state, each quite similar
20 to awk's work environment, they have regular expressions which are
21 matched from the input and actions which are executed when a match is
22 found. From the action blocks, states can perform state transitions;
23 it can move to another state from which the processing is continued.
24 State transitions are recorded so states can return to the calling
25 state once the current state has finished.
26
27 The biggest difference between states and awk, besides state machine
28 extensions, is that states is not line-oriented. It matches regular
29 expression tokens from the input and once a match is processed, it con‐
30 tinues processing from the current position, not from the beginning of
31 the next input line.
32
33
35 -D var=val, --define=var=val
36 Define variable var to have string value val. Command line
37 definitions overwrite variable definitions found from the con‐
38 fig file.
39
40 -f file, --file=file
41 Read state definitions from file file. As a default, states
42 tries to read state definitions from file states.st in the cur‐
43 rent working directory.
44
45 -h, --help
46 Print short help message and exit.
47
48 -o file, --output=file
49 Save output to file file instead of printing it to stdout.
50
51 -p path, --path=path
52 Set the load path to path. The load path defaults to the
53 directory, from which the state definitions file is loaded.
54
55 -s state, --state=state
56 Start execution from state state. This definition overwrites
57 start state resolved from the start block.
58
59 -v, --verbose
60 Increase the program verbosity.
61
62 -V, --version
63 Print states version and exit.
64
65 -W level, --warning=level
66 Set the warning level to level. Possible values for level are:
67
68 light light warnings (default)
69
70 all all warnings
71
72
74 States program files can contain on start block, startrules and
75 namerules blocks to specify the initial state, state definitions and
76 expressions.
77
78 The start block is the main() of the states program, it is executed on
79 script startup for each input file and it can perform any initializa‐
80 tion the script needs. It normally also calls the check_startrules()
81 and check_namerules() primitives which resolve the initial state from
82 the input file name or the data found from the beginning of the input
83 file. Here is a sample start block which initializes two variables and
84 does the standard start state resolving:
85
86 start
87 {
88 a = 1;
89 msg = "Hello, world!";
90 check_startrules ();
91 check_namerules ();
92 }
93
94 Once the start block is processed, the input processing is continued
95 from the initial state.
96
97 The initial state is resolved by the information found from the
98 startrules and namerules blocks. Both blocks contain regular expres‐
99 sion - symbol pairs, when the regular expression is matched from the
100 name of from the beginning of the input file, the initial state is
101 named by the corresponding symbol. For example, the following start
102 and name rules can distinguish C and Fortran files:
103
104 namerules
105 {
106 /\.(c|h)$/ c;
107 /\.[fF]$/ fortran;
108 }
109
110 startrules
111 {
112 /-\*- [cC] -\*-/ c;
113 /-\*- fortran -\*-/ fortran;
114 }
115
116 If these rules are used with the previously shown start block, states
117 first check the beginning of input file. If it has string -*- c -*-,
118 the file is assumed to contain C code and the processing is started
119 from state called c. If the beginning of the input file has string -*-
120 fortran -*-, the initial state is fortran. If none of the start rules
121 matched, the name of the input file is matched with the namerules. If
122 the name ends to suffix c or C, we go to state c. If the suffix is f
123 or F, the initial state is fortran.
124
125 If both start and name rules failed to resolve the start state, states
126 just copies its input to output unmodified.
127
128 The start state can also be specified from the command line with option
129 -s, --state.
130
131 State definitions have the following syntax:
132
133 state { expr {statements} ... }
134
135 where expr is: a regular expression, special expression or symbol and
136 statements is a list of statements. When the expression expr is
137 matched from the input, the statement block is executed. The statement
138 block can call states' primitives, user-defined subroutines, call other
139 states, etc. Once the block is executed, the input processing is con‐
140 tinued from the current intput position (which might have been changed
141 if the statement block called other states).
142
143 Special expressions BEGIN and END can be used in the place of expr.
144 Expression BEGIN matches the beginning of the state, its block is
145 called when the state is entered. Expression END matches the end of
146 the state, its block is executed when states leaves the state.
147
148 If expr is a symbol, its value is looked up from the global environment
149 and if it is a regular expression, it is matched to the input, other‐
150 wise that rule is ignored.
151
152 The states program file can also have top-level expressions, they are
153 evaluated after the program file is parsed but before any input files
154 are processed or the start block is evaluated.
155
156
158 call (symbol)
159 Move to state symbol and continue input file processing from
160 that state. Function returns whatever the symbol state's ter‐
161 minating return statement returned.
162
163 calln (name)
164 Like call but the argument name is evaluated and its value must
165 be string. For example, this function can be used to call a
166 state which name is stored to a variable.
167
168 check_namerules ()
169 Try to resolve start state from namerules rules. Function
170 returns 1 if start state was resolved or 0 otherwise.
171
172 check_startrules ()
173 Try to resolve start state from startrules rules. Function
174 returns 1 if start state was resolved or 0 otherwise.
175
176 concat (str, ...)
177 Concanate argument strings and return result as a new string.
178
179 float (any)
180 Convert argument to a floating point number.
181
182 getenv (str)
183 Get value of environment variable str. Returns an empty string
184 if variable var is undefined.
185
186 int (any)
187 Convert argument to an integer number.
188
189 length (item, ...)
190 Count the length of argument strings or lists.
191
192 list (any, ...)
193 Create a new list which contains items any, ...
194
195 panic (any, ...)
196 Report a non-recoverable error and exit with status 1. Func‐
197 tion never returns.
198
199 print (any, ...)
200 Convert arguments to strings and print them to the output.
201
202 range (source, start, end)
203 Return a sub-range of source starting from position start
204 (inclusively) to end (exclusively). Argument source can be
205 string or list.
206
207 regexp (string)
208 Convert string string to a new regular expression.
209
210 regexp_syntax (char, syntax)
211 Modify regular expression character syntaxes by assigning new
212 syntax syntax for character char. Possible values for syntax
213 are:
214
215 'w' character is a word constituent
216
217 ' ' character isn't a word constituent
218
219 regmatch (string, regexp)
220 Check if string string matches regular expression regexp.
221 Functions returns a boolean success status and sets sub-expres‐
222 sion registers $n.
223
224 regsub (string, regexp, subst)
225 Search regular expression regexp from string string and replace
226 the matching substring with string subst. Returns the result‐
227 ing string. The substitution string subst can contain $n ref‐
228 erences to the n:th parenthesized sup-expression.
229
230 regsuball (string, regexp, subst)
231 Like regsub but replace all matches of regular expression reg‐
232 exp from string string with string subst.
233
234 require_state (symbol)
235 Check that the state symbol is defined. If the required state
236 is undefined, the function tries to autoload it. If the load‐
237 ing fails, the program will terminate with an error message.
238
239 split (regexp, string)
240 Split string string to list considering matches of regular rex‐
241 pression regexp as item separator.
242
243 sprintf (fmt, ...)
244 Format arguments according to fmt and return result as a
245 string.
246
247 strcmp (str1, str2)
248 Perform a case-sensitive comparision for strings str1 and str2.
249 Function returns a value that is:
250
251 -1 string str1 is less than str2
252
253 0 strings are equal
254
255 1 string str1 is greater than str2
256
257 string (any)
258 Convert argument to string.
259
260 strncmp (str1, str2, num)
261 Perform a case-sensitive comparision for strings str1 and str2
262 comparing at maximum num characters.
263
264 substring (str, start, end)
265 Return a substring of string str starting from position start
266 (inclusively) to end (exclusively).
267
268
270 $. current input line number
271
272 $n the n:th parenthesized regular expression sub-expression from
273 the latest state regular expression or from the regmatch primi‐
274 tive
275
276 $` everything before the matched regular rexpression. This is
277 usable when used with the regmatch primitive; the contents of
278 this variable is undefined when used in action blocks to refer
279 the data before the block's regular expression.
280
281 $B an alias for $`
282
283 argv list of input file names
284
285 filename
286 name of the current input file
287
288 program name of the program (usually states)
289
290 version program version string
291
292
294 /usr/share/enscript/hl/*.st enscript's states definitions
295
296
298 awk(1), enscript(1)
299
300
302 Markku Rossi <mtr@iki.fi> <http://www.iki.fi/~mtr/>
303
304 GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/>
305
306
307
308STATES Oct 23, 1998 STATES(1)