1STATES(1)                           STATES                           STATES(1)
2
3
4

NAME

6       states - awk alike text processing tool
7
8

SYNOPSIS

10       states  [-hvV]  [-D  var=val]  [-f  file] [-o outputfile] [-p path] [-s
11       startstate] [-W level] [filename ...]
12
13

DESCRIPTION

15       States is an awk-alike text processing tool  with  some  state  machine
16       extensions.  It is designed for program source code highlighting and to
17       similar tasks where state information helps input processing.
18
19       At a single point of time, States is in one state, each  quite  similar
20       to  awk's  work  environment,  they  have regular expressions which are
21       matched from the input and actions which are executed when a  match  is
22       found.   From  the action blocks, states can perform state transitions;
23       it can move to another state from which the  processing  is  continued.
24       State  transitions  are  recorded  so  states can return to the calling
25       state once the current state has finished.
26
27       The biggest difference between states and awk,  besides  state  machine
28       extensions,  is  that  states is not line-oriented.  It matches regular
29       expression tokens from the input and once a match is processed, it con‐
30       tinues  processing from the current position, not from the beginning of
31       the next input line.
32
33

OPTIONS

35       -D var=val, --define=var=val
36               Define variable var to have string  value  val.   Command  line
37               definitions  overwrite variable definitions found from the con‐
38               fig file.
39
40       -f file, --file=file
41               Read state definitions from file file.  As  a  default,  states
42               tries to read state definitions from file states.st in the cur‐
43               rent working directory.
44
45       -h, --help
46               Print short help message and exit.
47
48       -o file, --output=file
49               Save output to file file instead of printing it to stdout.
50
51       -p path, --path=path
52               Set the load path to path.   The  load  path  defaults  to  the
53               directory, from which the state definitions file is loaded.
54
55       -s state, --state=state
56               Start  execution  from state state.  This definition overwrites
57               start state resolved from the start block.
58
59       -v, --verbose
60               Increase the program verbosity.
61
62       -V, --version
63               Print states version and exit.
64
65       -W level, --warning=level
66               Set the warning level to level.  Possible values for level are:
67
68               light   light warnings (default)
69
70               all     all warnings
71
72

STATES PROGRAM FILES

74       States program  files  can  contain  on  start  block,  startrules  and
75       namerules  blocks  to  specify the initial state, state definitions and
76       expressions.
77
78       The start block is the main() of the states program, it is executed  on
79       script  startup  for each input file and it can perform any initializa‐
80       tion the script needs.  It normally also calls  the  check_startrules()
81       and  check_namerules()  primitives which resolve the initial state from
82       the input file name or the data found from the beginning of  the  input
83       file.  Here is a sample start block which initializes two variables and
84       does the standard start state resolving:
85
86              start
87              {
88                a = 1;
89                msg = "Hello, world!";
90                check_startrules ();
91                check_namerules ();
92              }
93
94       Once the start block is processed, the input  processing  is  continued
95       from the initial state.
96
97       The  initial  state  is  resolved  by  the  information  found from the
98       startrules and namerules blocks.  Both blocks contain  regular  expres‐
99       sion  -  symbol  pairs, when the regular expression is matched from the
100       name of from the beginning of the input  file,  the  initial  state  is
101       named  by  the  corresponding symbol.  For example, the following start
102       and name rules can distinguish C and Fortran files:
103
104              namerules
105              {
106                /\.(c|h)$/    c;
107                /\.[fF]$/     fortran;
108              }
109
110              startrules
111              {
112                /-\*- [cC] -\*-/      c;
113                /-\*- fortran -\*-/   fortran;
114              }
115
116       If these rules are used with the previously shown start  block,  states
117       first  check  the beginning of input file.  If it has string -*- c -*-,
118       the file is assumed to contain C code and  the  processing  is  started
119       from state called c.  If the beginning of the input file has string -*-
120       fortran -*-, the initial state is fortran.  If none of the start  rules
121       matched,  the name of the input file is matched with the namerules.  If
122       the name ends to suffix c or C, we go to state c.  If the suffix  is  f
123       or F, the initial state is fortran.
124
125       If  both start and name rules failed to resolve the start state, states
126       just copies its input to output unmodified.
127
128       The start state can also be specified from the command line with option
129       -s, --state.
130
131       State definitions have the following syntax:
132
133       state { expr {statements} ... }
134
135       where  expr  is: a regular expression, special expression or symbol and
136       statements is a list  of  statements.   When  the  expression  expr  is
137       matched from the input, the statement block is executed.  The statement
138       block can call states' primitives, user-defined subroutines, call other
139       states,  etc.  Once the block is executed, the input processing is con‐
140       tinued from the current intput position (which might have been  changed
141       if the statement block called other states).
142
143       Special  expressions  BEGIN  and  END can be used in the place of expr.
144       Expression BEGIN matches the beginning  of  the  state,  its  block  is
145       called  when  the  state is entered.  Expression END matches the end of
146       the state, its block is executed when states leaves the state.
147
148       If expr is a symbol, its value is looked up from the global environment
149       and  if  it is a regular expression, it is matched to the input, other‐
150       wise that rule is ignored.
151
152       The states program file can also have top-level expressions,  they  are
153       evaluated  after  the program file is parsed but before any input files
154       are processed or the start block is evaluated.
155
156

PRIMITIVE FUNCTIONS

158       call (symbol)
159               Move to state symbol and continue input  file  processing  from
160               that  state.  Function returns whatever the symbol state's ter‐
161               minating return statement returned.
162
163       calln (name)
164               Like call but the argument name is evaluated and its value must
165               be  string.   For  example, this function can be used to call a
166               state which name is stored to a variable.
167
168       check_namerules ()
169               Try to resolve start  state  from  namerules  rules.   Function
170               returns 1 if start state was resolved or 0 otherwise.
171
172       check_startrules ()
173               Try  to  resolve  start  state from startrules rules.  Function
174               returns 1 if start state was resolved or 0 otherwise.
175
176       concat (str, ...)
177               Concanate argument strings and return result as a new string.
178
179       float (any)
180               Convert argument to a floating point number.
181
182       getenv (str)
183               Get value of environment variable str.  Returns an empty string
184               if variable var is undefined.
185
186       int (any)
187               Convert argument to an integer number.
188
189       length (item, ...)
190               Count the length of argument strings or lists.
191
192       list (any, ...)
193               Create a new list which contains items any, ...
194
195       panic (any, ...)
196               Report  a  non-recoverable error and exit with status 1.  Func‐
197               tion never returns.
198
199       print (any, ...)
200               Convert arguments to strings and print them to the output.
201
202       range (source, start, end)
203               Return a sub-range  of  source  starting  from  position  start
204               (inclusively)  to  end  (exclusively).   Argument source can be
205               string or list.
206
207       regexp (string)
208               Convert string string to a new regular expression.
209
210       regexp_syntax (char, syntax)
211               Modify regular expression character syntaxes by  assigning  new
212               syntax  syntax  for character char.  Possible values for syntax
213               are:
214
215               'w'     character is a word constituent
216
217               ' '     character isn't a word constituent
218
219       regmatch (string, regexp)
220               Check if  string  string  matches  regular  expression  regexp.
221               Functions returns a boolean success status and sets sub-expres‐
222               sion registers $n.
223
224       regsub (string, regexp, subst)
225               Search regular expression regexp from string string and replace
226               the  matching substring with string subst.  Returns the result‐
227               ing string.  The substitution string subst can contain $n  ref‐
228               erences to the n:th parenthesized sup-expression.
229
230       regsuball (string, regexp, subst)
231               Like  regsub but replace all matches of regular expression reg‐
232               exp from string string with string subst.
233
234       require_state (symbol)
235               Check that the state symbol is defined.  If the required  state
236               is  undefined, the function tries to autoload it.  If the load‐
237               ing fails, the program will terminate with an error message.
238
239       split (regexp, string)
240               Split string string to list considering matches of regular rex‐
241               pression regexp as item separator.
242
243       sprintf (fmt, ...)
244               Format  arguments  according  to  fmt  and  return  result as a
245               string.
246
247       strcmp (str1, str2)
248               Perform a case-sensitive comparision for strings str1 and str2.
249               Function returns a value that is:
250
251               -1      string str1 is less than str2
252
253               0       strings are equal
254
255               1       string str1 is greater than str2
256
257       string (any)
258               Convert argument to string.
259
260       strncmp (str1, str2, num)
261               Perform  a case-sensitive comparision for strings str1 and str2
262               comparing at maximum num characters.
263
264       substring (str, start, end)
265               Return a substring of string str starting from  position  start
266               (inclusively) to end (exclusively).
267
268

BUILTIN VARIABLES

270       $.      current input line number
271
272       $n      the  n:th  parenthesized regular expression sub-expression from
273               the latest state regular expression or from the regmatch primi‐
274               tive
275
276       $`      everything  before  the  matched  regular rexpression.  This is
277               usable when used with the regmatch primitive; the  contents  of
278               this  variable is undefined when used in action blocks to refer
279               the data before the block's regular expression.
280
281       $B      an alias for $`
282
283       argv    list of input file names
284
285       filename
286               name of the current input file
287
288       program name of the program (usually states)
289
290       version program version string
291
292

FILES

294       /usr/share/enscript/hl/*.st             enscript's states definitions
295
296

SEE ALSO

298       awk(1), enscript(1)
299
300

AUTHOR

302       Markku Rossi <mtr@iki.fi> <http://www.iki.fi/~mtr/>
303
304       GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/>
305
306
307
308STATES                           Oct 23, 1998                        STATES(1)
Impressum