1grammar::fa::op(n)   Finite automaton operations and usage  grammar::fa::op(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       grammar::fa::op - Operations on finite automatons
9

SYNOPSIS

11       package require Tcl  8.4
12
13       package require snit
14
15       package require struct::list
16
17       package require struct::set
18
19       package require grammar::fa::op  ?0.4.1?
20
21       ::grammar::fa::op::constructor cmd
22
23       ::grammar::fa::op::reverse fa
24
25       ::grammar::fa::op::complete fa ?sink?
26
27       ::grammar::fa::op::remove_eps fa
28
29       ::grammar::fa::op::trim fa ?what?
30
31       ::grammar::fa::op::determinize fa ?mapvar?
32
33       ::grammar::fa::op::minimize fa ?mapvar?
34
35       ::grammar::fa::op::complement fa
36
37       ::grammar::fa::op::kleene fa
38
39       ::grammar::fa::op::optional fa
40
41       ::grammar::fa::op::union fa fb ?mapvar?
42
43       ::grammar::fa::op::intersect fa fb ?mapvar?
44
45       ::grammar::fa::op::difference fa fb ?mapvar?
46
47       ::grammar::fa::op::concatenate fa fb ?mapvar?
48
49       ::grammar::fa::op::fromRegex fa regex ?over?
50
51       ::grammar::fa::op::toRegexp fa
52
53       ::grammar::fa::op::toRegexp2 fa
54
55       ::grammar::fa::op::toTclRegexp regexp symdict
56
57       ::grammar::fa::op::simplifyRegexp regexp
58
59______________________________________________________________________________
60

DESCRIPTION

62       This  package provides a number of complex operations on finite automa‐
63       tons (Short: FA), as provided by the package grammar::fa.  The  package
64       does  not provide the ability to create and/or manipulate such FAs, nor
65       the ability to execute a FA for a stream of symbols.  Use the  packages
66       grammar::fa  and  grammar::fa::interpreter  for  that.  Another package
67       related to this is grammar::fa::compiler  which  turns  a  FA  into  an
68       executor class which has the definition of the FA hardwired into it.
69
70       For  more  information  about  what  a  finite automaton is see section
71       FINITE AUTOMATONS in package grammar::fa.
72

API

74       The package exports the API described here.  All commands modify  their
75       first  argument.  I.e. whatever FA they compute is stored back into it.
76       Some of the operations will construct an automaton whose states are all
77       new, but related to the states in the source automaton(s). These opera‐
78       tions take variable names as optional arguments where they  will  store
79       mappings  which  describe  the  relationship(s).  The operations can be
80       loosely partitioned into structural and language operations. The latter
81       are  defined  in terms of the language the automaton(s) accept, whereas
82       the former are defined in terms of the  structural  properties  of  the
83       involved automaton(s). Some operations are both.  Structure operations
84
85       ::grammar::fa::op::constructor cmd
86              This  command has to be called by the user of the package before
87              any other operations is performed, to establish a command  which
88              can  be  used to construct a FA container object. If this is not
89              done several operations will fail as they  are  unable  to  con‐
90              struct  internal  and  transient containers to hold state and/or
91              partial results.
92
93              Any container class using this package  for  complex  operations
94              should set its own class command as the constructor. See package
95              grammar::fa for an example.
96
97       ::grammar::fa::op::reverse fa
98              Reverses the fa. This is done by reversing the direction of  all
99              transitions and swapping the sets of start and final states. The
100              language of fa changes unpredictably.
101
102       ::grammar::fa::op::complete fa ?sink?
103              Completes the fa complete, but nothing is  done  if  the  fa  is
104              already  complete.  This implies that only the first in a series
105              of multiple consecutive complete operations on fa  will  perform
106              anything. The remainder will be null operations.
107
108              The language of fa is unchanged by this operation.
109
110              This is done by adding a single new state, the sink, and transi‐
111              tions from all other states to that sink for  all  symbols  they
112              have  no  transitions  for.  The sink itself is made complete by
113              adding loop transitions for all symbols.
114
115              Note: When a FA has epsilon-transitions transitions over a  sym‐
116              bol for a state S can be indirect, i.e. not attached directly to
117              S, but to a state in the epsilon-closure of S. The  symbols  for
118              such indirect transitions count when computing completeness of a
119              state. In other words, these indirectly reached symbols are  not
120              missing.
121
122              The  argument  sink provides the name for the new state and most
123              not be present in the fa if specified. If the name is not speci‐
124              fied  the command will name the state "sinkn", where n is set so
125              that there are no collisions with existing states.
126
127              Note that the sink state is not useful by definition.  In  other
128              words,  while  the FA becomes complete, it is also not useful in
129              the strict sense as it has a state from which no final state can
130              be reached.
131
132       ::grammar::fa::op::remove_eps fa
133              Removes all epsilon-transitions from the fa in such a manner the
134              the language of fa is unchanged. However nothing is done if  the
135              fa is already epsilon-free.  This implies that only the first in
136              a series of multiple consecutive complete operations on fa  will
137              perform anything. The remainder will be null operations.
138
139              Note:  This  operation may cause states to become unreachable or
140              not useful. These states are not removed by this operation.  Use
141              ::grammar::fa::op::trim for that instead.
142
143       ::grammar::fa::op::trim fa ?what?
144              Removes unwanted baggage from fa.  The legal values for what are
145              listed below. The command defaults to !reachable|!useful  if  no
146              specific argument was given.
147
148              !reachable
149                     Removes  all  states which are not reachable from a start
150                     state.
151
152              !useful
153                     Removes all states which are  unable  to  reach  a  final
154                     state.
155
156              !reachable&!useful
157
158              !(reachable|useful)
159                     Removes  all  states which are not reachable from a start
160                     state and are unable to reach a final state.
161
162              !reachable|!useful
163
164              !(reachable&useful)
165                     Removes all states which are not reachable from  a  start
166                     state or are unable to reach a final state.
167
168
169       ::grammar::fa::op::determinize fa ?mapvar?
170              Makes   the  fa  deterministic  without  changing  the  language
171              accepted by the fa. However nothing is done if the fa is already
172              deterministic.  This  implies that only the first in a series of
173              multiple consecutive complete operations on fa will perform any‐
174              thing. The remainder will be null operations.
175
176              The  command will store a dictionary describing the relationship
177              between the new states of the resulting dfa and  the  states  of
178              the  input  nfa in mapvar, if it has been specified. Keys of the
179              dictionary are the handles for the states of the resulting  dfa,
180              values are sets of states from the input nfa.
181
182              Note:  An  empty dictionary signals that the command was able to
183              make the fa deterministic without performing a full subset  con‐
184              struction,  just  by  removing  states and shuffling transitions
185              around (As part of making the FA epsilon-free).
186
187              Note: The algorithm fails to make the FA  deterministic  in  the
188              technical  sense if the FA has no start state(s), because deter‐
189              minism requires the FA to have exactly  one  start  states.   In
190              that  situation  we  make  a  best effort; and the missing start
191              state will be the only condition preventing the generated result
192              from  being deterministic.  It should also be noted that in this
193              case the possibilities for trimming states from the FA are  also
194              severely reduced as we cannot declare states unreachable.
195
196       ::grammar::fa::op::minimize fa ?mapvar?
197              Creates  a  FA  which accepts the same language as fa, but has a
198              minimal number of states. Uses Brzozowski's method to accomplish
199              this.
200
201              The  command will store a dictionary describing the relationship
202              between the new states of  the  resulting  minimal  fa  and  the
203              states of the input fa in mapvar, if it has been specified. Keys
204              of the dictionary are the handles for the states of the  result‐
205              ing minimal fa, values are sets of states from the input fa.
206
207              Note:  An  empty dictionary signals that the command was able to
208              minimize the fa without  having  to  compute  new  states.  This
209              should happen if and only if the input FA was already minimal.
210
211              Note: If the algorithm has no start or final states to work with
212              then the result might be technically minimal, but  have  a  very
213              unexpected structure.  It should also be noted that in this case
214              the possibilities for trimming states from the FA are  also  se‐
215              verely reduced as we cannot declare states unreachable.
216
217       Language  operations  All  operations  in this section require that all
218       input FAs have at least one start and at least one final state.  Other‐
219       wise  the language of the FAs will not be defined, making the operation
220       senseless (as it operates on the languages of the FAs in a defined man‐
221       ner).
222
223       ::grammar::fa::op::complement fa
224              Complements  fa.  This is possible if and only if fa is complete
225              and deterministic. The resulting FA  accepts  the  complementary
226              language  of  fa. In other words, all inputs not accepted by the
227              input are accepted by the result, and vice versa.
228
229              The result will have all states and transitions  of  the  input,
230              and different final states.
231
232       ::grammar::fa::op::kleene fa
233              Applies  Kleene's  closure  to fa.  The resulting FA accepts all
234              strings S for which we can find a natural number n (0 inclusive)
235              and  strings  A1 ... An in the language of fa such that S is the
236              concatenation of A1 ... An.  In other words, the language of the
237              result  is  the infinite union over finite length concatenations
238              over the language of fa.
239
240              The result will have all states and transitions  of  the  input,
241              and new start and final states.
242
243       ::grammar::fa::op::optional fa
244              Makes  the  fa optional. In other words it computes the FA which
245              accepts the language of fa and the empty the word  (epsilon)  as
246              well.
247
248              The  result  will  have all states and transitions of the input,
249              and new start and final states.
250
251       ::grammar::fa::op::union fa fb ?mapvar?
252              Combines the FAs fa and fb such that the  resulting  FA  accepts
253              the union of the languages of the two FAs.
254
255              The result will have all states and transitions of the two input
256              FAs, and new start and final states.  All  states  of  fb  which
257              exist in fa as well will be renamed, and the mapvar will contain
258              a mapping from the old states of fb to the new ones, if present.
259
260              It should be noted that the result  will  be  non-deterministic,
261              even if the inputs are deterministic.
262
263       ::grammar::fa::op::intersect fa fb ?mapvar?
264              Combines  the  FAs  fa and fb such that the resulting FA accepts
265              the intersection of the languages  of  the  two  FAs.  In  other
266              words,  the result will accept a word if and only if the word is
267              accepted by both fa and fb. The result will be useful,  but  not
268              necessarily deterministic or minimal.
269
270              The  command will store a dictionary describing the relationship
271              between the new states of the resulting  fa  and  the  pairs  of
272              states  of  the  input  FAs in mapvar, if it has been specified.
273              Keys of the dictionary are the handles for  the  states  of  the
274              resulting  fa,  values  are  pairs of states from the input FAs.
275              Pairs are represented by lists. The first element in  each  pair
276              will be a state in fa, the second element will be drawn from fb.
277
278       ::grammar::fa::op::difference fa fb ?mapvar?
279              Combines  the  FAs  fa and fb such that the resulting FA accepts
280              the difference of the languages of the two FAs. In other  words,
281              the  result  will  accept  a  word  if  and  only if the word is
282              accepted by fa, but not by fb. This can also be expressed as the
283              intersection of fa with the complement of fb. The result will be
284              useful, but not necessarily deterministic or minimal.
285
286              The command will store a dictionary describing the  relationship
287              between  the  new  states  of  the resulting fa and the pairs of
288              states of the input FAs in mapvar, if  it  has  been  specified.
289              Keys  of  the  dictionary  are the handles for the states of the
290              resulting fa, values are pairs of states  from  the  input  FAs.
291              Pairs  are  represented by lists. The first element in each pair
292              will be a state in fa, the second element will be drawn from fb.
293
294       ::grammar::fa::op::concatenate fa fb ?mapvar?
295              Combines the FAs fa and fb such that the  resulting  FA  accepts
296              the cross-product of the languages of the two FAs. I.e. a word W
297              will be accepted by the result if there are two words  A  and  B
298              accepted by fa, and fb resp. and W is the concatenation of A and
299              B.
300
301              The result FA will be non-deterministic.
302
303       ::grammar::fa::op::fromRegex fa regex ?over?
304              Generates a non-deterministic FA which accepts the same language
305              as  the regular expression regex. If the over is specified it is
306              treated as the set of symbols the regular expression and the au‐
307              tomaton  are defined over. The command will compute the set from
308              the "S" constructors in regex when over was not specified.  This
309              set  is  important if and only if the complement operator "!" is
310              used in regex as the complementary language of an  FA  is  quite
311              different for different sets of symbols.
312
313              The  regular  expression  is represented by a nested list, which
314              forms a syntax tree. The following structures are legal:
315
316              {S x}  Atomic regular expression. Everything else is constructed
317                     from these. Accepts the Symbol "x".
318
319              {. A1 A2 ...}
320                     Concatenation  operator. Accepts the concatenation of the
321                     regular expressions A1, A2, etc.
322
323                     Note that this operator accepts zero or  more  arguments.
324                     With  zero arguments the represented language is epsilon,
325                     the empty word.
326
327              {| A1 A2 ...}
328                     Choice operator, also called "Alternative".  Accepts  all
329                     input accepted by at least one of the regular expressions
330                     A1, A2, etc. In other words, the union of A1, A2.
331
332                     Note that this operator accepts zero or  more  arguments.
333                     With zero arguments the represented language is the empty
334                     language, the language without words.
335
336              {& A1 A2 ...}
337                     Intersection operator, logical  and.  Accepts  all  input
338                     accepted  which is accepted by all of the regular expres‐
339                     sions A1, A2, etc. In other words,  the  intersection  of
340                     A1, A2.
341
342              {? A}  Optionality operator. Accepts the empty word and anything
343                     from the regular expression A.
344
345              {* A}  Kleene closure. Accepts the empty  word  and  any  finite
346                     concatenation of words accepted by the regular expression
347                     A.
348
349              {+ A}  Positive Kleene closure. Accepts any finite concatenation
350                     of  words  accepted  by the regular expression A, but not
351                     the empty word.
352
353              {! A}  Complement operator. Accepts any word not accepted by the
354                     regular expression A. Note that the complement depends on
355                     the set of symbol the result should  run  over.  See  the
356                     discussion of the argument over before.
357
358       ::grammar::fa::op::toRegexp fa
359              This  command  generates  and returns a regular expression which
360              accepts the same language as the finite automaton fa. The  regu‐
361              lar  expression is in the format as described above, for ::gram‐
362              mar::fa::op::fromRegex.
363
364       ::grammar::fa::op::toRegexp2 fa
365              This   command   has   the   same   functionality   as   ::gram‐
366              mar::fa::op::toRegexp,  but  uses  a different algorithm to sim‐
367              plify the generated regular expressions.
368
369       ::grammar::fa::op::toTclRegexp regexp symdict
370              This command generates and returns a regular expression  in  Tcl
371              syntax  for  the regular expression regexp, if that is possible.
372              regexp  is  in  the  same  format   as   expected   by   ::gram‐
373              mar::fa::op::fromRegex.
374
375              The command will fail and throw an error if regexp contains com‐
376              plementation and intersection operations.
377
378              The argument symdict is a dictionary  mapping  symbol  names  to
379              pairs of syntactic type and Tcl-regexp. If a symbol occurring in
380              the regexp is not listed in this dictionary then  single-charac‐
381              ter  symbols are considered to designate themselves whereas mul‐
382              tiple-character symbols are considered to be a  character  class
383              name.
384
385       ::grammar::fa::op::simplifyRegexp regexp
386              This  command  simplifies  a  regular expression by applying the
387              following algorithm first to the main expression and then recur‐
388              sively to all sub-expressions:
389
390              [1]    Convert the expression into a finite automaton.
391
392              [2]    Minimize the automaton.
393
394              [3]    Convert the automaton back to a regular expression.
395
396              [4]    Choose  the shorter of original expression and expression
397                     from the previous step.
398

EXAMPLES

BUGS, IDEAS, FEEDBACK

401       This document, and the package it describes, will  undoubtedly  contain
402       bugs and other problems.  Please report such in the category grammar_fa
403       of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].   Please
404       also  report any ideas for enhancements you may have for either package
405       and/or documentation.
406
407       When proposing code changes, please provide unified diffs, i.e the out‐
408       put of diff -u.
409
410       Note  further  that  attachments  are  strongly  preferred over inlined
411       patches. Attachments can be made by going  to  the  Edit  form  of  the
412       ticket  immediately  after  its  creation, and then using the left-most
413       button in the secondary navigation bar.
414

KEYWORDS

416       automaton, finite automaton, grammar, parsing, regular expression, reg‐
417       ular grammar, regular languages, state, transducer
418

CATEGORY

420       Grammars and finite automata
421
423       Copyright (c) 2004-2008 Andreas Kupries <andreas_kupries@users.sourceforge.net>
424
425
426
427
428tcllib                                0.4                   grammar::fa::op(n)
Impressum