1GVPR(1)                     General Commands Manual                    GVPR(1)
2
3
4

NAME

6       gvpr - graph pattern scanning and processing language
7       ( previously known as gpr )
8

SYNOPSIS

10       gvpr  [-icnqV?]   [ -o outfile ] [ -a args ] [ 'prog' | -f progfile ] [
11       files ]
12

DESCRIPTION

14       gvpr is a graph stream editor inspired by awk.  It copies input  graphs
15       to  its  output,  possibly transforming their structure and attributes,
16       creating new graphs, or  printing  arbitrary  information.   The  graph
17       model  is that provided by libcgraph(3).  In particular, gvpr reads and
18       writes graphs using the dot language.
19
20       Basically, gvpr traverses each input graph,  denoted  by  $G,  visiting
21       each  node  and  edge, matching it with the predicate‐action rules sup‐
22       plied in the input program.  The rules are  evaluated  in  order.   For
23       each  predicate  evaluating  to  true, the corresponding action is per‐
24       formed.  During the traversal, the current node or edge  being  visited
25       is denoted by $.
26
27       For  each  input graph, there is a target subgraph, denoted by $T, ini‐
28       tially empty and used to accumulate  chosen  entities,  and  an  output
29       graph,  $O,  used  for final processing and then written to output.  By
30       default, the output graph is the target graph.  The output graph can be
31       set in the program or, in a limited sense, on the command line.
32

OPTIONS

34       The following options are supported:
35
36       -a args
37              The  string args is split into whitespace‐separated tokens, with
38              the individual tokens available as strings in the  gvpr  program
39              as  ARGV[0],...,ARGV[ARGC-1].  Whitespace characters within sin‐
40              gle or double quoted substrings, or preceded by a backslash, are
41              ignored  as separators.  In general, a backslash character turns
42              off any special meaning of the following character.   Note  that
43              the tokens derived from multiple -a flags are concatenated.
44
45       -c     Use the source graph as the output graph.
46
47       -i     Derive  the  node‐induced subgraph extension of the output graph
48              in the context of its root graph.
49
50       -o outfile
51              Causes the output stream to be written to the specified file; by
52              default, output is written to stdout.
53
54       -f progfile
55              Use the contents of the specified file as the program to execute
56              on the input. If progfile contains a slash character,  the  name
57              is  taken  as the pathname of the file. Otherwise, gvpr will use
58              the directories specified in the environment  variable  GVPRPATH
59              to  look  for  the  file.  If -f is not given, gvpr will use the
60              first non‐option argument as the program.
61
62       -q     Turns off warning messages.
63
64       -n     Turns off graph read-ahead. By default, the variable $NG is  set
65              to  the  next graph to be processed. This requires a read of the
66              next graph before processing the current graph, which may  block
67              is  the  next graph is only generated in response to some action
68              pertaining to the processing of the current graph.
69
70       -V     Causes the program to print version information and exit.
71
72       -?     Causes the program to print usage information and exit.
73

OPERANDS

75       The following operand is supported:
76
77       files   Names of files containing 1 or more graphs in the dot language.
78               If  no  -f  option is given, the first name is removed from the
79               list and used as the input program. If the  list  of  files  is
80               empty, stdin will be used.
81

PROGRAMS

83       A  gvpr  program consists of a list of predicate‐action clauses, having
84       one of the forms:
85
86              BEGIN { action }
87
88              BEG_G { action }
89
90              N [ predicate ] { action }
91
92              E [ predicate ] { action }
93
94              END_G { action }
95
96              END { action }
97
98       A program can contain at most one of each of the BEGIN, END_G  and  END
99       clauses.   There  can  be  any number of BEG_G, N and E statements, the
100       first applied to graphs, the second  to  nodes,  the  third  to  edges.
101       These  are  separated  into  blocks,  a block consisting of an optional
102       BEG_G statement and all N and E statements up to the next BEG_G  state‐
103       ment, if any.  The top‐level semantics of a gvpr program are:
104
105              Evaluate the BEGIN clause, if any.
106              For each input graph G {
107                  For each block {
108                      Set G as the current graph and current object.
109                      Evaluate the BEG_G clause, if any.
110                      For each node and edge in G {
111                          Set the node or edge as the current object.
112                          Evaluate the N or E clauses, as appropriate.
113                      }
114                  }
115                  Set G as the current object.
116                  Evaluate the END_G clause, if any.
117              }
118              Evaluate the END clause, if any.
119
120       The  actions  of  the BEGIN, BEG_G, END_G and END clauses are performed
121       when the clauses are evaluated.  For N or E clauses, either the  predi‐
122       cate  or  action  may  be  omitted.   If  there is no predicate with an
123       action, the action is performed on every node or edge, as  appropriate.
124       If  there is no action and the predicate evaluates to true, the associ‐
125       ated node or edge is added to the target graph.
126
127       The blocks are evaluated in the order in which they  occur.   Within  a
128       block,  the  N  clauses  (E clauses, respectively) are evaluated in the
129       order in which the occur. Note, though, that within a  block,  N  or  E
130       clauses may be interlaced, depending on the traversal order.
131
132       Predicates  and  actions  are  sequences of statements in the C dialect
133       supported by the expr(3) library.  The only difference  between  predi‐
134       cates  and  actions is that the former must have a type that may inter‐
135       preted as either true or false.  Here the usual C  convention  is  fol‐
136       lowed, in which a non‐zero value is considered true. This would include
137       non‐empty strings and non‐empty references to nodes, edges,  etc.  How‐
138       ever, if a string can be converted to an integer, this value is used.
139
140       In  addition  to  the usual C base types (void, int, char, float, long,
141       unsigned and double), gvpr provides string as a synonym for char*,  and
142       the  graph‐based  types  node_t,  edge_t, graph_t and obj_t.  The obj_t
143       type can be viewed as a supertype of the other 3  concrete  types;  the
144       correct base type is maintained dynamically.  Besides these base types,
145       the only other supported type expressions are (associative) arrays.
146
147       Constants follow C syntax, but strings may be quoted with either  "..."
148       or '...'.  gvpr accepts C++ comments as well as cpp‐type comments.  For
149       the latter, if a line begins with a '#' character, the rest of the line
150       is ignored.
151
152       A statement can be a declaration of a function, a variable or an array,
153       or an executable statement. For declarations, there is a single  scope.
154       Array declarations have the form:
155
156               type array [ type0 ]
157
158       where   type0   is optional. If it is supplied, the parser will enforce
159       that all array subscripts have the specified type. If it  is  not  sup‐
160       plied,  objects of all types can be used as subscripts.  As in C, vari‐
161       ables and arrays must be declared. In particular, an  undeclared  vari‐
162       able will be interpreted as the name of an attribute of a node, edge or
163       graph, depending on the context.
164
165       Executable statements can be one of the following:
166              { [ statement ... ] }
167              expression              // commonly var = expression
168              if( expression ) statement [ else statement ]
169              for( expression ; expression ; expression ) statement
170              for( array [ var ]) statement
171              forr( array [ var ]) statement
172              while( expression ) statement
173              switch( expression ) case statements
174              break [ expression ]
175              continue [ expression ]
176              return [ expression ]
177       Items in brackets are optional.
178
179       In the second form of the for statement and  the  forr  statement,  the
180       variable  var  is  set  to each value used as an index in the specified
181       array and then the associated statement is evaluated. For  numeric  and
182       string  indices,  the  indices  are returned in increasing (decreasing)
183       numeric or lexicographic order for for (forr, respectively).  This  can
184       be used for sorting.
185
186       Function definitions can only appear in the BEGIN clause.
187
188       Expressions  include the usual C expressions.  String comparisons using
189       == and != treat the right hand operand as a pattern for the purpose  of
190       regular  expression  matching.   Patterns use ksh(1) file match pattern
191       syntax.  (For simple string equality, use the strcmp function.
192
193       gvpr will attempt to use an expression as a string or numeric value  as
194       appropriate.  Both  C-like casts and function templates will cause con‐
195       versions to be performed, if possible.
196
197       Expressions of graphical type (i.e., graph_t,  node_t,  edge_t,  obj_t)
198       may  be followed by a field reference in the form of .name. The result‐
199       ing value is the value of the attribute named name of the given object.
200       In  addition,  in certain contexts an undeclared, unmodified identifier
201       is taken to be an attribute name. Specifically, such identifiers denote
202       attributes  of  the  current  node  or  edge,  respectively, in N and E
203       clauses, and the current graph in BEG_G and END_G clauses.
204
205       As usual in the libcgraph(3) model, attributes are  string‐valued.   In
206       addition, gvpr supports certain pseudo‐attributes of graph objects, not
207       necessarily string‐valued. These reflect intrinsic  properties  of  the
208       graph objects and cannot be set by the user.
209
210       head : node_t
211              the head of an edge.
212
213       tail : node_t
214              the tail of an edge.
215
216       name : string
217              the  name of an edge, node or graph. The name of an edge has the
218              form "<tail‐name><edge‐op><head‐name>[<key>]",  where  <edge‐op>
219              is  "->"  or  "--" depending on whether the graph is directed or
220              not. The bracket part [<key>] only appears if  the  edge  has  a
221              non‐trivial key.
222
223       indegree : int
224              the indegree of a node.
225
226       outdegree : int
227              the outdegree of a node.
228
229       degree : int
230              the degree of a node.
231
232       root : graph_t
233              the root graph of an object. The root of a root graph is itself.
234
235       parent : graph_t
236              the  parent  graph  of a subgraph. The parent of a root graph is
237              NULL
238
239       n_edges : int
240              the number of edges in the graph
241
242       n_nodes : int
243              the number of nodes in the graph
244
245       directed : int
246              true (non‐zero) if the graph is directed
247
248       strict : int
249              true (non‐zero) if the graph is strict
250

BUILT‐IN FUNCTIONS

252       The following functions are built into gvpr. Those functions  returning
253       references to graph objects return NULL in case of failure.
254
255   Graphs and subgraph
256       graph(s : string, t : string) : graph_t
257              creates  a  graph whose name is s and whose type is specified by
258              the string t. Ignoring case, the characters U, D, S, N have  the
259              interpretation  undirected,  directed,  strict,  and non‐strict,
260              respectively. If t is empty, a  directed,  non‐strict  graph  is
261              generated.
262
263       subg(g : graph_t, s : string) : graph_t
264              creates  a  subgraph  in  graph  g  with name s. If the subgraph
265              already exists, it is returned.
266
267       isSubg(g : graph_t, s : string) : graph_t
268              returns the subgraph in graph g with name s, if  it  exists,  or
269              NULL otherwise.
270
271       fstsubg(g : graph_t) : graph_t
272              returns the first subgraph in graph g, or NULL if none exists.
273
274       nxtsubg(sg : graph_t) : graph_t
275              returns the next subgraph after sg, or NULL.
276
277       isDirect(g : graph_t) : int
278              returns true if and only if g is directed.
279
280       isStrict(g : graph_t) : int
281              returns true if and only if g is strict.
282
283       nNodes(g : graph_t) : int
284              returns the number of nodes in g.
285
286       nEdges(g : graph_t) : int
287              returns the number of edges in g.
288
289   Nodes
290       node(sg : graph_t, s : string) : node_t
291              creates  a  node  in  graph  g of name s. If such a node already
292              exists, it is returned.
293
294       subnode(sg : graph_t, n : node_t) : node_t
295              inserts the node n into the subgraph g. Returns the node.
296
297       fstnode(g : graph_t) : node_t
298              returns the first node in graph g, or NULL if none exists.
299
300       nxtnode(n : node_t) : node_t
301              returns the next node after n in the root graph, or NULL.
302
303       nxtnode_sg(sg : graph_t, n : node_t) : node_t
304              returns the next node after n in sg, or NULL.
305
306       isNode(sg : graph_t, s : string) : node_t
307              looks for a node in (sub)graph sg of name  s.  If  such  a  node
308              exists, it is returned. Otherwise, NULL is returned.
309
310       isSubnode(sg : graph_t, n : node_t) : int
311              returns  non-zero  if node n is in (sub)graph sg, or zero other‐
312              wise.
313
314       indegreeOf(sg : graph_t, n : node_t) : int
315              returns the indegree of node n in (sub)graph sg.
316
317       outdegreeOf(sg : graph_t, n : node_t) : int
318              returns the outdegree of node n in (sub)graph sg.
319
320       degreeOf(sg : graph_t, n : node_t) : int
321              returns the degree of node n in (sub)graph sg.
322
323   Edges
324       edge(t : node_t, h : node_t, s : string) : edge_t
325              creates an edge with tail node t, head node h and name s in  the
326              root  graph. If the graph is undirected, the distinction between
327              head and tail nodes is unimportant.  If  such  an  edge  already
328              exists, it is returned.
329
330       edge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
331              creates  an  edge  with  tail  node t, head node h and name s in
332              (sub)graph sg (and all parent graphs). If  the  graph  is  undi‐
333              rected,  the distinction between head and tail nodes is unimpor‐
334              tant.  If such an edge already exists, it is returned.
335
336       subedge(g : graph_t, e : edge_t) : edge_t
337              inserts the edge e into the subgraph g. Returns the edge.
338
339       isEdge(t : node_t, h : node_t, s : string) : edge_t
340              looks for an edge with tail node t, head node h and name  s.  If
341              the  graph  is undirected, the distinction between head and tail
342              nodes is unimportant.  If such an edge exists, it  is  returned.
343              Otherwise, NULL is returned.
344
345       isEdge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
346              looks  for  an  edge with tail node t, head node h and name s in
347              (sub)graph sg. If  the  graph  is  undirected,  the  distinction
348              between  head  and  tail  nodes is unimportant.  If such an edge
349              exists, it is returned. Otherwise, NULL is returned.
350
351       isSubedge(g : graph_t, e : edge_t) : int
352              returns non-zero if edge e is in (sub)graph sg, or  zero  other‐
353              wise.
354
355       fstout(n : node_t) : edge_t
356              returns the first outedge of node n in the root graph.
357
358       fstout_sg(sg : graph_t, n : node_t) : edge_t
359              returns the first outedge of node n in (sub)graph sg.
360
361       nxtout(e : edge_t) : edge_t
362              returns the next outedge after e in the root graph.
363
364       nxtout_sg(sg : graph_t, e : edge_t) : edge_t
365              returns the next outedge after e in graph sg.
366
367       fstin(n : node_t) : edge_t
368              returns the first inedge of node n in the root graph.
369
370       fstin_sg(sg : graph_t, n : node_t) : edge_t
371              returns the first inedge of node n in graph sg.
372
373       nxtin(e : edge_t) : edge_t
374              returns the next inedge after e in the root graph.
375
376       nxtin_sg(sg : graph_t, e : edge_t) : edge_t
377              returns the next inedge after e in graph sg.
378
379       fstedge(n : node_t) : edge_t
380              returns the first edge of node n in the root graph.
381
382       fstedge_sg(sg : graph_t, n : node_t) : edge_t
383              returns the first edge of node n in graph sg.
384
385       nxtedge(e : edge_t, node_t) : edge_t
386              returns the next edge after e in the root graph.
387
388       nxtedge_sg(sg : graph_t, e : edge_t, node_t) : edge_t
389              returns the next edge after e in the graph sg.
390
391       opp(e : edge_t, node_t) : node_t
392              returns  the node on the edge e not equal to n.  Returns NULL if
393              n is not a node of e.  This can be useful when using fstedge and
394              nxtedge to enumerate the neighbors of n.
395
396   Graph I/O
397       write(g : graph_t) : void
398              prints g in dot format onto the output stream.
399
400       writeG(g : graph_t, fname : string) : void
401              prints g in dot format into the file fname.
402
403       fwriteG(g : graph_t, fd : int) : void
404              prints g in dot format onto the open stream denoted by the inte‐
405              ger fd.
406
407       readG(fname : string) : graph_t
408              returns a graph read from the file fname. The graph should be in
409              dot format. If no graph can be read, NULL is returned.
410
411       freadG(fd : int) : graph_t
412              returns  the  next  graph read from the open stream fd.  Returns
413              NULL at end of file.
414
415   Graph miscellany
416       delete(g : graph_t, x : obj_t) : void
417              deletes object x from graph g.  If g is NULL, the function  uses
418              the  root graph of x.  If x is a graph or subgraph, it is closed
419              unless x is locked.
420
421       isIn(g : graph_t, x : obj_t) : int
422              returns true if x is in subgraph g.
423
424       cloneG(g : graph_t, s : string) : graph_t
425              creates a clone of graph g with name of s.  If s is "", the cre‐
426              ated graph has the same name as g.
427
428       clone(g : graph_t, x : obj_t) : obj_t
429              creates  a clone of object x in graph g.  In particular, the new
430              object has the same name/value attributes and structure  as  the
431              original  object.   If  an object with the same key as x already
432              exists, its attributes are overlaid by those of x and the object
433              is  returned.   If an edge is cloned, both endpoints are implic‐
434              itly cloned.  If a graph is cloned, all nodes,  edges  and  sub‐
435              graphs  are  implicitly cloned.  If x is a graph, g may be NULL,
436              in which case the cloned object will be a  new  root  graph.  In
437              this case, the call is equivalent to cloneG(x,"").
438
439       copy(g : graph_t, x : obj_t) : obj_t
440              creates  a copy of object x in graph g, where the new object has
441              the same name/value attributes as the original  object.   If  an
442              object with the same key as x already exists, its attributes are
443              overlaid by those of x and the object is  returned.   Note  that
444              this  is  a  shallow  copy.  If x is a graph, none of its nodes,
445              edges or subgraphs are copied into the new graph.  If  x  is  an
446              edge,  the  endpoints are created if necessary, but they are not
447              cloned.  If x is a graph, g may  be  NULL,  in  which  case  the
448              cloned object will be a new root graph.
449
450       copyA(src : obj_t, tgt : obj_t) : int
451              copies  the  attributes of object src to object tgt, overwriting
452              any attribute values tgt may initially have.
453
454       induce(g : graph_t) : void
455              extends g to its node‐induced subgraph  extension  in  its  root
456              graph.
457
458       hasAttr(src : obj_t, name : string) : int
459              returns  non-zero  if  object src has an attribute whose name is
460              name. It returns 0 otherwise.
461
462       isAttr(g : graph_t, kind : string, name : string) : int
463              returns non-zero if an attribute name has been defined in g  for
464              objects  of  the  given kind. For nodes, edges, and graphs, kind
465              should be "N", "E", and "G", respectively.  It returns 0  other‐
466              wise.
467
468       aget(src : obj_t, name : string) : string
469              returns  the value of attribute name in object src. This is use‐
470              ful for those cases when name conflicts with one of the keywords
471              such  as  "head"  or  "root".   If  the  attribute  has not been
472              declared in the graph, the function will initialize  it  with  a
473              default  value  of "". To avoid this, one should use the hasAttr
474              or isAttr function to check that the attribute exists.
475
476       aset(src : obj_t, name : string, value : string) : int
477              sets the value  of  attribute  name  in  object  src  to  value.
478              Returns 0 on success, non‐zero on failure. See aget above.
479
480       getDflt(g : graph_t, kind : string, name : string) : string
481              returns  the  default value of attribute name in objects in g of
482              the given kind. For nodes, edges, and  graphs,  kind  should  be
483              "N",  "E", and "G", respectively.  If the attribute has not been
484              declared in the graph, the function will initialize  it  with  a
485              default  value  of  "". To avoid this, one should use the isAttr
486              function to check that the attribute exists.
487
488       setDflt(g : graph_t, kind : string, name : string, value  :  string)  :
489       int
490              sets  the default value of attribute name to value in objects in
491              g of the given kind. For nodes, edges, and graphs,  kind  should
492              be  "N", "E", and "G", respectively.  Returns 0 on success, non‐
493              zero on failure. See getDflt above.
494
495       fstAttr(g : graph_t, kind : string) : string
496              returns the name of the first attribute of objects in g  of  the
497              given  kind.  For  nodes, edges, and graphs, kind should be "N",
498              "E", and "G", respectively.  If there  are  no  attributes,  the
499              string "" is returned.
500
501       nxtAttr(g : graph_t, kind : string, name : string) : string
502              returns  the  name  of the next attribute of objects in g of the
503              given kind after the attribute name.  The argument name must  be
504              the  name  of  an  existing  attribute; it will typically be the
505              return value of an previous call to  fstAttr  or  nxtAttr.   For
506              nodes,  edges,  and  graphs,  kind  should be "N", "E", and "G",
507              respectively.  If there are no attributes left, the string "" is
508              returned.
509
510       compOf(g : graph_t, n : node_t) : graph_t
511              returns  the  connected component of the graph g containing node
512              n, as a subgraph of g. The subgraph only contains the nodes. One
513              can  use induce to add the edges. The function fails and returns
514              NULL if n is not in g. Connectivity is based on  the  underlying
515              undirected graph of g.
516
517       kindOf(obj : obj_t) : string
518              returns  an indication of what kind of graph object is the argu‐
519              ment.  For nodes, edges, and graphs, it returns should  be  "N",
520              "E", and "G", respectively.
521
522       lock(g : graph_t, v : int) : int
523              implements  graph  locking  on  root graphs. If the integer v is
524              positive, the graph is set so that future calls to  delete  have
525              no  immediate  effect.   If v is zero, the graph is unlocked. If
526              there has been a call to delete the graph while it  was  locked,
527              the graph is closed.  If v is negative, nothing is done.  In all
528              cases, the previous lock value is returned.
529
530   Strings
531       sprintf(fmt : string, ...) : string
532              returns the string resulting from formatting the values  of  the
533              expressions  occurring after fmt according to the printf(3) for‐
534              mat fmt
535
536       gsub(str : string, pat : string) : string
537
538       gsub(str : string, pat : string, repl : string) : string
539              returns str with all substrings matching pat deleted or replaced
540              by repl, respectively.
541
542       sub(str : string, pat : string) : string
543
544       sub(str : string, pat : string, repl : string) : string
545              returns  str with the leftmost substring matching pat deleted or
546              replaced by repl, respectively. The characters '^' and  '$'  may
547              be used at the beginning and end, respectively, of pat to anchor
548              the pattern to the beginning or end of str.
549
550       substr(str : string, idx : int) : string
551
552       substr(str : string, idx : int, len : int) : string
553              returns the substring of str starting at position idx to the end
554              of  the  string or of length len, respectively.  Indexing starts
555              at 0. If idx is negative or idx is greater than  the  length  of
556              str, a fatal error occurs. Similarly, in the second case, if len
557              is negative or idx + len is greater than the length  of  str,  a
558              fatal error occurs.
559
560       strcmp(s1 : string, s2 : string) : int
561              provides the standard C function strcmp(3).
562
563       length(s : string) : int
564              returns the length of string s.
565
566       index(s : string, t : string) : int
567
568       rindex(s : string, t : string) : int
569              returns  the  index of the character in string s where the left‐
570              most (rightmost) copy of string t can be found, or -1  if  t  is
571              not a substring of s.
572
573       match(s : string, p : string) : int
574              returns  the  index of the character in string s where the left‐
575              most match of pattern p can be found, or -1 if no substring of s
576              matches p.
577
578       toupper(s : string) : string
579              returns  a version of s with the alphabetic characters converted
580              to upper-case.
581
582       tolower(s : string) : string
583              returns a version of s with the alphabetic characters  converted
584              to lower-case.
585
586       canon(s : string) : string
587              returns  a  version of s appropriate to be used as an identifier
588              in a dot file.
589
590       html(g : graph_t, s : string) : string
591              returns a ``magic'' version  of s as an HTML string.  This  will
592              typically  be  used  to  attach  an  HTML-like  label to a graph
593              object. Note that the returned string lives in g. In particular,
594              it will be freed when g is closed, and to act as an HTML string,
595              it has to be used with an object of g. In  addition,  note  that
596              the  angle bracket quotes should not be part of s. These will be
597              added if g is written in concrete DOT format.
598
599       ishtml(s : string) : int
600              returns non-zero if and only if s is an HTML string.
601
602       xOf(s : string) : string
603              returns the string "x" if s has the form "x,y", where both x and
604              y are numeric.
605
606       yOf(s : string) : string
607              returns the string "y" if s has the form "x,y", where both x and
608              y are numeric.
609
610       llOf(s : string) : string
611              returns   the   string   "llx,lly"   if   s   has    the    form
612              "llx,lly,urx,ury",  where  all  of  llx,  lly,  urx, and ury are
613              numeric.
614
615       urOf(s)
616              urOf(s : string) : string returns the string "urx,ury" if s  has
617              the  form "llx,lly,urx,ury", where all of llx, lly, urx, and ury
618              are numeric.
619
620       sscanf(s : string, fmt : string, ...) : int
621              scans the string s, extracting values according to the sscanf(3)
622              format  fmt.   The  values are stored in the addresses following
623              fmt, addresses having the form &v,  where  v  is  some  declared
624              variable  of the correct type.  Returns the number of items suc‐
625              cessfully scanned.
626
627       split(s : string, arr : array, seps : string) : int
628
629       split(s : string, arr : array) : int
630
631       tokens(s : string, arr : array, seps : string) : int
632
633       tokens(s : string, arr : array) : int
634              The split function breaks the string s into  fields,  while  the
635              tokens function breaks the string into tokens.  A field consists
636              of all non-separator characters between two separator characters
637              or  the beginning or end of the string. Thus, a field may be the
638              empty string. A token is a maximal, non-empty substring not con‐
639              taining  a  separator  character.   The separator characters are
640              those given in the seps argument.  If seps is not provided,  the
641              default  value  is  " \t\n".  The functions return the number of
642              fields or tokens.
643
644              The fields and tokens are stored  in  the  argument  array.  The
645              array  must be string-valued and, if an index type is specified,
646              it must be int. The entries are indexed by consecutive integers,
647              starting  at  0.  Any values already stored in the array will be
648              either overwritten, or  still  be  present  after  the  function
649              returns.
650
651   I/O
652       print(...) : void
653              print(  expr, ... ) prints a string representation of each argu‐
654              ment in turn onto stdout, followed by a newline.
655
656       printf(fmt : string, ...) : int
657
658       printf(fd : int, fmt : string, ...) : int
659              prints the string resulting from formatting the  values  of  the
660              expressions following fmt according to the printf(3) format fmt.
661              Returns 0 on success.  By default, it prints on stdout.  If  the
662              optional  integer  fd  is  given,  output is written on the open
663              stream associated with fd.
664
665       scanf(fmt : string, ...) : int
666
667       scanf(fd : int, fmt : string, ...) : int
668              scans in values from an input stream according to  the  scanf(3)
669              format  fmt.   The  values are stored in the addresses following
670              fmt, addresses having the form &v,  where  v  is  some  declared
671              variable  of the correct type.  By default, it reads from stdin.
672              If the optional integer fd is given, input is read from the open
673              stream associated with fd.  Returns the number of items success‐
674              fully scanned.
675
676       openF(s : string, t : string) : int
677              opens the file s as an I/O stream. The string argument t  speci‐
678              fies  how  the file is opened. The arguments are the same as for
679              the C function fopen(3).  It returns  an  integer  denoting  the
680              stream, or -1 on error.
681
682              As  usual, streams 0, 1 and 2 are already open as stdin, stdout,
683              and stderr, respectively. Since gvpr may use stdin to  read  the
684              input graphs, the user should avoid using this stream.
685
686       closeF(fd : int) : int
687              closes the open stream denoted by the integer fd.  Streams  0, 1
688              and 2 cannot be closed.  Returns 0 on success.
689
690       readL(fd : int) : string
691              returns the next line read from the input stream fd. It  returns
692              the  empty string "" on end of file. Note that the newline char‐
693              acter is left in the returned string.
694
695   Math
696       exp(d : double) : double
697              returns e to the dth power.
698
699       log(d : double) : double
700              returns the natural log of d.
701
702       sqrt(d : double) : double
703              returns the square root of the double d.
704
705       pow(d : double, x : double) : double
706              returns d raised to the xth power.
707
708       cos(d : double) : double
709              returns the cosine of d.
710
711       sin(d : double) : double
712              returns the sine of d.
713
714       atan2(y : double, x : double) : double
715              returns the arctangent of y/x in the range -pi to pi.
716
717       MIN(y : double, x : double) : double
718              returns the minimum of y and x.
719
720       MAX(y : double, x : double) : double
721              returns the maximum of y and x.
722
723   Associative Arrays
724       # arr : int
725              returns the number of elements in the array arr.
726
727       idx in arr : int
728              returns 1 if a value has been set for index  idx  in  the  array
729              arr.  It returns 0 otherwise.
730
731       unset(v : array, idx) : int
732              removes  the  item  indexed  by  idx.  It  returns 1 if the item
733              existed, 0 otherwise.
734
735       unset(v : array) : void
736              re-initializes the array.
737
738   Miscellaneous
739       exit(v : int) : void
740              causes gvpr to exit with the exit code v.
741
742       system(cmd : string) : int
743              provides the standard C function system(3).  It executes cmd  in
744              the user's shell environment, and returns the exit status of the
745              shell.
746
747       rand() : double
748              returns a pseudo‐random double between 0 and 1.
749
750       srand() : int
751
752       srand(v : int) : int
753              sets a seed for the random number generator. The optional  argu‐
754              ment gives the seed; if it is omitted, the current time is used.
755              The previous seed value is  returned.  srand  should  be  called
756              before any calls to rand.
757
758       colorx(color : string, fmt : string) : string
759              translates  a  color from one format to another. The color argu‐
760              ment should be a color in one of the recognized string represen‐
761              tations. The fmt value should be one of "RGB", "RGBA", "HSV", or
762              "HSVA".  An empty string is returned on error.
763

BUILT‐IN VARIABLES

765       gvpr provides certain special, built‐in variables, whose values are set
766       automatically  by  gvpr  depending on the context. Except as noted, the
767       user cannot modify their values.
768
769       $ : obj_t
770              denotes the current object (node, edge, graph) depending on  the
771              context.  It is not available in BEGIN or END clauses.
772
773       $F : string
774              is the name of the current input file.
775
776       $G : graph_t
777              denotes  the  current graph being processed. It is not available
778              in BEGIN or END clauses.
779
780       $NG : graph_t
781              denotes the next graph to be processed. If $NG is NULL, the cur‐
782              rent  graph  $G  is the last graph. Note that if the input comes
783              from stdin, the last graph cannot be determined until the  input
784              pipe is closed.  It is not available in BEGIN or END clauses, or
785              if the -n flag is used.
786
787       $O : graph_t
788              denotes the output graph. Before graph traversal, it is initial‐
789              ized to the target graph. After traversal and any END_G actions,
790              if it refers to a non‐empty graph, that graph  is  printed  onto
791              the  output stream.  It is only valid in N, E and END_G clauses.
792              The output graph may be set by the user.
793
794       $T : graph_t
795              denotes the current target graph. It is a subgraph of $G and  is
796              available only in N, E and END_G clauses.
797
798       $tgtname : string
799              denotes  the name of the target graph.  By default, it is set to
800              "gvpr_result".  If used multiple times during the  execution  of
801              gvpr,  the name will be appended with an integer.  This variable
802              may be set by the user.
803
804       $tvroot : node_t
805              indicates the starting  node  for  a  (directed  or  undirected)
806              depth‐first or breadth‐first traversal of the graph (cf. $tvtype
807              below).  The default value is NULL for each input graph.   After
808              the  traversal  at  the  given root, if the value of $tvroot has
809              changed, a new traversal  will  begin  with  the  new  value  of
810              $tvroot. Also, set $tvnext below.
811
812       $tvnext : node_t
813              indicates  the next starting node for a (directed or undirected)
814              depth‐first or breadth‐first traversal of the graph (cf. $tvtype
815              below).  If a traversal finishes and the $tvroot but the $tvnext
816              has been set but not used, this node will be used  as  the  next
817              choice  for  $tvroot.   The default value is NULL for each input
818              graph.
819
820       $tvedge : edge_t
821              For BFS and DFS traversals, this is set  to  the  edge  used  to
822              arrive  at  the current node or edge. At the beginning of a tra‐
823              versal, or for other traversal types, the value is NULL.
824
825       $tvtype : tvtype_t
826              indicates how gvpr traverses a graph. It can only  take  one  of
827              the  constant  values  with  the  previx  "TV_" described below.
828              TV_flat is the default.
829
830              In the underlying graph library cgraph(3), edges  in  undirected
831              graphs  are  given an arbitrary direction. This is used for tra‐
832              versals, such as TV_fwd, requiring directed edges.
833
834       ARGC : int
835              denotes the number of arguments specified by the  -a  args  com‐
836              mand‐line argument.
837
838       ARGV : string array
839              denotes the array of arguments specified by the -a args command‐
840              line argument. The ith argument is given by ARGV[i].
841

BUILT‐IN CONSTANTS

843       There are several symbolic constants defined by gvpr.
844
845       NULL : obj_t
846              a null object reference, equivalent to 0.
847
848       TV_flat : tvtype_t
849              a simple, flat traversal, with graph objects  visited  in  seem‐
850              ingly arbitrary order.
851
852       TV_ne : tvtype_t
853              a traversal which first visits all of the nodes, then all of the
854              edges.
855
856       TV_en : tvtype_t
857              a traversal which first visits all of the edges, then all of the
858              nodes.
859
860       TV_dfs : tvtype_t
861       TV_postdfs : tvtype_t
862       TV_prepostdfs : tvtype_t
863              a  traversal  of  the  graph  using  a depth‐first search on the
864              underlying undirected graph.  To do  the  traversal,  gvpr  will
865              check  the  value of $tvroot. If this has the same value that it
866              had previously (at the start, the previous value is  initialized
867              to  NULL.),  gvpr  will  simply look for some unvisited node and
868              traverse its connected component. On the other hand, if  $tvroot
869              has changed, its connected component will be toured, assuming it
870              has not been previously visited or, if $tvroot is NULL, the tra‐
871              versal will stop. Note that using TV_dfs and $tvroot, it is pos‐
872              sible to create an infinite loop.
873
874              By default, the traversal is done in pre-order. That is, a  node
875              is  visited  before  all of its unvisited edges. For TV_postdfs,
876              all of a node's unvisited edges are visited before the node. For
877              TV_prepostdfs,  a node is visited twice, before and after all of
878              its unvisited edges.
879
880       TV_fwd : tvtype_t
881       TV_postfwd : tvtype_t
882       TV_prepostfwd : tvtype_t
883              A traversal of the graph using a depth‐first search on the graph
884              following  only  forward arcs.  The choice of roots for the tra‐
885              versal is the same as described for TV_dfs above.  The different
886              order  of visitation specified by TV_fwd, TV_postfwd and TV_pre‐
887              postfwd are the same as those specified by the analogous traver‐
888              sals TV_dfs, TV_postdfs and TV_prepostdfs.
889
890       TV_rev : tvtype_t
891       TV_postrev : tvtype_t
892       TV_prepostrev : tvtype_t
893              A traversal of the graph using a depth‐first search on the graph
894              following only reverse arcs.  The choice of roots for  the  tra‐
895              versal is the same as described for TV_dfs above.  The different
896              order of visitation specified by TV_rev, TV_postrev and  TV_pre‐
897              postrev are the same as those specified by the analogous traver‐
898              sals TV_dfs, TV_postdfs and TV_prepostdfs.
899
900       TV_bfs : tvtype_t
901              A traversal of the graph using a  breadth‐first  search  on  the
902              graph ignoring edge directions. See the item on TV_dfs above for
903              the role of $tvroot.
904

EXAMPLES

906              gvpr -i 'N[color=="blue"]' file.gv
907
908       Generate the node‐induced subgraph of all nodes with color blue.
909
910              gvpr -c 'N[color=="blue"]{color = "red"}' file.gv
911
912       Make all blue nodes red.
913
914              BEGIN { int n, e; int tot_n = 0; int tot_e = 0; }
915              BEG_G {
916                n = nNodes($G);
917                e = nEdges($G);
918                printf ("%d nodes %d edges %s\n", n, e, $G.name);
919                tot_n += n;
920                tot_e += e;
921              }
922              END { printf ("%d nodes %d edges total\n", tot_n, tot_e) }
923
924       Version of the program gc.
925
926              gvpr -c ""
927
928       Equivalent to nop.
929
930              BEG_G { graph_t g = graph ("merge", "S"); }
931              E {
932                node_t h = clone(g,$.head);
933                node_t t = clone(g,$.tail);
934                edge_t e = edge(t,h,"");
935                e.weight = e.weight + 1;
936              }
937              END_G { $O = g; }
938
939       Produces a  strict  version  of  the  input  graph,  where  the  weight
940       attribute  of an edge indicates how many edges from the input graph the
941       edge represents.
942
943              BEGIN {node_t n; int deg[]}
944              E{deg[head]++; deg[tail]++; }
945              END_G {
946                for (deg[n]) {
947                  printf ("deg[%s] = %d\n", n.name, deg[n]);
948                }
949              }
950
951       Computes the degrees of nodes with edges.
952
953              BEGIN {
954                int i, indent;
955                int seen[string];
956                void prInd (int cnt) {
957                  for (i = 0; i < cnt; i++) printf ("  ");
958                }
959              }
960              BEG_G {
961
962                 $tvtype = TV_prepostfwd;
963                 $tvroot = node($,ARGV[0]);
964              }
965              N {
966                if (seen[$.name]) indent--;
967                else {
968                  prInd(indent);
969                    print ($.name);
970                  seen[$.name] = 1;
971                  indent++;
972                }
973              }
974
975       Prints the depth-first traversal of the graph, starting with  the  node
976       whose name is ARGV[0], as an indented list.
977

ENVIRONMENT

979       GVPRPATH
980              Colon‐separated  list  of directories to be searched to find the
981              file specified by the -f option. gvpr has a default  list  built
982              in.  If  GVPRPATH  is  not defined, the default list is used. If
983              GVPRPATH starts with colon, the  list  is  formed  by  appending
984              GVPRPATH  to  the default list. If GVPRPATH ends with colon, the
985              list is formed by appending the default list to GVPRPATH. Other‐
986              wise, GVPRPATH is used for the list.
987
988       On  Windows systems, replace ``colon'' with ``semicolon'' in the previ‐
989       ous paragraph.
990

BUGS AND WARNINGS

992       Scripts should be careful deleting nodes  during  N{}  and  E{}  blocks
993       using  BFS  and  DFS  traversals  as these rely on stacks and queues of
994       nodes.
995
996       When the program is given as a command line argument, the  usual  shell
997       interpretation  takes place, which may affect some of the special names
998       in gvpr. To avoid this, it is  best  to  wrap  the  program  in  single
999       quotes.
1000
1001       If  string  constants  contain  pattern metacharacters that you want to
1002       escape to avoid pattern matching, two backslashes will probably be nec‐
1003       essary,  as  a  single backslash will be lost when the string is origi‐
1004       nally scanned. Usually, it is simpler to use strcmp  to  avoid  pattern
1005       matching.
1006
1007       As  of  24  April  2008, gvpr switched to using a new, underlying graph
1008       library, which uses the simpler model that there is only one copy of  a
1009       node,  not  one  copy  for  each subgraph logically containing it. This
1010       means that iterators such as nxtnode cannot traverse a  subgraph  using
1011       just  a node argument. For this reason, subgraph traversal requires new
1012       functions ending in "_sg", which also take  a  subgraph  argument.  The
1013       versions without that suffix will always traverse the root graph.
1014
1015       There  is a single global scope, except for formal function parameters,
1016       and even these can interfere with the type system. Also, the extent  of
1017       all  variables  is the entire life of the program.  It might be prefer‐
1018       able for scope to reflect the natural nesting of the  clauses,  or  for
1019       the  program to at least reset locally declared variables.  For now, it
1020       is advisable to use distinct names for all variables.
1021
1022       If a function ends with a complex statement, such as an  IF  statement,
1023       with  each  branch  doing  a return, type checking may fail.  Functions
1024       should use a return at the end.
1025
1026       The expr library does not support  string  values  of  (char*)0.   This
1027       means  we can't distinguish between "" and (char*)0 edge keys.  For the
1028       purposes of looking up and  creating  edges,  we  translate  ""  to  be
1029       (char*)0,  since this latter value is necessary in order to look up any
1030       edge with a matching head and tail.
1031
1032       Related to this, strings converted to integers act like char  pointers,
1033       getting  the  value  0  or  1  depending on whether the string consists
1034       solely of zeroes or not. Thus, the ((int)"2") evaluates to 1.
1035
1036       The language inherits the usual C problems such as dangling  references
1037       and the confusion between '=' and '=='.
1038

AUTHOR

1040       Emden R. Gansner <erg@research.att.com>
1041

SEE ALSO

1043       awk(1), gc(1), dot(1), nop(1), expr(3), cgraph(3)
1044
1045
1046
1047                                  4 May 2012                           GVPR(1)
Impressum