1GVPR(1)                     General Commands Manual                    GVPR(1)
2
3
4

NAME

6       gvpr - graph pattern scanning and processing language
7

SYNOPSIS

9       gvpr  [-icnqV?]   [ -o outfile ] [ -a args ] [ 'prog' | -f progfile ] [
10       files ]
11

DESCRIPTION

13       gvpr (previously known as gpr) is a graph  stream  editor  inspired  by
14       awk.  It copies input graphs to its output, possibly transforming their
15       structure and attributes, creating new graphs,  or  printing  arbitrary
16       information.   The  graph  model  is that provided by libcgraph(3).  In
17       particular, gvpr reads and writes graphs using the dot language.
18
19       Basically, gvpr traverses each input graph,  denoted  by  $G,  visiting
20       each  node  and  edge, matching it with the predicate‐action rules sup‐
21       plied in the input program.  The rules are  evaluated  in  order.   For
22       each  predicate  evaluating  to  true, the corresponding action is per‐
23       formed.  During the traversal, the current node or edge  being  visited
24       is denoted by $.
25
26       For  each  input graph, there is a target subgraph, denoted by $T, ini‐
27       tially empty and used to accumulate  chosen  entities,  and  an  output
28       graph,  $O,  used  for final processing and then written to output.  By
29       default, the output graph is the target graph.  The output graph can be
30       set in the program or, in a limited sense, on the command line.
31

OPTIONS

33       The following options are supported:
34
35       -a args
36              The  string args is split into whitespace‐separated tokens, with
37              the individual tokens available as strings in the  gvpr  program
38              as  ARGV[0],...,ARGV[ARGC-1].  Whitespace characters within sin‐
39              gle or double quoted substrings, or preceded by a backslash, are
40              ignored  as separators.  In general, a backslash character turns
41              off any special meaning of the following character.   Note  that
42              the tokens derived from multiple -a flags are concatenated.
43
44       -c     Use the source graph as the output graph.
45
46       -i     Derive  the  node‐induced subgraph extension of the output graph
47              in the context of its root graph.
48
49       -o outfile
50              Causes the output stream to be written to the specified file; by
51              default, output is written to stdout.
52
53       -f progfile
54              Use the contents of the specified file as the program to execute
55              on the input. If progfile contains a slash character,  the  name
56              is  taken  as the pathname of the file. Otherwise, gvpr will use
57              the directories specified in the environment  variable  GVPRPATH
58              to  look  for  the  file.  If -f is not given, gvpr will use the
59              first non‐option argument as the program.
60
61       -q     Turns off warning messages.
62
63       -n     Turns off graph read-ahead. By default, the variable $NG is  set
64              to  the  next graph to be processed. This requires a read of the
65              next graph before processing the current graph, which may  block
66              if  the  next graph is only generated in response to some action
67              pertaining to the processing of the current graph.
68
69       -V     Causes the program to print version information and exit.
70
71       -?     Causes the program to print usage information and exit.
72

OPERANDS

74       The following operand is supported:
75
76       files   Names of files containing 1 or more graphs in the dot language.
77               If  no  -f  option is given, the first name is removed from the
78               list and used as the input program. If the  list  of  files  is
79               empty, stdin will be used.
80

PROGRAMS

82       A  gvpr  program consists of a list of predicate‐action clauses, having
83       one of the forms:
84
85              BEGIN { action }
86
87              BEG_G { action }
88
89              N [ predicate ] { action }
90
91              E [ predicate ] { action }
92
93              END_G { action }
94
95              END { action }
96
97       A program can contain at most one of each of the BEGIN, END_G  and  END
98       clauses.   There  can  be  any number of BEG_G, N and E statements, the
99       first applied to graphs, the second  to  nodes,  the  third  to  edges.
100       These  are  separated  into  blocks,  a block consisting of an optional
101       BEG_G statement and all N and E statements up to the next BEG_G  state‐
102       ment, if any.  The top‐level semantics of a gvpr program are:
103
104              Evaluate the BEGIN clause, if any.
105              For each input graph G {
106                  For each block {
107                      Set G as the current graph and current object.
108                      Evaluate the BEG_G clause, if any.
109                      For each node and edge in G {
110                          Set the node or edge as the current object.
111                          Evaluate the N or E clauses, as appropriate.
112                      }
113                  }
114                  Set G as the current object.
115                  Evaluate the END_G clause, if any.
116              }
117              Evaluate the END clause, if any.
118
119       The  actions  of  the BEGIN, BEG_G, END_G and END clauses are performed
120       when the clauses are evaluated.  For N or E clauses, either the  predi‐
121       cate  or  action  may  be  omitted.   If  there is no predicate with an
122       action, the action is performed on every node or edge, as  appropriate.
123       If  there is no action and the predicate evaluates to true, the associ‐
124       ated node or edge is added to the target graph.
125
126       The blocks are evaluated in the order in which they  occur.   Within  a
127       block,  the  N  clauses  (E clauses, respectively) are evaluated in the
128       order in which the occur. Note, though, that within a  block,  N  or  E
129       clauses may be interlaced, depending on the traversal order.
130
131       Predicates  and  actions  are  sequences of statements in the C dialect
132       supported by the expr(3) library.  The only difference  between  predi‐
133       cates  and  actions is that the former must have a type that may inter‐
134       preted as either true or false.  Here the usual C  convention  is  fol‐
135       lowed, in which a non‐zero value is considered true. This would include
136       non‐empty strings and non‐empty references to nodes, edges,  etc.  How‐
137       ever, if a string can be converted to an integer, this value is used.
138
139       In  addition  to  the usual C base types (void, int, char, float, long,
140       unsigned and double), gvpr provides string as a synonym for char*,  and
141       the  graph‐based  types  node_t,  edge_t, graph_t and obj_t.  The obj_t
142       type can be viewed as a supertype of the other 3  concrete  types;  the
143       correct base type is maintained dynamically.  Besides these base types,
144       the only other supported type expressions are (associative) arrays.
145
146       Constants follow C syntax, but strings may be quoted with either  "..."
147       or '...'.  gvpr accepts C++ comments as well as cpp‐type comments.  For
148       the latter, if a line begins with a '#' character, the rest of the line
149       is ignored.
150
151       A statement can be a declaration of a function, a variable or an array,
152       or an executable statement. For declarations, there is a single  scope.
153       Array declarations have the form:
154
155               type array [ type0 ]
156
157       where   type0   is optional. If it is supplied, the parser will enforce
158       that all array subscripts have the specified type. If it  is  not  sup‐
159       plied,  objects of all types can be used as subscripts.  As in C, vari‐
160       ables and arrays must be declared. In particular, an  undeclared  vari‐
161       able will be interpreted as the name of an attribute of a node, edge or
162       graph, depending on the context.
163
164       Executable statements can be one of the following:
165
166              { [ statement ... ] }
167              expression                                              // commonly var = expression
168              if( expression ) statement [ else statement ]
169              for( expression ; expression ; expression ) statement
170              for( array [ var ]) statement
171              forr( array [ var ]) statement
172              while( expression ) statement
173              switch( expression ) case statements
174              break [ expression ]
175              continue [ expression ]
176              return [ expression ]
177       Items in brackets are optional.
178
179       In the second form of the for statement and  the  forr  statement,  the
180       variable  var  is  set  to each value used as an index in the specified
181       array and then the associated statement is evaluated. For  numeric  and
182       string  indices,  the  indices  are returned in increasing (decreasing)
183       numeric or lexicographic order for for (forr, respectively).  This  can
184       be used for sorting.
185
186       Function definitions can only appear in the BEGIN clause.
187
188       Expressions  include the usual C expressions.  String comparisons using
189       == and != treat the right hand operand as a pattern for the purpose  of
190       regular  expression  matching.   Patterns use ksh(1) file match pattern
191       syntax.  (For simple string equality, use the strcmp function.
192
193       gvpr will attempt to use an expression as a string or numeric value  as
194       appropriate.  Both  C-like casts and function templates will cause con‐
195       versions to be performed, if possible.
196
197       Expressions of graphical type (i.e., graph_t,  node_t,  edge_t,  obj_t)
198       may  be followed by a field reference in the form of .name. The result‐
199       ing value is the value of the attribute named name of the given object.
200       In  addition,  in certain contexts an undeclared, unmodified identifier
201       is taken to be an attribute name. Specifically, such identifiers denote
202       attributes  of  the  current  node  or  edge,  respectively, in N and E
203       clauses, and the current graph in BEG_G and END_G clauses.
204
205       As usual in the libcgraph(3) model, attributes are  string‐valued.   In
206       addition, gvpr supports certain pseudo‐attributes of graph objects, not
207       necessarily string‐valued. These reflect intrinsic  properties  of  the
208       graph objects and cannot be set by the user.
209
210       head : node_t
211              the head of an edge.
212
213       tail : node_t
214              the tail of an edge.
215
216       name : string
217              the  name of an edge, node or graph. The name of an edge has the
218              form "<tail‐name><edge‐op><head‐name>[<key>]",  where  <edge‐op>
219              is  "->"  or  "--" depending on whether the graph is directed or
220              not. The bracket part [<key>] only appears if  the  edge  has  a
221              non‐trivial key.
222
223       indegree : int
224              the indegree of a node.
225
226       outdegree : int
227              the outdegree of a node.
228
229       degree : int
230              the degree of a node.
231
232       X : douible
233              the  X  coordinate  of  a  node.  (Assumes  the  node  has a pos
234              attribute.)
235
236       Y : douible
237              the Y coordinate  of  a  node.  (Assumes  the  node  has  a  pos
238              attribute.)
239
240       root : graph_t
241              the root graph of an object. The root of a root graph is itself.
242
243       parent : graph_t
244              the  parent  graph  of a subgraph. The parent of a root graph is
245              NULL
246
247       n_edges : int
248              the number of edges in the graph
249
250       n_nodes : int
251              the number of nodes in the graph
252
253       directed : int
254              true (non‐zero) if the graph is directed
255
256       strict : int
257              true (non‐zero) if the graph is strict
258

BUILT‐IN FUNCTIONS

260       The following functions are built into gvpr. Those functions  returning
261       references to graph objects return NULL in case of failure.
262
263   Graphs and subgraph
264       graph(s : string, t : string) : graph_t
265              creates  a  graph whose name is s and whose type is specified by
266              the string t. Ignoring case, the characters U, D, S, N have  the
267              interpretation  undirected,  directed,  strict,  and non‐strict,
268              respectively. If t is empty, a  directed,  non‐strict  graph  is
269              generated.
270
271       subg(g : graph_t, s : string) : graph_t
272              creates  a  subgraph  in  graph  g  with name s. If the subgraph
273              already exists, it is returned.
274
275       isSubg(g : graph_t, s : string) : graph_t
276              returns the subgraph in graph g with name s, if  it  exists,  or
277              NULL otherwise.
278
279       fstsubg(g : graph_t) : graph_t
280              returns the first subgraph in graph g, or NULL if none exists.
281
282       nxtsubg(sg : graph_t) : graph_t
283              returns the next subgraph after sg, or NULL.
284
285       isDirect(g : graph_t) : int
286              returns true if and only if g is directed.
287
288       isStrict(g : graph_t) : int
289              returns true if and only if g is strict.
290
291       nNodes(g : graph_t) : int
292              returns the number of nodes in g.
293
294       nEdges(g : graph_t) : int
295              returns the number of edges in g.
296
297   Nodes
298       node(sg : graph_t, s : string) : node_t
299              creates  a  node  in  graph  g of name s. If such a node already
300              exists, it is returned.
301
302       subnode(sg : graph_t, n : node_t) : node_t
303              inserts the node n into the subgraph g. Returns the node.
304
305       fstnode(g : graph_t) : node_t
306              returns the first node in graph g, or NULL if none exists.
307
308       nxtnode(n : node_t) : node_t
309              returns the next node after n in the root graph, or NULL.
310
311       nxtnode_sg(sg : graph_t, n : node_t) : node_t
312              returns the next node after n in sg, or NULL.
313
314       isNode(sg : graph_t, s : string) : node_t
315              looks for a node in (sub)graph sg of name  s.  If  such  a  node
316              exists, it is returned. Otherwise, NULL is returned.
317
318       isSubnode(sg : graph_t, n : node_t) : int
319              returns  non-zero  if node n is in (sub)graph sg, or zero other‐
320              wise.
321
322       indegreeOf(sg : graph_t, n : node_t) : int
323              returns the indegree of node n in (sub)graph sg.
324
325       outdegreeOf(sg : graph_t, n : node_t) : int
326              returns the outdegree of node n in (sub)graph sg.
327
328       degreeOf(sg : graph_t, n : node_t) : int
329              returns the degree of node n in (sub)graph sg.
330
331   Edges
332       edge(t : node_t, h : node_t, s : string) : edge_t
333              creates an edge with tail node t, head node h and name s in  the
334              root  graph. If the graph is undirected, the distinction between
335              head and tail nodes is unimportant.  If  such  an  edge  already
336              exists, it is returned.
337
338       edge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
339              creates  an  edge  with  tail  node t, head node h and name s in
340              (sub)graph sg (and all parent graphs). If  the  graph  is  undi‐
341              rected,  the distinction between head and tail nodes is unimpor‐
342              tant.  If such an edge already exists, it is returned.
343
344       subedge(g : graph_t, e : edge_t) : edge_t
345              inserts the edge e into the subgraph g. Returns the edge.
346
347       isEdge(t : node_t, h : node_t, s : string) : edge_t
348              looks for an edge with tail node t, head node h and name  s.  If
349              the  graph  is undirected, the distinction between head and tail
350              nodes is unimportant.  If such an edge exists, it  is  returned.
351              Otherwise, NULL is returned.
352
353       isEdge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
354              looks  for  an  edge with tail node t, head node h and name s in
355              (sub)graph sg. If  the  graph  is  undirected,  the  distinction
356              between  head  and  tail  nodes is unimportant.  If such an edge
357              exists, it is returned. Otherwise, NULL is returned.
358
359       isSubedge(g : graph_t, e : edge_t) : int
360              returns non-zero if edge e is in (sub)graph sg, or  zero  other‐
361              wise.
362
363       fstout(n : node_t) : edge_t
364              returns the first outedge of node n in the root graph.
365
366       fstout_sg(sg : graph_t, n : node_t) : edge_t
367              returns the first outedge of node n in (sub)graph sg.
368
369       nxtout(e : edge_t) : edge_t
370              returns the next outedge after e in the root graph.
371
372       nxtout_sg(sg : graph_t, e : edge_t) : edge_t
373              returns the next outedge after e in graph sg.
374
375       fstin(n : node_t) : edge_t
376              returns the first inedge of node n in the root graph.
377
378       fstin_sg(sg : graph_t, n : node_t) : edge_t
379              returns the first inedge of node n in graph sg.
380
381       nxtin(e : edge_t) : edge_t
382              returns the next inedge after e in the root graph.
383
384       nxtin_sg(sg : graph_t, e : edge_t) : edge_t
385              returns the next inedge after e in graph sg.
386
387       fstedge(n : node_t) : edge_t
388              returns the first edge of node n in the root graph.
389
390       fstedge_sg(sg : graph_t, n : node_t) : edge_t
391              returns the first edge of node n in graph sg.
392
393       nxtedge(e : edge_t, node_t) : edge_t
394              returns the next edge after e in the root graph.
395
396       nxtedge_sg(sg : graph_t, e : edge_t, node_t) : edge_t
397              returns the next edge after e in the graph sg.
398
399       opp(e : edge_t, node_t) : node_t
400              returns  the node on the edge e not equal to n.  Returns NULL if
401              n is not a node of e.  This can be useful when using fstedge and
402              nxtedge to enumerate the neighbors of n.
403
404   Graph I/O
405       write(g : graph_t) : void
406              prints g in dot format onto the output stream.
407
408       writeG(g : graph_t, fname : string) : void
409              prints g in dot format into the file fname.
410
411       fwriteG(g : graph_t, fd : int) : void
412              prints g in dot format onto the open stream denoted by the inte‐
413              ger fd.
414
415       readG(fname : string) : graph_t
416              returns a graph read from the file fname. The graph should be in
417              dot format. If no graph can be read, NULL is returned.
418
419       freadG(fd : int) : graph_t
420              returns  the  next  graph read from the open stream fd.  Returns
421              NULL at end of file.
422
423   Graph miscellany
424       delete(g : graph_t, x : obj_t) : void
425              deletes object x from graph g.  If g is NULL, the function  uses
426              the  root graph of x.  If x is a graph or subgraph, it is closed
427              unless x is locked.
428
429       isIn(g : graph_t, x : obj_t) : int
430              returns true if x is in subgraph g.
431
432       cloneG(g : graph_t, s : string) : graph_t
433              creates a clone of graph g with name of s.  If s is "", the cre‐
434              ated graph has the same name as g.
435
436       clone(g : graph_t, x : obj_t) : obj_t
437              creates  a clone of object x in graph g.  In particular, the new
438              object has the same name/value attributes and structure  as  the
439              original  object.   If  an object with the same key as x already
440              exists, its attributes are overlaid by those of x and the object
441              is  returned.   If an edge is cloned, both endpoints are implic‐
442              itly cloned.  If a graph is cloned, all nodes,  edges  and  sub‐
443              graphs  are  implicitly cloned.  If x is a graph, g may be NULL,
444              in which case the cloned object will be a  new  root  graph.  In
445              this case, the call is equivalent to cloneG(x,"").
446
447       copy(g : graph_t, x : obj_t) : obj_t
448              creates  a copy of object x in graph g, where the new object has
449              the same name/value attributes as the original  object.   If  an
450              object with the same key as x already exists, its attributes are
451              overlaid by those of x and the object is  returned.   Note  that
452              this  is  a  shallow  copy.  If x is a graph, none of its nodes,
453              edges or subgraphs are copied into the new graph.  If  x  is  an
454              edge,  the  endpoints are created if necessary, but they are not
455              cloned.  If x is a graph, g may  be  NULL,  in  which  case  the
456              cloned object will be a new root graph.
457
458       copyA(src : obj_t, tgt : obj_t) : int
459              copies  the  attributes of object src to object tgt, overwriting
460              any attribute values tgt may initially have.
461
462       induce(g : graph_t) : void
463              extends g to its node‐induced subgraph  extension  in  its  root
464              graph.
465
466       hasAttr(src : obj_t, name : string) : int
467              returns  non-zero  if  object src has an attribute whose name is
468              name. It returns 0 otherwise.
469
470       isAttr(g : graph_t, kind : string, name : string) : int
471              returns non-zero if an attribute name has been defined in g  for
472              objects  of  the  given kind. For nodes, edges, and graphs, kind
473              should be "N", "E", and "G", respectively.  It returns 0  other‐
474              wise.
475
476       aget(src : obj_t, name : string) : string
477              returns  the value of attribute name in object src. This is use‐
478              ful for those cases when name conflicts with one of the keywords
479              such  as  "head"  or  "root".   If  the  attribute  has not been
480              declared in the graph, the function will initialize  it  with  a
481              default  value  of "". To avoid this, one should use the hasAttr
482              or isAttr function to check that the attribute exists.
483
484       aset(src : obj_t, name : string, value : string) : int
485              sets the value  of  attribute  name  in  object  src  to  value.
486              Returns 0 on success, non‐zero on failure. See aget above.
487
488       getDflt(g : graph_t, kind : string, name : string) : string
489              returns  the  default value of attribute name in objects in g of
490              the given kind. For nodes, edges, and  graphs,  kind  should  be
491              "N",  "E", and "G", respectively.  If the attribute has not been
492              declared in the graph, the function will initialize  it  with  a
493              default  value  of  "". To avoid this, one should use the isAttr
494              function to check that the attribute exists.
495
496       setDflt(g : graph_t, kind : string, name : string, value  :  string)  :
497       int
498              sets  the default value of attribute name to value in objects in
499              g of the given kind. For nodes, edges, and graphs,  kind  should
500              be  "N", "E", and "G", respectively.  Returns 0 on success, non‐
501              zero on failure. See getDflt above.
502
503       fstAttr(g : graph_t, kind : string) : string
504              returns the name of the first attribute of objects in g  of  the
505              given  kind.  For  nodes, edges, and graphs, kind should be "N",
506              "E", and "G", respectively.  If there  are  no  attributes,  the
507              string "" is returned.
508
509       nxtAttr(g : graph_t, kind : string, name : string) : string
510              returns  the  name  of the next attribute of objects in g of the
511              given kind after the attribute name.  The argument name must  be
512              the  name  of  an  existing  attribute; it will typically be the
513              return value of an previous call to  fstAttr  or  nxtAttr.   For
514              nodes,  edges,  and  graphs,  kind  should be "N", "E", and "G",
515              respectively.  If there are no attributes left, the string "" is
516              returned.
517
518       compOf(g : graph_t, n : node_t) : graph_t
519              returns  the  connected component of the graph g containing node
520              n, as a subgraph of g. The subgraph only contains the nodes. One
521              can  use induce to add the edges. The function fails and returns
522              NULL if n is not in g. Connectivity is based on  the  underlying
523              undirected graph of g.
524
525       kindOf(obj : obj_t) : string
526              returns an indication of the type of obj.  For nodes, edges, and
527              graphs, it returns "N", "E", and "G", respectively.
528
529       lock(g : graph_t, v : int) : int
530              implements graph locking on root graphs. If  the  integer  v  is
531              positive,  the  graph is set so that future calls to delete have
532              no immediate effect.  If v is zero, the graph  is  unlocked.  If
533              there  has  been a call to delete the graph while it was locked,
534              the graph is closed.  If v is negative, nothing is done.  In all
535              cases, the previous lock value is returned.
536
537   Strings
538       sprintf(fmt : string, ...) : string
539              returns  the  string resulting from formatting the values of the
540              expressions occurring after fmt according to the printf(3)  for‐
541              mat fmt
542
543       gsub(str : string, pat : string) : string
544
545       gsub(str : string, pat : string, repl : string) : string
546              returns str with all substrings matching pat deleted or replaced
547              by repl, respectively.
548
549       sub(str : string, pat : string) : string
550
551       sub(str : string, pat : string, repl : string) : string
552              returns str with the leftmost substring matching pat deleted  or
553              replaced  by  repl, respectively. The characters '^' and '$' may
554              be used at the beginning and end, respectively, of pat to anchor
555              the pattern to the beginning or end of str.
556
557       substr(str : string, idx : int) : string
558
559       substr(str : string, idx : int, len : int) : string
560              returns the substring of str starting at position idx to the end
561              of the string or of length len, respectively.   Indexing  starts
562              at  0.  If  idx is negative or idx is greater than the length of
563              str, a fatal error occurs. Similarly, in the second case, if len
564              is  negative  or  idx + len is greater than the length of str, a
565              fatal error occurs.
566
567       strcmp(s1 : string, s2 : string) : int
568              provides the standard C function strcmp(3).
569
570       length(s : string) : int
571              returns the length of string s.
572
573       index(s : string, t : string) : int
574
575       rindex(s : string, t : string) : int
576              returns the index of the character in string s where  the  left‐
577              most  (rightmost)  copy  of string t can be found, or -1 if t is
578              not a substring of s.
579
580       match(s : string, p : string) : int
581              returns the index of the character in string s where  the  left‐
582              most match of pattern p can be found, or -1 if no substring of s
583              matches p.
584
585       toupper(s : string) : string
586              returns a version of s with the alphabetic characters  converted
587              to upper-case.
588
589       tolower(s : string) : string
590              returns  a version of s with the alphabetic characters converted
591              to lower-case.
592
593       canon(s : string) : string
594              returns a version of s appropriate to be used as  an  identifier
595              in a dot file.
596
597       html(g : graph_t, s : string) : string
598              returns  a  ``magic'' version  of s as an HTML string. This will
599              typically be used to  attach  an  HTML-like  label  to  a  graph
600              object. Note that the returned string lives in g. In particular,
601              it will be freed when g is closed, and to act as an HTML string,
602              it  has  to  be used with an object of g. In addition, note that
603              the angle bracket quotes should not be part of s. These will  be
604              added if g is written in concrete DOT format.
605
606       ishtml(s : string) : int
607              returns non-zero if and only if s is an HTML string.
608
609       xOf(s : string) : string
610              returns the string "x" if s has the form "x,y", where both x and
611              y are numeric.
612
613       yOf(s : string) : string
614              returns the string "y" if s has the form "x,y", where both x and
615              y are numeric.
616
617       llOf(s : string) : string
618              returns    the    string   "llx,lly"   if   s   has   the   form
619              "llx,lly,urx,ury", where all of  llx,  lly,  urx,  and  ury  are
620              numeric.
621
622       urOf(s)
623              urOf(s  : string) : string returns the string "urx,ury" if s has
624              the form "llx,lly,urx,ury", where all of llx, lly, urx, and  ury
625              are numeric.
626
627       sscanf(s : string, fmt : string, ...) : int
628              scans the string s, extracting values according to the sscanf(3)
629              format fmt.  The values are stored in  the  addresses  following
630              fmt,  addresses  having  the  form  &v, where v is some declared
631              variable of the correct type.  Returns the number of items  suc‐
632              cessfully scanned.
633
634       split(s : string, arr : array, seps : string) : int
635
636       split(s : string, arr : array) : int
637
638       tokens(s : string, arr : array, seps : string) : int
639
640       tokens(s : string, arr : array) : int
641              The  split  function  breaks the string s into fields, while the
642              tokens function breaks the string into tokens.  A field consists
643              of all non-separator characters between two separator characters
644              or the beginning or end of the string. Thus, a field may be  the
645              empty string. A token is a maximal, non-empty substring not con‐
646              taining a separator character.   The  separator  characters  are
647              those  given in the seps argument.  If seps is not provided, the
648              default value is " \t\n".  The functions return  the  number  of
649              fields or tokens.
650
651              The  fields  and  tokens  are  stored in the argument array. The
652              array must be string-valued and have int as its index type.  The
653              entries  are indexed by consecutive integers, starting at 0. Any
654              values already stored in the array will be  either  overwritten,
655              or still be present after the function returns.
656
657   I/O
658       print(...) : void
659              print(  expr, ... ) prints a string representation of each argu‐
660              ment in turn onto stdout, followed by a newline.
661
662       printf(fmt : string, ...) : int
663
664       printf(fd : int, fmt : string, ...) : int
665              prints the string resulting from formatting the  values  of  the
666              expressions following fmt according to the printf(3) format fmt.
667              Returns 0 on success.  By default, it prints on stdout.  If  the
668              optional  integer  fd  is  given,  output is written on the open
669              stream associated with fd.
670
671       scanf(fmt : string, ...) : int
672
673       scanf(fd : int, fmt : string, ...) : int
674              scans in values from an input stream according to  the  scanf(3)
675              format  fmt.   The  values are stored in the addresses following
676              fmt, addresses having the form &v,  where  v  is  some  declared
677              variable  of the correct type.  By default, it reads from stdin.
678              If the optional integer fd is given, input is read from the open
679              stream associated with fd.  Returns the number of items success‐
680              fully scanned.
681
682       openF(s : string, t : string) : int
683              opens the file s as an I/O stream. The string argument t  speci‐
684              fies  how  the file is opened. The arguments are the same as for
685              the C function fopen(3).  It returns  an  integer  denoting  the
686              stream, or -1 on error.
687
688              As  usual, streams 0, 1 and 2 are already open as stdin, stdout,
689              and stderr, respectively. Since gvpr may use stdin to  read  the
690              input graphs, the user should avoid using this stream.
691
692       closeF(fd : int) : int
693              closes the open stream denoted by the integer fd.  Streams  0, 1
694              and 2 cannot be closed.  Returns 0 on success.
695
696       readL(fd : int) : string
697              returns the next line read from the input stream fd. It  returns
698              the  empty string "" on end of file. Note that the newline char‐
699              acter is left in the returned string.
700
701   Math
702       exp(d : double) : double
703              returns e to the dth power.
704
705       log(d : double) : double
706              returns the natural log of d.
707
708       sqrt(d : double) : double
709              returns the square root of the double d.
710
711       pow(d : double, x : double) : double
712              returns d raised to the xth power.
713
714       cos(d : double) : double
715              returns the cosine of d.
716
717       sin(d : double) : double
718              returns the sine of d.
719
720       atan2(y : double, x : double) : double
721              returns the arctangent of y/x in the range -pi to pi.
722
723       MIN(y : double, x : double) : double
724              returns the minimum of y and x.
725
726       MAX(y : double, x : double) : double
727              returns the maximum of y and x.
728
729   Associative Arrays
730       # arr : int
731              returns the number of elements in the array arr.
732
733       idx in arr : int
734              returns 1 if a value has been set for index  idx  in  the  array
735              arr.  It returns 0 otherwise.
736
737       unset(v : array, idx) : int
738              removes  the  item  indexed  by  idx.  It  returns 1 if the item
739              existed, 0 otherwise.
740
741       unset(v : array) : void
742              re-initializes the array.
743
744   Miscellaneous
745       exit(v : int) : void
746              causes gvpr to exit with the exit code v.
747
748       system(cmd : string) : int
749              provides the standard C function system(3).  It executes cmd  in
750              the user's shell environment, and returns the exit status of the
751              shell.
752
753       rand() : double
754              returns a pseudo‐random double between 0 and 1.
755
756       srand() : int
757
758       srand(v : int) : int
759              sets a seed for the random number generator. The optional  argu‐
760              ment gives the seed; if it is omitted, the current time is used.
761              The previous seed value is  returned.  srand  should  be  called
762              before any calls to rand.
763
764       colorx(color : string, fmt : string) : string
765              translates  a  color from one format to another. The color argu‐
766              ment should be a color in one of the recognized string represen‐
767              tations. The fmt value should be one of "RGB", "RGBA", "HSV", or
768              "HSVA".  An empty string is returned on error.
769

BUILT‐IN VARIABLES

771       gvpr provides certain special, built‐in variables, whose values are set
772       automatically  by  gvpr  depending on the context. Except as noted, the
773       user cannot modify their values.
774
775       $ : obj_t
776              denotes the current object (node, edge, graph) depending on  the
777              context.  It is not available in BEGIN or END clauses.
778
779       $F : string
780              is the name of the current input file.
781
782       $G : graph_t
783              denotes  the  current graph being processed. It is not available
784              in BEGIN or END clauses.
785
786       $NG : graph_t
787              denotes the next graph to be processed. If $NG is NULL, the cur‐
788              rent  graph  $G  is the last graph. Note that if the input comes
789              from stdin, the last graph cannot be determined until the  input
790              pipe is closed.  It is not available in BEGIN or END clauses, or
791              if the -n flag is used.
792
793       $O : graph_t
794              denotes the output graph. Before graph traversal, it is initial‐
795              ized to the target graph. After traversal and any END_G actions,
796              if it refers to a non‐empty graph, that graph  is  printed  onto
797              the  output stream.  It is only valid in N, E and END_G clauses.
798              The output graph may be set by the user.
799
800       $T : graph_t
801              denotes the current target graph. It is a subgraph of $G and  is
802              available only in N, E and END_G clauses.
803
804       $tgtname : string
805              denotes  the name of the target graph.  By default, it is set to
806              "gvpr_result".  If used multiple times during the  execution  of
807              gvpr,  the name will be appended with an integer.  This variable
808              may be set by the user.
809
810       $tvroot : node_t
811              indicates the starting  node  for  a  (directed  or  undirected)
812              depth‐first or breadth‐first traversal of the graph (cf. $tvtype
813              below).  The default value is NULL for each input graph.   After
814              the  traversal  at  the  given root, if the value of $tvroot has
815              changed, a new traversal  will  begin  with  the  new  value  of
816              $tvroot. Also, set $tvnext below.
817
818       $tvnext : node_t
819              indicates  the next starting node for a (directed or undirected)
820              depth‐first or breadth‐first traversal of the graph (cf. $tvtype
821              below).   If  a  traversal finishes and the $tvroot has not been
822              reset but the $tvnext has been set but not used, this node  will
823              be  used  as  the next choice for $tvroot.  The default value is
824              NULL for each input graph.
825
826       $tvedge : edge_t
827              For BFS and DFS traversals, this is set  to  the  edge  used  to
828              arrive  at  the current node or edge. At the beginning of a tra‐
829              versal, or for other traversal types, the value is NULL.
830
831       $tvtype : tvtype_t
832              indicates how gvpr traverses a graph. It can only  take  one  of
833              the  constant  values  with  the  previx  "TV_" described below.
834              TV_flat is the default.
835
836              In the underlying graph library cgraph(3), edges  in  undirected
837              graphs  are  given an arbitrary direction. This is used for tra‐
838              versals, such as TV_fwd, requiring directed edges.
839
840       ARGC : int
841              denotes the number of arguments specified by the  -a  args  com‐
842              mand‐line argument.
843
844       ARGV : string array
845              denotes the array of arguments specified by the -a args command‐
846              line argument. The ith argument is given by ARGV[i].
847

BUILT‐IN CONSTANTS

849       There are several symbolic constants defined by gvpr.
850
851       NULL : obj_t
852              a null object reference, equivalent to 0.
853
854       TV_flat : tvtype_t
855              a simple, flat traversal, with graph objects  visited  in  seem‐
856              ingly arbitrary order.
857
858       TV_ne : tvtype_t
859              a traversal which first visits all of the nodes, then all of the
860              edges.
861
862       TV_en : tvtype_t
863              a traversal which first visits all of the edges, then all of the
864              nodes.
865
866       TV_dfs : tvtype_t
867       TV_postdfs : tvtype_t
868       TV_prepostdfs : tvtype_t
869              a  traversal  of  the  graph  using  a depth‐first search on the
870              underlying undirected graph.  To do  the  traversal,  gvpr  will
871              check  the  value of $tvroot. If this has the same value that it
872              had previously (at the start, the previous value is  initialized
873              to  NULL.),  gvpr  will  simply look for some unvisited node and
874              traverse its connected component. On the other hand, if  $tvroot
875              has changed, its connected component will be toured, assuming it
876              has not been previously visited or, if $tvroot is NULL, the tra‐
877              versal will stop. Note that using TV_dfs and $tvroot, it is pos‐
878              sible to create an infinite loop.
879
880              By default, the traversal is done in pre-order. That is, a  node
881              is  visited  before  all of its unvisited edges. For TV_postdfs,
882              all of a node's unvisited edges are visited before the node. For
883              TV_prepostdfs,  a node is visited twice, before and after all of
884              its unvisited edges.
885
886       TV_fwd : tvtype_t
887       TV_postfwd : tvtype_t
888       TV_prepostfwd : tvtype_t
889              A traversal of the graph using a depth‐first search on the graph
890              following  only  forward arcs.  The choice of roots for the tra‐
891              versal is the same as described for TV_dfs above.  The different
892              order  of visitation specified by TV_fwd, TV_postfwd and TV_pre‐
893              postfwd are the same as those specified by the analogous traver‐
894              sals TV_dfs, TV_postdfs and TV_prepostdfs.
895
896       TV_rev : tvtype_t
897       TV_postrev : tvtype_t
898       TV_prepostrev : tvtype_t
899              A traversal of the graph using a depth‐first search on the graph
900              following only reverse arcs.  The choice of roots for  the  tra‐
901              versal is the same as described for TV_dfs above.  The different
902              order of visitation specified by TV_rev, TV_postrev and  TV_pre‐
903              postrev are the same as those specified by the analogous traver‐
904              sals TV_dfs, TV_postdfs and TV_prepostdfs.
905
906       TV_bfs : tvtype_t
907              A traversal of the graph using a  breadth‐first  search  on  the
908              graph ignoring edge directions. See the item on TV_dfs above for
909              the role of $tvroot.
910

EXAMPLES

912              gvpr -i 'N[color=="blue"]' file.gv
913
914       Generate the node‐induced subgraph of all nodes with color blue.
915
916              gvpr -c 'N[color=="blue"]{color = "red"}' file.gv
917
918       Make all blue nodes red.
919
920              BEGIN { int n, e; int tot_n = 0; int tot_e = 0; }
921              BEG_G {
922                n = nNodes($G);
923                e = nEdges($G);
924                printf ("%d nodes %d edges %s\n", n, e, $G.name);
925                tot_n += n;
926                tot_e += e;
927              }
928              END { printf ("%d nodes %d edges total\n", tot_n, tot_e) }
929
930       Version of the program gc.
931
932              gvpr -c ""
933
934       Equivalent to nop.
935
936              BEG_G { graph_t g = graph ("merge", "S"); }
937              E {
938                node_t h = clone(g,$.head);
939                node_t t = clone(g,$.tail);
940                edge_t e = edge(t,h,"");
941                e.weight = e.weight + 1;
942              }
943              END_G { $O = g; }
944
945       Produces a  strict  version  of  the  input  graph,  where  the  weight
946       attribute  of an edge indicates how many edges from the input graph the
947       edge represents.
948
949              BEGIN {node_t n; int deg[]}
950              E{deg[head]++; deg[tail]++; }
951              END_G {
952                for (deg[n]) {
953                  printf ("deg[%s] = %d\n", n.name, deg[n]);
954                }
955              }
956
957       Computes the degrees of nodes with edges.
958
959              BEGIN {
960                int i, indent;
961                int seen[string];
962                void prInd (int cnt) {
963                  for (i = 0; i < cnt; i++) printf ("  ");
964                }
965              }
966              BEG_G {
967
968                 $tvtype = TV_prepostfwd;
969                 $tvroot = node($,ARGV[0]);
970              }
971              N {
972                if (seen[$.name]) indent--;
973                else {
974                  prInd(indent);
975                    print ($.name);
976                  seen[$.name] = 1;
977                  indent++;
978                }
979              }
980
981       Prints the depth-first traversal of the graph, starting with  the  node
982       whose name is ARGV[0], as an indented list.
983

ENVIRONMENT

985       GVPRPATH
986              Colon‐separated  list  of directories to be searched to find the
987              file specified by the -f option. gvpr has a default  list  built
988              in.  If  GVPRPATH  is  not defined, the default list is used. If
989              GVPRPATH starts with colon, the  list  is  formed  by  appending
990              GVPRPATH  to  the default list. If GVPRPATH ends with colon, the
991              list is formed by appending the default list to GVPRPATH. Other‐
992              wise, GVPRPATH is used for the list.
993
994       On  Windows systems, replace ``colon'' with ``semicolon'' in the previ‐
995       ous paragraph.
996

BUGS AND WARNINGS

998       Scripts should be careful deleting nodes  during  N{}  and  E{}  blocks
999       using  BFS  and  DFS  traversals  as these rely on stacks and queues of
1000       nodes.
1001
1002       When the program is given as a command line argument, the  usual  shell
1003       interpretation  takes place, which may affect some of the special names
1004       in gvpr. To avoid this, it is  best  to  wrap  the  program  in  single
1005       quotes.
1006
1007       If  string  constants  contain  pattern metacharacters that you want to
1008       escape to avoid pattern matching, two backslashes will probably be nec‐
1009       essary,  as  a  single backslash will be lost when the string is origi‐
1010       nally scanned. Usually, it is simpler to use strcmp  to  avoid  pattern
1011       matching.
1012
1013       As  of  24  April  2008, gvpr switched to using a new, underlying graph
1014       library, which uses the simpler model that there is only one copy of  a
1015       node,  not  one  copy  for  each subgraph logically containing it. This
1016       means that iterators such as nxtnode cannot traverse a  subgraph  using
1017       just  a node argument. For this reason, subgraph traversal requires new
1018       functions ending in "_sg", which also take  a  subgraph  argument.  The
1019       versions without that suffix will always traverse the root graph.
1020
1021       There  is a single global scope, except for formal function parameters,
1022       and even these can interfere with the type system. Also, the extent  of
1023       all  variables  is the entire life of the program.  It might be prefer‐
1024       able for scope to reflect the natural nesting of the  clauses,  or  for
1025       the  program to at least reset locally declared variables.  For now, it
1026       is advisable to use distinct names for all variables.
1027
1028       If a function ends with a complex statement, such as an  IF  statement,
1029       with  each  branch  doing  a return, type checking may fail.  Functions
1030       should use a return at the end.
1031
1032       The expr library does not support  string  values  of  (char*)0.   This
1033       means  we can't distinguish between "" and (char*)0 edge keys.  For the
1034       purposes of looking up and  creating  edges,  we  translate  ""  to  be
1035       (char*)0,  since this latter value is necessary in order to look up any
1036       edge with a matching head and tail.
1037
1038       Related to this, strings converted to integers act like char  pointers,
1039       getting  the  value  0  or  1  depending on whether the string consists
1040       solely of zeroes or not. Thus, the ((int)"2") evaluates to 1.
1041
1042       The language inherits the usual C problems such as dangling  references
1043       and the confusion between '=' and '=='.
1044

AUTHOR

1046       Emden R. Gansner <erg@research.att.com>
1047

SEE ALSO

1049       awk(1), gc(1), dot(1), nop(1), expr(3), cgraph(3)
1050
1051
1052
1053                                29 August 2013                         GVPR(1)
Impressum