1GVPR(1) General Commands Manual GVPR(1)
2
3
4
6 gvpr - graph pattern scanning and processing language
7
9 gvpr [-icnqV?] [ -o outfile ] [ -a args ] [ 'prog' | -f progfile ] [
10 files ]
11
13 gvpr (previously known as gpr) is a graph stream editor inspired by
14 awk. It copies input graphs to its output, possibly transforming their
15 structure and attributes, creating new graphs, or printing arbitrary
16 information. The graph model is that provided by libcgraph(3). In
17 particular, gvpr reads and writes graphs using the dot language.
18
19 Basically, gvpr traverses each input graph, denoted by $G, visiting
20 each node and edge, matching it with the predicate‐action rules sup‐
21 plied in the input program. The rules are evaluated in order. For
22 each predicate evaluating to true, the corresponding action is per‐
23 formed. During the traversal, the current node or edge being visited
24 is denoted by $.
25
26 For each input graph, there is a target subgraph, denoted by $T, ini‐
27 tially empty and used to accumulate chosen entities, and an output
28 graph, $O, used for final processing and then written to output. By
29 default, the output graph is the target graph. The output graph can be
30 set in the program or, in a limited sense, on the command line.
31
33 The following options are supported:
34
35 -a args
36 The string args is split into whitespace‐separated tokens, with
37 the individual tokens available as strings in the gvpr program
38 as ARGV[0],...,ARGV[ARGC-1]. Whitespace characters within sin‐
39 gle or double quoted substrings, or preceded by a backslash, are
40 ignored as separators. In general, a backslash character turns
41 off any special meaning of the following character. Note that
42 the tokens derived from multiple -a flags are concatenated.
43
44 -c Use the source graph as the output graph.
45
46 -i Derive the node‐induced subgraph extension of the output graph
47 in the context of its root graph.
48
49 -o outfile
50 Causes the output stream to be written to the specified file; by
51 default, output is written to stdout.
52
53 -f progfile
54 Use the contents of the specified file as the program to execute
55 on the input. If progfile contains a slash character, the name
56 is taken as the pathname of the file. Otherwise, gvpr will use
57 the directories specified in the environment variable GVPRPATH
58 to look for the file. If -f is not given, gvpr will use the
59 first non‐option argument as the program.
60
61 -q Turns off warning messages.
62
63 -n Turns off graph read-ahead. By default, the variable $NG is set
64 to the next graph to be processed. This requires a read of the
65 next graph before processing the current graph, which may block
66 if the next graph is only generated in response to some action
67 pertaining to the processing of the current graph.
68
69 -V Causes the program to print version information and exit.
70
71 -? Causes the program to print usage information and exit.
72
74 The following operand is supported:
75
76 files Names of files containing 1 or more graphs in the dot language.
77 If no -f option is given, the first name is removed from the
78 list and used as the input program. If the list of files is
79 empty, stdin will be used.
80
82 A gvpr program consists of a list of predicate‐action clauses, having
83 one of the forms:
84
85 BEGIN { action }
86
87 BEG_G { action }
88
89 N [ predicate ] { action }
90
91 E [ predicate ] { action }
92
93 END_G { action }
94
95 END { action }
96
97 A program can contain at most one of each of the BEGIN, END_G and END
98 clauses. There can be any number of BEG_G, N and E statements, the
99 first applied to graphs, the second to nodes, the third to edges.
100 These are separated into blocks, a block consisting of an optional
101 BEG_G statement and all N and E statements up to the next BEG_G state‐
102 ment, if any. The top‐level semantics of a gvpr program are:
103
104 Evaluate the BEGIN clause, if any.
105 For each input graph G {
106 For each block {
107 Set G as the current graph and current object.
108 Evaluate the BEG_G clause, if any.
109 For each node and edge in G {
110 Set the node or edge as the current object.
111 Evaluate the N or E clauses, as appropriate.
112 }
113 }
114 Set G as the current object.
115 Evaluate the END_G clause, if any.
116 }
117 Evaluate the END clause, if any.
118
119 The actions of the BEGIN, BEG_G, END_G and END clauses are performed
120 when the clauses are evaluated. For N or E clauses, either the predi‐
121 cate or action may be omitted. If there is no predicate with an
122 action, the action is performed on every node or edge, as appropriate.
123 If there is no action and the predicate evaluates to true, the associ‐
124 ated node or edge is added to the target graph.
125
126 The blocks are evaluated in the order in which they occur. Within a
127 block, the N clauses (E clauses, respectively) are evaluated in the
128 order in which the occur. Note, though, that within a block, N or E
129 clauses may be interlaced, depending on the traversal order.
130
131 Predicates and actions are sequences of statements in the C dialect
132 supported by the expr(3) library. The only difference between predi‐
133 cates and actions is that the former must have a type that may inter‐
134 preted as either true or false. Here the usual C convention is fol‐
135 lowed, in which a non‐zero value is considered true. This would include
136 non‐empty strings and non‐empty references to nodes, edges, etc. How‐
137 ever, if a string can be converted to an integer, this value is used.
138
139 In addition to the usual C base types (void, int, char, float, long,
140 unsigned and double), gvpr provides string as a synonym for char*, and
141 the graph‐based types node_t, edge_t, graph_t and obj_t. The obj_t
142 type can be viewed as a supertype of the other 3 concrete types; the
143 correct base type is maintained dynamically. Besides these base types,
144 the only other supported type expressions are (associative) arrays.
145
146 Constants follow C syntax, but strings may be quoted with either "..."
147 or '...'. gvpr accepts C++ comments as well as cpp‐type comments. For
148 the latter, if a line begins with a '#' character, the rest of the line
149 is ignored.
150
151 A statement can be a declaration of a function, a variable or an array,
152 or an executable statement. For declarations, there is a single scope.
153 Array declarations have the form:
154
155 type array [ type0 ]
156
157 where type0 is optional. If it is supplied, the parser will enforce
158 that all array subscripts have the specified type. If it is not sup‐
159 plied, objects of all types can be used as subscripts. As in C, vari‐
160 ables and arrays must be declared. In particular, an undeclared vari‐
161 able will be interpreted as the name of an attribute of a node, edge or
162 graph, depending on the context.
163
164 Executable statements can be one of the following:
165
166 { [ statement ... ] }
167 expression // commonly var = expression
168 if( expression ) statement [ else statement ]
169 for( expression ; expression ; expression ) statement
170 for( array [ var ]) statement
171 forr( array [ var ]) statement
172 while( expression ) statement
173 switch( expression ) case statements
174 break [ expression ]
175 continue [ expression ]
176 return [ expression ]
177 Items in brackets are optional.
178
179 In the second form of the for statement and the forr statement, the
180 variable var is set to each value used as an index in the specified
181 array and then the associated statement is evaluated. For numeric and
182 string indices, the indices are returned in increasing (decreasing)
183 numeric or lexicographic order for for (forr, respectively). This can
184 be used for sorting.
185
186 Function definitions can only appear in the BEGIN clause.
187
188 Expressions include the usual C expressions. String comparisons using
189 == and != treat the right hand operand as a pattern for the purpose of
190 regular expression matching. Patterns use ksh(1) file match pattern
191 syntax. (For simple string equality, use the strcmp function.
192
193 gvpr will attempt to use an expression as a string or numeric value as
194 appropriate. Both C-like casts and function templates will cause con‐
195 versions to be performed, if possible.
196
197 Expressions of graphical type (i.e., graph_t, node_t, edge_t, obj_t)
198 may be followed by a field reference in the form of .name. The result‐
199 ing value is the value of the attribute named name of the given object.
200 In addition, in certain contexts an undeclared, unmodified identifier
201 is taken to be an attribute name. Specifically, such identifiers denote
202 attributes of the current node or edge, respectively, in N and E
203 clauses, and the current graph in BEG_G and END_G clauses.
204
205 As usual in the libcgraph(3) model, attributes are string‐valued. In
206 addition, gvpr supports certain pseudo‐attributes of graph objects, not
207 necessarily string‐valued. These reflect intrinsic properties of the
208 graph objects and cannot be set by the user.
209
210 head : node_t
211 the head of an edge.
212
213 tail : node_t
214 the tail of an edge.
215
216 name : string
217 the name of an edge, node or graph. The name of an edge has the
218 form "<tail‐name><edge‐op><head‐name>[<key>]", where <edge‐op>
219 is "->" or "--" depending on whether the graph is directed or
220 not. The bracket part [<key>] only appears if the edge has a
221 non‐trivial key.
222
223 indegree : int
224 the indegree of a node.
225
226 outdegree : int
227 the outdegree of a node.
228
229 degree : int
230 the degree of a node.
231
232 X : double
233 the X coordinate of a node. (Assumes the node has a pos
234 attribute.)
235
236 Y : double
237 the Y coordinate of a node. (Assumes the node has a pos
238 attribute.)
239
240 root : graph_t
241 the root graph of an object. The root of a root graph is itself.
242
243 parent : graph_t
244 the parent graph of a subgraph. The parent of a root graph is
245 NULL
246
247 n_edges : int
248 the number of edges in the graph
249
250 n_nodes : int
251 the number of nodes in the graph
252
253 directed : int
254 true (non‐zero) if the graph is directed
255
256 strict : int
257 true (non‐zero) if the graph is strict
258
260 The following functions are built into gvpr. Those functions returning
261 references to graph objects return NULL in case of failure.
262
263 Graphs and subgraph
264 graph(s : string, t : string) : graph_t
265 creates a graph whose name is s and whose type is specified by
266 the string t. Ignoring case, the characters U, D, S, N have the
267 interpretation undirected, directed, strict, and non‐strict,
268 respectively. If t is empty, a directed, non‐strict graph is
269 generated.
270
271 subg(g : graph_t, s : string) : graph_t
272 creates a subgraph in graph g with name s. If the subgraph
273 already exists, it is returned.
274
275 isSubg(g : graph_t, s : string) : graph_t
276 returns the subgraph in graph g with name s, if it exists, or
277 NULL otherwise.
278
279 fstsubg(g : graph_t) : graph_t
280 returns the first subgraph in graph g, or NULL if none exists.
281
282 nxtsubg(sg : graph_t) : graph_t
283 returns the next subgraph after sg, or NULL.
284
285 isDirect(g : graph_t) : int
286 returns true if and only if g is directed.
287
288 isStrict(g : graph_t) : int
289 returns true if and only if g is strict.
290
291 nNodes(g : graph_t) : int
292 returns the number of nodes in g.
293
294 nEdges(g : graph_t) : int
295 returns the number of edges in g.
296
297 Nodes
298 node(sg : graph_t, s : string) : node_t
299 creates a node in graph g of name s. If such a node already
300 exists, it is returned.
301
302 subnode(sg : graph_t, n : node_t) : node_t
303 inserts the node n into the subgraph g. Returns the node.
304
305 fstnode(g : graph_t) : node_t
306 returns the first node in graph g, or NULL if none exists.
307
308 nxtnode(n : node_t) : node_t
309 returns the next node after n in the root graph, or NULL.
310
311 nxtnode_sg(sg : graph_t, n : node_t) : node_t
312 returns the next node after n in sg, or NULL.
313
314 isNode(sg : graph_t, s : string) : node_t
315 looks for a node in (sub)graph sg of name s. If such a node
316 exists, it is returned. Otherwise, NULL is returned.
317
318 isSubnode(sg : graph_t, n : node_t) : int
319 returns non-zero if node n is in (sub)graph sg, or zero other‐
320 wise.
321
322 indegreeOf(sg : graph_t, n : node_t) : int
323 returns the indegree of node n in (sub)graph sg.
324
325 outdegreeOf(sg : graph_t, n : node_t) : int
326 returns the outdegree of node n in (sub)graph sg.
327
328 degreeOf(sg : graph_t, n : node_t) : int
329 returns the degree of node n in (sub)graph sg.
330
331 Edges
332 edge(t : node_t, h : node_t, s : string) : edge_t
333 creates an edge with tail node t, head node h and name s in the
334 root graph. If the graph is undirected, the distinction between
335 head and tail nodes is unimportant. If such an edge already
336 exists, it is returned.
337
338 edge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
339 creates an edge with tail node t, head node h and name s in
340 (sub)graph sg (and all parent graphs). If the graph is undi‐
341 rected, the distinction between head and tail nodes is unimpor‐
342 tant. If such an edge already exists, it is returned.
343
344 subedge(g : graph_t, e : edge_t) : edge_t
345 inserts the edge e into the subgraph g. Returns the edge.
346
347 isEdge(t : node_t, h : node_t, s : string) : edge_t
348 looks for an edge with tail node t, head node h and name s. If
349 the graph is undirected, the distinction between head and tail
350 nodes is unimportant. If such an edge exists, it is returned.
351 Otherwise, NULL is returned.
352
353 isEdge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
354 looks for an edge with tail node t, head node h and name s in
355 (sub)graph sg. If the graph is undirected, the distinction
356 between head and tail nodes is unimportant. If such an edge
357 exists, it is returned. Otherwise, NULL is returned.
358
359 isSubedge(g : graph_t, e : edge_t) : int
360 returns non-zero if edge e is in (sub)graph sg, or zero other‐
361 wise.
362
363 fstout(n : node_t) : edge_t
364 returns the first outedge of node n in the root graph.
365
366 fstout_sg(sg : graph_t, n : node_t) : edge_t
367 returns the first outedge of node n in (sub)graph sg.
368
369 nxtout(e : edge_t) : edge_t
370 returns the next outedge after e in the root graph.
371
372 nxtout_sg(sg : graph_t, e : edge_t) : edge_t
373 returns the next outedge after e in graph sg.
374
375 fstin(n : node_t) : edge_t
376 returns the first inedge of node n in the root graph.
377
378 fstin_sg(sg : graph_t, n : node_t) : edge_t
379 returns the first inedge of node n in graph sg.
380
381 nxtin(e : edge_t) : edge_t
382 returns the next inedge after e in the root graph.
383
384 nxtin_sg(sg : graph_t, e : edge_t) : edge_t
385 returns the next inedge after e in graph sg.
386
387 fstedge(n : node_t) : edge_t
388 returns the first edge of node n in the root graph.
389
390 fstedge_sg(sg : graph_t, n : node_t) : edge_t
391 returns the first edge of node n in graph sg.
392
393 nxtedge(e : edge_t, node_t) : edge_t
394 returns the next edge after e in the root graph.
395
396 nxtedge_sg(sg : graph_t, e : edge_t, node_t) : edge_t
397 returns the next edge after e in the graph sg.
398
399 opp(e : edge_t, node_t) : node_t
400 returns the node on the edge e not equal to n. Returns NULL if
401 n is not a node of e. This can be useful when using fstedge and
402 nxtedge to enumerate the neighbors of n.
403
404 Graph I/O
405 write(g : graph_t) : void
406 prints g in dot format onto the output stream.
407
408 writeG(g : graph_t, fname : string) : void
409 prints g in dot format into the file fname.
410
411 fwriteG(g : graph_t, fd : int) : void
412 prints g in dot format onto the open stream denoted by the inte‐
413 ger fd.
414
415 readG(fname : string) : graph_t
416 returns a graph read from the file fname. The graph should be in
417 dot format. If no graph can be read, NULL is returned.
418
419 freadG(fd : int) : graph_t
420 returns the next graph read from the open stream fd. Returns
421 NULL at end of file.
422
423 Graph miscellany
424 delete(g : graph_t, x : obj_t) : void
425 deletes object x from graph g. If g is NULL, the function uses
426 the root graph of x. If x is a graph or subgraph, it is closed
427 unless x is locked.
428
429 isIn(g : graph_t, x : obj_t) : int
430 returns true if x is in subgraph g.
431
432 cloneG(g : graph_t, s : string) : graph_t
433 creates a clone of graph g with name of s. If s is "", the cre‐
434 ated graph has the same name as g.
435
436 clone(g : graph_t, x : obj_t) : obj_t
437 creates a clone of object x in graph g. In particular, the new
438 object has the same name/value attributes and structure as the
439 original object. If an object with the same key as x already
440 exists, its attributes are overlaid by those of x and the object
441 is returned. If an edge is cloned, both endpoints are implic‐
442 itly cloned. If a graph is cloned, all nodes, edges and sub‐
443 graphs are implicitly cloned. If x is a graph, g may be NULL,
444 in which case the cloned object will be a new root graph. In
445 this case, the call is equivalent to cloneG(x,"").
446
447 copy(g : graph_t, x : obj_t) : obj_t
448 creates a copy of object x in graph g, where the new object has
449 the same name/value attributes as the original object. If an
450 object with the same key as x already exists, its attributes are
451 overlaid by those of x and the object is returned. Note that
452 this is a shallow copy. If x is a graph, none of its nodes,
453 edges or subgraphs are copied into the new graph. If x is an
454 edge, the endpoints are created if necessary, but they are not
455 cloned. If x is a graph, g may be NULL, in which case the
456 cloned object will be a new root graph.
457
458 copyA(src : obj_t, tgt : obj_t) : int
459 copies the attributes of object src to object tgt, overwriting
460 any attribute values tgt may initially have.
461
462 induce(g : graph_t) : void
463 extends g to its node‐induced subgraph extension in its root
464 graph.
465
466 hasAttr(src : obj_t, name : string) : int
467 returns non-zero if object src has an attribute whose name is
468 name. It returns 0 otherwise.
469
470 isAttr(g : graph_t, kind : string, name : string) : int
471 returns non-zero if an attribute name has been defined in g for
472 objects of the given kind. For nodes, edges, and graphs, kind
473 should be "N", "E", and "G", respectively. It returns 0 other‐
474 wise.
475
476 aget(src : obj_t, name : string) : string
477 returns the value of attribute name in object src. This is use‐
478 ful for those cases when name conflicts with one of the keywords
479 such as "head" or "root". If the attribute has not been
480 declared in the graph, the function will initialize it with a
481 default value of "". To avoid this, one should use the hasAttr
482 or isAttr function to check that the attribute exists.
483
484 aset(src : obj_t, name : string, value : string) : int
485 sets the value of attribute name in object src to value.
486 Returns 0 on success, non‐zero on failure. See aget above.
487
488 getDflt(g : graph_t, kind : string, name : string) : string
489 returns the default value of attribute name in objects in g of
490 the given kind. For nodes, edges, and graphs, kind should be
491 "N", "E", and "G", respectively. If the attribute has not been
492 declared in the graph, the function will initialize it with a
493 default value of "". To avoid this, one should use the isAttr
494 function to check that the attribute exists.
495
496 setDflt(g : graph_t, kind : string, name : string, value : string) :
497 int
498 sets the default value of attribute name to value in objects in
499 g of the given kind. For nodes, edges, and graphs, kind should
500 be "N", "E", and "G", respectively. Returns 0 on success, non‐
501 zero on failure. See getDflt above.
502
503 fstAttr(g : graph_t, kind : string) : string
504 returns the name of the first attribute of objects in g of the
505 given kind. For nodes, edges, and graphs, kind should be "N",
506 "E", and "G", respectively. If there are no attributes, the
507 string "" is returned.
508
509 nxtAttr(g : graph_t, kind : string, name : string) : string
510 returns the name of the next attribute of objects in g of the
511 given kind after the attribute name. The argument name must be
512 the name of an existing attribute; it will typically be the
513 return value of an previous call to fstAttr or nxtAttr. For
514 nodes, edges, and graphs, kind should be "N", "E", and "G",
515 respectively. If there are no attributes left, the string "" is
516 returned.
517
518 compOf(g : graph_t, n : node_t) : graph_t
519 returns the connected component of the graph g containing node
520 n, as a subgraph of g. The subgraph only contains the nodes. One
521 can use induce to add the edges. The function fails and returns
522 NULL if n is not in g. Connectivity is based on the underlying
523 undirected graph of g.
524
525 kindOf(obj : obj_t) : string
526 returns an indication of the type of obj. For nodes, edges, and
527 graphs, it returns "N", "E", and "G", respectively.
528
529 lock(g : graph_t, v : int) : int
530 implements graph locking on root graphs. If the integer v is
531 positive, the graph is set so that future calls to delete have
532 no immediate effect. If v is zero, the graph is unlocked. If
533 there has been a call to delete the graph while it was locked,
534 the graph is closed. If v is negative, nothing is done. In all
535 cases, the previous lock value is returned.
536
537 Strings
538 sprintf(fmt : string, ...) : string
539 returns the string resulting from formatting the values of the
540 expressions occurring after fmt according to the printf(3) for‐
541 mat fmt
542
543 gsub(str : string, pat : string) : string
544
545 gsub(str : string, pat : string, repl : string) : string
546 returns str with all substrings matching pat deleted or replaced
547 by repl, respectively.
548
549 sub(str : string, pat : string) : string
550
551 sub(str : string, pat : string, repl : string) : string
552 returns str with the leftmost substring matching pat deleted or
553 replaced by repl, respectively. The characters '^' and '$' may
554 be used at the beginning and end, respectively, of pat to anchor
555 the pattern to the beginning or end of str.
556
557 substr(str : string, idx : int) : string
558
559 substr(str : string, idx : int, len : int) : string
560 returns the substring of str starting at position idx to the end
561 of the string or of length len, respectively. Indexing starts
562 at 0. If idx is negative or idx is greater than the length of
563 str, a fatal error occurs. Similarly, in the second case, if len
564 is negative or idx + len is greater than the length of str, a
565 fatal error occurs.
566
567 strcmp(s1 : string, s2 : string) : int
568 provides the standard C function strcmp(3).
569
570 length(s : string) : int
571 returns the length of string s.
572
573 index(s : string, t : string) : int
574
575 rindex(s : string, t : string) : int
576 returns the index of the character in string s where the left‐
577 most (rightmost) copy of string t can be found, or -1 if t is
578 not a substring of s.
579
580 match(s : string, p : string) : int
581 returns the index of the character in string s where the left‐
582 most match of pattern p can be found, or -1 if no substring of s
583 matches p.
584
585 toupper(s : string) : string
586 returns a version of s with the alphabetic characters converted
587 to upper-case.
588
589 tolower(s : string) : string
590 returns a version of s with the alphabetic characters converted
591 to lower-case.
592
593 canon(s : string) : string
594 returns a version of s appropriate to be used as an identifier
595 in a dot file.
596
597 html(g : graph_t, s : string) : string
598 returns a ``magic'' version of s as an HTML string. This will
599 typically be used to attach an HTML-like label to a graph
600 object. Note that the returned string lives in g. In particular,
601 it will be freed when g is closed, and to act as an HTML string,
602 it has to be used with an object of g. In addition, note that
603 the angle bracket quotes should not be part of s. These will be
604 added if g is written in concrete DOT format.
605
606 ishtml(s : string) : int
607 returns non-zero if and only if s is an HTML string.
608
609 xOf(s : string) : string
610 returns the string "x" if s has the form "x,y", where both x and
611 y are numeric.
612
613 yOf(s : string) : string
614 returns the string "y" if s has the form "x,y", where both x and
615 y are numeric.
616
617 llOf(s : string) : string
618 returns the string "llx,lly" if s has the form
619 "llx,lly,urx,ury", where all of llx, lly, urx, and ury are
620 numeric.
621
622 urOf(s)
623 urOf(s : string) : string returns the string "urx,ury" if s has
624 the form "llx,lly,urx,ury", where all of llx, lly, urx, and ury
625 are numeric.
626
627 sscanf(s : string, fmt : string, ...) : int
628 scans the string s, extracting values according to the sscanf(3)
629 format fmt. The values are stored in the addresses following
630 fmt, addresses having the form &v, where v is some declared
631 variable of the correct type. Returns the number of items suc‐
632 cessfully scanned.
633
634 split(s : string, arr : array, seps : string) : int
635
636 split(s : string, arr : array) : int
637
638 tokens(s : string, arr : array, seps : string) : int
639
640 tokens(s : string, arr : array) : int
641 The split function breaks the string s into fields, while the
642 tokens function breaks the string into tokens. A field consists
643 of all non-separator characters between two separator characters
644 or the beginning or end of the string. Thus, a field may be the
645 empty string. A token is a maximal, non-empty substring not con‐
646 taining a separator character. The separator characters are
647 those given in the seps argument. If seps is not provided, the
648 default value is " \t\n". The functions return the number of
649 fields or tokens.
650
651 The fields and tokens are stored in the argument array. The
652 array must be string-valued and have int as its index type. The
653 entries are indexed by consecutive integers, starting at 0. Any
654 values already stored in the array will be either overwritten,
655 or still be present after the function returns.
656
657 I/O
658 print(...) : void
659 print( expr, ... ) prints a string representation of each argu‐
660 ment in turn onto stdout, followed by a newline.
661
662 printf(fmt : string, ...) : int
663
664 printf(fd : int, fmt : string, ...) : int
665 prints the string resulting from formatting the values of the
666 expressions following fmt according to the printf(3) format fmt.
667 Returns 0 on success. By default, it prints on stdout. If the
668 optional integer fd is given, output is written on the open
669 stream associated with fd.
670
671 scanf(fmt : string, ...) : int
672
673 scanf(fd : int, fmt : string, ...) : int
674 scans in values from an input stream according to the scanf(3)
675 format fmt. The values are stored in the addresses following
676 fmt, addresses having the form &v, where v is some declared
677 variable of the correct type. By default, it reads from stdin.
678 If the optional integer fd is given, input is read from the open
679 stream associated with fd. Returns the number of items success‐
680 fully scanned.
681
682 openF(s : string, t : string) : int
683 opens the file s as an I/O stream. The string argument t speci‐
684 fies how the file is opened. The arguments are the same as for
685 the C function fopen(3). It returns an integer denoting the
686 stream, or -1 on error.
687
688 As usual, streams 0, 1 and 2 are already open as stdin, stdout,
689 and stderr, respectively. Since gvpr may use stdin to read the
690 input graphs, the user should avoid using this stream.
691
692 closeF(fd : int) : int
693 closes the open stream denoted by the integer fd. Streams 0, 1
694 and 2 cannot be closed. Returns 0 on success.
695
696 readL(fd : int) : string
697 returns the next line read from the input stream fd. It returns
698 the empty string "" on end of file. Note that the newline char‐
699 acter is left in the returned string.
700
701 Math
702 exp(d : double) : double
703 returns e to the dth power.
704
705 log(d : double) : double
706 returns the natural log of d.
707
708 sqrt(d : double) : double
709 returns the square root of the double d.
710
711 pow(d : double, x : double) : double
712 returns d raised to the xth power.
713
714 cos(d : double) : double
715 returns the cosine of d.
716
717 sin(d : double) : double
718 returns the sine of d.
719
720 atan2(y : double, x : double) : double
721 returns the arctangent of y/x in the range -pi to pi.
722
723 MIN(y : double, x : double) : double
724 returns the minimum of y and x.
725
726 MAX(y : double, x : double) : double
727 returns the maximum of y and x.
728
729 Associative Arrays
730 # arr : int
731 returns the number of elements in the array arr.
732
733 idx in arr : int
734 returns 1 if a value has been set for index idx in the array
735 arr. It returns 0 otherwise.
736
737 unset(v : array, idx) : int
738 removes the item indexed by idx. It returns 1 if the item
739 existed, 0 otherwise.
740
741 unset(v : array) : void
742 re-initializes the array.
743
744 Miscellaneous
745 exit(v : int) : void
746 causes gvpr to exit with the exit code v.
747
748 system(cmd : string) : int
749 provides the standard C function system(3). It executes cmd in
750 the user's shell environment, and returns the exit status of the
751 shell.
752
753 rand() : double
754 returns a pseudo‐random double between 0 and 1.
755
756 srand() : int
757
758 srand(v : int) : int
759 sets a seed for the random number generator. The optional argu‐
760 ment gives the seed; if it is omitted, the current time is used.
761 The previous seed value is returned. srand should be called
762 before any calls to rand.
763
764 colorx(color : string, fmt : string) : string
765 translates a color from one format to another. The color argu‐
766 ment should be a color in one of the recognized string represen‐
767 tations. The fmt value should be one of "RGB", "RGBA", "HSV", or
768 "HSVA". An empty string is returned on error.
769
771 gvpr provides certain special, built‐in variables, whose values are set
772 automatically by gvpr depending on the context. Except as noted, the
773 user cannot modify their values.
774
775 $ : obj_t
776 denotes the current object (node, edge, graph) depending on the
777 context. It is not available in BEGIN or END clauses.
778
779 $F : string
780 is the name of the current input file.
781
782 $G : graph_t
783 denotes the current graph being processed. It is not available
784 in BEGIN or END clauses.
785
786 $NG : graph_t
787 denotes the next graph to be processed. If $NG is NULL, the cur‐
788 rent graph $G is the last graph. Note that if the input comes
789 from stdin, the last graph cannot be determined until the input
790 pipe is closed. It is not available in BEGIN or END clauses, or
791 if the -n flag is used.
792
793 $O : graph_t
794 denotes the output graph. Before graph traversal, it is initial‐
795 ized to the target graph. After traversal and any END_G actions,
796 if it refers to a non‐empty graph, that graph is printed onto
797 the output stream. It is only valid in N, E and END_G clauses.
798 The output graph may be set by the user.
799
800 $T : graph_t
801 denotes the current target graph. It is a subgraph of $G and is
802 available only in N, E and END_G clauses.
803
804 $tgtname : string
805 denotes the name of the target graph. By default, it is set to
806 "gvpr_result". If used multiple times during the execution of
807 gvpr, the name will be appended with an integer. This variable
808 may be set by the user.
809
810 $tvroot : node_t
811 indicates the starting node for a (directed or undirected)
812 depth‐first or breadth‐first traversal of the graph (cf. $tvtype
813 below). The default value is NULL for each input graph. After
814 the traversal at the given root, if the value of $tvroot has
815 changed, a new traversal will begin with the new value of
816 $tvroot. Also, set $tvnext below.
817
818 $tvnext : node_t
819 indicates the next starting node for a (directed or undirected)
820 depth‐first or breadth‐first traversal of the graph (cf. $tvtype
821 below). If a traversal finishes and the $tvroot has not been
822 reset but the $tvnext has been set but not used, this node will
823 be used as the next choice for $tvroot. The default value is
824 NULL for each input graph.
825
826 $tvedge : edge_t
827 For BFS and DFS traversals, this is set to the edge used to
828 arrive at the current node or edge. At the beginning of a tra‐
829 versal, or for other traversal types, the value is NULL.
830
831 $tvtype : tvtype_t
832 indicates how gvpr traverses a graph. It can only take one of
833 the constant values with the prefix "TV_" described below.
834 TV_flat is the default.
835
836 In the underlying graph library cgraph(3), edges in undirected
837 graphs are given an arbitrary direction. This is used for tra‐
838 versals, such as TV_fwd, requiring directed edges.
839
840 ARGC : int
841 denotes the number of arguments specified by the -a args com‐
842 mand‐line argument.
843
844 ARGV : string array
845 denotes the array of arguments specified by the -a args command‐
846 line argument. The ith argument is given by ARGV[i].
847
849 There are several symbolic constants defined by gvpr.
850
851 NULL : obj_t
852 a null object reference, equivalent to 0.
853
854 TV_flat : tvtype_t
855 a simple, flat traversal, with graph objects visited in seem‐
856 ingly arbitrary order.
857
858 TV_ne : tvtype_t
859 a traversal which first visits all of the nodes, then all of the
860 edges.
861
862 TV_en : tvtype_t
863 a traversal which first visits all of the edges, then all of the
864 nodes.
865
866 TV_dfs : tvtype_t
867 TV_postdfs : tvtype_t
868 TV_prepostdfs : tvtype_t
869 a traversal of the graph using a depth‐first search on the
870 underlying undirected graph. To do the traversal, gvpr will
871 check the value of $tvroot. If this has the same value that it
872 had previously (at the start, the previous value is initialized
873 to NULL.), gvpr will simply look for some unvisited node and
874 traverse its connected component. On the other hand, if $tvroot
875 has changed, its connected component will be toured, assuming it
876 has not been previously visited or, if $tvroot is NULL, the tra‐
877 versal will stop. Note that using TV_dfs and $tvroot, it is pos‐
878 sible to create an infinite loop.
879
880 By default, the traversal is done in pre-order. That is, a node
881 is visited before all of its unvisited edges. For TV_postdfs,
882 all of a node's unvisited edges are visited before the node. For
883 TV_prepostdfs, a node is visited twice, before and after all of
884 its unvisited edges.
885
886 TV_fwd : tvtype_t
887 TV_postfwd : tvtype_t
888 TV_prepostfwd : tvtype_t
889 A traversal of the graph using a depth‐first search on the graph
890 following only forward arcs. The choice of roots for the tra‐
891 versal is the same as described for TV_dfs above. The different
892 order of visitation specified by TV_fwd, TV_postfwd and TV_pre‐
893 postfwd are the same as those specified by the analogous traver‐
894 sals TV_dfs, TV_postdfs and TV_prepostdfs.
895
896 TV_rev : tvtype_t
897 TV_postrev : tvtype_t
898 TV_prepostrev : tvtype_t
899 A traversal of the graph using a depth‐first search on the graph
900 following only reverse arcs. The choice of roots for the tra‐
901 versal is the same as described for TV_dfs above. The different
902 order of visitation specified by TV_rev, TV_postrev and TV_pre‐
903 postrev are the same as those specified by the analogous traver‐
904 sals TV_dfs, TV_postdfs and TV_prepostdfs.
905
906 TV_bfs : tvtype_t
907 A traversal of the graph using a breadth‐first search on the
908 graph ignoring edge directions. See the item on TV_dfs above for
909 the role of $tvroot.
910
912 gvpr -i 'N[color=="blue"]' file.gv
913
914 Generate the node‐induced subgraph of all nodes with color blue.
915
916 gvpr -c 'N[color=="blue"]{color = "red"}' file.gv
917
918 Make all blue nodes red.
919
920 BEGIN { int n, e; int tot_n = 0; int tot_e = 0; }
921 BEG_G {
922 n = nNodes($G);
923 e = nEdges($G);
924 printf ("%d nodes %d edges %s\n", n, e, $G.name);
925 tot_n += n;
926 tot_e += e;
927 }
928 END { printf ("%d nodes %d edges total\n", tot_n, tot_e) }
929
930 Version of the program gc.
931
932 gvpr -c ""
933
934 Equivalent to nop.
935
936 BEG_G { graph_t g = graph ("merge", "S"); }
937 E {
938 node_t h = clone(g,$.head);
939 node_t t = clone(g,$.tail);
940 edge_t e = edge(t,h,"");
941 e.weight = e.weight + 1;
942 }
943 END_G { $O = g; }
944
945 Produces a strict version of the input graph, where the weight
946 attribute of an edge indicates how many edges from the input graph the
947 edge represents.
948
949 BEGIN {node_t n; int deg[]}
950 E{deg[head]++; deg[tail]++; }
951 END_G {
952 for (deg[n]) {
953 printf ("deg[%s] = %d\n", n.name, deg[n]);
954 }
955 }
956
957 Computes the degrees of nodes with edges.
958
959 BEGIN {
960 int i, indent;
961 int seen[string];
962 void prInd (int cnt) {
963 for (i = 0; i < cnt; i++) printf (" ");
964 }
965 }
966 BEG_G {
967
968 $tvtype = TV_prepostfwd;
969 $tvroot = node($,ARGV[0]);
970 }
971 N {
972 if (seen[$.name]) indent--;
973 else {
974 prInd(indent);
975 print ($.name);
976 seen[$.name] = 1;
977 indent++;
978 }
979 }
980
981 Prints the depth-first traversal of the graph, starting with the node
982 whose name is ARGV[0], as an indented list.
983
985 GVPRPATH
986 Colon‐separated list of directories to be searched to find the
987 file specified by the -f option. gvpr has a default list built
988 in. If GVPRPATH is not defined, the default list is used. If
989 GVPRPATH starts with colon, the list is formed by appending
990 GVPRPATH to the default list. If GVPRPATH ends with colon, the
991 list is formed by appending the default list to GVPRPATH. Other‐
992 wise, GVPRPATH is used for the list.
993
994 On Windows systems, replace ``colon'' with ``semicolon'' in the previ‐
995 ous paragraph.
996
998 Scripts should be careful deleting nodes during N{} and E{} blocks
999 using BFS and DFS traversals as these rely on stacks and queues of
1000 nodes.
1001
1002 When the program is given as a command line argument, the usual shell
1003 interpretation takes place, which may affect some of the special names
1004 in gvpr. To avoid this, it is best to wrap the program in single
1005 quotes.
1006
1007 If string constants contain pattern metacharacters that you want to
1008 escape to avoid pattern matching, two backslashes will probably be nec‐
1009 essary, as a single backslash will be lost when the string is origi‐
1010 nally scanned. Usually, it is simpler to use strcmp to avoid pattern
1011 matching.
1012
1013 As of 24 April 2008, gvpr switched to using a new, underlying graph
1014 library, which uses the simpler model that there is only one copy of a
1015 node, not one copy for each subgraph logically containing it. This
1016 means that iterators such as nxtnode cannot traverse a subgraph using
1017 just a node argument. For this reason, subgraph traversal requires new
1018 functions ending in "_sg", which also take a subgraph argument. The
1019 versions without that suffix will always traverse the root graph.
1020
1021 There is a single global scope, except for formal function parameters,
1022 and even these can interfere with the type system. Also, the extent of
1023 all variables is the entire life of the program. It might be prefer‐
1024 able for scope to reflect the natural nesting of the clauses, or for
1025 the program to at least reset locally declared variables. For now, it
1026 is advisable to use distinct names for all variables.
1027
1028 If a function ends with a complex statement, such as an IF statement,
1029 with each branch doing a return, type checking may fail. Functions
1030 should use a return at the end.
1031
1032 The expr library does not support string values of (char*)0. This
1033 means we can't distinguish between "" and (char*)0 edge keys. For the
1034 purposes of looking up and creating edges, we translate "" to be
1035 (char*)0, since this latter value is necessary in order to look up any
1036 edge with a matching head and tail.
1037
1038 Related to this, strings converted to integers act like char pointers,
1039 getting the value 0 or 1 depending on whether the string consists
1040 solely of zeroes or not. Thus, the ((int)"2") evaluates to 1.
1041
1042 The language inherits the usual C problems such as dangling references
1043 and the confusion between '=' and '=='.
1044
1046 Emden R. Gansner <erg@research.att.com>
1047
1049 awk(1), gc(1), dot(1), nop(1), expr(3), cgraph(3)
1050
1051
1052
1053 29 August 2013 GVPR(1)