1GVPR(1) General Commands Manual GVPR(1)
2
3
4
6 gvpr - graph pattern scanning and processing language
7 ( previously known as gpr )
8
10 gvpr [-icnqV?] [ -o outfile ] [ -a args ] [ 'prog' | -f progfile ] [
11 files ]
12
14 gvpr is a graph stream editor inspired by awk. It copies input graphs
15 to its output, possibly transforming their structure and attributes,
16 creating new graphs, or printing arbitrary information. The graph
17 model is that provided by libcgraph(3). In particular, gvpr reads and
18 writes graphs using the dot language.
19
20 Basically, gvpr traverses each input graph, denoted by $G, visiting
21 each node and edge, matching it with the predicate‐action rules sup‐
22 plied in the input program. The rules are evaluated in order. For
23 each predicate evaluating to true, the corresponding action is per‐
24 formed. During the traversal, the current node or edge being visited
25 is denoted by $.
26
27 For each input graph, there is a target subgraph, denoted by $T, ini‐
28 tially empty and used to accumulate chosen entities, and an output
29 graph, $O, used for final processing and then written to output. By
30 default, the output graph is the target graph. The output graph can be
31 set in the program or, in a limited sense, on the command line.
32
34 The following options are supported:
35
36 -a args
37 The string args is split into whitespace‐separated tokens, with
38 the individual tokens available as strings in the gvpr program
39 as ARGV[0],...,ARGV[ARGC-1]. Whitespace characters within sin‐
40 gle or double quoted substrings, or preceded by a backslash, are
41 ignored as separators. In general, a backslash character turns
42 off any special meaning of the following character. Note that
43 the tokens derived from multiple -a flags are concatenated.
44
45 -c Use the source graph as the output graph.
46
47 -i Derive the node‐induced subgraph extension of the output graph
48 in the context of its root graph.
49
50 -o outfile
51 Causes the output stream to be written to the specified file; by
52 default, output is written to stdout.
53
54 -f progfile
55 Use the contents of the specified file as the program to execute
56 on the input. If progfile contains a slash character, the name
57 is taken as the pathname of the file. Otherwise, gvpr will use
58 the directories specified in the environment variable GVPRPATH
59 to look for the file. If -f is not given, gvpr will use the
60 first non‐option argument as the program.
61
62 -q Turns off warning messages.
63
64 -n Turns off graph read-ahead. By default, the variable $NG is set
65 to the next graph to be processed. This requires a read of the
66 next graph before processing the current graph, which may block
67 is the next graph is only generated in response to some action
68 pertaining to the processing of the current graph.
69
70 -V Causes the program to print version information and exit.
71
72 -? Causes the program to print usage information and exit.
73
75 The following operand is supported:
76
77 files Names of files containing 1 or more graphs in the dot language.
78 If no -f option is given, the first name is removed from the
79 list and used as the input program. If the list of files is
80 empty, stdin will be used.
81
83 A gvpr program consists of a list of predicate‐action clauses, having
84 one of the forms:
85
86 BEGIN { action }
87
88 BEG_G { action }
89
90 N [ predicate ] { action }
91
92 E [ predicate ] { action }
93
94 END_G { action }
95
96 END { action }
97
98 A program can contain at most one of each of the BEGIN, END_G and END
99 clauses. There can be any number of BEG_G, N and E statements, the
100 first applied to graphs, the second to nodes, the third to edges.
101 These are separated into blocks, a block consisting of an optional
102 BEG_G statement and all N and E statements up to the next BEG_G state‐
103 ment, if any. The top‐level semantics of a gvpr program are:
104
105 Evaluate the BEGIN clause, if any.
106 For each input graph G {
107 For each block {
108 Set G as the current graph and current object.
109 Evaluate the BEG_G clause, if any.
110 For each node and edge in G {
111 Set the node or edge as the current object.
112 Evaluate the N or E clauses, as appropriate.
113 }
114 }
115 Set G as the current object.
116 Evaluate the END_G clause, if any.
117 }
118 Evaluate the END clause, if any.
119
120 The actions of the BEGIN, BEG_G, END_G and END clauses are performed
121 when the clauses are evaluated. For N or E clauses, either the predi‐
122 cate or action may be omitted. If there is no predicate with an
123 action, the action is performed on every node or edge, as appropriate.
124 If there is no action and the predicate evaluates to true, the associ‐
125 ated node or edge is added to the target graph.
126
127 The blocks are evaluated in the order in which they occur. Within a
128 block, the N clauses (E clauses, respectively) are evaluated in the
129 order in which the occur. Note, though, that within a block, N or E
130 clauses may be interlaced, depending on the traversal order.
131
132 Predicates and actions are sequences of statements in the C dialect
133 supported by the expr(3) library. The only difference between predi‐
134 cates and actions is that the former must have a type that may inter‐
135 preted as either true or false. Here the usual C convention is fol‐
136 lowed, in which a non‐zero value is considered true. This would include
137 non‐empty strings and non‐empty references to nodes, edges, etc. How‐
138 ever, if a string can be converted to an integer, this value is used.
139
140 In addition to the usual C base types (void, int, char, float, long,
141 unsigned and double), gvpr provides string as a synonym for char*, and
142 the graph‐based types node_t, edge_t, graph_t and obj_t. The obj_t
143 type can be viewed as a supertype of the other 3 concrete types; the
144 correct base type is maintained dynamically. Besides these base types,
145 the only other supported type expressions are (associative) arrays.
146
147 Constants follow C syntax, but strings may be quoted with either "..."
148 or '...'. gvpr accepts C++ comments as well as cpp‐type comments. For
149 the latter, if a line begins with a '#' character, the rest of the line
150 is ignored.
151
152 A statement can be a declaration of a function, a variable or an array,
153 or an executable statement. For declarations, there is a single scope.
154 Array declarations have the form:
155
156 type array [ type0 ]
157
158 where type0 is optional. If it is supplied, the parser will enforce
159 that all array subscripts have the specified type. If it is not sup‐
160 plied, objects of all types can be used as subscripts. As in C, vari‐
161 ables and arrays must be declared. In particular, an undeclared vari‐
162 able will be interpreted as the name of an attribute of a node, edge or
163 graph, depending on the context.
164
165 Executable statements can be one of the following:
166 { [ statement ... ] }
167 expression // commonly var = expression
168 if( expression ) statement [ else statement ]
169 for( expression ; expression ; expression ) statement
170 for( array [ var ]) statement
171 forr( array [ var ]) statement
172 while( expression ) statement
173 switch( expression ) case statements
174 break [ expression ]
175 continue [ expression ]
176 return [ expression ]
177 Items in brackets are optional.
178
179 In the second form of the for statement and the forr statement, the
180 variable var is set to each value used as an index in the specified
181 array and then the associated statement is evaluated. For numeric and
182 string indices, the indices are returned in increasing (decreasing)
183 numeric or lexicographic order for for (forr, respectively). This can
184 be used for sorting.
185
186 Function definitions can only appear in the BEGIN clause.
187
188 Expressions include the usual C expressions. String comparisons using
189 == and != treat the right hand operand as a pattern for the purpose of
190 regular expression matching. Patterns use ksh(1) file match pattern
191 syntax. (For simple string equality, use the strcmp function.
192
193 gvpr will attempt to use an expression as a string or numeric value as
194 appropriate. Both C-like casts and function templates will cause con‐
195 versions to be performed, if possible.
196
197 Expressions of graphical type (i.e., graph_t, node_t, edge_t, obj_t)
198 may be followed by a field reference in the form of .name. The result‐
199 ing value is the value of the attribute named name of the given object.
200 In addition, in certain contexts an undeclared, unmodified identifier
201 is taken to be an attribute name. Specifically, such identifiers denote
202 attributes of the current node or edge, respectively, in N and E
203 clauses, and the current graph in BEG_G and END_G clauses.
204
205 As usual in the libcgraph(3) model, attributes are string‐valued. In
206 addition, gvpr supports certain pseudo‐attributes of graph objects, not
207 necessarily string‐valued. These reflect intrinsic properties of the
208 graph objects and cannot be set by the user.
209
210 head : node_t
211 the head of an edge.
212
213 tail : node_t
214 the tail of an edge.
215
216 name : string
217 the name of an edge, node or graph. The name of an edge has the
218 form "<tail‐name><edge‐op><head‐name>[<key>]", where <edge‐op>
219 is "->" or "--" depending on whether the graph is directed or
220 not. The bracket part [<key>] only appears if the edge has a
221 non‐trivial key.
222
223 indegree : int
224 the indegree of a node.
225
226 outdegree : int
227 the outdegree of a node.
228
229 degree : int
230 the degree of a node.
231
232 root : graph_t
233 the root graph of an object. The root of a root graph is itself.
234
235 parent : graph_t
236 the parent graph of a subgraph. The parent of a root graph is
237 NULL
238
239 n_edges : int
240 the number of edges in the graph
241
242 n_nodes : int
243 the number of nodes in the graph
244
245 directed : int
246 true (non‐zero) if the graph is directed
247
248 strict : int
249 true (non‐zero) if the graph is strict
250
252 The following functions are built into gvpr. Those functions returning
253 references to graph objects return NULL in case of failure.
254
255 Graphs and subgraph
256 graph(s : string, t : string) : graph_t
257 creates a graph whose name is s and whose type is specified by
258 the string t. Ignoring case, the characters U, D, S, N have the
259 interpretation undirected, directed, strict, and non‐strict,
260 respectively. If t is empty, a directed, non‐strict graph is
261 generated.
262
263 subg(g : graph_t, s : string) : graph_t
264 creates a subgraph in graph g with name s. If the subgraph
265 already exists, it is returned.
266
267 isSubg(g : graph_t, s : string) : graph_t
268 returns the subgraph in graph g with name s, if it exists, or
269 NULL otherwise.
270
271 fstsubg(g : graph_t) : graph_t
272 returns the first subgraph in graph g, or NULL if none exists.
273
274 nxtsubg(sg : graph_t) : graph_t
275 returns the next subgraph after sg, or NULL.
276
277 isDirect(g : graph_t) : int
278 returns true if and only if g is directed.
279
280 isStrict(g : graph_t) : int
281 returns true if and only if g is strict.
282
283 nNodes(g : graph_t) : int
284 returns the number of nodes in g.
285
286 nEdges(g : graph_t) : int
287 returns the number of edges in g.
288
289 Nodes
290 node(sg : graph_t, s : string) : node_t
291 creates a node in graph g of name s. If such a node already
292 exists, it is returned.
293
294 subnode(sg : graph_t, n : node_t) : node_t
295 inserts the node n into the subgraph g. Returns the node.
296
297 fstnode(g : graph_t) : node_t
298 returns the first node in graph g, or NULL if none exists.
299
300 nxtnode(n : node_t) : node_t
301 returns the next node after n in the root graph, or NULL.
302
303 nxtnode_sg(sg : graph_t, n : node_t) : node_t
304 returns the next node after n in sg, or NULL.
305
306 isNode(sg : graph_t, s : string) : node_t
307 looks for a node in (sub)graph sg of name s. If such a node
308 exists, it is returned. Otherwise, NULL is returned.
309
310 isSubnode(sg : graph_t, n : node_t) : int
311 returns non-zero if node n is in (sub)graph sg, or zero other‐
312 wise.
313
314 indegreeOf(sg : graph_t, n : node_t) : int
315 returns the indegree of node n in (sub)graph sg.
316
317 outdegreeOf(sg : graph_t, n : node_t) : int
318 returns the outdegree of node n in (sub)graph sg.
319
320 degreeOf(sg : graph_t, n : node_t) : int
321 returns the degree of node n in (sub)graph sg.
322
323 Edges
324 edge(t : node_t, h : node_t, s : string) : edge_t
325 creates an edge with tail node t, head node h and name s in the
326 root graph. If the graph is undirected, the distinction between
327 head and tail nodes is unimportant. If such an edge already
328 exists, it is returned.
329
330 edge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
331 creates an edge with tail node t, head node h and name s in
332 (sub)graph sg (and all parent graphs). If the graph is undi‐
333 rected, the distinction between head and tail nodes is unimpor‐
334 tant. If such an edge already exists, it is returned.
335
336 subedge(g : graph_t, e : edge_t) : edge_t
337 inserts the edge e into the subgraph g. Returns the edge.
338
339 isEdge(t : node_t, h : node_t, s : string) : edge_t
340 looks for an edge with tail node t, head node h and name s. If
341 the graph is undirected, the distinction between head and tail
342 nodes is unimportant. If such an edge exists, it is returned.
343 Otherwise, NULL is returned.
344
345 isEdge_sg(sg : graph_t, t : node_t, h : node_t, s : string) : edge_t
346 looks for an edge with tail node t, head node h and name s in
347 (sub)graph sg. If the graph is undirected, the distinction
348 between head and tail nodes is unimportant. If such an edge
349 exists, it is returned. Otherwise, NULL is returned.
350
351 isSubedge(g : graph_t, e : edge_t) : int
352 returns non-zero if edge e is in (sub)graph sg, or zero other‐
353 wise.
354
355 fstout(n : node_t) : edge_t
356 returns the first outedge of node n in the root graph.
357
358 fstout_sg(sg : graph_t, n : node_t) : edge_t
359 returns the first outedge of node n in (sub)graph sg.
360
361 nxtout(e : edge_t) : edge_t
362 returns the next outedge after e in the root graph.
363
364 nxtout_sg(sg : graph_t, e : edge_t) : edge_t
365 returns the next outedge after e in graph sg.
366
367 fstin(n : node_t) : edge_t
368 returns the first inedge of node n in the root graph.
369
370 fstin_sg(sg : graph_t, n : node_t) : edge_t
371 returns the first inedge of node n in graph sg.
372
373 nxtin(e : edge_t) : edge_t
374 returns the next inedge after e in the root graph.
375
376 nxtin_sg(sg : graph_t, e : edge_t) : edge_t
377 returns the next inedge after e in graph sg.
378
379 fstedge(n : node_t) : edge_t
380 returns the first edge of node n in the root graph.
381
382 fstedge_sg(sg : graph_t, n : node_t) : edge_t
383 returns the first edge of node n in graph sg.
384
385 nxtedge(e : edge_t, node_t) : edge_t
386 returns the next edge after e in the root graph.
387
388 nxtedge_sg(sg : graph_t, e : edge_t, node_t) : edge_t
389 returns the next edge after e in the graph sg.
390
391 opp(e : edge_t, node_t) : node_t
392 returns the node on the edge e not equal to n. Returns NULL if
393 n is not a node of e. This can be useful when using fstedge and
394 nxtedge to enumerate the neighbors of n.
395
396 Graph I/O
397 write(g : graph_t) : void
398 prints g in dot format onto the output stream.
399
400 writeG(g : graph_t, fname : string) : void
401 prints g in dot format into the file fname.
402
403 fwriteG(g : graph_t, fd : int) : void
404 prints g in dot format onto the open stream denoted by the inte‐
405 ger fd.
406
407 readG(fname : string) : graph_t
408 returns a graph read from the file fname. The graph should be in
409 dot format. If no graph can be read, NULL is returned.
410
411 freadG(fd : int) : graph_t
412 returns the next graph read from the open stream fd. Returns
413 NULL at end of file.
414
415 Graph miscellany
416 delete(g : graph_t, x : obj_t) : void
417 deletes object x from graph g. If g is NULL, the function uses
418 the root graph of x. If x is a graph or subgraph, it is closed
419 unless x is locked.
420
421 isIn(g : graph_t, x : obj_t) : int
422 returns true if x is in subgraph g.
423
424 cloneG(g : graph_t, s : string) : graph_t
425 creates a clone of graph g with name of s. If s is "", the cre‐
426 ated graph has the same name as g.
427
428 clone(g : graph_t, x : obj_t) : obj_t
429 creates a clone of object x in graph g. In particular, the new
430 object has the same name/value attributes and structure as the
431 original object. If an object with the same key as x already
432 exists, its attributes are overlaid by those of x and the object
433 is returned. If an edge is cloned, both endpoints are implic‐
434 itly cloned. If a graph is cloned, all nodes, edges and sub‐
435 graphs are implicitly cloned. If x is a graph, g may be NULL,
436 in which case the cloned object will be a new root graph. In
437 this case, the call is equivalent to cloneG(x,"").
438
439 copy(g : graph_t, x : obj_t) : obj_t
440 creates a copy of object x in graph g, where the new object has
441 the same name/value attributes as the original object. If an
442 object with the same key as x already exists, its attributes are
443 overlaid by those of x and the object is returned. Note that
444 this is a shallow copy. If x is a graph, none of its nodes,
445 edges or subgraphs are copied into the new graph. If x is an
446 edge, the endpoints are created if necessary, but they are not
447 cloned. If x is a graph, g may be NULL, in which case the
448 cloned object will be a new root graph.
449
450 copyA(src : obj_t, tgt : obj_t) : int
451 copies the attributes of object src to object tgt, overwriting
452 any attribute values tgt may initially have.
453
454 induce(g : graph_t) : void
455 extends g to its node‐induced subgraph extension in its root
456 graph.
457
458 hasAttr(src : obj_t, name : string) : int
459 returns non-zero if object src has an attribute whose name is
460 name. It returns 0 otherwise.
461
462 isAttr(g : graph_t, kind : string, name : string) : int
463 returns non-zero if an attribute name has been defined in g for
464 objects of the given kind. For nodes, edges, and graphs, kind
465 should be "N", "E", and "G", respectively. It returns 0 other‐
466 wise.
467
468 aget(src : obj_t, name : string) : string
469 returns the value of attribute name in object src. This is use‐
470 ful for those cases when name conflicts with one of the keywords
471 such as "head" or "root". If the attribute has not been
472 declared in the graph, the function will initialize it with a
473 default value of "". To avoid this, one should use the hasAttr
474 or isAttr function to check that the attribute exists.
475
476 aset(src : obj_t, name : string, value : string) : int
477 sets the value of attribute name in object src to value.
478 Returns 0 on success, non‐zero on failure. See aget above.
479
480 getDflt(g : graph_t, kind : string, name : string) : string
481 returns the default value of attribute name in objects in g of
482 the given kind. For nodes, edges, and graphs, kind should be
483 "N", "E", and "G", respectively. If the attribute has not been
484 declared in the graph, the function will initialize it with a
485 default value of "". To avoid this, one should use the isAttr
486 function to check that the attribute exists.
487
488 setDflt(g : graph_t, kind : string, name : string, value : string) :
489 int
490 sets the default value of attribute name to value in objects in
491 g of the given kind. For nodes, edges, and graphs, kind should
492 be "N", "E", and "G", respectively. Returns 0 on success, non‐
493 zero on failure. See getDflt above.
494
495 fstAttr(g : graph_t, kind : string) : string
496 returns the name of the first attribute of objects in g of the
497 given kind. For nodes, edges, and graphs, kind should be "N",
498 "E", and "G", respectively. If there are no attributes, the
499 string "" is returned.
500
501 nxtAttr(g : graph_t, kind : string, name : string) : string
502 returns the name of the next attribute of objects in g of the
503 given kind after the attribute name. The argument name must be
504 the name of an existing attribute; it will typically be the
505 return value of an previous call to fstAttr or nxtAttr. For
506 nodes, edges, and graphs, kind should be "N", "E", and "G",
507 respectively. If there are no attributes left, the string "" is
508 returned.
509
510 compOf(g : graph_t, n : node_t) : graph_t
511 returns the connected component of the graph g containing node
512 n, as a subgraph of g. The subgraph only contains the nodes. One
513 can use induce to add the edges. The function fails and returns
514 NULL if n is not in g. Connectivity is based on the underlying
515 undirected graph of g.
516
517 kindOf(obj : obj_t) : string
518 returns an indication of what kind of graph object is the argu‐
519 ment. For nodes, edges, and graphs, it returns should be "N",
520 "E", and "G", respectively.
521
522 lock(g : graph_t, v : int) : int
523 implements graph locking on root graphs. If the integer v is
524 positive, the graph is set so that future calls to delete have
525 no immediate effect. If v is zero, the graph is unlocked. If
526 there has been a call to delete the graph while it was locked,
527 the graph is closed. If v is negative, nothing is done. In all
528 cases, the previous lock value is returned.
529
530 Strings
531 sprintf(fmt : string, ...) : string
532 returns the string resulting from formatting the values of the
533 expressions occurring after fmt according to the printf(3) for‐
534 mat fmt
535
536 gsub(str : string, pat : string) : string
537
538 gsub(str : string, pat : string, repl : string) : string
539 returns str with all substrings matching pat deleted or replaced
540 by repl, respectively.
541
542 sub(str : string, pat : string) : string
543
544 sub(str : string, pat : string, repl : string) : string
545 returns str with the leftmost substring matching pat deleted or
546 replaced by repl, respectively. The characters '^' and '$' may
547 be used at the beginning and end, respectively, of pat to anchor
548 the pattern to the beginning or end of str.
549
550 substr(str : string, idx : int) : string
551
552 substr(str : string, idx : int, len : int) : string
553 returns the substring of str starting at position idx to the end
554 of the string or of length len, respectively. Indexing starts
555 at 0. If idx is negative or idx is greater than the length of
556 str, a fatal error occurs. Similarly, in the second case, if len
557 is negative or idx + len is greater than the length of str, a
558 fatal error occurs.
559
560 strcmp(s1 : string, s2 : string) : int
561 provides the standard C function strcmp(3).
562
563 length(s : string) : int
564 returns the length of string s.
565
566 index(s : string, t : string) : int
567
568 rindex(s : string, t : string) : int
569 returns the index of the character in string s where the left‐
570 most (rightmost) copy of string t can be found, or -1 if t is
571 not a substring of s.
572
573 match(s : string, p : string) : int
574 returns the index of the character in string s where the left‐
575 most match of pattern p can be found, or -1 if no substring of s
576 matches p.
577
578 toupper(s : string) : string
579 returns a version of s with the alphabetic characters converted
580 to upper-case.
581
582 tolower(s : string) : string
583 returns a version of s with the alphabetic characters converted
584 to lower-case.
585
586 canon(s : string) : string
587 returns a version of s appropriate to be used as an identifier
588 in a dot file.
589
590 html(g : graph_t, s : string) : string
591 returns a ``magic'' version of s as an HTML string. This will
592 typically be used to attach an HTML-like label to a graph
593 object. Note that the returned string lives in g. In particular,
594 it will be freed when g is closed, and to act as an HTML string,
595 it has to be used with an object of g. In addition, note that
596 the angle bracket quotes should not be part of s. These will be
597 added if g is written in concrete DOT format.
598
599 ishtml(s : string) : int
600 returns non-zero if and only if s is an HTML string.
601
602 xOf(s : string) : string
603 returns the string "x" if s has the form "x,y", where both x and
604 y are numeric.
605
606 yOf(s : string) : string
607 returns the string "y" if s has the form "x,y", where both x and
608 y are numeric.
609
610 llOf(s : string) : string
611 returns the string "llx,lly" if s has the form
612 "llx,lly,urx,ury", where all of llx, lly, urx, and ury are
613 numeric.
614
615 urOf(s)
616 urOf(s : string) : string returns the string "urx,ury" if s has
617 the form "llx,lly,urx,ury", where all of llx, lly, urx, and ury
618 are numeric.
619
620 sscanf(s : string, fmt : string, ...) : int
621 scans the string s, extracting values according to the sscanf(3)
622 format fmt. The values are stored in the addresses following
623 fmt, addresses having the form &v, where v is some declared
624 variable of the correct type. Returns the number of items suc‐
625 cessfully scanned.
626
627 split(s : string, arr : array, seps : string) : int
628
629 split(s : string, arr : array) : int
630
631 tokens(s : string, arr : array, seps : string) : int
632
633 tokens(s : string, arr : array) : int
634 The split function breaks the string s into fields, while the
635 tokens function breaks the string into tokens. A field consists
636 of all non-separator characters between two separator characters
637 or the beginning or end of the string. Thus, a field may be the
638 empty string. A token is a maximal, non-empty substring not con‐
639 taining a separator character. The separator characters are
640 those given in the seps argument. If seps is not provided, the
641 default value is " \t\n". The functions return the number of
642 fields or tokens.
643
644 The fields and tokens are stored in the argument array. The
645 array must be string-valued and, if an index type is specified,
646 it must be int. The entries are indexed by consecutive integers,
647 starting at 0. Any values already stored in the array will be
648 either overwritten, or still be present after the function
649 returns.
650
651 I/O
652 print(...) : void
653 print( expr, ... ) prints a string representation of each argu‐
654 ment in turn onto stdout, followed by a newline.
655
656 printf(fmt : string, ...) : int
657
658 printf(fd : int, fmt : string, ...) : int
659 prints the string resulting from formatting the values of the
660 expressions following fmt according to the printf(3) format fmt.
661 Returns 0 on success. By default, it prints on stdout. If the
662 optional integer fd is given, output is written on the open
663 stream associated with fd.
664
665 scanf(fmt : string, ...) : int
666
667 scanf(fd : int, fmt : string, ...) : int
668 scans in values from an input stream according to the scanf(3)
669 format fmt. The values are stored in the addresses following
670 fmt, addresses having the form &v, where v is some declared
671 variable of the correct type. By default, it reads from stdin.
672 If the optional integer fd is given, input is read from the open
673 stream associated with fd. Returns the number of items success‐
674 fully scanned.
675
676 openF(s : string, t : string) : int
677 opens the file s as an I/O stream. The string argument t speci‐
678 fies how the file is opened. The arguments are the same as for
679 the C function fopen(3). It returns an integer denoting the
680 stream, or -1 on error.
681
682 As usual, streams 0, 1 and 2 are already open as stdin, stdout,
683 and stderr, respectively. Since gvpr may use stdin to read the
684 input graphs, the user should avoid using this stream.
685
686 closeF(fd : int) : int
687 closes the open stream denoted by the integer fd. Streams 0, 1
688 and 2 cannot be closed. Returns 0 on success.
689
690 readL(fd : int) : string
691 returns the next line read from the input stream fd. It returns
692 the empty string "" on end of file. Note that the newline char‐
693 acter is left in the returned string.
694
695 Math
696 exp(d : double) : double
697 returns e to the dth power.
698
699 log(d : double) : double
700 returns the natural log of d.
701
702 sqrt(d : double) : double
703 returns the square root of the double d.
704
705 pow(d : double, x : double) : double
706 returns d raised to the xth power.
707
708 cos(d : double) : double
709 returns the cosine of d.
710
711 sin(d : double) : double
712 returns the sine of d.
713
714 atan2(y : double, x : double) : double
715 returns the arctangent of y/x in the range -pi to pi.
716
717 MIN(y : double, x : double) : double
718 returns the minimum of y and x.
719
720 MAX(y : double, x : double) : double
721 returns the maximum of y and x.
722
723 Associative Arrays
724 # arr : int
725 returns the number of elements in the array arr.
726
727 idx in arr : int
728 returns 1 if a value has been set for index idx in the array
729 arr. It returns 0 otherwise.
730
731 unset(v : array, idx) : int
732 removes the item indexed by idx. It returns 1 if the item
733 existed, 0 otherwise.
734
735 unset(v : array) : void
736 re-initializes the array.
737
738 Miscellaneous
739 exit(v : int) : void
740 causes gvpr to exit with the exit code v.
741
742 system(cmd : string) : int
743 provides the standard C function system(3). It executes cmd in
744 the user's shell environment, and returns the exit status of the
745 shell.
746
747 rand() : double
748 returns a pseudo‐random double between 0 and 1.
749
750 srand() : int
751
752 srand(v : int) : int
753 sets a seed for the random number generator. The optional argu‐
754 ment gives the seed; if it is omitted, the current time is used.
755 The previous seed value is returned. srand should be called
756 before any calls to rand.
757
758 colorx(color : string, fmt : string) : string
759 translates a color from one format to another. The color argu‐
760 ment should be a color in one of the recognized string represen‐
761 tations. The fmt value should be one of "RGB", "RGBA", "HSV", or
762 "HSVA". An empty string is returned on error.
763
765 gvpr provides certain special, built‐in variables, whose values are set
766 automatically by gvpr depending on the context. Except as noted, the
767 user cannot modify their values.
768
769 $ : obj_t
770 denotes the current object (node, edge, graph) depending on the
771 context. It is not available in BEGIN or END clauses.
772
773 $F : string
774 is the name of the current input file.
775
776 $G : graph_t
777 denotes the current graph being processed. It is not available
778 in BEGIN or END clauses.
779
780 $NG : graph_t
781 denotes the next graph to be processed. If $NG is NULL, the cur‐
782 rent graph $G is the last graph. Note that if the input comes
783 from stdin, the last graph cannot be determined until the input
784 pipe is closed. It is not available in BEGIN or END clauses, or
785 if the -n flag is used.
786
787 $O : graph_t
788 denotes the output graph. Before graph traversal, it is initial‐
789 ized to the target graph. After traversal and any END_G actions,
790 if it refers to a non‐empty graph, that graph is printed onto
791 the output stream. It is only valid in N, E and END_G clauses.
792 The output graph may be set by the user.
793
794 $T : graph_t
795 denotes the current target graph. It is a subgraph of $G and is
796 available only in N, E and END_G clauses.
797
798 $tgtname : string
799 denotes the name of the target graph. By default, it is set to
800 "gvpr_result". If used multiple times during the execution of
801 gvpr, the name will be appended with an integer. This variable
802 may be set by the user.
803
804 $tvroot : node_t
805 indicates the starting node for a (directed or undirected)
806 depth‐first or breadth‐first traversal of the graph (cf. $tvtype
807 below). The default value is NULL for each input graph. After
808 the traversal at the given root, if the value of $tvroot has
809 changed, a new traversal will begin with the new value of
810 $tvroot. Also, set $tvnext below.
811
812 $tvnext : node_t
813 indicates the next starting node for a (directed or undirected)
814 depth‐first or breadth‐first traversal of the graph (cf. $tvtype
815 below). If a traversal finishes and the $tvroot but the $tvnext
816 has been set but not used, this node will be used as the next
817 choice for $tvroot. The default value is NULL for each input
818 graph.
819
820 $tvedge : edge_t
821 For BFS and DFS traversals, this is set to the edge used to
822 arrive at the current node or edge. At the beginning of a tra‐
823 versal, or for other traversal types, the value is NULL.
824
825 $tvtype : tvtype_t
826 indicates how gvpr traverses a graph. It can only take one of
827 the constant values with the previx "TV_" described below.
828 TV_flat is the default.
829
830 In the underlying graph library cgraph(3), edges in undirected
831 graphs are given an arbitrary direction. This is used for tra‐
832 versals, such as TV_fwd, requiring directed edges.
833
834 ARGC : int
835 denotes the number of arguments specified by the -a args com‐
836 mand‐line argument.
837
838 ARGV : string array
839 denotes the array of arguments specified by the -a args command‐
840 line argument. The ith argument is given by ARGV[i].
841
843 There are several symbolic constants defined by gvpr.
844
845 NULL : obj_t
846 a null object reference, equivalent to 0.
847
848 TV_flat : tvtype_t
849 a simple, flat traversal, with graph objects visited in seem‐
850 ingly arbitrary order.
851
852 TV_ne : tvtype_t
853 a traversal which first visits all of the nodes, then all of the
854 edges.
855
856 TV_en : tvtype_t
857 a traversal which first visits all of the edges, then all of the
858 nodes.
859
860 TV_dfs : tvtype_t
861 TV_postdfs : tvtype_t
862 TV_prepostdfs : tvtype_t
863 a traversal of the graph using a depth‐first search on the
864 underlying undirected graph. To do the traversal, gvpr will
865 check the value of $tvroot. If this has the same value that it
866 had previously (at the start, the previous value is initialized
867 to NULL.), gvpr will simply look for some unvisited node and
868 traverse its connected component. On the other hand, if $tvroot
869 has changed, its connected component will be toured, assuming it
870 has not been previously visited or, if $tvroot is NULL, the tra‐
871 versal will stop. Note that using TV_dfs and $tvroot, it is pos‐
872 sible to create an infinite loop.
873
874 By default, the traversal is done in pre-order. That is, a node
875 is visited before all of its unvisited edges. For TV_postdfs,
876 all of a node's unvisited edges are visited before the node. For
877 TV_prepostdfs, a node is visited twice, before and after all of
878 its unvisited edges.
879
880 TV_fwd : tvtype_t
881 TV_postfwd : tvtype_t
882 TV_prepostfwd : tvtype_t
883 A traversal of the graph using a depth‐first search on the graph
884 following only forward arcs. The choice of roots for the tra‐
885 versal is the same as described for TV_dfs above. The different
886 order of visitation specified by TV_fwd, TV_postfwd and TV_pre‐
887 postfwd are the same as those specified by the analogous traver‐
888 sals TV_dfs, TV_postdfs and TV_prepostdfs.
889
890 TV_rev : tvtype_t
891 TV_postrev : tvtype_t
892 TV_prepostrev : tvtype_t
893 A traversal of the graph using a depth‐first search on the graph
894 following only reverse arcs. The choice of roots for the tra‐
895 versal is the same as described for TV_dfs above. The different
896 order of visitation specified by TV_rev, TV_postrev and TV_pre‐
897 postrev are the same as those specified by the analogous traver‐
898 sals TV_dfs, TV_postdfs and TV_prepostdfs.
899
900 TV_bfs : tvtype_t
901 A traversal of the graph using a breadth‐first search on the
902 graph ignoring edge directions. See the item on TV_dfs above for
903 the role of $tvroot.
904
906 gvpr -i 'N[color=="blue"]' file.gv
907
908 Generate the node‐induced subgraph of all nodes with color blue.
909
910 gvpr -c 'N[color=="blue"]{color = "red"}' file.gv
911
912 Make all blue nodes red.
913
914 BEGIN { int n, e; int tot_n = 0; int tot_e = 0; }
915 BEG_G {
916 n = nNodes($G);
917 e = nEdges($G);
918 printf ("%d nodes %d edges %s\n", n, e, $G.name);
919 tot_n += n;
920 tot_e += e;
921 }
922 END { printf ("%d nodes %d edges total\n", tot_n, tot_e) }
923
924 Version of the program gc.
925
926 gvpr -c ""
927
928 Equivalent to nop.
929
930 BEG_G { graph_t g = graph ("merge", "S"); }
931 E {
932 node_t h = clone(g,$.head);
933 node_t t = clone(g,$.tail);
934 edge_t e = edge(t,h,"");
935 e.weight = e.weight + 1;
936 }
937 END_G { $O = g; }
938
939 Produces a strict version of the input graph, where the weight
940 attribute of an edge indicates how many edges from the input graph the
941 edge represents.
942
943 BEGIN {node_t n; int deg[]}
944 E{deg[head]++; deg[tail]++; }
945 END_G {
946 for (deg[n]) {
947 printf ("deg[%s] = %d\n", n.name, deg[n]);
948 }
949 }
950
951 Computes the degrees of nodes with edges.
952
953 BEGIN {
954 int i, indent;
955 int seen[string];
956 void prInd (int cnt) {
957 for (i = 0; i < cnt; i++) printf (" ");
958 }
959 }
960 BEG_G {
961
962 $tvtype = TV_prepostfwd;
963 $tvroot = node($,ARGV[0]);
964 }
965 N {
966 if (seen[$.name]) indent--;
967 else {
968 prInd(indent);
969 print ($.name);
970 seen[$.name] = 1;
971 indent++;
972 }
973 }
974
975 Prints the depth-first traversal of the graph, starting with the node
976 whose name is ARGV[0], as an indented list.
977
979 GVPRPATH
980 Colon‐separated list of directories to be searched to find the
981 file specified by the -f option. gvpr has a default list built
982 in. If GVPRPATH is not defined, the default list is used. If
983 GVPRPATH starts with colon, the list is formed by appending
984 GVPRPATH to the default list. If GVPRPATH ends with colon, the
985 list is formed by appending the default list to GVPRPATH. Other‐
986 wise, GVPRPATH is used for the list.
987
988 On Windows systems, replace ``colon'' with ``semicolon'' in the previ‐
989 ous paragraph.
990
992 Scripts should be careful deleting nodes during N{} and E{} blocks
993 using BFS and DFS traversals as these rely on stacks and queues of
994 nodes.
995
996 When the program is given as a command line argument, the usual shell
997 interpretation takes place, which may affect some of the special names
998 in gvpr. To avoid this, it is best to wrap the program in single
999 quotes.
1000
1001 If string constants contain pattern metacharacters that you want to
1002 escape to avoid pattern matching, two backslashes will probably be nec‐
1003 essary, as a single backslash will be lost when the string is origi‐
1004 nally scanned. Usually, it is simpler to use strcmp to avoid pattern
1005 matching.
1006
1007 As of 24 April 2008, gvpr switched to using a new, underlying graph
1008 library, which uses the simpler model that there is only one copy of a
1009 node, not one copy for each subgraph logically containing it. This
1010 means that iterators such as nxtnode cannot traverse a subgraph using
1011 just a node argument. For this reason, subgraph traversal requires new
1012 functions ending in "_sg", which also take a subgraph argument. The
1013 versions without that suffix will always traverse the root graph.
1014
1015 There is a single global scope, except for formal function parameters,
1016 and even these can interfere with the type system. Also, the extent of
1017 all variables is the entire life of the program. It might be prefer‐
1018 able for scope to reflect the natural nesting of the clauses, or for
1019 the program to at least reset locally declared variables. For now, it
1020 is advisable to use distinct names for all variables.
1021
1022 If a function ends with a complex statement, such as an IF statement,
1023 with each branch doing a return, type checking may fail. Functions
1024 should use a return at the end.
1025
1026 The expr library does not support string values of (char*)0. This
1027 means we can't distinguish between "" and (char*)0 edge keys. For the
1028 purposes of looking up and creating edges, we translate "" to be
1029 (char*)0, since this latter value is necessary in order to look up any
1030 edge with a matching head and tail.
1031
1032 Related to this, strings converted to integers act like char pointers,
1033 getting the value 0 or 1 depending on whether the string consists
1034 solely of zeroes or not. Thus, the ((int)"2") evaluates to 1.
1035
1036 The language inherits the usual C problems such as dangling references
1037 and the confusion between '=' and '=='.
1038
1040 Emden R. Gansner <erg@research.att.com>
1041
1043 awk(1), gc(1), dot(1), nop(1), expr(3), cgraph(3)
1044
1045
1046
1047 4 May 2012 GVPR(1)