1CTAGS-CLIENT-TOOLS(7) Universal Ctags CTAGS-CLIENT-TOOLS(7)
2
3
4
6 ctags-client-tools - Hints for developing a tool using ctags command
7 and tags output
8
10 ctags [options] [file(s)]
11 etags [options] [file(s)]
12
13
15 Client tool means a tool running the ctags command and/or reading a
16 tags file generated by ctags command. This man page gathers hints for
17 people who develop client tools.
18
20 Pseudo-tags, stored in a tag file, indicate how ctags generated the
21 tags file: whether the tags file is sorted or not, which version of
22 tags file format is used, the name of tags generator, and so on. The
23 opposite term for pseudo-tags is regular-tags. A regular-tag is for a
24 language object in an input file. A pseudo-tag is for the tags file it‐
25 self. Client tools may use pseudo-tags as reference for processing reg‐
26 ular-tags.
27
28 A pseudo-tag is stored in a tags file in the same format as regu‐
29 lar-tags as described in tags(5), except that pseudo-tag names are pre‐
30 fixed with "!_". For the general information about pseudo-tags, see
31 "TAG FILE INFORMATION" in tags(5).
32
33 An example of a pseudo tag:
34
35 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/
36
37 The value, "Universal Ctags", associated with the pseudo tag TAG_PRO‐
38 GRAM_NAME, is used in the field for input file. The description, "De‐
39 rived from Exuberant Ctags", is used in the field for pattern.
40
41 Universal Ctags extends the naming scheme of the classical pseudo-tags
42 available in Exuberant Ctags for emitting language specific information
43 as pseudo tags:
44
45 !_{pseudo-tag-name}!{language-name} {associated-value} /{description}/
46
47 The language-name is appended to the pseudo-tag name with a separator,
48 "!".
49
50 An example of pseudo tag with a language suffix:
51
52 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/
53
54 This pseudo-tag says "the function kind of C language is enabled when
55 generating this tags file." --pseudo-tags is the option for en‐
56 abling/disabling individual pseudo-tags. When enabling/disabling a
57 pseudo tag with the option, specify the tag name only TAG_KIND_DESCRIP‐
58 TION, without the prefix ("!_") or the suffix ("!C").
59
60 Options for Pseudo-tags
61 --extras=+p (or --extras=+{pseudo})
62 Forces writing pseudo-tags.
63
64 ctags emits pseudo-tags by default when writing tags to a regu‐
65 lar file (e.g. "tags'.) However, when specifying -o - or -f -
66 for writing tags to standard output, ctags doesn't emit
67 pseudo-tags. --extras=+p or --extras=+{pseudo} will force
68 pseudo-tags to be written.
69
70 --list-pseudo-tags
71 Lists available types of pseudo-tags and shows whether they are
72 enabled or disabled.
73
74 Running ctags with --list-pseudo-tags option lists available
75 pseudo-tags. Some of pseudo-tags newly introduced in Universal
76 Ctags project are disabled by default. Use --pseudo-tags=... to
77 enable them.
78
79 --pseudo-tags=[+|-]names|*
80 Specifies a list of pseudo-tag types to include in the output.
81
82 The parameters are a set of pseudo tag names. Valid pseudo tag
83 names can be listed with --list-pseudo-tags. Surround each name
84 in the set with braces, like "{TAG_PROGRAM_AUTHOR}". You don't
85 have to include the "!_" pseudo tag prefix when specifying a
86 name in the option argument for --pseudo-tags= option.
87
88 pseudo-tags don't have a notation using one-letter flags.
89
90 If a name is preceded by either the '+' or '-' characters, that
91 tags's effect has been added or removed. Otherwise the names re‐
92 place any current settings. All entries are included if '*' is
93 given.
94
95 --fields=+E (or --fields=+{extras})
96 Attach "extras:pseudo" field to pseudo-tags.
97
98 An example of pseudo tags with the field:
99
100 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/ extras:pseudo
101
102 If the name of a regular tag in a tag file starts with "!_", a
103 client tool cannot distinguish whether the tag is a regular-tag
104 or pseudo-tag. The fields attached with this option help the
105 tool distinguish them.
106
107 List of notable pseudo-tags
108 Running ctags with --list-pseudo-tags option lists available types of
109 pseudo-tags with short descriptions. This subsection shows hints for
110 using notable ones.
111
112 TAG_EXTRA_DESCRIPTION (new in Universal Ctags)
113 Indicates the names and descriptions of enabled extras:
114
115 !_TAG_EXTRA_DESCRIPTION {extra-name} /description/
116 !_TAG_EXTRA_DESCRIPTION!{language-name} {extra-name} /description/
117
118 If your tool relies on some extra tags (extras), refer to the
119 pseudo-tags of this type. A tool can reject the tags file that
120 doesn't include expected extras, and raise an error in an early
121 stage of processing.
122
123 An example of the pseudo-tags:
124
125 $ ctags --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c
126 !_TAG_EXTRA_DESCRIPTION anonymous /Include tags for non-named objects like lambda/
127 !_TAG_EXTRA_DESCRIPTION fileScope /Include tags of file scope/
128 !_TAG_EXTRA_DESCRIPTION pseudo /Include pseudo tags/
129 !_TAG_EXTRA_DESCRIPTION subparser /Include tags generated by subparsers/
130 ...
131
132 A client tool can know "{anonymous}", "{fileScope}", "{pseudo}",
133 and "{subparser}" extras are enabled from the output.
134
135 Universal Ctags version 6.0 will turn on this pseudo tag by de‐
136 fault.
137
138 TAG_FIELD_DESCRIPTION (new in Universal Ctags)
139 Indicates the names and descriptions of enabled fields:
140
141 !_TAG_FIELD_DESCRIPTION {field-name} /description/
142 !_TAG_FIELD_DESCRIPTION!{language-name} {field-name} /description/
143
144 If your tool relies on some fields, refer to the pseudo-tags of
145 this type. A tool can reject a tags file that doesn't include
146 expected fields, and raise an error in an early stage of pro‐
147 cessing.
148
149 An example of the pseudo-tags:
150
151 $ ctags --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c
152 !_TAG_FIELD_DESCRIPTION file /File-restricted scoping/
153 !_TAG_FIELD_DESCRIPTION input /input file/
154 !_TAG_FIELD_DESCRIPTION name /tag name/
155 !_TAG_FIELD_DESCRIPTION pattern /pattern/
156 !_TAG_FIELD_DESCRIPTION typeref /Type and name of a variable or typedef/
157 !_TAG_FIELD_DESCRIPTION!C macrodef /macro definition/
158 ...
159
160 A client tool can know "{file}", "{input}", "{name}", "{pat‐
161 tern}", and "{typeref}" fields are enabled from the output. The
162 fields are common in languages. In addition to the common
163 fields, the tool can known "{macrodef}" field of C language is
164 also enabled.
165
166 Universal Ctags version 6.0 will turn on this pseudo tag by de‐
167 fault.
168
169 TAG_FILE_ENCODING (new in Universal Ctags)
170 TBW
171
172 TAG_FILE_FORMAT
173 See also tags(5).
174
175 TAG_FILE_SORTED
176 See also tags(5).
177
178 TAG_KIND_DESCRIPTION (new in Universal Ctags)
179 Indicates the names and descriptions of enabled kinds:
180
181 !_TAG_KIND_DESCRIPTION!{language-name} {kind-letter},{kind-name} /description/
182
183 If your tool relies on some kinds, refer to the pseudo-tags of
184 this type. A tool can reject the tags file that doesn't include
185 expected kinds, and raise an error in an early stage of process‐
186 ing.
187
188 Kinds are language specific, so a language name is always ap‐
189 pended to the tag name as suffix.
190
191 An example of the pseudo-tags:
192
193 $ ctags --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c
194 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/
195 !_TAG_KIND_DESCRIPTION!C m,member /struct, and union members/
196 !_TAG_KIND_DESCRIPTION!C v,variable /variable definitions/
197 ...
198
199 A client tool can know "{function}", "{member}", and "{vari‐
200 able}" kinds of C language are enabled from the output.
201
202 Universal Ctags version 6.0 will turn on this pseudo tag by de‐
203 fault.
204
205 TAG_KIND_SEPARATOR (new in Universal Ctags)
206 TBW
207
208 TAG_OUTPUT_EXCMD (new in Universal Ctags)
209 Indicates the specified type of EX command with --excmd option.
210
211 TAG_OUTPUT_FILESEP (new in Universal Ctags)
212 TBW
213
214 TAG_OUTPUT_MODE (new in Universal Ctags)
215 TBW
216
217 TAG_OUTPUT_VERSION (new in Universal Ctags 6.0)
218 Indicates the language-common interface version of the output:
219
220 !_TAG_OUTPUT_VERSION {current}.{age} /.../
221
222 The public interface includes common fields, common extras,
223 pseudo tags.
224
225 The maintainer of Universal Ctags may update the numbers,
226 "{current}" and "{age}" in the same manner as explained
227 in ``TAG_PARSER_VERSION``.
228
229 TAG_PARSER_VERSION (new in Universal Ctags 6.0)
230 Indicates the interface version of the parser:
231
232 !_TAG_PARSER_VERSION!{language-name} {current}.{age} /.../
233
234 The public interfaces include kinds, roles, language specific
235 fields, and language specific extras.
236
237 The maintainer of the parser for "${language-name}" may update
238 the numbers, "{current}" and "{age}" in the following rules:
239
240 • If kinds, roles, language specific fields, and/or language
241 specific extras have been added, removed or changed since last
242 release, increment "{current}".
243
244 • If they have been added since last release, increment "{age}".
245
246 • If they have been removed since last release, set "{age}" to
247 0.
248
249 This concept is baesd on the versioning in libtool (7.2
250 Libtool’s versioning system.) In Universal Ctags, we simplified
251 the concept with removing "revision" in the versioning in
252 libtool.
253
254 Manual pages for languages may document changes that increase
255 the number of "{current}".
256
257 TAG_PATTERN_LENGTH_LIMIT (new in Universal Ctags)
258 TBW
259
260 TAG_PROC_CWD (new in Universal Ctags)
261 Indicates the working directory of ctags during processing.
262
263 This pseudo-tag helps a client tool solve the absolute paths for
264 the input files for tag entries even when they are tagged with
265 relative paths.
266
267 An example of the pseudo-tags:
268
269 $ cat tags
270 !_TAG_PROC_CWD /tmp/ //
271 main input.c /^int main (void) { return 0; }$/;" f typeref:typename:int
272 ...
273
274 From the regular tag for "main", the client tool can know the
275 "main" is at "input.c". However, it is a relative path. So if
276 the directory where ctags run and the directory where the client
277 tool runs are different, the client tool cannot find "input.c"
278 from the file system. In that case, TAG_PROC_CWD gives the tool
279 a hint; "input.c" may be at "/tmp".
280
281 TAG_PROGRAM_NAME
282 Indicates the name of program generating this tags file.
283
284 TAG_PROGRAM_VERSION
285 Indicates the version of program generating this tags file.
286
287 TAG_ROLE_DESCRIPTION (new in Universal Ctags)
288 Indicates the names and descriptions of enabled roles:
289
290 !_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name} {role-name} /description/
291
292 If your tool relies on some roles, refer to the pseudo-tags of
293 this type. Note that a role owned by a disabled kind is not
294 listed even if the role itself is enabled.
295
297 TBW
298
300 Universal ctags can run multiple parsers. That means a parser, which
301 supports multiple parsers, may output tags for different languages.
302 language/l field can be used to show the language for each tag.
303
304 $ cat /tmp/foo.html
305 <html>
306 <script>var x = 1</script>
307 <h1>title</h1>
308 </html>
309 $ ./ctags -o - --extras=+g /tmp/foo.html
310 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h
311 x /tmp/foo.html /var x = 1/;" v
312 $ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html
313 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h language:HTML
314 x /tmp/foo.html /var x = 1/;" v language:JavaScript
315
317 See readtags(1) to know how to use readtags. This section is for dis‐
318 cussing some notable topics for client tools.
319
320 Build Filter/Sorter Expressions
321 Certain escape sequences in expressions are recognized by readtags. For
322 example, when searching for a tag that matches a\?b, if using a filter
323 expression like '(eq? $name "a\?b")', since \? is translated into a
324 single ? by readtags, it actually searches for a?b.
325
326 Another problem is: If the client tools talks to readtags not by sub‐
327 process directly, but through a shell, then if a single quote appear in
328 filter expressions (which is also wrapped by single quotes), it termi‐
329 nates the expression, producing broken expressions, and may even cause
330 unintended shell injection. Single quotes can be escaped using '"'"'.
331
332 So, client tools need to:
333
334 • Replace \ by \\
335
336 • Replace ' by '"'"', if it talks to readtags through a shell.
337
338 inside the expressions. If the expression also contains strings, " in
339 the strings needs to be replaced by \".
340
341 Another thing to notice is that missing fields are represented by #f,
342 and applying string operators to them will produce an error. You should
343 always check if a field is missing before applying string operators.
344 See the "Filtering" section in readtags(1) to know how to do this. Run
345 "readtags -H filter" to see which operators take string arguments.
346
347 Build Filter/Sorter Expressions using Lisp Languages
348 Client tools written in Lisp could build the expression using lists.
349 prin1 (in Common Lisp style Lisps) and write (in Scheme style Lisps)
350 can translate the list into a string that can be directly used. For ex‐
351 ample, in EmacsLisp:
352
353 (let ((name "hi"))
354 (prin1 `(eq? $name ,name)))
355 => "(eq\\? $name "hi")"
356
357 The "?" is escaped, and readtags can handle it.
358
359 Escape sequences produced by write in Scheme style Lisps are exactly
360 those supported by readtags, so any legal readtags expressions can be
361 used. Common Lisp style Lisps may produce escape sequences that are un‐
362 recgonized by readtags, like \#, so symbols that contain "#" can't be
363 used. Readtags provides some aliases for these Lisps, so they should:
364
365 • Use true for #t.
366
367 • Use false for #f.
368
369 • Use nil or () for ().
370
371 • Use (string->regexp "PATTERN") for #/PATTERN/. Use (string->regexp
372 "PATTERN" :case-fold true) for #/PATTERN/i. Notice that string->reg‐
373 exp doesn't require escaping "/" in the pattern.
374
375 Notice that if the client tool talks to readtags through a shell, then
376 in the produced string, ' still needs to be replaced by '"'"' to pre‐
377 vent broken expressions and shell injection.
378
379 Parse Readtags Output
380 In the output of readtags, tabs can appear in all field values (e.g.,
381 the tag name itself could contain tabs), which makes it hard to split
382 the line into fields. Client tools should use the -E option, which
383 keeps the escape sequences in the tags file, so the only field that
384 could contain tabs is the pattern field.
385
386 The pattern field could:
387
388 • Use a line number. It will look like number;" (e.g. 10;").
389
390 • Use a search pattern. It will look like /pattern/;" or ?pattern?;".
391 Notice that the search pattern could contain tabs.
392
393 • Combine these two, like number;/pattern/;" or number;?pattern?;".
394
395 These are true for tags files using extended format, which is the de‐
396 fault one. The legacy format (i.e. --format=1) doesn't include the
397 semicolons. It's old and barely used, so we won't discuss it here.
398
399 Client tools could split the line using the following steps:
400
401 • Find the first 2 tabs in the line, so we get the name and input
402 field.
403
404 • From the 2nd tab:
405
406 • If a / follows, then the pattern delimiter is /.
407
408 • If a ? follows, then the pattern delimiter is ?.
409
410 • If a number follows, then:
411
412 • If a ;/ follows the number, then the delimiter is /.
413
414 • If a ;? follows the number, then the delimiter is ?.
415
416 • If a ;" follows the number, then the field uses only line num‐
417 ber, and there's no pattern delimiter (since there's no regex
418 pattern). In this case the pattern field ends at the 3rd tab.
419
420 • After the opening delimiter, find the next unescaped pattern delim‐
421 iter, and that's the closing delimiter. It will be followed by ;" and
422 then a tab. That's the end of the pattern field. By "unescaped pat‐
423 tern delimiter", we mean there's an even number (including 0) of
424 backslashes before it.
425
426 • From here, split the rest of the line into fields by tabs.
427
428 Then, the escape sequences in fields other than the pattern field
429 should be translated. See "Proposal" in tags(5) to know about all the
430 escape sequences.
431
432 Make Use of the Pattern Field
433 The pattern field specifies how to find a tag in its source file. The
434 code generating this field seems to have a long history, so there are
435 some pitfalls and it's a bit hard to handle. A client tool could simply
436 require the line: field and jump to the line it specifies, to avoid us‐
437 ing the pattern field. But anyway, we'll discuss how to make the best
438 use of it here.
439
440 You should take the words here merely as suggestions, and not stan‐
441 dards. A client tool could definitely develop better (or simpler) ways
442 to use the pattern field.
443
444 From the last section, we know the pattern field could contain a line
445 number and a search pattern. When it only contains the line number,
446 handling it is easy: you simply go to that line.
447
448 The search pattern resembles an EX command, but as we'll see later,
449 it's actually not a valid one, so some manual work are required to
450 process it.
451
452 The search pattern could look like /pat/, called "forward search pat‐
453 tern", or ?pat?, called "backward search pattern". Using a search pat‐
454 tern means even if the source file is updated, as long as the part con‐
455 taining the tag doesn't change, we could still locate the tag correctly
456 by searching.
457
458 When the pattern field only contains the search pattern, you just
459 search for it. The search direction (forward/backward) doesn't matter,
460 as it's decided solely by whether the -B option is enabled, and not the
461 actual context. You could always start the search from say the begin‐
462 ning of the file.
463
464 When both the search pattern and the line number are presented, you
465 could make good use of the line number, by going to the line first,
466 then searching for the nearest occurrence of the pattern. A way to do
467 this is to search both forward and backward for the pattern, and when
468 there is a occurrence on both sides, go to the nearer one.
469
470 What's good about this is when there are multiple identical lines in
471 the source file (e.g. the COMMON block in Fortran), this could help us
472 find the correct one, even after the source file is updated and the tag
473 position is shifted by a few lines.
474
475 Now let's discuss how to search for the pattern. After you trim the /
476 or ? around it, the pattern resembles a regex pattern. It should be a
477 regex pattern, as required by being a valid EX command, but it's actu‐
478 ally not, as you'll see below.
479
480 It could begin with a ^, which means the pattern starts from the begin‐
481 ning of a line. It could also end with an unescaped $ which means the
482 pattern ends at the end of a line. Let's keep this information, and
483 trim them too.
484
485 Now the remaining part is the actual string containing the tag. Some
486 characters are escaped:
487
488 • \.
489
490 • $, but only at the end of the string.
491
492 • /, but only in forward search patterns.
493
494 • ?, but only in backward search patterns.
495
496 You need to unescape these to get the literal string. Now you could
497 convert this literal string to a regexp that matches it (by escaping,
498 like re.escape in Python or regexp-quote in Elisp), and assemble it
499 with ^ or $ if the pattern originally has it, and finally search for
500 the tag using this regexp.
501
502 Remark: About a Previous Format of the Pattern Field
503 In some earlier versions of Universal Ctags, the line number in the
504 pattern field is the actual line number minus one, for forward search
505 patterns; or plus one, for backward search patterns. The idea is to re‐
506 semble an EX command: you go to the line, then search forward/backward
507 for the pattern, and you can always find the correct one. But this de‐
508 nies the purpose of using a search pattern: to tolerate file updates.
509 For example, the tag is at line 50, according to this scheme, the pat‐
510 tern field should be:
511
512 49;/pat/;"
513
514 Then let's assume that some code above are removed, and the tag is now
515 at line 45. Now you can't find it if you search forward from line 49.
516
517 Due to this reason, Universal Ctags turns to use the actual line num‐
518 ber. A client tool could distinguish them by the TAG_OUTPUT_EXCMD
519 pseudo tag, it's "combine" for the old scheme, and "combineV2" for the
520 present scheme. But probably there's no need to treat them differently,
521 since "search for the nearest occurrence from the line" gives good re‐
522 sult on both schemes.
523
525 See ctags-json-output(5).
526
528 Version 6.0
529 • ctags enables TAG_KIND_DESCRIPTION, TAG_ROLE_DESCRIPTION,
530 TAG_FIELD_DESCRIPTION, and TAG_EXTRA_DESCRIPTION pseudo tags by de‐
531 fault.
532
533 • TAG_PARSER_VERSION is introduced.
534
536 ctags(1), ctags-lang-python(7), ctags-incompatibilities(7), tags(5),
537 ctags-json-output(5), readtags(1), 7.2 Libtool’s versioning system
538 <https://www.gnu.org/software/libtool/manual/libtool.html#Libtool-ver‐
539 sioning>
540
541
542
543
5446.0.0 CTAGS-CLIENT-TOOLS(7)