1CTAGS-CLIENT-TOOLS(7) Universal Ctags CTAGS-CLIENT-TOOLS(7)
2
3
4
6 ctags-client-tools - Hints for developing a tool using ctags command
7 and tags output
8
10 ctags [options] [file(s)]
11 etags [options] [file(s)]
12
13
15 Client tool means a tool running the ctags command and/or reading a
16 tags file generated by ctags command. This man page gathers hints for
17 people who develop client tools.
18
20 Pseudo-tags, stored in a tag file, indicate how ctags generated the
21 tags file: whether the tags file is sorted or not, which version of
22 tags file format is used, the name of tags generator, and so on. The
23 opposite term for pseudo-tags is regular-tags. A regular-tag is for a
24 language object in an input file. A pseudo-tag is for the tags file it‐
25 self. Client tools may use pseudo-tags as reference for processing reg‐
26 ular-tags.
27
28 A pseudo-tag is stored in a tags file in the same format as regu‐
29 lar-tags as described in tags(5), except that pseudo-tag names are pre‐
30 fixed with "!_". For the general information about pseudo-tags, see
31 "TAG FILE INFORMATION" in tags(5).
32
33 An example of a pseudo tag:
34
35 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/
36
37 The value, "2", associated with the pseudo tag "TAG_PROGRAM_NAME", is
38 used in the field for input file. The description, "Derived from Exu‐
39 berant Ctags", is used in the field for pattern.
40
41 Universal Ctags extends the naming scheme of the classical pseudo-tags
42 available in Exuberant Ctags for emitting language specific information
43 as pseudo tags:
44
45 !_{pseudo-tag-name}!{language-name} {associated-value} /{description}/
46
47 The language-name is appended to the pseudo-tag name with a separator,
48 "!".
49
50 An example of pseudo tag with a language suffix:
51
52 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/
53
54 This pseudo-tag says "the function kind of C language is enabled when
55 generating this tags file." --pseudo-tags is the option for en‐
56 abling/disabling individual pseudo-tags. When enabling/disabling a
57 pseudo tag with the option, specify the tag name only "TAG_KIND_DE‐
58 SCRIPTION", without the prefix ("!_") or the suffix ("!C").
59
60 Options for Pseudo-tags
61 --extras=+p (or --extras=+{pseudo})
62 Forces writing pseudo-tags.
63
64 ctags emits pseudo-tags by default when writing tags to a regu‐
65 lar file (e.g. "tags'.) However, when specifying -o - or -f -
66 for writing tags to standard output, ctags doesn't emit
67 pseudo-tags. --extras=+p or --extras=+{pseudo} will force
68 pseudo-tags to be written.
69
70 --list-pseudo-tags
71 Lists available types of pseudo-tags and shows whether they are
72 enabled or disabled.
73
74 Running ctags with --list-pseudo-tags option lists available
75 pseudo-tags. Some of pseudo-tags newly introduced in Universal
76 Ctags project are disabled by default. Use --pseudo-tags=... to
77 enable them.
78
79 --pseudo-tags=[+|-]names|*
80 Specifies a list of pseudo-tag types to include in the output.
81
82 The parameters are a set of pseudo tag names. Valid pseudo tag
83 names can be listed with --list-pseudo-tags. Surround each name
84 in the set with braces, like "{TAG_PROGRAM_AUTHOR}". You don't
85 have to include the "!_" pseudo tag prefix when specifying a
86 name in the option argument for --pseudo-tags= option.
87
88 pseudo-tags don't have a notation using one-letter flags.
89
90 If a name is preceded by either the '+' or '-' characters, that
91 tags's effect has been added or removed. Otherwise the names re‐
92 place any current settings. All entries are included if '*' is
93 given.
94
95 --fields=+E (or --fields=+{extras})
96 Attach "extras:pseudo" field to pseudo-tags.
97
98 An example of pseudo tags with the field:
99
100 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/ extras:pseudo
101
102 If the name of a normal tag in a tag file starts with "!_", a
103 client tool cannot distinguish whether the tag is a regular-tag
104 or pseudo-tag. The fields attached with this option help the
105 tool distinguish them.
106
107 List of notable pseudo-tags
108 Running ctags with --list-pseudo-tags option lists available types of
109 pseudo-tags with short descriptions. This subsection shows hints for
110 using notable ones.
111
112 TAG_EXTRA_DESCRIPTION (new in Universal Ctags)
113 Indicates the names and descriptions of enabled extras:
114
115 !_TAG_EXTRA_DESCRIPTION {extra-name} /description/
116 !_TAG_EXTRA_DESCRIPTION!{language-name} {extra-name} /description/
117
118 If your tool relies on some extra tags (extras), refer to the
119 pseudo-tags of this type. A tool can reject the tags file that
120 doesn't include expected extras, and raise an error in an early
121 stage of processing.
122
123 An example of the pseudo-tags:
124
125 $ ctags --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c
126 !_TAG_EXTRA_DESCRIPTION anonymous /Include tags for non-named objects like lambda/
127 !_TAG_EXTRA_DESCRIPTION fileScope /Include tags of file scope/
128 !_TAG_EXTRA_DESCRIPTION pseudo /Include pseudo tags/
129 !_TAG_EXTRA_DESCRIPTION subparser /Include tags generated by subparsers/
130 ...
131
132 A client tool can know "{anonymous}", "{fileScope}", "{pseudo}",
133 and "{subparser}" extras are enabled from the output.
134
135 TAG_FIELD_DESCRIPTION (new in Universal Ctags)
136 Indicates the names and descriptions of enabled fields:
137
138 !_TAG_FIELD_DESCRIPTION {field-name} /description/
139 !_TAG_FIELD_DESCRIPTION!{language-name} {field-name} /description/
140
141 If your tool relies on some fields, refer to the pseudo-tags of
142 this type. A tool can reject a tags file that doesn't include
143 expected fields, and raise an error in an early stage of pro‐
144 cessing.
145
146 An example of the pseudo-tags:
147
148 $ ctags --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c
149 !_TAG_FIELD_DESCRIPTION file /File-restricted scoping/
150 !_TAG_FIELD_DESCRIPTION input /input file/
151 !_TAG_FIELD_DESCRIPTION name /tag name/
152 !_TAG_FIELD_DESCRIPTION pattern /pattern/
153 !_TAG_FIELD_DESCRIPTION typeref /Type and name of a variable or typedef/
154 !_TAG_FIELD_DESCRIPTION!C macrodef /macro definition/
155 ...
156
157 A client tool can know "{file}", "{input}", "{name}", "{pat‐
158 tern}", and "{typeref}" fields are enabled from the output. The
159 fields are common in languages. In addition to the common
160 fields, the tool can known "{macrodef}" field of C language is
161 also enabled.
162
163 TAG_FILE_ENCODING (new in Universal Ctags)
164 TBW
165
166 TAG_FILE_FORMAT
167 See also tags(5).
168
169 TAG_FILE_SORTED
170 See also tags(5).
171
172 TAG_KIND_DESCRIPTION (new in Universal Ctags)
173 Indicates the names and descriptions of enabled kinds:
174
175 !_TAG_KIND_DESCRIPTION!{language-name} {kind-letter},{kind-name} /description/
176
177 If your tool relies on some kinds, refer to the pseudo-tags of
178 this type. A tool can reject the tags file that doesn't include
179 expected kinds, and raise an error in an early stage of process‐
180 ing.
181
182 Kinds are language specific, so a language name is always ap‐
183 pended to the tag name as suffix.
184
185 An example of the pseudo-tags:
186
187 $ ctags --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c
188 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/
189 !_TAG_KIND_DESCRIPTION!C m,member /struct, and union members/
190 !_TAG_KIND_DESCRIPTION!C v,variable /variable definitions/
191 ...
192
193 A client tool can know "{function}", "{member}", and "{vari‐
194 able}" kinds of C language are enabled from the output.
195
196 TAG_KIND_SEPARATOR (new in Universal Ctags)
197 TBW
198
199 TAG_OUTPUT_EXCMD (new in Universal Ctags)
200 Indicates the specified type of EX command with --excmd option.
201
202 TAG_OUTPUT_FILESEP (new in Universal Ctags)
203 TBW
204
205 TAG_OUTPUT_MODE (new in Universal Ctags)
206 TBW
207
208 TAG_PATTERN_LENGTH_LIMIT (new in Universal Ctags)
209 TBW
210
211 TAG_PROC_CWD (new in Universal Ctags)
212 Indicates the working directory of ctags during processing.
213
214 This pseudo-tag helps a client tool solve the absolute paths for
215 the input files for tag entries even when they are tagged with
216 relative paths.
217
218 An example of the pseudo-tags:
219
220 $ cat tags
221 !_TAG_PROC_CWD /tmp/ //
222 main input.c /^int main (void) { return 0; }$/;" f typeref:typename:int
223 ...
224
225 From the regular tag for "main", the client tool can know the
226 "main" is at "input.c". However, it is a relative path. So if
227 the directory where ctags run and the directory where the client
228 tool runs are different, the client tool cannot find "input.c"
229 from the file system. In that case, TAG_PROC_CWD gives the tool
230 a hint; "input.c" may be at "/tmp".
231
232 TAG_PROGRAM_NAME
233 TBW
234
235 TAG_ROLE_DESCRIPTION (new in Universal Ctags)
236 Indicates the names and descriptions of enabled roles:
237
238 !_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name} {role-name} /description/
239
240 If your tool relies on some roles, refer to the pseudo-tags of
241 this type. Note that a role owned by a disabled kind is not
242 listed even if the role itself is enabled.
243
245 TBW
246
248 Universal ctags can run multiple parsers. That means a parser, which
249 supports multiple parsers, may output tags for different languages.
250 language/l field can be used to show the language for each tag.
251
252 $ cat /tmp/foo.html
253 <html>
254 <script>var x = 1</script>
255 <h1>title</h1>
256 </html>
257 $ ./ctags -o - --extras=+g /tmp/foo.html
258 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h
259 x /tmp/foo.html /var x = 1/;" v
260 $ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html
261 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h language:HTML
262 x /tmp/foo.html /var x = 1/;" v language:JavaScript
263
265 See readtags(1) to know how to use readtags. This section is for dis‐
266 cussing some notable topics for client tools.
267
268 Build Filter/Sorter Expressions
269 Certain escape sequences in expressions are recognized by readtags. For
270 example, when searching for a tag that matches a\?b, if using a filter
271 expression like '(eq? $name "a\?b")', since \? is translated into a
272 single ? by readtags, it actually searches for a?b.
273
274 Another problem is if a single quote appear in filter expressions
275 (which is also wrapped by single quotes), it terminates the expression,
276 producing broken expressions, and may even cause unintended shell in‐
277 jection. Single quotes can be escaped using '"'"'.
278
279 So, client tools need to:
280
281 • Replace \ by \\
282
283 • Replace ' by '"'"'
284
285 inside the expressions. If the expression also contains strings, " in
286 the strings needs to be replaced by \".
287
288 Client tools written in Lisp could build the expression using lists.
289 prin1 (in Common Lisp style Lisps) and write (in Scheme style Lisps)
290 can translate the list into a string that can be directly used. For ex‐
291 ample, in EmacsLisp:
292
293 (let ((name "hi"))
294 (prin1 `(eq? $name ,name)))
295 => "(eq\\? $name "hi")"
296
297 The "?" is escaped, and readtags can handle it. Scheme style Lisps
298 should do proper escaping so the expression readtags gets is just the
299 expression passed into write. Common Lisp style Lisps may produce un‐
300 recognized escape sequences by readtags, like \#. Readtags provides
301 some aliases for these Lisps:
302
303 • Use true for #t.
304
305 • Use false for #f.
306
307 • Use nil or () for ().
308
309 • Use (string->regexp "PATTERN") for #/PATTERN/. Use (string->regexp
310 "PATTERN" :case-fold true) for #/PATTERN/i. Notice that string->reg‐
311 exp doesn't require escaping "/" in the pattern.
312
313 Notice that even when the client tool uses this method, ' still needs
314 to be replaced by '"'"' to prevent broken expressions and shell injec‐
315 tion.
316
317 Another thing to notice is that missing fields are represented by #f,
318 and applying string operators to them will produce an error. You should
319 always check if a field is missing before applying string operators.
320 See the "Filtering" section in readtags(1) to know how to do this. Run
321 "readtags -H filter" to see which operators take string arguments.
322
323 Parse Readtags Output
324 In the output of readtags, tabs can appear in all field values (e.g.,
325 the tag name itself could contain tabs), which makes it hard to split
326 the line into fields. Client tools should use the -E option, which
327 keeps the escape sequences in the tags file, so the only field that
328 could contain tabs is the pattern field.
329
330 The pattern field could:
331
332 • Use a line number. It will look like number;" (e.g. 10;").
333
334 • Use a search pattern. It will look like /pattern/;" or ?pattern?;".
335 Notice that the search pattern could contain tabs.
336
337 • Combine these two, like number;/pattern/;" or number;?pattern?;".
338
339 These are true for tags files using extended format, which is the de‐
340 fault one. The legacy format (i.e. --format=1) doesn't include the
341 semicolons. It's old and barely used, so we won't discuss it here.
342
343 Client tools could split the line using the following steps:
344
345 • Find the first 2 tabs in the line, so we get the name and input
346 field.
347
348 • From the 2nd tab:
349
350 • If a / follows, then the pattern delimiter is /.
351
352 • If a ? follows, then the pattern delimiter is ?.
353
354 • If a number follows, then:
355
356 • If a ;/ follows the number, then the delimiter is /.
357
358 • If a ;? follows the number, then the delimiter is ?.
359
360 • If a ;" follows the number, then the field uses only line number,
361 and there's no pattern delimiter (since there's no regex pat‐
362 tern). In this case the pattern field ends at the 3rd tab.
363
364 • After the opening delimiter, find the next unescaped pattern delim‐
365 iter, and that's the closing delimiter. It will be followed by ;" and
366 then a tab. That's the end of the pattern field. By "unescaped pat‐
367 tern delimiter", we mean there's an even number (including 0) of
368 backslashes before it.
369
370 • From here, split the rest of the line into fields by tabs.
371
372 Then, the escape sequences in fields other than the pattern field
373 should be translated. See "Proposal" in tags(5) to know about all the
374 escape sequences.
375
376 Make Use of the Pattern Field
377 The pattern field specifies how to find a tag in its source file. The
378 code generating this field seems to have a long history, so there are
379 some pitfalls and it's a bit hard to handle. A client tool could simply
380 require the line: field and jump to the line it specifies, to avoid us‐
381 ing the pattern field. But anyway, we'll discuss how to make the best
382 use of it here.
383
384 You should take the words here merely as suggestions, and not stan‐
385 dards. A client tool could definitely develop better (or simpler) ways
386 to use the pattern field.
387
388 From the last section, we know the pattern field could contain a line
389 number and a search pattern. When it only contains the line number,
390 handling it is easy: you simply go to that line.
391
392 The search pattern resembles an EX command, but as we'll see later,
393 it's actually not a valid one, so some manual work are required to
394 process it.
395
396 The search pattern could look like /pat/, called "forward search pat‐
397 tern", or ?pat?, called "backward search pattern". Using a search pat‐
398 tern means even if the source file is updated, as long as the part con‐
399 taining the tag doesn't change, we could still locate the tag correctly
400 by searching.
401
402 When the pattern field only contains the search pattern, you just
403 search for it. The search direction (forward/backward) doesn't matter,
404 as it's decided solely by whether the -B option is enabled, and not the
405 actual context. You could always start the search from say the begin‐
406 ning of the file.
407
408 When both the search pattern and the line number are presented, you
409 could make good use of the line number, by going to the line first,
410 then searching for the nearest occurrence of the pattern. A way to do
411 this is to search both forward and backward for the pattern, and when
412 there is a occurrence on both sides, go to the nearer one.
413
414 What's good about this is when there are multiple identical lines in
415 the source file (e.g. the COMMON block in Fortran), this could help us
416 find the correct one, even after the source file is updated and the tag
417 position is shifted by a few lines.
418
419 Now let's discuss how to search for the pattern. After you trim the /
420 or ? around it, the pattern resembles a regex pattern. It should be a
421 regex pattern, as required by being a valid EX command, but it's actu‐
422 ally not, as you'll see below.
423
424 It could begin with a ^, which means the pattern starts from the begin‐
425 ning of a line. It could also end with an unescaped $ which means the
426 pattern ends at the end of a line. Let's keep this information, and
427 trim them too.
428
429 Now the remaining part is the actual string containing the tag. Some
430 characters are escaped:
431
432 • \.
433
434 • $, but only at the end of the string.
435
436 • /, but only in forward search patterns.
437
438 • ?, but only in backward search patterns.
439
440 You need to unescape these to get the literal string. Now you could
441 convert this literal string to a regexp that matches it (by escaping,
442 like re.escape in Python or regexp-quote in Elisp), and assemble it
443 with ^ or $ if the pattern originally has it, and finally search for
444 the tag using this regexp.
445
446 Remark: About a Previous Format of the Pattern Field
447 In some earlier versions of Universal Ctags, the line number in the
448 pattern field is the actual line number minus one, for forward search
449 patterns; or plus one, for backward search patterns. The idea is to re‐
450 semble an EX command: you go to the line, then search forward/backward
451 for the pattern, and you can always find the correct one. But this de‐
452 nies the purpose of using a search pattern: to tolerate file updates.
453 For example, the tag is at line 50, according to this scheme, the pat‐
454 tern field should be:
455
456 49;/pat/;"
457
458 Then let's assume that some code above are removed, and the tag is now
459 at line 45. Now you can't find it if you search forward from line 49.
460
461 Due to this reason, Universal Ctags turns to use the actual line num‐
462 ber. A client tool could distinguish them by the TAG_OUTPUT_EXCMD
463 pseudo tag, it's "combine" for the old scheme, and "combineV2" for the
464 present scheme. But probably there's no need to treat them differently,
465 since "search for the nearest occurrence from the line" gives good re‐
466 sult on both schemes.
467
469 Universal Ctags supports JSON (strictly speaking JSON Lines) output
470 format if the ctags executable is built with libjansson. JSON output
471 goes to standard output by default.
472
473 Format
474 Each JSON line represents a tag.
475
476 $ ctags --extras=+p --output-format=json --fields=-s input.py
477 {"_type": "ptag", "name": "JSON_OUTPUT_VERSION", "path": "0.0", "pattern": "in development"}
478 {"_type": "ptag", "name": "TAG_FILE_SORTED", "path": "1", "pattern": "0=unsorted, 1=sorted, 2=foldcase"}
479 ...
480 {"_type": "tag", "name": "Klass", "path": "/tmp/input.py", "pattern": "/^class Klass:$/", "language": "Python", "kind": "class"}
481 {"_type": "tag", "name": "method", "path": "/tmp/input.py", "pattern": "/^ def method(self):$/", "language": "Python", "kind": "member", "scope": "Klass", "scopeKind": "class"}
482 ...
483
484 A key not starting with _ is mapped to a field of ctags. "--out‐
485 put-format=json --list-fields" options list the fields.
486
487 A key starting with _ represents meta information of the JSON line.
488 Currently only _type key is used. If the value for the key is tag, the
489 JSON line represents a normal tag. If the value is ptag, the line rep‐
490 resents a pseudo-tag.
491
492 The output format can be changed in the future. JSON_OUTPUT_VERSION
493 pseudo-tag provides a change client-tools to handle the changes. Cur‐
494 rent version is "0.0". A client-tool can extract the version with path
495 key from the pseudo-tag.
496
497 The JSON output format is newly designed and has no limitation found in
498 the default tags file format.
499
500 • The values for kind key are represented in long-name flags. No
501 one-letter is here.
502
503 • Scope names and scope kinds have distinguished keys: scope and
504 scopeKind. They are combined in the default tags file format.
505
506 Data type used in a field
507 Values for the most of all keys are represented in JSON string type.
508 However, some of them are represented in string, integer, and/or bool‐
509 ean type.
510
511 "--output-format=json --list-fields" options show What kind of data
512 type used in a field of JSON.
513
514 $ ctags --output-format=json --list-fields
515 #LETTER NAME ENABLED LANGUAGE JSTYPE FIXED DESCRIPTION
516 F input yes NONE s-- no input file
517 ...
518 P pattern yes NONE s-b no pattern
519 ...
520 f file yes NONE --b no File-restricted scoping
521 ...
522 e end no NONE -i- no end lines of various items
523 ...
524
525 JSTYPE column shows the data types.
526
527 's' string
528
529 'i' integer
530
531 'b' boolean (true or false)
532
533 For an example, the value for pattern field of ctags takes a string or
534 a boolean value.
535
537 ctags(1), ctags-lang-python(7), ctags-incompatibilities(7), tags(5),
538 readtags(1)
539
540
541
542
5435.9.0 CTAGS-CLIENT-TOOLS(7)