1CTAGS-CLIENT-TOOLS(7)           Universal Ctags          CTAGS-CLIENT-TOOLS(7)
2
3
4

NAME

6       ctags-client-tools  -  Hints  for developing a tool using ctags command
7       and tags output
8

SYNOPSIS

10       ctags [options] [file(s)]
11       etags [options] [file(s)]
12
13

DESCRIPTION

15       Client tool means a tool running the ctags  command  and/or  reading  a
16       tags  file generated by ctags command.  This man page gathers hints for
17       people who develop client tools.
18

PSEUDO-TAGS

20       Pseudo-tags, stored in a tag file, indicate  how  ctags  generated  the
21       tags  file:  whether  the  tags file is sorted or not, which version of
22       tags file format is used, the name of tags generator, and  so  on.  The
23       opposite  term  for pseudo-tags is regular-tags. A regular-tag is for a
24       language object in an input file. A pseudo-tag is for the tags file it‐
25       self. Client tools may use pseudo-tags as reference for processing reg‐
26       ular-tags.
27
28       A pseudo-tag is stored in a tags file  in  the  same  format  as  regu‐
29       lar-tags as described in tags(5), except that pseudo-tag names are pre‐
30       fixed with "!_". For the general  information  about  pseudo-tags,  see
31       "TAG FILE INFORMATION" in tags(5).
32
33       An example of a pseudo tag:
34
35          !_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/
36
37       The  value,  "Universal Ctags", associated with the pseudo tag TAG_PRO‐
38       GRAM_NAME, is used in the field for input file. The  description,  "De‐
39       rived from Exuberant Ctags", is used in the field for pattern.
40
41       Universal  Ctags extends the naming scheme of the classical pseudo-tags
42       available in Exuberant Ctags for emitting language specific information
43       as pseudo tags:
44
45          !_{pseudo-tag-name}!{language-name}     {associated-value}      /{description}/
46
47       The  language-name is appended to the pseudo-tag name with a separator,
48       "!".
49
50       An example of pseudo tag with a language suffix:
51
52          !_TAG_KIND_DESCRIPTION!C        f,function      /function definitions/
53
54       This pseudo-tag says "the function kind of C language is  enabled  when
55       generating  this  tags  file."  --pseudo-tags  is  the  option  for en‐
56       abling/disabling  individual  pseudo-tags.  When  enabling/disabling  a
57       pseudo tag with the option, specify the tag name only TAG_KIND_DESCRIP‐
58       TION, without the prefix ("!_") or the suffix ("!C").
59
60   Options for Pseudo-tags
61       --extras=+p (or --extras=+{pseudo})
62              Forces writing pseudo-tags.
63
64              ctags emits pseudo-tags by default when writing tags to a  regu‐
65              lar  file  (e.g.  "tags'.) However, when specifying -o - or -f -
66              for  writing  tags  to  standard  output,  ctags  doesn't   emit
67              pseudo-tags.   --extras=+p   or  --extras=+{pseudo}  will  force
68              pseudo-tags to be written.
69
70       --list-pseudo-tags
71              Lists available types of pseudo-tags and shows whether they  are
72              enabled or disabled.
73
74              Running  ctags  with  --list-pseudo-tags  option lists available
75              pseudo-tags. Some of pseudo-tags newly introduced  in  Universal
76              Ctags  project are disabled by default. Use --pseudo-tags=... to
77              enable them.
78
79       --pseudo-tags=[+|-]names|*
80              Specifies a list of pseudo-tag types to include in the output.
81
82              The parameters are a set of pseudo tag names. Valid  pseudo  tag
83              names  can be listed with --list-pseudo-tags. Surround each name
84              in the set with braces, like "{TAG_PROGRAM_AUTHOR}".  You  don't
85              have  to  include  the  "!_" pseudo tag prefix when specifying a
86              name in the option argument for --pseudo-tags= option.
87
88              pseudo-tags don't have a notation using one-letter flags.
89
90              If a name is preceded by either the '+' or '-' characters,  that
91              tags's effect has been added or removed. Otherwise the names re‐
92              place any current settings. All entries are included if  '*'  is
93              given.
94
95       --fields=+E (or --fields=+{extras})
96              Attach "extras:pseudo" field to pseudo-tags.
97
98              An example of pseudo tags with the field:
99
100                 !_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/  extras:pseudo
101
102              If  the  name of a regular tag in a tag file starts with "!_", a
103              client tool cannot distinguish whether the tag is a  regular-tag
104              or  pseudo-tag.   The  fields attached with this option help the
105              tool distinguish them.
106
107   List of notable pseudo-tags
108       Running ctags with --list-pseudo-tags option lists available  types  of
109       pseudo-tags  with  short  descriptions. This subsection shows hints for
110       using notable ones.
111
112       TAG_EXTRA_DESCRIPTION (new in Universal Ctags)
113              Indicates the names and descriptions of enabled extras:
114
115                 !_TAG_EXTRA_DESCRIPTION       {extra-name}    /description/
116                 !_TAG_EXTRA_DESCRIPTION!{language-name}       {extra-name}    /description/
117
118              If your tool relies on some extra tags (extras),  refer  to  the
119              pseudo-tags  of  this type. A tool can reject the tags file that
120              doesn't include expected extras, and raise an error in an  early
121              stage of processing.
122
123              An example of the pseudo-tags:
124
125                 $ ctags --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c
126                 !_TAG_EXTRA_DESCRIPTION       anonymous       /Include tags for non-named objects like lambda/
127                 !_TAG_EXTRA_DESCRIPTION       fileScope       /Include tags of file scope/
128                 !_TAG_EXTRA_DESCRIPTION       pseudo  /Include pseudo tags/
129                 !_TAG_EXTRA_DESCRIPTION       subparser       /Include tags generated by subparsers/
130                 ...
131
132              A client tool can know "{anonymous}", "{fileScope}", "{pseudo}",
133              and "{subparser}" extras are enabled from the output.
134
135              Universal Ctags version 6.0 will turn on this pseudo tag by  de‐
136              fault.
137
138       TAG_FIELD_DESCRIPTION (new in Universal Ctags)
139              Indicates the names and descriptions of enabled fields:
140
141                 !_TAG_FIELD_DESCRIPTION       {field-name}    /description/
142                 !_TAG_FIELD_DESCRIPTION!{language-name}       {field-name}    /description/
143
144              If  your tool relies on some fields, refer to the pseudo-tags of
145              this type.  A tool can reject a tags file that  doesn't  include
146              expected  fields,  and  raise an error in an early stage of pro‐
147              cessing.
148
149              An example of the pseudo-tags:
150
151                 $ ctags --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c
152                 !_TAG_FIELD_DESCRIPTION       file    /File-restricted scoping/
153                 !_TAG_FIELD_DESCRIPTION       input   /input file/
154                 !_TAG_FIELD_DESCRIPTION       name    /tag name/
155                 !_TAG_FIELD_DESCRIPTION       pattern /pattern/
156                 !_TAG_FIELD_DESCRIPTION       typeref /Type and name of a variable or typedef/
157                 !_TAG_FIELD_DESCRIPTION!C     macrodef        /macro definition/
158                 ...
159
160              A client tool can know  "{file}",  "{input}",  "{name}",  "{pat‐
161              tern}", and "{typeref}" fields are enabled from the output.  The
162              fields are common  in  languages.  In  addition  to  the  common
163              fields,  the  tool can known "{macrodef}" field of C language is
164              also enabled.
165
166              Universal Ctags version 6.0 will turn on this pseudo tag by  de‐
167              fault.
168
169       TAG_FILE_ENCODING (new in Universal Ctags)
170              TBW
171
172       TAG_FILE_FORMAT
173              See also tags(5).
174
175       TAG_FILE_SORTED
176              See also tags(5).
177
178       TAG_KIND_DESCRIPTION (new in Universal Ctags)
179              Indicates the names and descriptions of enabled kinds:
180
181                 !_TAG_KIND_DESCRIPTION!{language-name}        {kind-letter},{kind-name}       /description/
182
183              If  your  tool relies on some kinds, refer to the pseudo-tags of
184              this type.  A tool can reject the tags file that doesn't include
185              expected kinds, and raise an error in an early stage of process‐
186              ing.
187
188              Kinds are language specific, so a language name is   always  ap‐
189              pended to the tag name as suffix.
190
191              An example of the pseudo-tags:
192
193                 $ ctags --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c
194                 !_TAG_KIND_DESCRIPTION!C      f,function      /function definitions/
195                 !_TAG_KIND_DESCRIPTION!C      m,member        /struct, and union members/
196                 !_TAG_KIND_DESCRIPTION!C      v,variable      /variable definitions/
197                 ...
198
199              A  client  tool  can  know "{function}", "{member}", and "{vari‐
200              able}" kinds of C language are enabled from the output.
201
202              Universal Ctags version 6.0 will turn on this pseudo tag by  de‐
203              fault.
204
205       TAG_KIND_SEPARATOR (new in Universal Ctags)
206              TBW
207
208       TAG_OUTPUT_EXCMD (new in Universal Ctags)
209              Indicates the specified type of EX command with --excmd option.
210
211       TAG_OUTPUT_FILESEP (new in Universal Ctags)
212              TBW
213
214       TAG_OUTPUT_MODE (new in Universal Ctags)
215              TBW
216
217       TAG_OUTPUT_VERSION (new in Universal Ctags 6.0)
218              Indicates the language-common interface version of the output:
219
220                   !_TAG_OUTPUT_VERSION  {current}.{age} /.../
221
222                 The public interface includes common fields, common extras,
223                 pseudo tags.
224
225                 The maintainer of Universal Ctags may update the numbers,
226                 "{current}" and "{age}" in the same manner as explained
227                 in ``TAG_PARSER_VERSION``.
228
229       TAG_PARSER_VERSION (new in Universal Ctags 6.0)
230              Indicates the interface version of the parser:
231
232                 !_TAG_PARSER_VERSION!{language-name}  {current}.{age} /.../
233
234              The  public  interfaces  include kinds, roles, language specific
235              fields, and language specific extras.
236
237              The maintainer of the parser for "${language-name}"  may  update
238              the numbers, "{current}" and "{age}" in the following rules:
239
240              • If  kinds,  roles,  language  specific fields, and/or language
241                specific extras have been added, removed or changed since last
242                release, increment "{current}".
243
244              • If they have been added since last release, increment "{age}".
245
246              • If  they  have been removed since last release, set "{age}" to
247                0.
248
249              This  concept  is  baesd  on  the  versioning  in  libtool  (7.2
250              Libtool’s versioning system.)  In Universal Ctags, we simplified
251              the concept  with  removing  "revision"  in  the  versioning  in
252              libtool.
253
254              Manual  pages  for  languages may document changes that increase
255              the number of "{current}".
256
257       TAG_PATTERN_LENGTH_LIMIT (new in Universal Ctags)
258              TBW
259
260       TAG_PROC_CWD (new in Universal Ctags)
261              Indicates the working directory of ctags during processing.
262
263              This pseudo-tag helps a client tool solve the absolute paths for
264              the  input  files for tag entries even when they are tagged with
265              relative paths.
266
267              An example of the pseudo-tags:
268
269                 $ cat tags
270                 !_TAG_PROC_CWD        /tmp/   //
271                 main  input.c /^int main (void) { return 0; }$/;"     f       typeref:typename:int
272                 ...
273
274              From the regular tag for "main", the client tool  can  know  the
275              "main"  is  at "input.c".  However, it is a relative path. So if
276              the directory where ctags run and the directory where the client
277              tool  runs  are different, the client tool cannot find "input.c"
278              from the file system. In that case, TAG_PROC_CWD gives the  tool
279              a hint; "input.c" may be at "/tmp".
280
281       TAG_PROGRAM_NAME
282              Indicates the name of program generating this tags file.
283
284       TAG_PROGRAM_VERSION
285              Indicates the version of program generating this tags file.
286
287       TAG_ROLE_DESCRIPTION (new in Universal Ctags)
288              Indicates the names and descriptions of enabled roles:
289
290                 !_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name}    {role-name}     /description/
291
292              If  your  tool relies on some roles, refer to the pseudo-tags of
293              this type. Note that a role owned by  a  disabled  kind  is  not
294              listed even if the role itself is enabled.
295

REDUNDANT-KINDS

297       TBW
298

MULTIPLE-LANGUAGES FOR AN INPUT FILE

300       Universal  ctags  can run multiple parsers.  That means a parser, which
301       supports multiple parsers, may output  tags  for  different  languages.
302       language/l field can be used to show the language for each tag.
303
304          $ cat /tmp/foo.html
305          <html>
306          <script>var x = 1</script>
307          <h1>title</h1>
308          </html>
309          $ ./ctags -o - --extras=+g /tmp/foo.html
310          title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h
311          x       /tmp/foo.html   /var x = 1/;"   v
312          $ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html
313          title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h       language:HTML
314          x       /tmp/foo.html   /var x = 1/;"   v       language:JavaScript
315

UTILIZING READTAGS

317       See  readtags(1)  to know how to use readtags. This section is for dis‐
318       cussing some notable topics for client tools.
319
320   Build Filter/Sorter Expressions
321       Certain escape sequences in expressions are recognized by readtags. For
322       example,  when searching for a tag that matches a\?b, if using a filter
323       expression like '(eq? $name "a\?b")', since \?  is  translated  into  a
324       single ? by readtags, it actually searches for a?b.
325
326       Another  problem  is: If the client tools talks to readtags not by sub‐
327       process directly, but through a shell, then if a single quote appear in
328       filter  expressions (which is also wrapped by single quotes), it termi‐
329       nates the expression, producing broken expressions, and may even  cause
330       unintended shell injection. Single quotes can be escaped using '"'"'.
331
332       So, client tools need to:
333
334       • Replace \ by \\
335
336       • Replace ' by '"'"', if it talks to readtags through a shell.
337
338       inside  the  expressions. If the expression also contains strings, " in
339       the strings needs to be replaced by \".
340
341       Another thing to notice is that missing fields are represented  by  #f,
342       and applying string operators to them will produce an error. You should
343       always check if a field is missing before  applying  string  operators.
344       See  the "Filtering" section in readtags(1) to know how to do this. Run
345       "readtags -H filter" to see which operators take string arguments.
346
347   Build Filter/Sorter Expressions using Lisp Languages
348       Client tools written in Lisp could build the  expression  using  lists.
349       prin1  (in  Common  Lisp style Lisps) and write (in Scheme style Lisps)
350       can translate the list into a string that can be directly used. For ex‐
351       ample, in EmacsLisp:
352
353          (let ((name "hi"))
354            (prin1 `(eq? $name ,name)))
355          => "(eq\\? $name "hi")"
356
357       The "?" is escaped, and readtags can handle it.
358
359       Escape  sequences  produced  by write in Scheme style Lisps are exactly
360       those supported by readtags, so any legal readtags expressions  can  be
361       used. Common Lisp style Lisps may produce escape sequences that are un‐
362       recgonized by readtags, like \#, so symbols that contain "#"  can't  be
363       used. Readtags provides some aliases for these Lisps, so they should:
364
365       • Use true for #t.
366
367       • Use false for #f.
368
369       • Use nil or () for ().
370
371       • Use  (string->regexp  "PATTERN")  for #/PATTERN/. Use (string->regexp
372         "PATTERN" :case-fold true) for #/PATTERN/i. Notice that  string->reg‐
373         exp doesn't require escaping "/" in the pattern.
374
375       Notice  that if the client tool talks to readtags through a shell, then
376       in the produced string, ' still needs to be replaced by '"'"'  to  pre‐
377       vent broken expressions and shell injection.
378
379   Parse Readtags Output
380       In  the  output of readtags, tabs can appear in all field values (e.g.,
381       the tag name itself could contain tabs), which makes it hard  to  split
382       the  line  into  fields.  Client  tools should use the -E option, which
383       keeps the escape sequences in the tags file, so  the  only  field  that
384       could contain tabs is the pattern field.
385
386       The pattern field could:
387
388       • Use a line number. It will look like number;" (e.g. 10;").
389
390       • Use  a  search pattern. It will look like /pattern/;" or ?pattern?;".
391         Notice that the search pattern could contain tabs.
392
393       • Combine these two, like number;/pattern/;" or number;?pattern?;".
394
395       These are true for tags files using extended format, which is  the  de‐
396       fault  one.   The  legacy  format (i.e. --format=1) doesn't include the
397       semicolons. It's old and barely used, so we won't discuss it here.
398
399       Client tools could split the line using the following steps:
400
401       • Find the first 2 tabs in the line, so  we  get  the  name  and  input
402         field.
403
404       • From the 2nd tab:
405
406         • If a / follows, then the pattern delimiter is /.
407
408         • If a ? follows, then the pattern delimiter is ?.
409
410         • If a number follows, then:
411
412              • If a ;/ follows the number, then the delimiter is /.
413
414              • If a ;? follows the number, then the delimiter is ?.
415
416              • If a ;" follows the number, then the field uses only line num‐
417                ber, and there's no pattern delimiter (since there's no  regex
418                pattern). In this case the pattern field ends at the 3rd tab.
419
420       • After  the  opening delimiter, find the next unescaped pattern delim‐
421         iter, and that's the closing delimiter. It will be followed by ;" and
422         then  a tab.  That's the end of the pattern field. By "unescaped pat‐
423         tern delimiter", we mean there's an  even  number  (including  0)  of
424         backslashes before it.
425
426       • From here, split the rest of the line into fields by tabs.
427
428       Then,  the  escape  sequences  in  fields  other than the pattern field
429       should be translated. See "Proposal" in tags(5) to know about  all  the
430       escape sequences.
431
432   Make Use of the Pattern Field
433       The  pattern  field specifies how to find a tag in its source file. The
434       code generating this field seems to have a long history, so  there  are
435       some pitfalls and it's a bit hard to handle. A client tool could simply
436       require the line: field and jump to the line it specifies, to avoid us‐
437       ing  the  pattern field. But anyway, we'll discuss how to make the best
438       use of it here.
439
440       You should take the words here merely as  suggestions,  and  not  stan‐
441       dards.  A client tool could definitely develop better (or simpler) ways
442       to use the pattern field.
443
444       From the last section, we know the pattern field could contain  a  line
445       number  and  a  search  pattern. When it only contains the line number,
446       handling it is easy: you simply go to that line.
447
448       The search pattern resembles an EX command, but  as  we'll  see  later,
449       it's  actually  not  a  valid  one, so some manual work are required to
450       process it.
451
452       The search pattern could look like /pat/, called "forward  search  pat‐
453       tern",  or ?pat?, called "backward search pattern". Using a search pat‐
454       tern means even if the source file is updated, as long as the part con‐
455       taining the tag doesn't change, we could still locate the tag correctly
456       by searching.
457
458       When the pattern field only  contains  the  search  pattern,  you  just
459       search  for it. The search direction (forward/backward) doesn't matter,
460       as it's decided solely by whether the -B option is enabled, and not the
461       actual  context.  You could always start the search from say the begin‐
462       ning of the file.
463
464       When both the search pattern and the line  number  are  presented,  you
465       could  make  good  use  of the line number, by going to the line first,
466       then searching for the nearest occurrence of the pattern. A way  to  do
467       this  is  to search both forward and backward for the pattern, and when
468       there is a occurrence on both sides, go to the nearer one.
469
470       What's good about this is when there are multiple  identical  lines  in
471       the  source file (e.g. the COMMON block in Fortran), this could help us
472       find the correct one, even after the source file is updated and the tag
473       position is shifted by a few lines.
474
475       Now  let's  discuss how to search for the pattern. After you trim the /
476       or ? around it, the pattern resembles a regex pattern. It should  be  a
477       regex  pattern, as required by being a valid EX command, but it's actu‐
478       ally not, as you'll see below.
479
480       It could begin with a ^, which means the pattern starts from the begin‐
481       ning  of  a line. It could also end with an unescaped $ which means the
482       pattern ends at the end of a line. Let's  keep  this  information,  and
483       trim them too.
484
485       Now  the  remaining  part is the actual string containing the tag. Some
486       characters are escaped:
487
488\.
489
490$, but only at the end of the string.
491
492/, but only in forward search patterns.
493
494?, but only in backward search patterns.
495
496       You need to unescape these to get the literal  string.  Now  you  could
497       convert  this  literal string to a regexp that matches it (by escaping,
498       like re.escape in Python or regexp-quote in  Elisp),  and  assemble  it
499       with  ^  or  $ if the pattern originally has it, and finally search for
500       the tag using this regexp.
501
502   Remark: About a Previous Format of the Pattern Field
503       In some earlier versions of Universal Ctags, the  line  number  in  the
504       pattern  field  is the actual line number minus one, for forward search
505       patterns; or plus one, for backward search patterns. The idea is to re‐
506       semble  an EX command: you go to the line, then search forward/backward
507       for the pattern, and you can always find the correct one. But this  de‐
508       nies  the  purpose of using a search pattern: to tolerate file updates.
509       For example, the tag is at line 50, according to this scheme, the  pat‐
510       tern field should be:
511
512          49;/pat/;"
513
514       Then  let's assume that some code above are removed, and the tag is now
515       at line 45. Now you can't find it if you search forward from line 49.
516
517       Due to this reason, Universal Ctags turns to use the actual  line  num‐
518       ber.  A  client  tool  could  distinguish  them by the TAG_OUTPUT_EXCMD
519       pseudo tag, it's "combine" for the old scheme, and "combineV2" for  the
520       present scheme. But probably there's no need to treat them differently,
521       since "search for the nearest occurrence from the line" gives good  re‐
522       sult on both schemes.
523

JSON OUTPUT

525       See ctags-json-output(5).
526

CHANGES

528   Version 6.0
529       • ctags     enables     TAG_KIND_DESCRIPTION,     TAG_ROLE_DESCRIPTION,
530         TAG_FIELD_DESCRIPTION, and TAG_EXTRA_DESCRIPTION pseudo tags  by  de‐
531         fault.
532
533TAG_PARSER_VERSION is introduced.
534

SEE ALSO

536       ctags(1),  ctags-lang-python(7),  ctags-incompatibilities(7),  tags(5),
537       ctags-json-output(5),  readtags(1),  7.2  Libtool’s  versioning  system
538       <https://www.gnu.org/software/libtool/manual/libtool.html#Libtool-ver‐
539       sioning>
540
541
542
543
5446.0.0                                                    CTAGS-CLIENT-TOOLS(7)
Impressum