1CTAGS-CLIENT-TOOLS(7)           Universal Ctags          CTAGS-CLIENT-TOOLS(7)
2
3
4

NAME

6       ctags-client-tools  -  Hints  for developing a tool using ctags command
7       and tags output
8

SYNOPSIS

10       ctags [options] [file(s)]
11       etags [options] [file(s)]
12
13

DESCRIPTION

15       Client tool means a tool running the ctags  command  and/or  reading  a
16       tags  file generated by ctags command.  This man page gathers hints for
17       people who develop client tools.
18

PSEUDO-TAGS

20       Pseudo-tags, stored in a tag file, indicate  how  ctags  generated  the
21       tags  file:  whether  the  tags file is sorted or not, which version of
22       tags file format is used, the name of tags generator, and  so  on.  The
23       opposite  term  for pseudo-tags is regular-tags. A regular-tag is for a
24       language object in an input file. A pseudo-tag is for the tags file it‐
25       self. Client tools may use pseudo-tags as reference for processing reg‐
26       ular-tags.
27
28       A pseudo-tag is stored in a tags file  in  the  same  format  as  regu‐
29       lar-tags as described in tags(5), except that pseudo-tag names are pre‐
30       fixed with "!_". For the general  information  about  pseudo-tags,  see
31       "TAG FILE INFORMATION" in tags(5).
32
33       An example of a pseudo tag:
34
35          !_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/
36
37       The  value,  "2", associated with the pseudo tag "TAG_PROGRAM_NAME", is
38       used in the field for input file. The description, "Derived  from  Exu‐
39       berant Ctags", is used in the field for pattern.
40
41       Universal  Ctags extends the naming scheme of the classical pseudo-tags
42       available in Exuberant Ctags for emitting language specific information
43       as pseudo tags:
44
45          !_{pseudo-tag-name}!{language-name}     {associated-value}      /{description}/
46
47       The  language-name is appended to the pseudo-tag name with a separator,
48       "!".
49
50       An example of pseudo tag with a language suffix:
51
52          !_TAG_KIND_DESCRIPTION!C        f,function      /function definitions/
53
54       This pseudo-tag says "the function kind of C language is  enabled  when
55       generating  this  tags  file."  --pseudo-tags  is  the  option  for en‐
56       abling/disabling  individual  pseudo-tags.  When  enabling/disabling  a
57       pseudo  tag  with  the  option, specify the tag name only "TAG_KIND_DE‐
58       SCRIPTION", without the prefix ("!_") or the suffix ("!C").
59
60   Options for Pseudo-tags
61       --extras=+p (or --extras=+{pseudo})
62              Forces writing pseudo-tags.
63
64              ctags emits pseudo-tags by default when writing tags to a  regu‐
65              lar  file  (e.g.  "tags'.) However, when specifying -o - or -f -
66              for  writing  tags  to  standard  output,  ctags  doesn't   emit
67              pseudo-tags.   --extras=+p   or  --extras=+{pseudo}  will  force
68              pseudo-tags to be written.
69
70       --list-pseudo-tags
71              Lists available types of pseudo-tags and shows whether they  are
72              enabled or disabled.
73
74              Running  ctags  with  --list-pseudo-tags  option lists available
75              pseudo-tags. Some of pseudo-tags newly introduced  in  Universal
76              Ctags  project are disabled by default. Use --pseudo-tags=... to
77              enable them.
78
79       --pseudo-tags=[+|-]names|*
80              Specifies a list of pseudo-tag types to include in the output.
81
82              The parameters are a set of pseudo tag names. Valid  pseudo  tag
83              names  can be listed with --list-pseudo-tags. Surround each name
84              in the set with braces, like "{TAG_PROGRAM_AUTHOR}".  You  don't
85              have  to  include  the  "!_" pseudo tag prefix when specifying a
86              name in the option argument for --pseudo-tags= option.
87
88              pseudo-tags don't have a notation using one-letter flags.
89
90              If a name is preceded by either the '+' or '-' characters,  that
91              tags's effect has been added or removed. Otherwise the names re‐
92              place any current settings. All entries are included if  '*'  is
93              given.
94
95       --fields=+E (or --fields=+{extras})
96              Attach "extras:pseudo" field to pseudo-tags.
97
98              An example of pseudo tags with the field:
99
100                 !_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/  extras:pseudo
101
102              If  the  name  of a normal tag in a tag file starts with "!_", a
103              client tool cannot distinguish whether the tag is a  regular-tag
104              or  pseudo-tag.   The  fields attached with this option help the
105              tool distinguish them.
106
107   List of notable pseudo-tags
108       Running ctags with --list-pseudo-tags option lists available  types  of
109       pseudo-tags  with  short  descriptions. This subsection shows hints for
110       using notable ones.
111
112       TAG_EXTRA_DESCRIPTION (new in Universal Ctags)
113              Indicates the names and descriptions of enabled extras:
114
115                 !_TAG_EXTRA_DESCRIPTION       {extra-name}    /description/
116                 !_TAG_EXTRA_DESCRIPTION!{language-name}       {extra-name}    /description/
117
118              If your tool relies on some extra tags (extras),  refer  to  the
119              pseudo-tags  of  this type. A tool can reject the tags file that
120              doesn't include expected extras, and raise an error in an  early
121              stage of processing.
122
123              An example of the pseudo-tags:
124
125                 $ ctags --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c
126                 !_TAG_EXTRA_DESCRIPTION       anonymous       /Include tags for non-named objects like lambda/
127                 !_TAG_EXTRA_DESCRIPTION       fileScope       /Include tags of file scope/
128                 !_TAG_EXTRA_DESCRIPTION       pseudo  /Include pseudo tags/
129                 !_TAG_EXTRA_DESCRIPTION       subparser       /Include tags generated by subparsers/
130                 ...
131
132              A client tool can know "{anonymous}", "{fileScope}", "{pseudo}",
133              and "{subparser}" extras are enabled from the output.
134
135       TAG_FIELD_DESCRIPTION (new in Universal Ctags)
136              Indicates the names and descriptions of enabled fields:
137
138                 !_TAG_FIELD_DESCRIPTION       {field-name}    /description/
139                 !_TAG_FIELD_DESCRIPTION!{language-name}       {field-name}    /description/
140
141              If your tool relies on some fields, refer to the pseudo-tags  of
142              this  type.   A tool can reject a tags file that doesn't include
143              expected fields, and raise an error in an early  stage  of  pro‐
144              cessing.
145
146              An example of the pseudo-tags:
147
148                 $ ctags --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c
149                 !_TAG_FIELD_DESCRIPTION       file    /File-restricted scoping/
150                 !_TAG_FIELD_DESCRIPTION       input   /input file/
151                 !_TAG_FIELD_DESCRIPTION       name    /tag name/
152                 !_TAG_FIELD_DESCRIPTION       pattern /pattern/
153                 !_TAG_FIELD_DESCRIPTION       typeref /Type and name of a variable or typedef/
154                 !_TAG_FIELD_DESCRIPTION!C     macrodef        /macro definition/
155                 ...
156
157              A  client  tool  can  know "{file}", "{input}", "{name}", "{pat‐
158              tern}", and "{typeref}" fields are enabled from the output.  The
159              fields  are  common  in  languages.  In  addition  to the common
160              fields, the tool can known "{macrodef}" field of C  language  is
161              also enabled.
162
163       TAG_FILE_ENCODING (new in Universal Ctags)
164              TBW
165
166       TAG_FILE_FORMAT
167              See also tags(5).
168
169       TAG_FILE_SORTED
170              See also tags(5).
171
172       TAG_KIND_DESCRIPTION (new in Universal Ctags)
173              Indicates the names and descriptions of enabled kinds:
174
175                 !_TAG_KIND_DESCRIPTION!{language-name}        {kind-letter},{kind-name}       /description/
176
177              If  your  tool relies on some kinds, refer to the pseudo-tags of
178              this type.  A tool can reject the tags file that doesn't include
179              expected kinds, and raise an error in an early stage of process‐
180              ing.
181
182              Kinds are language specific, so a language name is   always  ap‐
183              pended to the tag name as suffix.
184
185              An example of the pseudo-tags:
186
187                 $ ctags --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c
188                 !_TAG_KIND_DESCRIPTION!C      f,function      /function definitions/
189                 !_TAG_KIND_DESCRIPTION!C      m,member        /struct, and union members/
190                 !_TAG_KIND_DESCRIPTION!C      v,variable      /variable definitions/
191                 ...
192
193              A  client  tool  can  know "{function}", "{member}", and "{vari‐
194              able}" kinds of C language are enabled from the output.
195
196       TAG_KIND_SEPARATOR (new in Universal Ctags)
197              TBW
198
199       TAG_OUTPUT_EXCMD (new in Universal Ctags)
200              Indicates the specified type of EX command with --excmd option.
201
202       TAG_OUTPUT_FILESEP (new in Universal Ctags)
203              TBW
204
205       TAG_OUTPUT_MODE (new in Universal Ctags)
206              TBW
207
208       TAG_PATTERN_LENGTH_LIMIT (new in Universal Ctags)
209              TBW
210
211       TAG_PROC_CWD (new in Universal Ctags)
212              Indicates the working directory of ctags during processing.
213
214              This pseudo-tag helps a client tool solve the absolute paths for
215              the  input  files for tag entries even when they are tagged with
216              relative paths.
217
218              An example of the pseudo-tags:
219
220                 $ cat tags
221                 !_TAG_PROC_CWD        /tmp/   //
222                 main  input.c /^int main (void) { return 0; }$/;"     f       typeref:typename:int
223                 ...
224
225              From the regular tag for "main", the client tool  can  know  the
226              "main"  is  at "input.c".  However, it is a relative path. So if
227              the directory where ctags run and the directory where the client
228              tool  runs  are different, the client tool cannot find "input.c"
229              from the file system. In that case, TAG_PROC_CWD gives the  tool
230              a hint; "input.c" may be at "/tmp".
231
232       TAG_PROGRAM_NAME
233              TBW
234
235       TAG_ROLE_DESCRIPTION (new in Universal Ctags)
236              Indicates the names and descriptions of enabled roles:
237
238                 !_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name}    {role-name}     /description/
239
240              If  your  tool relies on some roles, refer to the pseudo-tags of
241              this type. Note that a role owned by  a  disabled  kind  is  not
242              listed even if the role itself is enabled.
243

REDUNDANT-KINDS

245       TBW
246

MULTIPLE-LANGUAGES FOR AN INPUT FILE

248       Universal  ctags  can run multiple parsers.  That means a parser, which
249       supports multiple parsers, may output  tags  for  different  languages.
250       language/l field can be used to show the language for each tag.
251
252          $ cat /tmp/foo.html
253          <html>
254          <script>var x = 1</script>
255          <h1>title</h1>
256          </html>
257          $ ./ctags -o - --extras=+g /tmp/foo.html
258          title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h
259          x       /tmp/foo.html   /var x = 1/;"   v
260          $ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html
261          title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h       language:HTML
262          x       /tmp/foo.html   /var x = 1/;"   v       language:JavaScript
263

UTILIZING READTAGS

265       See  readtags(1)  to know how to use readtags. This section is for dis‐
266       cussing some notable topics for client tools.
267
268   Build Filter/Sorter Expressions
269       Certain escape sequences in expressions are recognized by readtags. For
270       example,  when searching for a tag that matches a\?b, if using a filter
271       expression like '(eq? $name "a\?b")', since \?  is  translated  into  a
272       single ? by readtags, it actually searches for a?b.
273
274       Another  problem  is  if  a  single  quote appear in filter expressions
275       (which is also wrapped by single quotes), it terminates the expression,
276       producing  broken  expressions, and may even cause unintended shell in‐
277       jection. Single quotes can be escaped using '"'"'.
278
279       So, client tools need to:
280
281       • Replace \ by \\
282
283       • Replace ' by '"'"'
284
285       inside the expressions. If the expression also contains strings,  "  in
286       the strings needs to be replaced by \".
287
288       Client  tools  written  in Lisp could build the expression using lists.
289       prin1 (in Common Lisp style Lisps) and write (in  Scheme  style  Lisps)
290       can translate the list into a string that can be directly used. For ex‐
291       ample, in EmacsLisp:
292
293          (let ((name "hi"))
294            (prin1 `(eq? $name ,name)))
295          => "(eq\\? $name "hi")"
296
297       The "?" is escaped, and readtags can  handle  it.  Scheme  style  Lisps
298       should  do  proper escaping so the expression readtags gets is just the
299       expression passed into write. Common Lisp style Lisps may  produce  un‐
300       recognized  escape  sequences  by  readtags, like \#. Readtags provides
301       some aliases for these Lisps:
302
303       • Use true for #t.
304
305       • Use false for #f.
306
307       • Use nil or () for ().
308
309       • Use (string->regexp "PATTERN") for  #/PATTERN/.  Use  (string->regexp
310         "PATTERN"  :case-fold true) for #/PATTERN/i. Notice that string->reg‐
311         exp doesn't require escaping "/" in the pattern.
312
313       Notice that even when the client tool uses this method, '  still  needs
314       to  be replaced by '"'"' to prevent broken expressions and shell injec‐
315       tion.
316
317       Another thing to notice is that missing fields are represented  by  #f,
318       and applying string operators to them will produce an error. You should
319       always check if a field is missing before  applying  string  operators.
320       See  the "Filtering" section in readtags(1) to know how to do this. Run
321       "readtags -H filter" to see which operators take string arguments.
322
323   Parse Readtags Output
324       In the output of readtags, tabs can appear in all field  values  (e.g.,
325       the  tag  name itself could contain tabs), which makes it hard to split
326       the line into fields. Client tools should  use  the  -E  option,  which
327       keeps  the  escape  sequences  in the tags file, so the only field that
328       could contain tabs is the pattern field.
329
330       The pattern field could:
331
332       • Use a line number. It will look like number;" (e.g. 10;").
333
334       • Use a search pattern. It will look like /pattern/;"  or  ?pattern?;".
335         Notice that the search pattern could contain tabs.
336
337       • Combine these two, like number;/pattern/;" or number;?pattern?;".
338
339       These  are  true for tags files using extended format, which is the de‐
340       fault one.  The legacy format (i.e.  --format=1)  doesn't  include  the
341       semicolons. It's old and barely used, so we won't discuss it here.
342
343       Client tools could split the line using the following steps:
344
345       • Find  the  first  2  tabs  in  the line, so we get the name and input
346         field.
347
348       • From the 2nd tab:
349
350         • If a / follows, then the pattern delimiter is /.
351
352         • If a ? follows, then the pattern delimiter is ?.
353
354         • If a number follows, then:
355
356           • If a ;/ follows the number, then the delimiter is /.
357
358           • If a ;? follows the number, then the delimiter is ?.
359
360           • If a ;" follows the number, then the field uses only line number,
361             and  there's  no  pattern  delimiter (since there's no regex pat‐
362             tern). In this case the pattern field ends at the 3rd tab.
363
364       • After the opening delimiter, find the next unescaped  pattern  delim‐
365         iter, and that's the closing delimiter. It will be followed by ;" and
366         then a tab.  That's the end of the pattern field. By "unescaped  pat‐
367         tern  delimiter",  we  mean  there's  an even number (including 0) of
368         backslashes before it.
369
370       • From here, split the rest of the line into fields by tabs.
371
372       Then, the escape sequences in  fields  other  than  the  pattern  field
373       should  be  translated. See "Proposal" in tags(5) to know about all the
374       escape sequences.
375
376   Make Use of the Pattern Field
377       The pattern field specifies how to find a tag in its source  file.  The
378       code  generating  this field seems to have a long history, so there are
379       some pitfalls and it's a bit hard to handle. A client tool could simply
380       require the line: field and jump to the line it specifies, to avoid us‐
381       ing the pattern field. But anyway, we'll discuss how to make  the  best
382       use of it here.
383
384       You  should  take  the  words here merely as suggestions, and not stan‐
385       dards. A client tool could definitely develop better (or simpler)  ways
386       to use the pattern field.
387
388       From  the  last section, we know the pattern field could contain a line
389       number and a search pattern. When it only  contains  the  line  number,
390       handling it is easy: you simply go to that line.
391
392       The  search  pattern  resembles  an EX command, but as we'll see later,
393       it's actually not a valid one, so some  manual  work  are  required  to
394       process it.
395
396       The  search  pattern could look like /pat/, called "forward search pat‐
397       tern", or ?pat?, called "backward search pattern". Using a search  pat‐
398       tern means even if the source file is updated, as long as the part con‐
399       taining the tag doesn't change, we could still locate the tag correctly
400       by searching.
401
402       When  the  pattern  field  only  contains  the search pattern, you just
403       search for it. The search direction (forward/backward) doesn't  matter,
404       as it's decided solely by whether the -B option is enabled, and not the
405       actual context. You could always start the search from say  the  begin‐
406       ning of the file.
407
408       When  both  the  search  pattern and the line number are presented, you
409       could make good use of the line number, by going  to  the  line  first,
410       then  searching  for the nearest occurrence of the pattern. A way to do
411       this is to search both forward and backward for the pattern,  and  when
412       there is a occurrence on both sides, go to the nearer one.
413
414       What's  good  about  this is when there are multiple identical lines in
415       the source file (e.g. the COMMON block in Fortran), this could help  us
416       find the correct one, even after the source file is updated and the tag
417       position is shifted by a few lines.
418
419       Now let's discuss how to search for the pattern. After you trim  the  /
420       or  ?  around it, the pattern resembles a regex pattern. It should be a
421       regex pattern, as required by being a valid EX command, but it's  actu‐
422       ally not, as you'll see below.
423
424       It could begin with a ^, which means the pattern starts from the begin‐
425       ning of a line. It could also end with an unescaped $ which  means  the
426       pattern  ends  at  the  end of a line. Let's keep this information, and
427       trim them too.
428
429       Now the remaining part is the actual string containing  the  tag.  Some
430       characters are escaped:
431
432\.
433
434$, but only at the end of the string.
435
436/, but only in forward search patterns.
437
438?, but only in backward search patterns.
439
440       You  need  to  unescape  these to get the literal string. Now you could
441       convert this literal string to a regexp that matches it  (by  escaping,
442       like  re.escape  in  Python  or regexp-quote in Elisp), and assemble it
443       with ^ or $ if the pattern originally has it, and  finally  search  for
444       the tag using this regexp.
445
446   Remark: About a Previous Format of the Pattern Field
447       In  some  earlier  versions  of Universal Ctags, the line number in the
448       pattern field is the actual line number minus one, for  forward  search
449       patterns; or plus one, for backward search patterns. The idea is to re‐
450       semble an EX command: you go to the line, then search  forward/backward
451       for  the pattern, and you can always find the correct one. But this de‐
452       nies the purpose of using a search pattern: to tolerate  file  updates.
453       For  example, the tag is at line 50, according to this scheme, the pat‐
454       tern field should be:
455
456          49;/pat/;"
457
458       Then let's assume that some code above are removed, and the tag is  now
459       at line 45. Now you can't find it if you search forward from line 49.
460
461       Due  to  this reason, Universal Ctags turns to use the actual line num‐
462       ber. A client tool  could  distinguish  them  by  the  TAG_OUTPUT_EXCMD
463       pseudo  tag, it's "combine" for the old scheme, and "combineV2" for the
464       present scheme. But probably there's no need to treat them differently,
465       since  "search for the nearest occurrence from the line" gives good re‐
466       sult on both schemes.
467

JSON OUTPUT

469       Universal Ctags supports JSON (strictly  speaking  JSON  Lines)  output
470       format  if  the ctags executable is built with libjansson.  JSON output
471       goes to standard output by default.
472
473   Format
474       Each JSON line represents a tag.
475
476          $ ctags --extras=+p --output-format=json --fields=-s input.py
477          {"_type": "ptag", "name": "JSON_OUTPUT_VERSION", "path": "0.0", "pattern": "in development"}
478          {"_type": "ptag", "name": "TAG_FILE_SORTED", "path": "1", "pattern": "0=unsorted, 1=sorted, 2=foldcase"}
479          ...
480          {"_type": "tag", "name": "Klass", "path": "/tmp/input.py", "pattern": "/^class Klass:$/", "language": "Python", "kind": "class"}
481          {"_type": "tag", "name": "method", "path": "/tmp/input.py", "pattern": "/^    def method(self):$/", "language": "Python", "kind": "member", "scope": "Klass", "scopeKind": "class"}
482          ...
483
484       A key not starting with _ is mapped  to  a  field  of  ctags.   "--out‐
485       put-format=json --list-fields" options list the fields.
486
487       A  key  starting  with  _ represents meta information of the JSON line.
488       Currently only _type key is used. If the value for the key is tag,  the
489       JSON  line represents a normal tag. If the value is ptag, the line rep‐
490       resents a pseudo-tag.
491
492       The output format can be changed  in  the  future.  JSON_OUTPUT_VERSION
493       pseudo-tag  provides a change client-tools to handle the changes.  Cur‐
494       rent version is "0.0". A client-tool can extract the version with  path
495       key from the pseudo-tag.
496
497       The JSON output format is newly designed and has no limitation found in
498       the default tags file format.
499
500       • The values for kind key  are  represented  in  long-name  flags.   No
501         one-letter is here.
502
503       • Scope  names  and  scope  kinds  have  distinguished  keys: scope and
504         scopeKind.  They are combined in the default tags file format.
505
506   Data type used in a field
507       Values for the most of all keys are represented in  JSON  string  type.
508       However,  some of them are represented in string, integer, and/or bool‐
509       ean type.
510
511       "--output-format=json --list-fields" options show  What  kind  of  data
512       type used in a field of JSON.
513
514          $ ctags --output-format=json --list-fields
515          #LETTER NAME           ENABLED LANGUAGE         JSTYPE FIXED DESCRIPTION
516          F       input          yes     NONE             s--    no    input file
517          ...
518          P       pattern        yes     NONE             s-b    no    pattern
519          ...
520          f       file           yes     NONE             --b    no    File-restricted scoping
521          ...
522          e       end            no      NONE             -i-    no    end lines of various items
523          ...
524
525       JSTYPE column shows the data types.
526
527       's'    string
528
529       'i'    integer
530
531       'b'    boolean (true or false)
532
533       For  an example, the value for pattern field of ctags takes a string or
534       a boolean value.
535

SEE ALSO

537       ctags(1),  ctags-lang-python(7),  ctags-incompatibilities(7),  tags(5),
538       readtags(1)
539
540
541
542
5435.9.0                                                    CTAGS-CLIENT-TOOLS(7)
Impressum