1
2
3DEBUGINFOD(8)               System Manager's Manual              DEBUGINFOD(8)
4
5
6

NAME

8       debuginfod - debuginfo-related http file-server daemon
9
10

SYNOPSIS

12       debuginfod [OPTION]... [PATH]...
13
14

DESCRIPTION

16       debuginfod  serves  debuginfo-related artifacts over HTTP.  It periodi‐
17       cally scans a set of directories for ELF/DWARF files and their  associ‐
18       ated  source  code,  as  well as archive files containing the above, to
19       build an index by their  buildid.   This  index  is  used  when  remote
20       clients use the HTTP webapi, to fetch these files by the same buildid.
21
22       If a debuginfod cannot service a given buildid artifact request itself,
23       and  it  is  configured  with  information  about  upstream  debuginfod
24       servers,  it queries them for the same information, just as debuginfod-
25       find would.  If successful, it locally caches then relays the file con‐
26       tent to the original requester.
27
28       Indexing  the  given PATHs proceeds using multiple threads.  One thread
29       periodically traverses all the given PATHs logically or physically (see
30       the  -L option).  Duplicate PATHs are ignored.  You may use a file name
31       for a PATH, but source code indexing may be incomplete; prefer using  a
32       directory  that contains the binaries.  The traversal thread enumerates
33       all matching files (see the -I and -X options) into a  work  queue.   A
34       collection  of  scanner  threads  (see  the -c option) wait at the work
35       queue to analyze files in parallel.
36
37       If the -F option is given, each file is scanned as an  ELF/DWARF  file.
38       Source  files  are  matched  with  DWARF files based on the AT_comp_dir
39       (compilation directory) attributes inside it.   Caution:  source  files
40       listed  in  the  DWARF  may  be a path anywhere in the file system, and
41       debuginfod will readily serve their content on demand.  (Imagine a doc‐
42       tored  DWARF file that lists /etc/passwd as a source file.)  If this is
43       a concern, audit your binaries with tools such as:
44
45       % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
46       or
47       % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
48       or even use debuginfod itself:
49       % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
50       ^C
51
52       If any of the -R, -U, or -Z options is given, each file is  scanned  as
53       an archive file that may contain ELF/DWARF/source files.  Archive files
54       are recognized by extension.  If -R is given, ".rpm" files are scanned;
55       if  -D  is given, ".deb" and ".ddeb" files are scanned; if -Z is given,
56       the listed extensions are scanned.  Because of  complications  such  as
57       DWZ-compressed  debuginfo, may require two traversal passes to identify
58       all source code.  Source files for RPMs  are  only  served  from  other
59       RPMs,  so  the  caution  for  -F  does not apply.  Note that due to De‐
60       bian/Ubuntu packaging policies & mechanisms, debuginfod cannot  resolve
61       source files for DEB/DDEB at all.
62
63       If  no  PATH  is listed, or none of the scanning options is given, then
64       debuginfod will simply serve content that it accumulated into its index
65       in all previous runs, and federate to any upstream debuginfod servers.
66
67
68

OPTIONS

70       -F     Activate ELF/DWARF file scanning.  The default is off.
71
72
73       -Z EXT -Z EXT=CMD
74              Activate  an additional pattern in archive scanning.  Files with
75              name extension EXT (include the dot) will be processed.  If  CMD
76              is given, it is invoked with the file name added to its argument
77              list, and should produce a common archive on its  standard  out‐
78              put.   Otherwise,  the file is read as if CMD were "cat".  Since
79              debuginfod internally uses libarchive to read archive files,  it
80              can  accept  a  wide  range  of  archive formats and compression
81              modes.  The default is no additional patterns.  This option  may
82              be repeated.
83
84
85       -R     Activate  RPM patterns in archive scanning.  The default is off.
86              Equivalent to -Z .rpm=cat, since libarchive can natively process
87              RPM  archives.  If your version of libarchive is much older than
88              2020, be aware that some distributions have switched to  an  in‐
89              compatible  zstd compression for their payload.  You may experi‐
90              ment with -Z .rpm='(rpm2cpio|zstdcat)<' instead of -R.
91
92
93       -U     Activate DEB/DDEB patterns in archive scanning.  The default  is
94              off.      Equivalent     to    -Z .deb='dpkg-deb --fsys-tarfile'
95              -Z .ddeb='dpkg-deb --fsys-tarfile'.
96
97
98       -d FILE --database=FILE
99              Set the path of the sqlite database used  to  store  the  index.
100              This  file  is  disposable in the sense that a later rescan will
101              repopulate data.  It will contain absolute file path  names,  so
102              it  may  not  be portable across machines.  It may be frequently
103              read/written, so it should be on a fast filesystem.   It  should
104              not be shared across machines or users, to maximize sqlite lock‐
105              ing    performance.     The    default    database    file    is
106              $HOME/.debuginfod.sqlite.
107
108
109       -D SQL --ddl=SQL
110              Execute  given sqlite statement after the database is opened and
111              initialized as extra DDL (SQL data definition  language).   This
112              may  be  useful  to tune performance-related pragmas or indexes.
113              May be repeated.  The default is nothing extra.
114
115
116       -p NUM --port=NUM
117              Set the TCP port number (0 < NUM < 65536)  on  which  debuginfod
118              should  listen,  to  service  HTTP requests.  Both IPv4 and IPV6
119              sockets are opened, if possible.  The webapi is  documented  be‐
120              low.  The default port number is 8002.
121
122
123       -I REGEX --include=REGEX -X REGEX --exclude=REGEX
124              Govern  the  inclusion  and  exclusion  of  file names under the
125              search paths.  The regular expressions are interpreted as  unan‐
126              chored  POSIX  extended REs, thus may include alternation.  They
127              are evaluated against the full path of each file, based  on  its
128              realpath(3) canonicalization.  By default, all files are includ‐
129              ed and none are excluded.  A file that matches both include  and
130              exclude  REGEX  is excluded.  (The contents of archive files are
131              not subject to inclusion or exclusion filtering:  they  are  all
132              processed.)   Only  the  last of each type of regular expression
133              given is used.
134
135
136       -t SECONDS --rescan-time=SECONDS
137              Set the rescan time for the file and archive directories.   This
138              is  the amount of time the traversal thread will wait after fin‐
139              ishing a scan, before doing it again.  A  rescan  for  unchanged
140              files  is  fast (because the index also stores the file mtimes).
141              A time of zero is acceptable, and means that  only  one  initial
142              scan  should performed.  The default rescan time is 300 seconds.
143              Receiving a SIGUSR1 signal triggers a new scan,  independent  of
144              the rescan time (including if it was zero), interrupting a groom
145              pass (if any).
146
147
148       -g SECONDS --groom-time=SECONDS
149              Set the groom time for the index database.  This is  the  amount
150              of time the grooming thread will wait after finishing a grooming
151              pass before doing it again.  A groom operation  quickly  rescans
152              all  previously  scanned  files,  only  to see if they are still
153              present and current, so it can deindex obsolete files.  See also
154              the  DATA  MANAGEMENT  section.  The default groom time is 86400
155              seconds (1 day).  A time of zero is acceptable, and  means  that
156              only one initial groom should be performed.  Receiving a SIGUSR2
157              signal triggers a new grooming pass, independent  of  the  groom
158              time  (including if it was zero), interrupting a rescan pass (if
159              any)..
160
161
162       -G     Run an extraordinary maximal-grooming pass at debuginfod  start‐
163              up.   This  pass can take considerable time, because it tries to
164              remove any debuginfo-unrelated content from the  archive-related
165              parts of the index.  It should not be run if any recent archive-
166              related indexing operations were aborted  early.   It  can  take
167              considerable space, because it finishes up with an sqlite "vacu‐
168              um" operation, which repacks the database file  by  triplicating
169              it temporarily.  The default is not to do maximal-grooming.  See
170              also the DATA MANAGEMENT section.
171
172
173       -c NUM --concurrency=NUM
174              Set the concurrency limit for the scanning queue threads,  which
175              work together to process archives & files located by the traver‐
176              sal thread.  This important for controlling CPU-intensive opera‐
177              tions  like parsing an ELF file and especially decompressing ar‐
178              chives.  The default is the number of processors on the  system;
179              the minimum is 1.
180
181
182       -L     Traverse  symbolic  links  encountered  during  traversal of the
183              PATHs, including across devices - as in find -L.  The default is
184              to  traverse  the physical directory structure only, stay on the
185              same device, and ignore symlinks - as  in  find -P -xdev.   Cau‐
186              tion: a loops in the symbolic directory tree might lead to infi‐
187              nite traversal.
188
189
190       --fdcache-fds=NUM --fdcache-mbs=MB --fdcache-prefetch=NUM2
191              Configure limits on a cache that keeps recently extracted  files
192              from  archives.   Up to NUM requested files and up to a total of
193              MB megabytes will be kept extracted, in order to avoid having to
194              decompress  their archives over and over again.  In addition, up
195              to NUM2 other files from an archive may be prefetched  into  the
196              cache  before  they  are even requested.  The default NUM, NUM2,
197              and MB values depend on the concurrency of the  system,  and  on
198              the  available  disk  space  on  the $TMPDIR or /tmp filesystem.
199              This is because that is where the most recently  used  extracted
200              files are kept.  Grooming cleans this cache.
201
202
203       --fdcache-mintmp=NUM
204              Configure  a  disk space threshold for emergency flushing of the
205              cache.  The filesystem holding the cache is checked  periodical‐
206              ly.   If  the  available space falls below the given percentage,
207              the cache is flushed, and the fdcache will stay  disabled  until
208              the  next  groom  cycle.  This mechanism, along a few associated
209              /metrics on the webapi, are intended to give an operator  notice
210              about  storage scarcity - which can translate to RAM scarcity if
211              the disk happens to be on  a  RAM  virtual  disk.   The  default
212              threshold is 25%.
213
214
215       -v     Increase  verbosity  of  logging  to the standard error file de‐
216              scriptor.  May be repeated to  increase  details.   The  default
217              verbosity is 0.
218
219

WEBAPI

221       debuginfod's  webapi  resembles  ordinary file service, where a GET re‐
222       quest with a path containing a known buildid results in  a  file.   Un‐
223       known  buildid / request combinations result in HTTP error codes.  This
224       file service resemblance is intentional, so that  an  installation  can
225       take advantage of standard HTTP management infrastructure.
226
227       There  are  three  requests.  In each case, the buildid is encoded as a
228       lowercase hexadecimal string.  For example, for a program /bin/ls, look
229       at the ELF note GNU_BUILD_ID:
230
231       % readelf -n /bin/ls | grep -A4 build.id
232       Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
233       Owner          Data size  Type
234       GNU                   20  GNU_BUILD_ID
235       Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d
236
237       Then the hexadecimal BUILDID is simply:
238
239       8713b9c3fb8a720137a4a08b325905c7aaf8429d
240
241
242   /buildid/BUILDID/debuginfo
243       If  the  given buildid is known to the server, this request will result
244       in a binary object that  contains  the  customary  .*debug_*  sections.
245       This may be a split debuginfo file as created by strip, or it may be an
246       original unstripped executable.
247
248
249   /buildid/BUILDID/executable
250       If the given buildid is known to the server, this request  will  result
251       in  a binary object that contains the normal executable segments.  This
252       may be a executable stripped by strip, or it may  be  an  original  un‐
253       stripped  executable.   ET_DYN  shared libraries are considered to be a
254       type of executable.
255
256
257   /buildid/BUILDID/source/SOURCE/FILE
258       If the given buildid is known to the server, this request  will  result
259       in  a  binary object that contains the source file mentioned.  The path
260       should be absolute.  Relative path names commonly appear in  the  DWARF
261       file's  source  directory,  but  these paths are relative to individual
262       compilation unit AT_comp_dir paths, and yet an executable is made up of
263       multiple  CUs.   Therefore,  to disambiguate, debuginfod expects source
264       queries to prefix relative path names with the CU  compilation-directo‐
265       ry, followed by a mandatory "/".
266
267       Note:  the  caller  may  or  may not elide ../ or /./ or extraneous ///
268       sorts of path components in the directory  names.   debuginfod  accepts
269       both  forms.  Specifically, debuginfod canonicalizes path names accord‐
270       ing to RFC3986 section 5.2.4 (Remove Dot Segments), plus  reducing  any
271       // to / in the path.
272
273       For example:
274
275       #include <stdio.h>               /buildid/BUILDID/source/usr/include/stdio.h
276       /path/to/foo.c                   /buildid/BUILDID/source/path/to/foo.c
277       ../bar/foo.c AT_comp_dir=/zoo/   /buildid/BUILDID/source/zoo//../bar/foo.c
278
279
280   /metrics
281       This endpoint returns a Prometheus formatted text/plain dump of a vari‐
282       ety of statistics about the operation of the  debuginfod  server.   The
283       exact  set of metrics and their meanings may change in future versions.
284       Caution: configuration information (path names, versions) may  be  dis‐
285       closed.
286
287

DATA MANAGEMENT

289       debuginfod  stores  its index in an sqlite database in a densely packed
290       set of interlinked tables.  While the representation is as efficient as
291       we  have  been able to make it, it still takes a considerable amount of
292       data to record all debuginfo-related data of potentially a  great  many
293       files.  This section offers some advice about the implications.
294
295       As  a  general  explanation  for size, consider that debuginfod indexes
296       ELF/DWARF files, it stores  their  names  and  referenced  source  file
297       names,  and buildids will be stored.  When indexing archives, it stores
298       every file name of or in an archive, every buildid, plus  every  source
299       file  name referenced from a DWARF file.  (Indexing archives takes more
300       space because the source files often  reside  in  separate  subpackages
301       that  may  not be indexed at the same pass, so extra metadata has to be
302       kept.)
303
304       Getting down to numbers, in the case of Fedora RPMs (essentially, gzip-
305       compressed cpio files), the sqlite index database tends to be from 0.5%
306       to 3% of their size.  It's larger for binaries that are  assembled  out
307       of a great many source files, or packages that carry much debuginfo-un‐
308       related content.  It may be even larger during the indexing  phase  due
309       to  temporary  sqlite write-ahead-logging files; these are checkpointed
310       (cleaned out and removed) at shutdown.  It  may  be  helpful  to  apply
311       tight  -I  or  -X  regular-expression constraints to exclude files from
312       scanning that you know have no debuginfo-relevant content.
313
314       As debuginfod runs, it periodically rescans its target directories, and
315       any  new  content found is added to the database.  Old content, such as
316       data for files that have disappeared or that have  been  replaced  with
317       newer versions is removed at a periodic grooming pass.  This means that
318       the sqlite files grow fast during initial indexing, slowly during index
319       rescans, and periodically shrink during grooming.  There is also an op‐
320       tional one-shot maximal grooming pass is available.  It removes  infor‐
321       mation  debuginfo-unrelated data from the archive content index such as
322       file names found in archives ("archive sdef" records) that are not  re‐
323       ferred  to as source files from any binaries find in archives ("archive
324       sref" records).  This can save considerable disk space.  However, it is
325       slow  and  temporarily  requires  up to twice the database size as free
326       space.  Worse: it may result in missing source-code info if the archive
327       traversals  were  interrupted,  so  that not all source file references
328       were known.  Use it rarely to polish a complete index.
329
330       You should ensure that ample disk space remains available.  (The  flood
331       of  error  messages on -ENOSPC is ugly and nagging.  But, like for most
332       other errors, debuginfod will resume when resources permit.)  If neces‐
333       sary,  debuginfod  can  be stopped, the database file moved or removed,
334       and debuginfod restarted.
335
336       sqlite offers several performance-related options in the form of  prag‐
337       mas.   Some may be useful to fine-tune the defaults plus the debuginfod
338       extras.  The -D option may be useful to tell debuginfod to execute  the
339       given  bits of SQL after the basic schema creation commands.  For exam‐
340       ple, the "synchronous", "cache_size", "auto_vacuum", "threads",  "jour‐
341       nal_mode"  pragmas  may be fun to tweak via -D, if you're searching for
342       peak performance.  The "optimize", "wal_checkpoint" pragmas may be use‐
343       ful  to run periodically, outside debuginfod.  The default settings are
344       performance- rather than  reliability-oriented,  so  a  hardware  crash
345       might  corrupt  the  database.   In these cases, it may be necessary to
346       manually delete the sqlite database and start over.
347
348       As debuginfod changes in the future, we  may  have  no  choice  but  to
349       change the database schema in an incompatible manner.  If this happens,
350       new versions of debuginfod will issue SQL statements to drop all  prior
351       schema  &  data, and start over.  So, disk space will not be wasted for
352       retaining a no-longer-useable dataset.
353
354       In summary, if your system can bear a 0.5%-3%  index-to-archive-dataset
355       size  ratio,  and  slow growth afterwards, you should not need to worry
356       about disk space.  If a system crash corrupts the database, or you want
357       to  force  debuginfod  to reset and start over, simply erase the sqlite
358       file before restarting debuginfod.
359
360
361

SECURITY

363       debuginfod does not include any particular security features.  While it
364       is  robust  with respect to inputs, some abuse is possible.  It forks a
365       new thread for each incoming HTTP request, which could lead  to  a  de‐
366       nial-of-service  in  terms  of  RAM, CPU, disk I/O, or network I/O.  If
367       this is a problem, users are advised to install debuginfod with a HTTPS
368       reverse-proxy  front-end  that  enforces site policies for firewalling,
369       authentication, integrity, authorization, and load control.  The  /met‐
370       rics  webapi endpoint is probably not appropriate for disclosure to the
371       public.
372
373       When relaying queries to upstream debuginfods, debuginfod does not  in‐
374       clude  any  particular  security features.  It trusts that the binaries
375       returned by the debuginfods  are  accurate.   Therefore,  the  list  of
376       servers  should include only trustworthy ones.  If accessed across HTTP
377       rather than HTTPS, the network should be  trustworthy.   Authentication
378       information  through  the internal libcurl library is not currently en‐
379       abled.
380
381
382

ENVIRONMENT VARIABLES

384       TMPDIR This environment variable points to a file system to be used for
385              temporary files.  The default is /tmp.
386
387
388       DEBUGINFOD_URLS
389              This  environment  variable  contains a list of URL prefixes for
390              trusted debuginfod instances.  Alternate URL prefixes are  sepa‐
391              rated  by space.  Avoid referential loops that cause a server to
392              contact itself, directly or indirectly - the  results  would  be
393              hilarious.
394
395
396       DEBUGINFOD_TIMEOUT
397              This  environment variable governs the timeout for each debugin‐
398              fod HTTP connection.  A server that fails to  provide  at  least
399              100K of data within this many seconds is skipped. The default is
400              90 seconds.  (Zero or negative means "no timeout".)
401
402
403
404       DEBUGINFOD_CACHE_PATH
405              This environment variable governs  the  location  of  the  cache
406              where  downloaded files are kept.  It is cleaned periodically as
407              this program  is  reexecuted.  If  XDG_CACHE_HOME  is  set  then
408              $XDG_CACHE_HOME/debuginfod_client  is the default location, oth‐
409              erwise $HOME/.cache/debuginfod_client is used. For more informa‐
410              tion  regarding  the  client  cache see debuginfod_find_debugin‐
411              fo(3).
412
413

FILES

415       $HOME/.debuginfod.sqlite
416                           Default database file.
417
418
419       $XDG_CACHE_HOME/debuginfod_client
420                           Default cache directory for content  from  upstream
421                           debuginfods.   If  XDG_CACHE_HOME  is  not set then
422                           $HOME/.cache/debuginfod_client is used.
423
424
425

SEE ALSO

427       debuginfod-find(1)                                           sqlite3(1)
428       https://prometheus.io/docs/instrumenting/exporters/
429
430
431
432                                                                 DEBUGINFOD(8)
Impressum