1
2DEBUGINFOD(8) System Manager's Manual DEBUGINFOD(8)
3
4
5
7 debuginfod - debuginfo-related http file-server daemon
8
9
11 debuginfod [OPTION]... [PATH]...
12
13
15 debuginfod serves debuginfo-related artifacts over HTTP. It periodi‐
16 cally scans a set of directories for ELF/DWARF files and their associ‐
17 ated source code, as well as archive files containing the above, to
18 build an index by their buildid. This index is used when remote
19 clients use the HTTP webapi, to fetch these files by the same buildid.
20
21 If a debuginfod cannot service a given buildid artifact request itself,
22 and it is configured with information about upstream debuginfod
23 servers, it queries them for the same information, just as debuginfod-
24 find would. If successful, it locally caches then relays the file con‐
25 tent to the original requester.
26
27 Indexing the given PATHs proceeds using multiple threads. One thread
28 periodically traverses all the given PATHs logically or physically (see
29 the -L option). Duplicate PATHs are ignored. You may use a file name
30 for a PATH, but source code indexing may be incomplete; prefer using a
31 directory that contains the binaries. The traversal thread enumerates
32 all matching files (see the -I and -X options) into a work queue. A
33 collection of scanner threads (see the -c option) wait at the work
34 queue to analyze files in parallel.
35
36 If the -F option is given, each file is scanned as an ELF/DWARF file.
37 Source files are matched with DWARF files based on the AT_comp_dir
38 (compilation directory) attributes inside it. Caution: source files
39 listed in the DWARF may be a path anywhere in the file system, and de‐
40 buginfod will readily serve their content on demand. (Imagine a doc‐
41 tored DWARF file that lists /etc/passwd as a source file.) If this is
42 a concern, audit your binaries with tools such as:
43
44 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
45 or
46 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
47 or even use debuginfod itself:
48 % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
49 ^C
50
51 If any of the -R, -U, or -Z options is given, each file is scanned as
52 an archive file that may contain ELF/DWARF/source files. Archive files
53 are recognized by extension. If -R is given, ".rpm" files are scanned;
54 if -U is given, ".deb" and ".ddeb" files are scanned; if -Z is given,
55 the listed extensions are scanned. Because of complications such as
56 DWZ-compressed debuginfo, may require two traversal passes to identify
57 all source code. Source files for RPMs are only served from other
58 RPMs, so the caution for -F does not apply. Note that due to De‐
59 bian/Ubuntu packaging policies & mechanisms, debuginfod cannot resolve
60 source files for DEB/DDEB at all.
61
62 If no PATH is listed, or none of the scanning options is given, then
63 debuginfod will simply serve content that it accumulated into its index
64 in all previous runs, periodically groom the database, and federate to
65 any upstream debuginfod servers. In passive mode, debuginfod will only
66 serve content from a read-only index and federated upstream servers,
67 but will not scan or groom.
68
69
71 -F Activate ELF/DWARF file scanning. The default is off.
72
73
74 -Z EXT -Z EXT=CMD
75 Activate an additional pattern in archive scanning. Files with
76 name extension EXT (include the dot) will be processed. If CMD
77 is given, it is invoked with the file name added to its argument
78 list, and should produce a common archive on its standard out‐
79 put. Otherwise, the file is read as if CMD were "cat". Since
80 debuginfod internally uses libarchive to read archive files, it
81 can accept a wide range of archive formats and compression
82 modes. The default is no additional patterns. This option may
83 be repeated.
84
85
86 -R Activate RPM patterns in archive scanning. The default is off.
87 Equivalent to -Z .rpm=cat, since libarchive can natively process
88 RPM archives. If your version of libarchive is much older than
89 2020, be aware that some distributions have switched to an in‐
90 compatible zstd compression for their payload. You may experi‐
91 ment with -Z .rpm='(rpm2cpio|zstdcat)<' instead of -R.
92
93
94 -U Activate DEB/DDEB patterns in archive scanning. The default is
95 off. Equivalent to -Z .deb='dpkg-deb --fsys-tarfile'
96 -Z .ddeb='dpkg-deb --fsys-tarfile'.
97
98
99 -d FILE --database=FILE
100 Set the path of the sqlite database used to store the index.
101 This file is disposable in the sense that a later rescan will
102 repopulate data. It will contain absolute file path names, so
103 it may not be portable across machines. It may be frequently
104 read/written, so it should be on a fast filesystem. It should
105 not be shared across machines or users, to maximize sqlite lock‐
106 ing performance. For quick testing the magic string ":memory:"
107 can be used to use an one-time memory-only database. The de‐
108 fault database file is $HOME/.debuginfod.sqlite.
109
110
111 --passive
112 Set the server to passive mode, where it only services webapi
113 requests, including participating in federation. It performs no
114 scanning, no grooming, and so only opens the sqlite database
115 read-only. This way a database can be safely shared between a
116 active scanner/groomer server and multiple passive ones, thereby
117 sharing service load. Archive pattern options must still be
118 given, so debuginfod can recognize file name extensions for un‐
119 packing.
120
121
122 -D SQL --ddl=SQL
123 Execute given sqlite statement after the database is opened and
124 initialized as extra DDL (SQL data definition language). This
125 may be useful to tune performance-related pragmas or indexes.
126 May be repeated. The default is nothing extra.
127
128
129 -p NUM --port=NUM
130 Set the TCP port number (0 < NUM < 65536) on which debuginfod
131 should listen, to service HTTP requests. Both IPv4 and IPV6
132 sockets are opened, if possible. The webapi is documented be‐
133 low. The default port number is 8002.
134
135
136 -I REGEX --include=REGEX -X REGEX --exclude=REGEX
137 Govern the inclusion and exclusion of file names under the
138 search paths. The regular expressions are interpreted as unan‐
139 chored POSIX extended REs, thus may include alternation. They
140 are evaluated against the full path of each file, based on its
141 realpath(3) canonicalization. By default, all files are includ‐
142 ed and none are excluded. A file that matches both include and
143 exclude REGEX is excluded. (The contents of archive files are
144 not subject to inclusion or exclusion filtering: they are all
145 processed.) Only the last of each type of regular expression
146 given is used.
147
148
149 -t SECONDS --rescan-time=SECONDS
150 Set the rescan time for the file and archive directories. This
151 is the amount of time the traversal thread will wait after fin‐
152 ishing a scan, before doing it again. A rescan for unchanged
153 files is fast (because the index also stores the file mtimes).
154 A time of zero is acceptable, and means that only one initial
155 scan should performed. The default rescan time is 300 seconds.
156 Receiving a SIGUSR1 signal triggers a new scan, independent of
157 the rescan time (including if it was zero), interrupting a groom
158 pass (if any).
159
160
161 -r Apply the -I and -X during groom cycles, so that files excluded
162 by the regexes are removed from the index. These parameters are
163 in addition to what normally qualifies a file for grooming, not
164 a replacement.
165
166 -g SECONDS --groom-time=SECONDS Set the groom time for the index
167 database. This is the amount of time the grooming thread will
168 wait after finishing a grooming pass before doing it again. A
169 groom operation quickly rescans all previously scanned files,
170 only to see if they are still present and current, so it can
171 deindex obsolete files. See also the DATA MANAGEMENT section.
172 The default groom time is 86400 seconds (1 day). A time of zero
173 is acceptable, and means that only one initial groom should be
174 performed. Receiving a SIGUSR2 signal triggers a new grooming
175 pass, independent of the groom time (including if it was zero),
176 interrupting a rescan pass (if any)..
177
178
179 -G Run an extraordinary maximal-grooming pass at debuginfod start‐
180 up. This pass can take considerable time, because it tries to
181 remove any debuginfo-unrelated content from the archive-related
182 parts of the index. It should not be run if any recent archive-
183 related indexing operations were aborted early. It can take
184 considerable space, because it finishes up with an sqlite "vacu‐
185 um" operation, which repacks the database file by triplicating
186 it temporarily. The default is not to do maximal-grooming. See
187 also the DATA MANAGEMENT section.
188
189
190 -c NUM --concurrency=NUM
191 Set the concurrency limit for the scanning queue threads, which
192 work together to process archives & files located by the traver‐
193 sal thread. This important for controlling CPU-intensive opera‐
194 tions like parsing an ELF file and especially decompressing ar‐
195 chives. The default is the number of processors on the system;
196 the minimum is 1.
197
198
199 -C -C=NUM --connection-pool --connection-pool=NUM
200 Set the size of the pool of threads serving webapi queries. The
201 following table summarizes the interpretaton of this option and
202 its optional NUM parameter.
203
204 no option clone new thread for every request, no fixed pool
205 -C use a fixed thread pool sized automatically
206 -C=NUM use a fixed thread pool sized NUM, minimum 2
207
208 The first mode is useful for friendly bursty traffic. The sec‐
209 ond mode is a simple and safe configuration based on the number
210 of processors. The third mode is suitable for tuned load-limit‐
211 ing configurations facing unruly traffic.
212
213
214 -L Traverse symbolic links encountered during traversal of the
215 PATHs, including across devices - as in find -L. The default is
216 to traverse the physical directory structure only, stay on the
217 same device, and ignore symlinks - as in find -P -xdev. Cau‐
218 tion: a loops in the symbolic directory tree might lead to infi‐
219 nite traversal.
220
221
222 --fdcache-fds=NUM --fdcache-mbs=MB --fdcache-prefetch=NUM2
223 Configure limits on a cache that keeps recently extracted files
224 from archives. Up to NUM requested files and up to a total of
225 MB megabytes will be kept extracted, in order to avoid having to
226 decompress their archives over and over again. In addition, up
227 to NUM2 other files from an archive may be prefetched into the
228 cache before they are even requested. The default NUM, NUM2,
229 and MB values depend on the concurrency of the system, and on
230 the available disk space on the $TMPDIR or /tmp filesystem.
231 This is because that is where the most recently used extracted
232 files are kept. Grooming cleans this cache.
233
234
235 --fdcache--prefetch-fds=NUM --fdcache--prefetch-mbs=MB
236 Configure how many file descriptors (fds) and megabytes (mbs)
237 are allocated to the prefetch fdcache. If unspecified, values of
238 --prefetch-fds and --prefetch-mbs depend on concurrency of the
239 system and on the available disk space on the $TMPDIR. Allocat‐
240 ing more to the prefetch cache will improve performance in envi‐
241 ronments where different parts of several large archives are be‐
242 ing accessed.
243
244
245 --fdcache-mintmp=NUM
246 Configure a disk space threshold for emergency flushing of the
247 cache. The filesystem holding the cache is checked periodical‐
248 ly. If the available space falls below the given percentage,
249 the cache is flushed, and the fdcache will stay disabled until
250 the next groom cycle. This mechanism, along a few associated
251 /metrics on the webapi, are intended to give an operator notice
252 about storage scarcity - which can translate to RAM scarcity if
253 the disk happens to be on a RAM virtual disk. The default
254 threshold is 25%.
255
256
257 --forwarded-ttl-limit=NUM
258 Configure limits of X-Forwarded-For hops. if X-Forwarded-For ex‐
259 ceeds N hops, it will not delegate a local lookup miss to up‐
260 stream debuginfods. The default limit is 8.
261
262
263 -v Increase verbosity of logging to the standard error file de‐
264 scriptor. May be repeated to increase details. The default
265 verbosity is 0.
266
267
269 debuginfod's webapi resembles ordinary file service, where a GET re‐
270 quest with a path containing a known buildid results in a file. Un‐
271 known buildid / request combinations result in HTTP error codes. This
272 file service resemblance is intentional, so that an installation can
273 take advantage of standard HTTP management infrastructure.
274
275 Upon finding a file in an archive or simply in the database, some cus‐
276 tom http headers are added to the response. For files in the database
277 X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE are added. X-DEBUGINFOD-FILE
278 is simply the unescaped filename and X-DEBUGINFOD-SIZE is the size of
279 the file. For files found in archives, in addition to X-DEBUGINFOD-FILE
280 and X-DEBUGINFOD-SIZE, X-DEBUGINFOD-ARCHIVE is added. X-DEBUGINFOD-AR‐
281 CHIVE is the name of the archive the file was found in.
282
283 There are three requests. In each case, the buildid is encoded as a
284 lowercase hexadecimal string. For example, for a program /bin/ls, look
285 at the ELF note GNU_BUILD_ID:
286
287 % readelf -n /bin/ls | grep -A4 build.id
288 Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
289 Owner Data size Type
290 GNU 20 GNU_BUILD_ID
291 Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d
292
293 Then the hexadecimal BUILDID is simply:
294
295 8713b9c3fb8a720137a4a08b325905c7aaf8429d
296
297
298 /buildid/BUILDID/debuginfo
299 If the given buildid is known to the server, this request will result
300 in a binary object that contains the customary .*debug_* sections.
301 This may be a split debuginfo file as created by strip, or it may be an
302 original unstripped executable.
303
304
305 /buildid/BUILDID/executable
306 If the given buildid is known to the server, this request will result
307 in a binary object that contains the normal executable segments. This
308 may be a executable stripped by strip, or it may be an original un‐
309 stripped executable. ET_DYN shared libraries are considered to be a
310 type of executable.
311
312
313 /buildid/BUILDID/source/SOURCE/FILE
314 If the given buildid is known to the server, this request will result
315 in a binary object that contains the source file mentioned. The path
316 should be absolute. Relative path names commonly appear in the DWARF
317 file's source directory, but these paths are relative to individual
318 compilation unit AT_comp_dir paths, and yet an executable is made up of
319 multiple CUs. Therefore, to disambiguate, debuginfod expects source
320 queries to prefix relative path names with the CU compilation-directo‐
321 ry, followed by a mandatory "/".
322
323 Note: the caller may or may not elide ../ or /./ or extraneous ///
324 sorts of path components in the directory names. debuginfod accepts
325 both forms. Specifically, debuginfod canonicalizes path names accord‐
326 ing to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any
327 // to / in the path.
328
329 For example:
330
331 #include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h
332 /path/to/foo.c /buildid/BUILDID/source/path/to/foo.c
333 ../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
334
335 Note: the client should %-escape characters in /SOURCE/FILE that are
336 not shown as "unreserved" in section 2.3 of RFC3986. Some characters
337 that will be escaped include "+", "\", "$", "!", the 'space' character,
338 and ";". RFC3986 includes a more comprehensive list of these charac‐
339 ters.
340
341 /metrics
342 This endpoint returns a Prometheus formatted text/plain dump of a vari‐
343 ety of statistics about the operation of the debuginfod server. The
344 exact set of metrics and their meanings may change in future versions.
345 Caution: configuration information (path names, versions) may be dis‐
346 closed.
347
348
350 debuginfod stores its index in an sqlite database in a densely packed
351 set of interlinked tables. While the representation is as efficient as
352 we have been able to make it, it still takes a considerable amount of
353 data to record all debuginfo-related data of potentially a great many
354 files. This section offers some advice about the implications.
355
356 As a general explanation for size, consider that debuginfod indexes
357 ELF/DWARF files, it stores their names and referenced source file
358 names, and buildids will be stored. When indexing archives, it stores
359 every file name of or in an archive, every buildid, plus every source
360 file name referenced from a DWARF file. (Indexing archives takes more
361 space because the source files often reside in separate subpackages
362 that may not be indexed at the same pass, so extra metadata has to be
363 kept.)
364
365 Getting down to numbers, in the case of Fedora RPMs (essentially, gzip-
366 compressed cpio files), the sqlite index database tends to be from 0.5%
367 to 3% of their size. It's larger for binaries that are assembled out
368 of a great many source files, or packages that carry much debuginfo-un‐
369 related content. It may be even larger during the indexing phase due
370 to temporary sqlite write-ahead-logging files; these are checkpointed
371 (cleaned out and removed) at shutdown. It may be helpful to apply
372 tight -I or -X regular-expression constraints to exclude files from
373 scanning that you know have no debuginfo-relevant content.
374
375 As debuginfod runs in normal active mode, it periodically rescans its
376 target directories, and any new content found is added to the database.
377 Old content, such as data for files that have disappeared or that have
378 been replaced with newer versions is removed at a periodic grooming
379 pass. This means that the sqlite files grow fast during initial index‐
380 ing, slowly during index rescans, and periodically shrink during groom‐
381 ing. There is also an optional one-shot maximal grooming pass is
382 available. It removes information debuginfo-unrelated data from the
383 archive content index such as file names found in archives ("archive
384 sdef" records) that are not referred to as source files from any bina‐
385 ries find in archives ("archive sref" records). This can save consid‐
386 erable disk space. However, it is slow and temporarily requires up to
387 twice the database size as free space. Worse: it may result in missing
388 source-code info if the archive traversals were interrupted, so that
389 not all source file references were known. Use it rarely to polish a
390 complete index.
391
392 You should ensure that ample disk space remains available. (The flood
393 of error messages on -ENOSPC is ugly and nagging. But, like for most
394 other errors, debuginfod will resume when resources permit.) If neces‐
395 sary, debuginfod can be stopped, the database file moved or removed,
396 and debuginfod restarted.
397
398 sqlite offers several performance-related options in the form of prag‐
399 mas. Some may be useful to fine-tune the defaults plus the debuginfod
400 extras. The -D option may be useful to tell debuginfod to execute the
401 given bits of SQL after the basic schema creation commands. For exam‐
402 ple, the "synchronous", "cache_size", "auto_vacuum", "threads", "jour‐
403 nal_mode" pragmas may be fun to tweak via -D, if you're searching for
404 peak performance. The "optimize", "wal_checkpoint" pragmas may be use‐
405 ful to run periodically, outside debuginfod. The default settings are
406 performance- rather than reliability-oriented, so a hardware crash
407 might corrupt the database. In these cases, it may be necessary to
408 manually delete the sqlite database and start over.
409
410 As debuginfod changes in the future, we may have no choice but to
411 change the database schema in an incompatible manner. If this happens,
412 new versions of debuginfod will issue SQL statements to drop all prior
413 schema & data, and start over. So, disk space will not be wasted for
414 retaining a no-longer-useable dataset.
415
416 In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
417 size ratio, and slow growth afterwards, you should not need to worry
418 about disk space. If a system crash corrupts the database, or you want
419 to force debuginfod to reset and start over, simply erase the sqlite
420 file before restarting debuginfod.
421
422 In contrast, in passive mode, all scanning and grooming is disabled,
423 and the index database remains read-only. This makes the database more
424 suitable for sharing between servers or sites with simple one-way
425 replication, and data management considerations are generally moot.
426
427
429 debuginfod does not include any particular security features. While it
430 is robust with respect to inputs, some abuse is possible. It forks a
431 new thread for each incoming HTTP request, which could lead to a de‐
432 nial-of-service in terms of RAM, CPU, disk I/O, or network I/O. If
433 this is a problem, users are advised to install debuginfod with a HTTPS
434 reverse-proxy front-end that enforces site policies for firewalling,
435 authentication, integrity, authorization, and load control. The /met‐
436 rics webapi endpoint is probably not appropriate for disclosure to the
437 public.
438
439 When relaying queries to upstream debuginfods, debuginfod does not in‐
440 clude any particular security features. It trusts that the binaries
441 returned by the debuginfods are accurate. Therefore, the list of
442 servers should include only trustworthy ones. If accessed across HTTP
443 rather than HTTPS, the network should be trustworthy. Authentication
444 information through the internal libcurl library is not currently en‐
445 abled.
446
447
448
450 $HOME/.debuginfod.sqlite
451 Default database file.
452
453
454
456 debuginfod-find(1) sqlite3(1)
457 https://prometheus.io/docs/instrumenting/exporters/
458
459
460
461 DEBUGINFOD(8)