1
2DEBUGINFOD(8) System Manager's Manual DEBUGINFOD(8)
3
4
5
7 debuginfod - debuginfo-related http file-server daemon
8
9
11 debuginfod [OPTION]... [PATH]...
12
13
15 debuginfod serves debuginfo-related artifacts over HTTP. It periodi‐
16 cally scans a set of directories for ELF/DWARF files and their associ‐
17 ated source code, as well as archive files containing the above, to
18 build an index by their buildid. This index is used when remote
19 clients use the HTTP webapi, to fetch these files by the same buildid.
20
21 If a debuginfod cannot service a given buildid artifact request itself,
22 and it is configured with information about upstream debuginfod
23 servers, it queries them for the same information, just as debuginfod-
24 find would. If successful, it locally caches then relays the file con‐
25 tent to the original requester.
26
27 Indexing the given PATHs proceeds using multiple threads. One thread
28 periodically traverses all the given PATHs logically or physically (see
29 the -L option). Duplicate PATHs are ignored. You may use a file name
30 for a PATH, but source code indexing may be incomplete; prefer using a
31 directory that contains the binaries. The traversal thread enumerates
32 all matching files (see the -I and -X options) into a work queue. A
33 collection of scanner threads (see the -c option) wait at the work
34 queue to analyze files in parallel.
35
36 If the -F option is given, each file is scanned as an ELF/DWARF file.
37 Source files are matched with DWARF files based on the AT_comp_dir
38 (compilation directory) attributes inside it. Caution: source files
39 listed in the DWARF may be a path anywhere in the file system, and de‐
40 buginfod will readily serve their content on demand. (Imagine a doc‐
41 tored DWARF file that lists /etc/passwd as a source file.) If this is
42 a concern, audit your binaries with tools such as:
43
44 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
45 or
46 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
47 or even use debuginfod itself:
48 % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
49 ^C
50
51 If any of the -R, -U, or -Z options is given, each file is scanned as
52 an archive file that may contain ELF/DWARF/source files. Archive files
53 are recognized by extension. If -R is given, ".rpm" files are scanned;
54 if -U is given, ".deb" and ".ddeb" files are scanned; if -Z is given,
55 the listed extensions are scanned. Because of complications such as
56 DWZ-compressed debuginfo, may require two traversal passes to identify
57 all source code. Source files for RPMs are only served from other
58 RPMs, so the caution for -F does not apply. Note that due to De‐
59 bian/Ubuntu packaging policies & mechanisms, debuginfod cannot resolve
60 source files for DEB/DDEB at all.
61
62 If no PATH is listed, or none of the scanning options is given, then
63 debuginfod will simply serve content that it accumulated into its index
64 in all previous runs, periodically groom the database, and federate to
65 any upstream debuginfod servers. In passive mode, debuginfod will only
66 serve content from a read-only index and federated upstream servers,
67 but will not scan or groom.
68
69
71 -F Activate ELF/DWARF file scanning. The default is off.
72
73
74 -Z EXT -Z EXT=CMD
75 Activate an additional pattern in archive scanning. Files with
76 name extension EXT (include the dot) will be processed. If CMD
77 is given, it is invoked with the file name added to its argument
78 list, and should produce a common archive on its standard out‐
79 put. Otherwise, the file is read as if CMD were "cat". Since
80 debuginfod internally uses libarchive to read archive files, it
81 can accept a wide range of archive formats and compression
82 modes. The default is no additional patterns. This option may
83 be repeated.
84
85
86 -R Activate RPM patterns in archive scanning. The default is off.
87 Equivalent to -Z .rpm=cat, since libarchive can natively process
88 RPM archives. If your version of libarchive is much older than
89 2020, be aware that some distributions have switched to an in‐
90 compatible zstd compression for their payload. You may experi‐
91 ment with -Z .rpm='(rpm2cpio|zstdcat)<' instead of -R.
92
93
94 -U Activate DEB/DDEB patterns in archive scanning. The default is
95 off. Equivalent to -Z .deb='dpkg-deb --fsys-tarfile'
96 -Z .ddeb='dpkg-deb --fsys-tarfile'.
97
98
99 -d FILE --database=FILE
100 Set the path of the sqlite database used to store the index.
101 This file is disposable in the sense that a later rescan will
102 repopulate data. It will contain absolute file path names, so
103 it may not be portable across machines. It may be frequently
104 read/written, so it should be on a fast filesystem. It should
105 not be shared across machines or users, to maximize sqlite lock‐
106 ing performance. For quick testing the magic string ":memory:"
107 can be used to use an one-time memory-only database. The de‐
108 fault database file is $HOME/.debuginfod.sqlite.
109
110
111 --passive
112 Set the server to passive mode, where it only services webapi
113 requests, including participating in federation. It performs no
114 scanning, no grooming, and so only opens the sqlite database
115 read-only. This way a database can be safely shared between a
116 active scanner/groomer server and multiple passive ones, thereby
117 sharing service load. Archive pattern options must still be
118 given, so debuginfod can recognize file name extensions for un‐
119 packing.
120
121
122 -D SQL --ddl=SQL
123 Execute given sqlite statement after the database is opened and
124 initialized as extra DDL (SQL data definition language). This
125 may be useful to tune performance-related pragmas or indexes.
126 May be repeated. The default is nothing extra.
127
128
129 -p NUM --port=NUM
130 Set the TCP port number (0 < NUM < 65536) on which debuginfod
131 should listen, to service HTTP requests. Both IPv4 and IPV6
132 sockets are opened, if possible. The webapi is documented be‐
133 low. The default port number is 8002.
134
135
136 -I REGEX --include=REGEX -X REGEX --exclude=REGEX
137 Govern the inclusion and exclusion of file names under the
138 search paths. The regular expressions are interpreted as unan‐
139 chored POSIX extended REs, thus may include alternation. They
140 are evaluated against the full path of each file, based on its
141 realpath(3) canonicalization. By default, all files are includ‐
142 ed and none are excluded. A file that matches both include and
143 exclude REGEX is excluded. (The contents of archive files are
144 not subject to inclusion or exclusion filtering: they are all
145 processed.) Only the last of each type of regular expression
146 given is used.
147
148
149 -t SECONDS --rescan-time=SECONDS
150 Set the rescan time for the file and archive directories. This
151 is the amount of time the traversal thread will wait after fin‐
152 ishing a scan, before doing it again. A rescan for unchanged
153 files is fast (because the index also stores the file mtimes).
154 A time of zero is acceptable, and means that only one initial
155 scan should performed. The default rescan time is 300 seconds.
156 Receiving a SIGUSR1 signal triggers a new scan, independent of
157 the rescan time (including if it was zero), interrupting a groom
158 pass (if any).
159
160
161 -r Apply the -I and -X during groom cycles, so that files excluded
162 by the regexes are removed from the index. These parameters are
163 in addition to what normally qualifies a file for grooming, not
164 a replacement.
165
166 -g SECONDS --groom-time=SECONDS Set the groom time for the index
167 database. This is the amount of time the grooming thread will
168 wait after finishing a grooming pass before doing it again. A
169 groom operation quickly rescans all previously scanned files,
170 only to see if they are still present and current, so it can
171 deindex obsolete files. See also the DATA MANAGEMENT section.
172 The default groom time is 86400 seconds (1 day). A time of zero
173 is acceptable, and means that only one initial groom should be
174 performed. Receiving a SIGUSR2 signal triggers a new grooming
175 pass, independent of the groom time (including if it was zero),
176 interrupting a rescan pass (if any)..
177
178
179 -G Run an extraordinary maximal-grooming pass at debuginfod start‐
180 up. This pass can take considerable time, because it tries to
181 remove any debuginfo-unrelated content from the archive-related
182 parts of the index. It should not be run if any recent archive-
183 related indexing operations were aborted early. It can take
184 considerable space, because it finishes up with an sqlite "vacu‐
185 um" operation, which repacks the database file by triplicating
186 it temporarily. The default is not to do maximal-grooming. See
187 also the DATA MANAGEMENT section.
188
189
190 -c NUM --concurrency=NUM
191 Set the concurrency limit for the scanning queue threads, which
192 work together to process archives & files located by the traver‐
193 sal thread. This important for controlling CPU-intensive opera‐
194 tions like parsing an ELF file and especially decompressing ar‐
195 chives. The default is the number of processors on the system;
196 the minimum is 1.
197
198
199 -L Traverse symbolic links encountered during traversal of the
200 PATHs, including across devices - as in find -L. The default is
201 to traverse the physical directory structure only, stay on the
202 same device, and ignore symlinks - as in find -P -xdev. Cau‐
203 tion: a loops in the symbolic directory tree might lead to infi‐
204 nite traversal.
205
206
207 --fdcache-fds=NUM --fdcache-mbs=MB --fdcache-prefetch=NUM2
208 Configure limits on a cache that keeps recently extracted files
209 from archives. Up to NUM requested files and up to a total of
210 MB megabytes will be kept extracted, in order to avoid having to
211 decompress their archives over and over again. In addition, up
212 to NUM2 other files from an archive may be prefetched into the
213 cache before they are even requested. The default NUM, NUM2,
214 and MB values depend on the concurrency of the system, and on
215 the available disk space on the $TMPDIR or /tmp filesystem.
216 This is because that is where the most recently used extracted
217 files are kept. Grooming cleans this cache.
218
219
220 --fdcache--prefetch-fds=NUM --fdcache--prefetch-mbs=MB
221 Configure how many file descriptors (fds) and megabytes (mbs)
222 are allocated to the prefetch fdcache. If unspecified, values of
223 --prefetch-fds and --prefetch-mbs depend on concurrency of the
224 system and on the available disk space on the $TMPDIR. Allocat‐
225 ing more to the prefetch cache will improve performance in envi‐
226 ronments where different parts of several large archives are be‐
227 ing accessed.
228
229
230 --fdcache-mintmp=NUM
231 Configure a disk space threshold for emergency flushing of the
232 cache. The filesystem holding the cache is checked periodical‐
233 ly. If the available space falls below the given percentage,
234 the cache is flushed, and the fdcache will stay disabled until
235 the next groom cycle. This mechanism, along a few associated
236 /metrics on the webapi, are intended to give an operator notice
237 about storage scarcity - which can translate to RAM scarcity if
238 the disk happens to be on a RAM virtual disk. The default
239 threshold is 25%.
240
241
242 --forwarded-ttl-limit=NUM
243 Configure limits of X-Forwarded-For hops. if X-Forwarded-For ex‐
244 ceeds N hops, it will not delegate a local lookup miss to up‐
245 stream debuginfods. The default limit is 8.
246
247
248 -v Increase verbosity of logging to the standard error file de‐
249 scriptor. May be repeated to increase details. The default
250 verbosity is 0.
251
252
254 debuginfod's webapi resembles ordinary file service, where a GET re‐
255 quest with a path containing a known buildid results in a file. Un‐
256 known buildid / request combinations result in HTTP error codes. This
257 file service resemblance is intentional, so that an installation can
258 take advantage of standard HTTP management infrastructure.
259
260 Upon finding a file in an archive or simply in the database, some cus‐
261 tom http headers are added to the response. For files in the database
262 X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE are added. X-DEBUGINFOD-FILE
263 is simply the unescaped filename and X-DEBUGINFOD-SIZE is the size of
264 the file. For files found in archives, in addition to X-DEBUGINFOD-FILE
265 and X-DEBUGINFOD-SIZE, X-DEBUGINFOD-ARCHIVE is added. X-DEBUGINFOD-AR‐
266 CHIVE is the name of the archive the file was found in.
267
268 There are three requests. In each case, the buildid is encoded as a
269 lowercase hexadecimal string. For example, for a program /bin/ls, look
270 at the ELF note GNU_BUILD_ID:
271
272 % readelf -n /bin/ls | grep -A4 build.id
273 Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
274 Owner Data size Type
275 GNU 20 GNU_BUILD_ID
276 Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d
277
278 Then the hexadecimal BUILDID is simply:
279
280 8713b9c3fb8a720137a4a08b325905c7aaf8429d
281
282
283 /buildid/BUILDID/debuginfo
284 If the given buildid is known to the server, this request will result
285 in a binary object that contains the customary .*debug_* sections.
286 This may be a split debuginfo file as created by strip, or it may be an
287 original unstripped executable.
288
289
290 /buildid/BUILDID/executable
291 If the given buildid is known to the server, this request will result
292 in a binary object that contains the normal executable segments. This
293 may be a executable stripped by strip, or it may be an original un‐
294 stripped executable. ET_DYN shared libraries are considered to be a
295 type of executable.
296
297
298 /buildid/BUILDID/source/SOURCE/FILE
299 If the given buildid is known to the server, this request will result
300 in a binary object that contains the source file mentioned. The path
301 should be absolute. Relative path names commonly appear in the DWARF
302 file's source directory, but these paths are relative to individual
303 compilation unit AT_comp_dir paths, and yet an executable is made up of
304 multiple CUs. Therefore, to disambiguate, debuginfod expects source
305 queries to prefix relative path names with the CU compilation-directo‐
306 ry, followed by a mandatory "/".
307
308 Note: the caller may or may not elide ../ or /./ or extraneous ///
309 sorts of path components in the directory names. debuginfod accepts
310 both forms. Specifically, debuginfod canonicalizes path names accord‐
311 ing to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any
312 // to / in the path.
313
314 For example:
315
316 #include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h
317 /path/to/foo.c /buildid/BUILDID/source/path/to/foo.c
318 ../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
319
320 Note: the client should %-escape characters in /SOURCE/FILE that are
321 not shown as "unreserved" in section 2.3 of RFC3986. Some characters
322 that will be escaped include "+", "\", "$", "!", the 'space' character,
323 and ";". RFC3986 includes a more comprehensive list of these charac‐
324 ters.
325
326 /metrics
327 This endpoint returns a Prometheus formatted text/plain dump of a vari‐
328 ety of statistics about the operation of the debuginfod server. The
329 exact set of metrics and their meanings may change in future versions.
330 Caution: configuration information (path names, versions) may be dis‐
331 closed.
332
333
335 debuginfod stores its index in an sqlite database in a densely packed
336 set of interlinked tables. While the representation is as efficient as
337 we have been able to make it, it still takes a considerable amount of
338 data to record all debuginfo-related data of potentially a great many
339 files. This section offers some advice about the implications.
340
341 As a general explanation for size, consider that debuginfod indexes
342 ELF/DWARF files, it stores their names and referenced source file
343 names, and buildids will be stored. When indexing archives, it stores
344 every file name of or in an archive, every buildid, plus every source
345 file name referenced from a DWARF file. (Indexing archives takes more
346 space because the source files often reside in separate subpackages
347 that may not be indexed at the same pass, so extra metadata has to be
348 kept.)
349
350 Getting down to numbers, in the case of Fedora RPMs (essentially, gzip-
351 compressed cpio files), the sqlite index database tends to be from 0.5%
352 to 3% of their size. It's larger for binaries that are assembled out
353 of a great many source files, or packages that carry much debuginfo-un‐
354 related content. It may be even larger during the indexing phase due
355 to temporary sqlite write-ahead-logging files; these are checkpointed
356 (cleaned out and removed) at shutdown. It may be helpful to apply
357 tight -I or -X regular-expression constraints to exclude files from
358 scanning that you know have no debuginfo-relevant content.
359
360 As debuginfod runs in normal active mode, it periodically rescans its
361 target directories, and any new content found is added to the database.
362 Old content, such as data for files that have disappeared or that have
363 been replaced with newer versions is removed at a periodic grooming
364 pass. This means that the sqlite files grow fast during initial index‐
365 ing, slowly during index rescans, and periodically shrink during groom‐
366 ing. There is also an optional one-shot maximal grooming pass is
367 available. It removes information debuginfo-unrelated data from the
368 archive content index such as file names found in archives ("archive
369 sdef" records) that are not referred to as source files from any bina‐
370 ries find in archives ("archive sref" records). This can save consid‐
371 erable disk space. However, it is slow and temporarily requires up to
372 twice the database size as free space. Worse: it may result in missing
373 source-code info if the archive traversals were interrupted, so that
374 not all source file references were known. Use it rarely to polish a
375 complete index.
376
377 You should ensure that ample disk space remains available. (The flood
378 of error messages on -ENOSPC is ugly and nagging. But, like for most
379 other errors, debuginfod will resume when resources permit.) If neces‐
380 sary, debuginfod can be stopped, the database file moved or removed,
381 and debuginfod restarted.
382
383 sqlite offers several performance-related options in the form of prag‐
384 mas. Some may be useful to fine-tune the defaults plus the debuginfod
385 extras. The -D option may be useful to tell debuginfod to execute the
386 given bits of SQL after the basic schema creation commands. For exam‐
387 ple, the "synchronous", "cache_size", "auto_vacuum", "threads", "jour‐
388 nal_mode" pragmas may be fun to tweak via -D, if you're searching for
389 peak performance. The "optimize", "wal_checkpoint" pragmas may be use‐
390 ful to run periodically, outside debuginfod. The default settings are
391 performance- rather than reliability-oriented, so a hardware crash
392 might corrupt the database. In these cases, it may be necessary to
393 manually delete the sqlite database and start over.
394
395 As debuginfod changes in the future, we may have no choice but to
396 change the database schema in an incompatible manner. If this happens,
397 new versions of debuginfod will issue SQL statements to drop all prior
398 schema & data, and start over. So, disk space will not be wasted for
399 retaining a no-longer-useable dataset.
400
401 In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
402 size ratio, and slow growth afterwards, you should not need to worry
403 about disk space. If a system crash corrupts the database, or you want
404 to force debuginfod to reset and start over, simply erase the sqlite
405 file before restarting debuginfod.
406
407 In contrast, in passive mode, all scanning and grooming is disabled,
408 and the index database remains read-only. This makes the database more
409 suitable for sharing between servers or sites with simple one-way
410 replication, and data management considerations are generally moot.
411
412
414 debuginfod does not include any particular security features. While it
415 is robust with respect to inputs, some abuse is possible. It forks a
416 new thread for each incoming HTTP request, which could lead to a de‐
417 nial-of-service in terms of RAM, CPU, disk I/O, or network I/O. If
418 this is a problem, users are advised to install debuginfod with a HTTPS
419 reverse-proxy front-end that enforces site policies for firewalling,
420 authentication, integrity, authorization, and load control. The /met‐
421 rics webapi endpoint is probably not appropriate for disclosure to the
422 public.
423
424 When relaying queries to upstream debuginfods, debuginfod does not in‐
425 clude any particular security features. It trusts that the binaries
426 returned by the debuginfods are accurate. Therefore, the list of
427 servers should include only trustworthy ones. If accessed across HTTP
428 rather than HTTPS, the network should be trustworthy. Authentication
429 information through the internal libcurl library is not currently en‐
430 abled.
431
432
433
435 $HOME/.debuginfod.sqlite
436 Default database file.
437
438
439
441 debuginfod-find(1) sqlite3(1)
442 https://prometheus.io/docs/instrumenting/exporters/
443
444
445
446 DEBUGINFOD(8)