1
2
3DEBUGINFOD(8) System Manager's Manual DEBUGINFOD(8)
4
5
6
8 debuginfod - debuginfo-related http file-server daemon
9
10
12 debuginfod [OPTION]... [PATH]...
13
14
16 debuginfod serves debuginfo-related artifacts over HTTP. It periodi‐
17 cally scans a set of directories for ELF/DWARF files and their associ‐
18 ated source code, as well as archive files containing the above, to
19 build an index by their buildid. This index is used when remote
20 clients use the HTTP webapi, to fetch these files by the same buildid.
21
22 If a debuginfod cannot service a given buildid artifact request itself,
23 and it is configured with information about upstream debuginfod
24 servers, it queries them for the same information, just as debuginfod-
25 find would. If successful, it locally caches then relays the file con‐
26 tent to the original requester.
27
28 Indexing the given PATHs proceeds using multiple threads. One thread
29 periodically traverses all the given PATHs logically or physically (see
30 the -L option). Duplicate PATHs are ignored. You may use a file name
31 for a PATH, but source code indexing may be incomplete; prefer using a
32 directory that contains the binaries. The traversal thread enumerates
33 all matching files (see the -I and -X options) into a work queue. A
34 collection of scanner threads (see the -c option) wait at the work
35 queue to analyze files in parallel.
36
37 If the -F option is given, each file is scanned as an ELF/DWARF file.
38 Source files are matched with DWARF files based on the AT_comp_dir
39 (compilation directory) attributes inside it. Caution: source files
40 listed in the DWARF may be a path anywhere in the file system, and
41 debuginfod will readily serve their content on demand. (Imagine a doc‐
42 tored DWARF file that lists /etc/passwd as a source file.) If this is
43 a concern, audit your binaries with tools such as:
44
45 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
46 or
47 % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
48 or even use debuginfod itself:
49 % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
50 ^C
51
52 If any of the -R, -U, or -Z options is given, each file is scanned as
53 an archive file that may contain ELF/DWARF/source files. Archive files
54 are recognized by extension. If -R is given, ".rpm" files are scanned;
55 if -D is given, ".deb" and ".ddeb" files are scanned; if -Z is given,
56 the listed extensions are scanned. Because of complications such as
57 DWZ-compressed debuginfo, may require two traversal passes to identify
58 all source code. Source files for RPMs are only served from other
59 RPMs, so the caution for -F does not apply. Note that due to De‐
60 bian/Ubuntu packaging policies & mechanisms, debuginfod cannot resolve
61 source files for DEB/DDEB at all.
62
63 If no PATH is listed, or none of the scanning options is given, then
64 debuginfod will simply serve content that it accumulated into its index
65 in all previous runs, and federate to any upstream debuginfod servers.
66
67
68
70 -F Activate ELF/DWARF file scanning. The default is off.
71
72
73 -Z EXT -Z EXT=CMD
74 Activate an additional pattern in archive scanning. Files with
75 name extension EXT (include the dot) will be processed. If CMD
76 is given, it is invoked with the file name added to its argument
77 list, and should produce a common archive on its standard out‐
78 put. Otherwise, the file is read as if CMD were "cat". Since
79 debuginfod internally uses libarchive to read archive files, it
80 can accept a wide range of archive formats and compression
81 modes. The default is no additional patterns. This option may
82 be repeated.
83
84
85 -R Activate RPM patterns in archive scanning. The default is off.
86 Equivalent to -Z .rpm=cat, since libarchive can natively process
87 RPM archives. If your version of libarchive is much older than
88 2020, be aware that some distributions have switched to an in‐
89 compatible zstd compression for their payload. You may experi‐
90 ment with -Z .rpm='(rpm2cpio|zstdcat)<' instead of -R.
91
92
93 -U Activate DEB/DDEB patterns in archive scanning. The default is
94 off. Equivalent to -Z .deb='dpkg-deb --fsys-tarfile'
95 -Z .ddeb='dpkg-deb --fsys-tarfile'.
96
97
98 -d FILE --database=FILE
99 Set the path of the sqlite database used to store the index.
100 This file is disposable in the sense that a later rescan will
101 repopulate data. It will contain absolute file path names, so
102 it may not be portable across machines. It may be frequently
103 read/written, so it should be on a fast filesytem. It should
104 not be shared across machines or users, to maximize sqlite lock‐
105 ing performance. The default database file is
106 $HOME/.debuginfod.sqlite.
107
108
109 -D SQL --ddl=SQL
110 Execute given sqlite statement after the database is opened and
111 initialized as extra DDL (SQL data definition language). This
112 may be useful to tune performance-related pragmas or indexes.
113 May be repeated. The default is nothing extra.
114
115
116 -p NUM --port=NUM
117 Set the TCP port number (0 < NUM < 65536) on which debuginfod
118 should listen, to service HTTP requests. Both IPv4 and IPV6
119 sockets are opened, if possible. The webapi is documented be‐
120 low. The default port number is 8002.
121
122
123 -I REGEX --include=REGEX -X REGEX --exclude=REGEX
124 Govern the inclusion and exclusion of file names under the
125 search paths. The regular expressions are interpreted as unan‐
126 chored POSIX extended REs, thus may include alternation. They
127 are evaluated against the full path of each file, based on its
128 realpath(3) canonicalization. By default, all files are includ‐
129 ed and none are excluded. A file that matches both include and
130 exclude REGEX is excluded. (The contents of archive files are
131 not subject to inclusion or exclusion filtering: they are all
132 processed.) Only the last of each type of regular expression
133 given is used.
134
135
136 -t SECONDS --rescan-time=SECONDS
137 Set the rescan time for the file and archive directories. This
138 is the amount of time the traversal thread will wait after fin‐
139 ishing a scan, before doing it again. A rescan for unchanged
140 files is fast (because the index also stores the file mtimes).
141 A time of zero is acceptable, and means that only one initial
142 scan should performed. The default rescan time is 300 seconds.
143 Receiving a SIGUSR1 signal triggers a new scan, independent of
144 the rescan time (including if it was zero).
145
146
147 -g SECONDS --groom-time=SECONDS
148 Set the groom time for the index database. This is the amount
149 of time the grooming thread will wait after finishing a grooming
150 pass before doing it again. A groom operation quickly rescans
151 all previously scanned files, only to see if they are still
152 present and current, so it can deindex obsolete files. See also
153 the DATA MANAGEMENT section. The default groom time is 86400
154 seconds (1 day). A time of zero is acceptable, and means that
155 only one initial groom should be performed. Receiving a SIGUSR2
156 signal triggers a new grooming pass, independent of the groom
157 time (including if it was zero).
158
159
160 -G Run an extraordinary maximal-grooming pass at debuginfod start‐
161 up. This pass can take considerable time, because it tries to
162 remove any debuginfo-unrelated content from the archive-related
163 parts of the index. It should not be run if any recent archive-
164 related indexing operations were aborted early. It can take
165 considerable space, because it finishes up with an sqlite "vacu‐
166 um" operation, which repacks the database file by triplicating
167 it temporarily. The default is not to do maximal-grooming. See
168 also the DATA MANAGEMENT section.
169
170
171 -c NUM --concurrency=NUM
172 Set the concurrency limit for the scanning queue threads, which
173 work together to process archives & files located by the traver‐
174 sal thread. This important for controlling CPU-intensive opera‐
175 tions like parsing an ELF file and especially decompressing ar‐
176 chives. The default is the number of processors on the system;
177 the minimum is 1.
178
179
180 -L Traverse symbolic links encountered during traversal of the
181 PATHs, including across devices - as in find -L. The default is
182 to traverse the physical directory structure only, stay on the
183 same device, and ignore symlinks - as in find -P -xdev. Cau‐
184 tion: a loops in the symbolic directory tree might lead to infi‐
185 nite traversal.
186
187
188 --fdcache-fds=NUM --fdcache-mbs=MB --fdcache-prefetch=NUM2
189 Configure limits on a cache that keeps recently extracted files
190 from archives. Up to NUM requested files and up to a total of
191 MB megabytes will be kept extracted, in order to avoid having to
192 decompress their archives over and over again. In addition, up
193 to NUM2 other files from an archive may be prefetched into the
194 cache before they are even requested. The default NUM, NUM2,
195 and MB values depend on the concurrency of the system, and on
196 the available disk space on the $TMPDIR or /tmp filesystem.
197 This is because that is where the most recently used extracted
198 files are kept. Grooming cleans this cache.
199
200
201 -v Increase verbosity of logging to the standard error file de‐
202 scriptor. May be repeated to increase details. The default
203 verbosity is 0.
204
205
207 debuginfod's webapi resembles ordinary file service, where a GET re‐
208 quest with a path containing a known buildid results in a file. Un‐
209 known buildid / request combinations result in HTTP error codes. This
210 file service resemblance is intentional, so that an installation can
211 take advantage of standard HTTP management infrastructure.
212
213 There are three requests. In each case, the buildid is encoded as a
214 lowercase hexadecimal string. For example, for a program /bin/ls, look
215 at the ELF note GNU_BUILD_ID:
216
217 % readelf -n /bin/ls | grep -A4 build.id
218 Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
219 Owner Data size Type
220 GNU 20 GNU_BUILD_ID
221 Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d
222
223 Then the hexadecimal BUILDID is simply:
224
225 8713b9c3fb8a720137a4a08b325905c7aaf8429d
226
227
228 /buildid/BUILDID/debuginfo
229 If the given buildid is known to the server, this request will result
230 in a binary object that contains the customary .*debug_* sections.
231 This may be a split debuginfo file as created by strip, or it may be an
232 original unstripped executable.
233
234
235 /buildid/BUILDID/executable
236 If the given buildid is known to the server, this request will result
237 in a binary object that contains the normal executable segments. This
238 may be a executable stripped by strip, or it may be an original un‐
239 stripped executable. ET_DYN shared libraries are considered to be a
240 type of executable.
241
242
243 /buildid/BUILDID/source/SOURCE/FILE
244 If the given buildid is known to the server, this request will result
245 in a binary object that contains the source file mentioned. The path
246 should be absolute. Relative path names commonly appear in the DWARF
247 file's source directory, but these paths are relative to individual
248 compilation unit AT_comp_dir paths, and yet an executable is made up of
249 multiple CUs. Therefore, to disambiguate, debuginfod expects source
250 queries to prefix relative path names with the CU compilation-directo‐
251 ry, followed by a mandatory "/".
252
253 Note: the caller may or may not elide ../ or /./ or extraneous ///
254 sorts of path components in the directory names. debuginfod accepts
255 both forms. Specifically, debuginfod canonicalizes path names accord‐
256 ing to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any
257 // to / in the path.
258
259 For example:
260
261 #include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h
262 /path/to/foo.c /buildid/BUILDID/source/path/to/foo.c
263 ../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
264
265
266 /metrics
267 This endpoint returns a Prometheus formatted text/plain dump of a vari‐
268 ety of statistics about the operation of the debuginfod server. The
269 exact set of metrics and their meanings may change in future versions.
270 Caution: configuration information (path names, versions) may be dis‐
271 closed.
272
273
275 debuginfod stores its index in an sqlite database in a densely packed
276 set of interlinked tables. While the representation is as efficient as
277 we have been able to make it, it still takes a considerable amount of
278 data to record all debuginfo-related data of potentially a great many
279 files. This section offers some advice about the implications.
280
281 As a general explanation for size, consider that debuginfod indexes
282 ELF/DWARF files, it stores their names and referenced source file
283 names, and buildids will be stored. When indexing archives, it stores
284 every file name of or in an archive, every buildid, plus every source
285 file name referenced from a DWARF file. (Indexing archives takes more
286 space because the source files often reside in separate subpackages
287 that may not be indexed at the same pass, so extra metadata has to be
288 kept.)
289
290 Getting down to numbers, in the case of Fedora RPMs (essentially, gzip-
291 compressed cpio files), the sqlite index database tends to be from 0.5%
292 to 3% of their size. It's larger for binaries that are assembled out
293 of a great many source files, or packages that carry much debuginfo-un‐
294 related content. It may be even larger during the indexing phase due
295 to temporary sqlite write-ahead-logging files; these are checkpointed
296 (cleaned out and removed) at shutdown. It may be helpful to apply
297 tight -I or -X regular-expression constraints to exclude files from
298 scanning that you know have no debuginfo-relevant content.
299
300 As debuginfod runs, it periodically rescans its target directories, and
301 any new content found is added to the database. Old content, such as
302 data for files that have disappeared or that have been replaced with
303 newer versions is removed at a periodic grooming pass. This means that
304 the sqlite files grow fast during initial indexing, slowly during index
305 rescans, and periodically shrink during grooming. There is also an op‐
306 tional one-shot maximal grooming pass is available. It removes infor‐
307 mation debuginfo-unrelated data from the archive content index such as
308 file names found in archives ("archive sdef" records) that are not re‐
309 ferred to as source files from any binaries find in archives ("archive
310 sref" records). This can save considerable disk space. However, it is
311 slow and temporarily requires up to twice the database size as free
312 space. Worse: it may result in missing source-code info if the archive
313 traversals were interrupted, so that not all source file references
314 were known. Use it rarely to polish a complete index.
315
316 You should ensure that ample disk space remains available. (The flood
317 of error messages on -ENOSPC is ugly and nagging. But, like for most
318 other errors, debuginfod will resume when resources permit.) If neces‐
319 sary, debuginfod can be stopped, the database file moved or removed,
320 and debuginfod restarted.
321
322 sqlite offers several performance-related options in the form of prag‐
323 mas. Some may be useful to fine-tune the defaults plus the debuginfod
324 extras. The -D option may be useful to tell debuginfod to execute the
325 given bits of SQL after the basic schema creation commands. For exam‐
326 ple, the "synchronous", "cache_size", "auto_vacuum", "threads", "jour‐
327 nal_mode" pragmas may be fun to tweak via -D, if you're searching for
328 peak performance. The "optimize", "wal_checkpoint" pragmas may be use‐
329 ful to run periodically, outside debuginfod. The default settings are
330 performance- rather than reliability-oriented, so a hardware crash
331 might corrupt the database. In these cases, it may be necessary to
332 manually delete the sqlite database and start over.
333
334 As debuginfod changes in the future, we may have no choice but to
335 change the database schema in an incompatible manner. If this happens,
336 new versions of debuginfod will issue SQL statements to drop all prior
337 schema & data, and start over. So, disk space will not be wasted for
338 retaining a no-longer-useable dataset.
339
340 In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
341 size ratio, and slow growth afterwards, you should not need to worry
342 about disk space. If a system crash corrupts the database, or you want
343 to force debuginfod to reset and start over, simply erase the sqlite
344 file before restarting debuginfod.
345
346
347
349 debuginfod does not include any particular security features. While it
350 is robust with respect to inputs, some abuse is possible. It forks a
351 new thread for each incoming HTTP request, which could lead to a de‐
352 nial-of-service in terms of RAM, CPU, disk I/O, or network I/O. If
353 this is a problem, users are advised to install debuginfod with a HTTPS
354 reverse-proxy front-end that enforces site policies for firewalling,
355 authentication, integrity, authorization, and load control. The /met‐
356 rics webapi endpoint is probably not appropriate for disclosure to the
357 public.
358
359 When relaying queries to upstream debuginfods, debuginfod does not in‐
360 clude any particular security features. It trusts that the binaries
361 returned by the debuginfods are accurate. Therefore, the list of
362 servers should include only trustworthy ones. If accessed across HTTP
363 rather than HTTPS, the network should be trustworthy. Authentication
364 information through the internal libcurl library is not currently en‐
365 abled.
366
367
368
370 TMPDIR This environment variable points to a file system to be used for
371 temporary files. The default is /tmp.
372
373
374 DEBUGINFOD_URLS
375 This environment variable contains a list of URL prefixes for
376 trusted debuginfod instances. Alternate URL prefixes are sepa‐
377 rated by space. Avoid referential loops that cause a server to
378 contact itself, directly or indirectly - the results would be
379 hilarious.
380
381
382 DEBUGINFOD_TIMEOUT
383 This environment variable governs the timeout for each debugin‐
384 fod HTTP connection. A server that fails to provide at least
385 100K of data within this many seconds is skipped. The default is
386 90 seconds. (Zero or negative means "no timeout".)
387
388
389
390 DEBUGINFOD_CACHE_PATH
391 This environment variable governs the location of the cache
392 where downloaded files are kept. It is cleaned periodically as
393 this program is reexecuted. If XDG_CACHE_HOME is set then
394 $XDG_CACHE_HOME/debuginfod_client is the default location, oth‐
395 erwise $HOME/.cache/debuginfod_client is used. For more informa‐
396 tion regarding the client cache see debuginfod_find_debugin‐
397 fo(3).
398
399
401 $HOME/.debuginfod.sqlite
402 Default database file.
403
404
405 $XDG_CACHE_HOME/debuginfod_client
406 Default cache directory for content from upstream
407 debuginfods. If XDG_CACHE_HOME is not set then
408 $HOME/.cache/debuginfod_client is used.
409
410
411
413 debuginfod-find(1) sqlite3(1)
414 https://prometheus.io/docs/instrumenting/exporters/
415
416
417
418 DEBUGINFOD(8)