1LINKCHECKERRC(5) LinkChecker LINKCHECKERRC(5)
2
3
4
6 linkcheckerrc - configuration file for LinkChecker
7
9 linkcheckerrc is the configuration file for LinkChecker. The file is
10 written in an INI-style format. The default file location is
11 ~/.linkchecker/linkcheckerrc on Unix, %HOME‐
12 PATH%\.linkchecker\linkcheckerrc on Windows systems.
13
15 checking
16 cookiefile=filename
17 Read a file with initial cookie data. The cookie data format is
18 explained in linkchecker(1). Command line option: --cookiefile
19
20 debugmemory=[0|1]
21 Write memory allocation statistics to a file on exit, requires
22 meliae. The default is not to write the file. Command line op‐
23 tion: none
24
25 localwebroot=STRING
26 When checking absolute URLs inside local files, the given root
27 directory is used as base URL. Note that the given directory
28 must have URL syntax, so it must use a slash to join directories
29 instead of a backslash. And the given directory must end with a
30 slash. Command line option: none
31
32 nntpserver=STRING
33 Specify an NNTP server for news: links. Default is the environ‐
34 ment variable NNTP_SERVER. If no host is given, only the syntax
35 of the link is checked. Command line option: --nntp-server
36
37 recursionlevel=NUMBER
38 Check recursively all links up to given depth. A negative depth
39 will enable infinite recursion. Default depth is infinite. Com‐
40 mand line option: --recursion-level
41
42 threads=NUMBER
43 Generate no more than the given number of threads. Default num‐
44 ber of threads is 10. To disable threading specify a non-posi‐
45 tive number. Command line option: --threads
46
47 timeout=NUMBER
48 Set the timeout for connection attempts in seconds. The default
49 timeout is 60 seconds. Command line option: --timeout
50
51 aborttimeout=NUMBER
52 Time to wait for checks to finish after the user aborts the
53 first time (with Ctrl-C or the abort button). The default abort
54 timeout is 300 seconds. Command line option: none
55
56 useragent=STRING
57 Specify the User-Agent string to send to the HTTP server, for
58 example "Mozilla/4.0". The default is "LinkChecker/X.Y" where
59 X.Y is the current version of LinkChecker. Command line option:
60 --user-agent
61
62 sslverify=[0|1|filename]
63 If set to zero disables SSL certificate checking. If set to one
64 (the default) enables SSL certificate checking with the provided
65 CA certificate file. If a filename is specified, it will be used
66 as the certificate file. Command line option: none
67
68 maxrunseconds=NUMBER
69 Stop checking new URLs after the given number of seconds. Same
70 as if the user stops (by hitting Ctrl-C) after the given number
71 of seconds. The default is not to stop until all URLs are
72 checked. Command line option: none
73
74 maxfilesizedownload=NUMBER
75 Files larger than NUMBER bytes will be ignored, without down‐
76 loading anything if accessed over http and an accurate Con‐
77 tent-Length header was returned. No more than this amount of a
78 file will be downloaded. The default is 5242880 (5 MB). Com‐
79 mand line option: none
80
81 maxfilesizeparse=NUMBER
82 Files larger than NUMBER bytes will not be parsed for links.
83 The default is 1048576 (1 MB). Command line option: none
84
85 maxnumurls=NUMBER
86 Maximum number of URLs to check. New URLs will not be queued af‐
87 ter the given number of URLs is checked. The default is to
88 queue and check all URLs. Command line option: none
89
90 maxrequestspersecond=NUMBER
91 Limit the maximum number of requests per second to one host.
92 The default is 10. Command line option: none
93
94 robotstxt=[0|1]
95 When using http, fetch robots.txt, and confirm whether each URL
96 should be accessed before checking. The default is to use ro‐
97 bots.txt files. Command line option: --no-robots
98
99 allowedschemes=NAME[,NAME...]
100 Allowed URL schemes as comma-separated list. Command line op‐
101 tion: none
102
103 resultcachesize=NUMBER
104 Set the result cache size. The default is 100 000 URLs. Com‐
105 mand line option: none
106
107 filtering
108 ignore=REGEX (MULTILINE)
109 Only check syntax of URLs matching the given regular expres‐
110 sions. Command line option: --ignore-url
111
112 ignorewarnings=NAME[,NAME...]
113 Ignore the comma-separated list of warnings. See WARNINGS for
114 the list of supported warnings. Command line option: none
115
116 internlinks=REGEX
117 Regular expression to add more URLs recognized as internal
118 links. Default is that URLs given on the command line are in‐
119 ternal. Command line option: none
120
121 nofollow=REGEX (MULTILINE)
122 Check but do not recurse into URLs matching the given regular
123 expressions. Command line option: --no-follow-url
124
125 checkextern=[0|1]
126 Check external links. Default is to check internal links only.
127 Command line option: --check-extern
128
129 authentication
130 entry=REGEX USER [PASS] (MULTILINE)
131 Provide individual username/password pairs for different links.
132 In addtion to a single login page specified with loginurl multi‐
133 ple FTP, HTTP (Basic Authentication) and telnet links are sup‐
134 ported. Entries are a triple (URL regex, username, password) or
135 a tuple (URL regex, username), where the entries are separated
136 by whitespace. The password is optional and if missing it has
137 to be entered at the commandline. If the regular expression
138 matches the checked URL, the given username/password pair is
139 used for authentication. The command line options -u and -p
140 match every link and therefore override the entries given here.
141 The first match wins. Command line option: -u, -p
142
143 loginurl=URL
144 The URL of a login page to be visited before link checking. The
145 page is expected to contain an HTML form to collect credentials
146 and submit them to the address in its action attribute using an
147 HTTP POST request. The name attributes of the input elements of
148 the form and the values to be submitted need to be available
149 (see entry for an explanation of username and password values).
150
151 loginuserfield=STRING
152 The name attribute of the username input element. Default: lo‐
153 gin.
154
155 loginpasswordfield=STRING
156 The name attribute of the password input element. Default: pass‐
157 word.
158
159 loginextrafields=NAME:VALUE (MULTILINE)
160 Optionally the name attributes of any additional input elements
161 and the values to populate them with. Note that these are sub‐
162 mitted without checking whether matching input elements exist in
163 the HTML form.
164
165 output
166 URL checking results
167 fileoutput=TYPE[,TYPE...]
168 Output to a file linkchecker-out.TYPE, or
169 $HOME/.linkchecker/failures for the failures output type. Valid
170 file output types are text, html, sql, csv, gml, dot, xml, none
171 or failures. Default is no file output. The various output types
172 are documented below. Note that you can suppress all console
173 output with output=none. Command line option: --file-output
174
175 log=TYPE[/ENCODING]
176 Specify the console output type as text, html, sql, csv, gml,
177 dot, xml, none or failures. Default type is text. The various
178 output types are documented below. The ENCODING specifies the
179 output encoding, the default is that of your locale. Valid en‐
180 codings are listed at
181 https://docs.python.org/library/codecs.html#standard-encodings.
182 Command line option: --output
183
184 verbose=[0|1]
185 If set log all checked URLs once. Default is to log only errors
186 and warnings. Command line option: --verbose
187
188 warnings=[0|1]
189 If set log warnings. Default is to log warnings. Command line
190 option: --no-warnings
191
192 Progress updates
193 status=[0|1]
194 Control printing URL checker status messages. Default is 1.
195 Command line option: --no-status
196
197 Application
198 debug=STRING[,STRING...]
199 Print debugging output for the given modules. Available debug
200 modules are cmdline, checking, cache, dns, thread, plugins and
201 all. Specifying all is an alias for specifying all available
202 loggers. Command line option: --debug
203
204 Quiet
205 quiet=[0|1]
206 If set, operate quiet. An alias for log=none that also hides ap‐
207 plication information messages. This is only useful with file‐
208 output, else no results will be output. Command line option:
209 --quiet
210
212 text
213 filename=STRING
214 Specify output filename for text logging. Default filename is
215 linkchecker-out.txt. Command line option: --file-output
216
217 parts=STRING
218 Comma-separated list of parts that have to be logged. See LOGGER
219 PARTS below. Command line option: none
220
221 encoding=STRING
222 Valid encodings are listed in
223 https://docs.python.org/library/codecs.html#standard-encodings.
224 Default encoding is the system default locale encoding.
225
226 color* Color settings for the various log parts, syntax is color or
227 type;color. The type can be bold, light, blink, invert. The
228 color can be default, black, red, green, yellow, blue, purple,
229 cyan, white, Black, Red, Green, Yellow, Blue, Purple, Cyan or
230 White. Command line option: none
231
232 colorparent=STRING
233 Set parent color. Default is white.
234
235 colorurl=STRING
236 Set URL color. Default is default.
237
238 colorname=STRING
239 Set name color. Default is default.
240
241 colorreal=STRING
242 Set real URL color. Default is cyan.
243
244 colorbase=STRING
245 Set base URL color. Default is purple.
246
247 colorvalid=STRING
248 Set valid color. Default is bold;green.
249
250 colorinvalid=STRING
251 Set invalid color. Default is bold;red.
252
253 colorinfo=STRING
254 Set info color. Default is default.
255
256 colorwarning=STRING
257 Set warning color. Default is bold;yellow.
258
259 colordltime=STRING
260 Set download time color. Default is default.
261
262 colorreset=STRING
263 Set reset color. Default is default.
264
265 gml
266 filename=STRING
267 See [text] section above.
268
269 parts=STRING
270 See [text] section above.
271
272 encoding=STRING
273 See [text] section above.
274
275 dot
276 filename=STRING
277 See [text] section above.
278
279 parts=STRING
280 See [text] section above.
281
282 encoding=STRING
283 See [text] section above.
284
285 csv
286 filename=STRING
287 See [text] section above.
288
289 parts=STRING
290 See [text] section above.
291
292 encoding=STRING
293 See [text] section above.
294
295 separator=CHAR
296 Set CSV separator. Default is a semicolon (;).
297
298 quotechar=CHAR
299 Set CSV quote character. Default is a double quote (").
300
301 sql
302 filename=STRING
303 See [text] section above.
304
305 parts=STRING
306 See [text] section above.
307
308 encoding=STRING
309 See [text] section above.
310
311 dbname=STRING
312 Set database name to store into. Default is linksdb.
313
314 separator=CHAR
315 Set SQL command separator character. Default is a semicolon (;).
316
317 html
318 filename=STRING
319 See [text] section above.
320
321 parts=STRING
322 See [text] section above.
323
324 encoding=STRING
325 See [text] section above.
326
327 colorbackground=COLOR
328 Set HTML background color. Default is #fff7e5.
329
330 colorurl=
331 Set HTML URL color. Default is #dcd5cf.
332
333 colorborder=
334 Set HTML border color. Default is #000000.
335
336 colorlink=
337 Set HTML link color. Default is #191c83.
338
339 colorwarning=
340 Set HTML warning color. Default is #e0954e.
341
342 colorerror=
343 Set HTML error color. Default is #db4930.
344
345 colorok=
346 Set HTML valid color. Default is #3ba557.
347
348 failures
349 filename=STRING
350 See [text] section above.
351
352 encoding=STRING
353 See [text] section above.
354
355 xml
356 filename=STRING
357 See [text] section above.
358
359 parts=STRING
360 See [text] section above.
361
362 encoding=STRING
363 See [text] section above.
364
365 gxml
366 filename=STRING
367 See [text] section above.
368
369 parts=STRING
370 See [text] section above.
371
372 encoding=STRING
373 See [text] section above.
374
375 sitemap
376 filename=STRING
377 See [text] section above.
378
379 parts=STRING
380 See [text] section above.
381
382 encoding=STRING
383 See [text] section above.
384
385 priority=FLOAT
386 A number between 0.0 and 1.0 determining the priority. The de‐
387 fault priority for the first URL is 1.0, for all child URLs 0.5.
388
389 frequency=[always|hourly|daily|weekly|monthly|yearly|never]
390 How frequently pages are changing.
391
393 all for all parts
394
395 id a unique ID for each logentry
396
397 realurl
398 the full url link
399
400 result valid or invalid, with messages
401
402 extern 1 or 0, only in some logger types reported
403
404 base base href=...
405
406 name <a href=...>name</a> and <img alt="name">
407
408 parenturl
409 if any
410
411 info some additional info, e.g. FTP welcome messages
412
413 warning
414 warnings
415
416 dltime download time
417
418 checktime
419 check time
420
421 url the original url name, can be relative
422
423 intro the blurb at the beginning, "starting at ..."
424
425 outro the blurb at the end, "found x errors ..."
426
428 Some option values can span multiple lines. Each line has to be in‐
429 dented for that to work. Lines starting with a hash (#) will be ig‐
430 nored, though they must still be indented.
431
432 ignore=
433 lconline
434 bookmark
435 # a comment
436 ^mailto:
437
439 [output]
440 log=html
441
442 [checking]
443 threads=5
444
445 [filtering]
446 ignorewarnings=http-moved-permanent
447
449 All plugins have a separate section. If the section appears in the con‐
450 figuration file the plugin is enabled. Some plugins read extra options
451 in their section.
452
453 AnchorCheck
454 Checks validity of HTML anchors.
455
456 NOTE:
457 The AnchorCheck plugin is currently broken and is disabled.
458
459 LocationInfo
460 Adds the country and if possible city name of the URL host as info.
461 Needs GeoIP or pygeoip and a local country or city lookup DB installed.
462
463 RegexCheck
464 Define a regular expression which prints a warning if it matches any
465 content of the checked link. This applies only to valid pages, so we
466 can get their content.
467
468 warningregex=REGEX
469 Use this to check for pages that contain some form of error mes‐
470 sage, for example "This page has moved" or "Oracle Application
471 error". REGEX should be unquoted.
472
473 Note that multiple values can be combined in the regular expres‐
474 sion, for example "(This page has moved|Oracle Application er‐
475 ror)".
476
477 SslCertificateCheck
478 Check SSL certificate expiration date. Only internal https: links will
479 be checked. A domain will only be checked once to avoid duplicate warn‐
480 ings.
481
482 sslcertwarndays=NUMBER
483 Configures the expiration warning time in days.
484
485 HtmlSyntaxCheck
486 Check the syntax of HTML pages with the online W3C HTML validator. See
487 https://validator.w3.org/docs/api.html.
488
489 NOTE:
490 The HtmlSyntaxCheck plugin is currently broken and is disabled.
491
492 HttpHeaderInfo
493 Print HTTP headers in URL info.
494
495 prefixes=prefix1[,*prefix2*]...
496 List of comma separated header prefixes. For example to display
497 all HTTP headers that start with "X-".
498
499 CssSyntaxCheck
500 Check the syntax of HTML pages with the online W3C CSS validator. See
501 https://jigsaw.w3.org/css-validator/manual.html#expert.
502
503 VirusCheck
504 Checks the page content for virus infections with clamav. A local cla‐
505 mav daemon must be installed.
506
507 clamavconf=filename
508 Filename of clamd.conf config file.
509
510 PdfParser
511 Parse PDF files for URLs to check. Needs the pdfminer Python package
512 installed.
513
514 WordParser
515 Parse Word files for URLs to check. Needs the pywin32 Python extension
516 installed.
517
518 MarkdownCheck
519 Parse Markdown files for URLs to check.
520
521 filename_re=REGEX
522 Regular expression matching the names of Markdown files.
523
525 The following warnings are recognized in the 'ignorewarnings' config
526 file entry:
527
528 file-missing-slash
529 The file: URL is missing a trailing slash.
530
531 file-system-path
532 The file: path is not the same as the system specific path.
533
534 ftp-missing-slash
535 The ftp: URL is missing a trailing slash.
536
537 http-cookie-store-error
538 An error occurred while storing a cookie.
539
540 http-empty-content
541 The URL had no content.
542
543 mail-no-mx-host
544 The mail MX host could not be found.
545
546 nntp-no-newsgroup
547 The NNTP newsgroup could not be found.
548
549 nntp-no-server
550 No NNTP server was found.
551
552 url-content-size-zero
553 The URL content size is zero.
554
555 url-content-too-large
556 The URL content size is too large.
557
558 url-effective-url
559 The effective URL is different from the original.
560
561 url-error-getting-content
562 Could not get the content of the URL.
563
564 url-obfuscated-ip
565 The IP is obfuscated.
566
567 url-whitespace
568 The URL contains leading or trailing whitespace.
569
571 linkchecker(1)
572
574 Bastian Kleineidam <bastian.kleineidam@web.de>
575
577 2000-2016 Bastian Kleineidam, 2010-2021 LinkChecker Authors
578
579
580
581
58210.0.1.post124+ga12fcf04 December 21, 2021 LINKCHECKERRC(5)