1linkcheckerrc(5) File Formats Manual linkcheckerrc(5)
2
3
4
6 linkcheckerrc - configuration file for LinkChecker
7
9 linkcheckerrc is the default configuration file LinkChecker. The file
10 is written in an INI-style format.
11
13 [checking]
14 threads=NUMBER
15 Generate no more than the given number of threads. Default num‐
16 ber of threads is 10. To disable threading specify a non-posi‐
17 tive number.
18 Command line option: --threads
19
20 timeout=NUMBER
21 Set the timeout for connection attempts in seconds. The default
22 timeout is 60 seconds.
23 Command line option: --timeout
24
25 anchors=[0|1]
26 Check HTTP anchor references. Default is not to check anchors.
27 This option enables logging of the warning url-anchor-not-found.
28 Command line option: --anchors
29
30 recursionlevel=NUMBER
31 Check recursively all links up to given depth. A negative depth
32 will enable infinite recursion. Default depth is infinite.
33 Command line option: --recursion-level
34
35 warningregex==REGEX
36 Define a regular expression which prints a warning if it matches
37 any content of the checked link. This applies only to valid
38 pages, so we can get their content.
39 Use this to check for pages that contain some form of error, for
40 example "This page has moved" or "Oracle Application Server
41 error".
42 Command line option: --warning-regex
43
44 warnsizebytes=NUMBER
45 Print a warning if content size info is available and exceeds
46 the given number of bytes.
47 Command line option: --warning-size-bytes
48
49 nntpserver=STRING
50 Specify an NNTP server for news: links. Default is the environ‐
51 ment variable NNTP_SERVER. If no host is given, only the syntax
52 of the link is checked.
53 Command line option: --nntp-server
54
55 checkhtml=[0|1]
56 Check syntax of HTML URLs with local library (HTML tidy).
57 Command line option: --check-html
58
59 checkhtmlw3=[0|1]
60 Check syntax of HTML URLs with W3C online validator.
61 Command line option: --check-html-w3
62
63 checkcss=[0|1]
64 Check syntax of CSS URLs with local library (cssutils).
65 Command line option: --check-css
66
67 checkcssw3=[0|1]
68 Check syntax of CSS URLs with W3C online validator.
69 Command line option: --check-css-w3
70
71 scanvirus=[0|1]
72 Scan content of URLs for viruses with ClamAV.
73 Command line option: --scan-virus
74
75 clamavconf=filename
76 Filename of clamd.conf config file.
77 Command line option: none
78
79 cookies=[0|1]
80 Accept and send HTTP cookies.
81 Command line option: --cookies
82
83 [filtering]
84 ignore=REGEX (MULTILINE)
85 Only check syntax of URLs matching the given regular expres‐
86 sions.
87 Command line option: --ignore-url
88
89 nofollow=REGEX (MULTILINE)
90 Check but do not recurse into URLs matching the given regular
91 expressions.
92 Command line option: --no-follow-url
93
94 ignorewarnings=NAME[,NAME...]
95 Ignore the comma-separated list of warnings. See linkchecker -h
96 for the list of recognized warnings.
97 Command line option: none
98
99 internlinks=REGEX
100 Regular expression to add more URLs recognized as internal
101 links. Default is that URLs given on the command line are
102 internal.
103 Command line option: none
104
105 [authentication]
106 entry=REGEX USER [PASS] (MULTILINE)
107 Provide different user/password pairs for different link types.
108 Entries are a triple (URL regex, username, password) or a tuple
109 (URL regex, username), where the entries are separated by white‐
110 space.
111 The password is optional and if missing it has to be entered at
112 the commandline.
113 If the regular expression matches the checked URL, the given
114 user/password pair is used for authentication. The commandline
115 options -u and -p match every link and therefore override the
116 entries given here. The first match wins. At the moment, authen‐
117 tication is used/needed for http[s] and ftp links.
118 Command line option: -u, -p
119
120 loginurl=URL
121 A login URL to be visited before checking. Also needs authenti‐
122 cation data set for it, and implies using cookies because most
123 logins use cookies nowadays.
124
125 loginuserfield=STRING
126 The name of the user CGI field. Default name is login.
127
128 loginpasswordfield=STRING
129 The name of the password CGI field. Default name is password.
130
131 loginextrafields=NAME:VALUE (MULTILINE)
132 Optionally any additional CGI name/value pairs. Note that the
133 default values are submitted automatically.
134
135 [output]
136 debug=STRING[,STRING...]
137 Print debugging output for the given loggers. Available loggers
138 are cmdline, checking, cache, gui, dns, thread and all. Speci‐
139 fying all is an alias for specifying all available loggers.
140 Command line option: --debug
141
142 status=[0|1]
143 Control printing check status messages. Default is 1.
144 Command line option: --no-status
145
146 log=TYPE[/ENCODING]
147 Specify output type as text, html, sql, csv, gml, dot, xml, none
148 or blacklist. Default type is text. The various output types
149 are documented below.
150 The ENCODING specifies the output encoding, the default is that
151 of your locale. Valid encodings are listed at
152 http://docs.python.org/library/codecs.html#standard-encodings.
153 Command line option: --output
154
155 verbose=[0|1]
156 If set log all checked URLs once. Default is to log only errors
157 and warnings.
158 Command line option: --verbose
159
160 complete=[0|1]
161 If set log all checked URLs, even duplicates. Default is to log
162 duplicate URLs only once.
163 Command line option: --complete
164
165 warnings=[0|1]
166 If set log warnings. Default is to log warnings.
167 Command line option: --no-warnings
168
169 quiet=[0|1]
170 If set, operate quiet. An alias for log=none. This is only use‐
171 ful with fileoutput.
172 Command line option: --verbose
173
174 fileoutput=TYPE[,TYPE...]
175 Output to a files linkchecker-out.TYPE,
176 $HOME/.linkchecker/blacklist for blacklist output.
177 Valid file output types are text, html, sql, csv, gml, dot, xml,
178 none or blacklist Default is no file output. The various output
179 types are documented below. Note that you can suppress all con‐
180 sole output with output=none.
181 Command line option: --file-output
182
183 [text]
184 filename=STRING
185 Specify output filename for text logging. Default filename is
186 linkchecker-out.txt.
187 Command line option: --file-output=
188
189 parts=STRING
190 Comma-separated list of parts that have to be logged. See LOG‐
191 GER PARTS below.
192 Command line option: none
193
194 encoding=STRING
195 Valid encodings are listed in
196 http://docs.python.org/library/codecs.html#standard-encodings.
197 Default encoding is iso-8859-15.
198
199 color* Color settings for the various log parts, syntax is color or
200 type;color. The type can be bold, light, blink, invert. The
201 color can be default, black, red, green, yellow, blue, purple,
202 cyan, white, Black, Red, Green, Yellow, Blue, Purple, Cyan or
203 White.
204 Command line option: none
205
206 colorparent=STRING
207 Set parent color. Default is white.
208
209 colorurl=STRING
210 Set URL color. Default is default.
211
212 colorname=STRING
213 Set name color. Default is default.
214
215 colorreal=STRING
216 Set real URL color. Default is cyan.
217
218 colorbase=STRING
219 Set base URL color. Default is purple.
220
221 colorvalid=STRING
222 Set valid color. Default is bold;green.
223
224 colorinvalid=STRING
225 Set invalid color. Default is bold;red.
226
227 colorinfo=STRING
228 Set info color. Default is default.
229
230 colorwarning=STRING
231 Set warning color. Default is bold;yellow.
232
233 colordltime=STRING
234 Set download time color. Default is default.
235
236 colorreset=STRING
237 Set reset color. Default is deault.
238
239 [gml]
240 filename=STRING
241 See [text] section above.
242
243 parts=STRING
244 See [text] section above.
245
246 encoding=STRING
247 See [text] section above.
248
249 [dot]
250 filename=STRING
251 See [text] section above.
252
253 parts=STRING
254 See [text] section above.
255
256 encoding=STRING
257 See [text] section above.
258
259 [csv]
260 filename=STRING
261 See [text] section above.
262
263 parts=STRING
264 See [text] section above.
265
266 encoding=STRING
267 See [text] section above.
268
269 separator=CHAR
270 Set CSV separator. Default is a comma (,).
271
272 quotechar=CHAR
273 Set CSV quote character. Default is a double quote (").
274
275 [sql]
276 filename=STRING
277 See [text] section above.
278
279 parts=STRING
280 See [text] section above.
281
282 encoding=STRING
283 See [text] section above.
284
285 dbname=STRING
286 Set database name to store into. Default is linksdb.
287
288 separator=CHAR
289 Set SQL command separator character. Default is a semicolor (;).
290
291 [html]
292 filename=STRING
293 See [text] section above.
294
295 parts=STRING
296 See [text] section above.
297
298 encoding=STRING
299 See [text] section above.
300
301 colorbackground=COLOR
302 Set HTML background color. Default is #fff7e5.
303
304 colorurl=
305 Set HTML URL color. Default is #dcd5cf.
306
307 colorborder=
308 Set HTML border color. Default is #000000.
309
310 colorlink=
311 Set HTML link color. Default is #191c83.
312
313 colorwarning=
314 Set HTML warning color. Default is #e0954e.
315
316 colorerror=
317 Set HTML error color. Default is #db4930.
318
319 colorok=
320 Set HTML valid color. Default is #3ba557.
321
322 [blacklist]
323 filename=STRING
324 See [text] section above.
325
326 encoding=STRING
327 See [text] section above.
328
329 [xml]
330 filename=STRING
331 See [text] section above.
332
333 parts=STRING
334 See [text] section above.
335
336 encoding=STRING
337 See [text] section above.
338
339 [gxml]
340 filename=STRING
341 See [text] section above.
342
343 parts=STRING
344 See [text] section above.
345
346 encoding=STRING
347 See [text] section above.
348
350 all (for all parts)
351 id (a unique ID for each logentry)
352 realurl (the full url link)
353 result (valid or invalid, with messages)
354 extern (1 or 0, only in some logger types reported)
355 base (base href=...)
356 name (<a href=...>name</a> and <img alt="name">)
357 parenturl (if any)
358 info (some additional info, e.g. FTP welcome messages)
359 warning (warnings)
360 dltime (download time)
361 checktime (check time)
362 url (the original url name, can be relative)
363 intro (the blurb at the beginning, "starting at ...")
364 outro (the blurb at the end, "found x errors ...")
365
367 Some option values can span multiple lines. Each line has to be
368 indented for that to work. Lines starting with a hash (#) will be
369 ignored, though they must still be indented.
370
371 ignore=
372 lconline
373 bookmark
374 # a comment
375 ^mailto:
376
378 [output]
379 log=html
380
381 [checking]
382 threads=5
383
384 [filtering]
385 ignorewarnings=anchor-not-found
386
388 linkchecker(1)
389
391 Bastian Kleineidam <calvin@users.sourceforge.net>
392
394 Copyright © 2000-2011 Bastian Kleineidam
395
396
397
398LinkChecker 2007-11-30 linkcheckerrc(5)