1IGNORE.LIST(5) User Contributed Perl Documentation IGNORE.LIST(5)
2
3
4
6 ignore.list - websec url monitoring configuration
7
9 IGNORE KEYWORDS
10
11 When determining which parts of a particular web page has changed, you
12 may want to skip those paragraphs that contains certain predefined
13 words. For example, pages like InfoWorld, PC Magazine and PC Week often
14 contain the current date/time regardless of whether there is new or
15 changed content. In such cases, you can use IGNORE KEYWORDS to skip
16 those paragraphs which contains date/time information.
17
18 Ignore keywords are stored in a file called "ignore.list" in the same
19 directory as websec. Like the URL list, the ignore keywords are
20 partitioned into different sections. Each section has a user-defined
21 name. An example is shown below:
22
23 [General]
24 all rights reserved
25 an error occurred
26 click here
27 comments
28 copyright
29
30 [Date_Time]
31 January\s+\d{1,2}
32 February\s+\d{1,2}
33 March\s+\d{1,2}
34 April\s+\d{1,2}
35 May\s+\d{1,2}
36
37 In the example above, there are two sections: "General" and
38 "Date_Time". You can use them in the URL list as follows:
39
40 Ignore = General
41
42 You can also use multiple sections at one go:
43
44 Ignore = General,Date_Time
45
46 If you use certain ignore keywords regularly, you might want to add
47 them to a defaults section in the URL list.
48
49 Ignore keywords can contain regular expressions. For example, the
50 ignore keyword "January\s+\d{1,2}" tells websec to look for the string
51 "January", followed by one or more spaces, followed by at least one but
52 not more than two digits.
53
54 Two sections of ignore keywords are supplied in this distribution.
55 "General" contains some general ignore keywords which you may want to
56 use. "Date_Time" contains date/time detectors coded using regular
57 expressions. Feel free to add your own!
58
59 IGNORE URLS
60
61 Most advertisements in webpages are of the following form:
62
63 <A HREF="http://page.url.com/advert/cgi-bin/" ...>
64 <IMG SRC="advert.animated.gif" ...>
65 Click here for free beer!
66 </A>
67
68 Such advertisements can be ignored when running webdiff using ignore
69 URLs.
70
71 Ignore URLs are also stored in "ignore.list". They contain all of parts
72 of the URL referred to by the <A HREF> tag which you want to ignore. An
73 example is shown below:
74
75 [Adverts]
76 page.url.com/advert/cgi-bin/
77
78 Use the "Adverts" section in the URL list as follows:
79
80 IgnoreURL = Adverts
81
82 You can also use multiple sections at one go:
83
84 IgnoreURL = Adverts1,Adverts2
85
86 If you use certain ignore URLs regularly, you might want to add them to
87 a defaults section in the URL list.
88
89 Like ignore keywords, ignore URLs can contain regular expressions.
90
91 An "Adverts" section is supplied in this distribution. Feel free to add
92 your own!
93
95 url.list(5)
96
98 Baruch Even <websec@ev-en.org> is maintaining this program.
99
100
101
102perl v5.10.0 2003-05-31 IGNORE.LIST(5)