1IGNORE.LIST(5)        User Contributed Perl Documentation       IGNORE.LIST(5)
2
3
4

NAME

6       ignore.list - websec url monitoring configuration
7

DESCRIPTION

9   IGNORE KEYWORDS
10       When determining which parts of a particular web page has changed, you
11       may want to skip those paragraphs that contains certain predefined
12       words. For example, pages like InfoWorld, PC Magazine and PC Week often
13       contain the current date/time regardless of whether there is new or
14       changed content. In such cases, you can use IGNORE KEYWORDS to skip
15       those paragraphs which contains date/time information.
16
17       Ignore keywords are stored in a file called "ignore.list" in the same
18       directory as websec. Like the URL list, the ignore keywords are
19       partitioned into different sections. Each section has a user-defined
20       name. An example is shown below:
21
22               [General]
23               all rights reserved
24               an error occurred
25               click here
26               comments
27               copyright
28
29               [Date_Time]
30               January\s+\d{1,2}
31               February\s+\d{1,2}
32               March\s+\d{1,2}
33               April\s+\d{1,2}
34               May\s+\d{1,2}
35
36       In the example above, there are two sections: "General" and
37       "Date_Time".  You can use them in the URL list as follows:
38
39           Ignore = General
40
41       You can also use multiple sections at one go:
42
43           Ignore = General,Date_Time
44
45       If you use certain ignore keywords regularly, you might want to add
46       them to a defaults section in the URL list.
47
48       Ignore keywords can contain regular expressions. For example, the
49       ignore keyword "January\s+\d{1,2}" tells websec to look for the string
50       "January", followed by one or more spaces, followed by at least one but
51       not more than two digits.
52
53       Two sections of ignore keywords are supplied in this distribution.
54       "General" contains some general ignore keywords which you may want to
55       use. "Date_Time" contains date/time detectors coded using regular
56       expressions. Feel free to add your own!
57
58   IGNORE URLS
59       Most advertisements in webpages are of the following form:
60
61               <A HREF="http://page.url.com/advert/cgi-bin/" ...>
62               <IMG SRC="advert.animated.gif" ...>
63               Click here for free beer!
64               </A>
65
66       Such advertisements can be ignored when running webdiff using ignore
67       URLs.
68
69       Ignore URLs are also stored in "ignore.list". They contain all of parts
70       of the URL referred to by the <A HREF> tag which you want to ignore. An
71       example is shown below:
72
73               [Adverts]
74               page.url.com/advert/cgi-bin/
75
76       Use the "Adverts" section in the URL list as follows:
77
78           IgnoreURL = Adverts
79
80       You can also use multiple sections at one go:
81
82           IgnoreURL = Adverts1,Adverts2
83
84       If you use certain ignore URLs regularly, you might want to add them to
85       a defaults section in the URL list.
86
87       Like ignore keywords, ignore URLs can contain regular expressions.
88
89       An "Adverts" section is supplied in this distribution. Feel free to add
90       your own!
91

SEE ALSO

93       url.list(5)
94

AUTHOR

96       Baruch Even <websec@ev-en.org> is maintaining this program.
97
98
99
100perl v5.30.0                      2019-07-27                    IGNORE.LIST(5)
Impressum