1URLWATCH(1) User Commands URLWATCH(1)
2
3
4
6 urlwatch - Watch web pages and arbitrary URLs for changes
7
9 urlwatch [options]
10
12 urlwatch watches a list of URLs for changes and prints out unified
13 diffs of the changes. You can filter always-changing parts of websites
14 by providing a "hooks.py" script.
15
17 --version
18 show program's version number and exit
19
20 -h, --help
21 show the help message and exit
22
23 -v, --verbose
24 Show debug/log output
25
26 --urls=FILE
27 Read URLs from the specified file
28
29 --hooks=FILE
30 Use specified file as hooks.py module
31
32 -e, --display-errors
33 Include HTTP errors (404, etc..) in the output
34
36 urlwatch includes some advanced features that you have to activate by
37 creating a hooks.py file that specifies for which URLs to use a spe‐
38 cific feature. You can also use the hooks.py file to filter trivially-
39 varying elements of a web page.
40
41 ICALENDAR FILE PARSING
42 This module allows you to parse .ics files that are in iCalendar format
43 and provide a very simplified text-based format for the diffs. Use it
44 like this in your hooks.py file:
45
46 from urlwatch import ical2txt
47
48 def filter(url, data):
49 if url.endswith('.ics'):
50 return ical2txt.ical2text(data).encode('utf-8') + data
51 # ...you can add more hooks here...
52
53 HTML TO TEXT CONVERSION
54 There are three methods of converting HTML to text in the current ver‐
55 sion of urlwatch: "lynx" (default), "html2text" and "re". The former
56 two use command-line utilities of the same name to convert HTML to
57 text, and the last one uses a simple regex-based tag stripping method
58 (needs no extra tools). Here is an example of using it in your
59 hooks.py file:
60
61 from urlwatch import html2txt
62
63 def filter(url, data):
64 if url.endswith('.html') or url.endswith('.htm'):
65 return html2txt.html2text(data, method='lynx')
66 # ...you can add more hooks here...
67
69 ~/.urlwatch/urls.txt
70 A list of HTTP/FTP URLs to watch (one URL per line)
71
72 ~/.urlwatch/lib/hooks.py
73 A Python module that can be used to filter contents
74
75 ~/.urlwatch/cache/
76 The state of web pages is saved in this folder
77
79 Thomas Perl <thp@thpinfo.com>
80
82 http://thpinfo.com/2008/urlwatch/
83
84
85
86urlwatch 1.11 July 2010 URLWATCH(1)