1URLWATCH(1)                      User Commands                     URLWATCH(1)
2
3
4

NAME

6       urlwatch - Watch web pages and arbitrary URLs for changes
7

SYNOPSIS

9       urlwatch [options]
10

DESCRIPTION

12       urlwatch  watches  a  list  of  URLs for changes and prints out unified
13       diffs of the changes. You can filter always-changing parts of  websites
14       by providing a "hooks.py" script.
15

OPTIONS

17       --version
18              show program's version number and exit
19
20       -h, --help
21              show the help message and exit
22
23       -v, --verbose
24              Show debug/log output
25
26       --urls=FILE
27              Read URLs from the specified file
28
29       --hooks=FILE
30              Use specified file as hooks.py module
31
32       -e, --display-errors
33              Include HTTP errors (404, etc..) in the output
34

ADVANCED FEATURES

36       urlwatch  includes  some advanced features that you have to activate by
37       creating a hooks.py file that specifies for which URLs to  use  a  spe‐
38       cific  feature. You can also use the hooks.py file to filter trivially-
39       varying elements of a web page.
40
41   ICALENDAR FILE PARSING
42       This module allows you to parse .ics files that are in iCalendar format
43       and  provide  a very simplified text-based format for the diffs. Use it
44       like this in your hooks.py file:
45
46         from urlwatch import ical2txt
47
48         def filter(url, data):
49             if url.endswith('.ics'):
50                 return ical2txt.ical2text(data).encode('utf-8') + data
51             # ...you can add more hooks here...
52
53   HTML TO TEXT CONVERSION
54       There are three methods of converting HTML to text in the current  ver‐
55       sion  of  urlwatch:  "lynx" (default), "html2text" and "re". The former
56       two use command-line utilities of the same  name  to  convert  HTML  to
57       text,  and  the last one uses a simple regex-based tag stripping method
58       (needs no extra tools).  Here  is  an  example  of  using  it  in  your
59       hooks.py file:
60
61         from urlwatch import html2txt
62
63         def filter(url, data):
64             if url.endswith('.html') or url.endswith('.htm'):
65                 return html2txt.html2text(data, method='lynx')
66             # ...you can add more hooks here...
67

FILES

69       ~/.urlwatch/urls.txt
70              A list of HTTP/FTP URLs to watch (one URL per line)
71
72       ~/.urlwatch/lib/hooks.py
73              A Python module that can be used to filter contents
74
75       ~/.urlwatch/cache/
76              The state of web pages is saved in this folder
77

AUTHOR

79       Thomas Perl <thp@thpinfo.com>
80

WEBSITE

82       http://thpinfo.com/2008/urlwatch/
83
84
85
86urlwatch 1.11                      July 2010                       URLWATCH(1)
Impressum