1ARCHIVEMAIL(1) archivemail user manual ARCHIVEMAIL(1)
2
3
4
6 archivemail - archive and compress your old email
7
9 archivemail [options] {MAILBOX...}
10
12 archivemail is a tool for archiving and compressing old email in
13 mailboxes. By default it will read the mailbox MAILBOX, moving messages
14 that are older than the specified number of days (180 by default) to a
15 mbox(5)-format mailbox in the same directory that is compressed with
16 gzip(1). It can also just delete old email rather than archive it.
17
18 By default, archivemail derives the archive filename from the mailbox
19 name by appending an _archive suffix to the mailbox name. For example,
20 if you run archivemail on a mailbox called exsouthrock, the archive
21 will be created with the filename exsouthrock_archive.gz. This default
22 behavior can be overridden with command line options, choosing a custom
23 suffix, a prefix, or a completely custom name for the archive.
24
25 archivemail supports reading IMAP, Maildir, MH and mbox-format
26 mailboxes, but always writes mbox-format archives.
27
28 Messages that are flagged important are not archived or deleted unless
29 explicitly requested with the --include-flagged option. Also,
30 archivemail can be configured not to archive unread mail, or to only
31 archive messages larger than a specified size.
32
33 To archive an IMAP-format mailbox, use the format
34 imap://username:password@server/mailbox to specify the mailbox.
35 archivemail will expand wildcards in IMAP mailbox names according to
36 [RFC 3501], which says: “The character "*" is a wildcard, and matches
37 zero or more characters at this position. The character "%" is similar
38 to "*", but it does not match a hierarchy delimiter.” You can omit the
39 password from the URL; use the --pwfile option to make archivemail read
40 the password from a file, or alternatively just enter it upon request.
41 If the --pwfile option is set, archivemail does not look for a password
42 in the URL, and the colon is not considered a delimiter. Substitute
43 imap with imaps, and archivemail will establish a secure SSL
44 connection. See below for more IMAP peculiarities.
45
47 -d NUM, --days=NUM
48 Archive messages older than NUM days. The default is 180. This
49 option is incompatible with the --date option below.
50
51 -D DATE, --date=DATE
52 Archive messages older than DATE. DATE can be a date string in ISO
53 format (eg “2002-04-23”), Internet format (eg “23 Apr 2002”) or
54 Internet format with full month names (eg “23 April 2002”).
55 Two-digit years are not supported. This option is incompatible with
56 the --days option above.
57
58 -o PATH, --output-dir=PATH
59 Use the directory name PATH to store the mailbox archives. The
60 default is the same directory as the mailbox to be read.
61
62 -P FILE, --pwfile=FILE
63 Read IMAP password from file FILE instead of from the command line.
64 Note that this will probably not work if you are archiving folders
65 from more than one IMAP account.
66
67 -F STRING, --filter-append=STRING
68 Append STRING to the IMAP filter string. For IMAP wizards.
69
70 -p NAME, --prefix=NAME
71 Prefix NAME to the archive name. NAME is expanded by the python(1)
72 function time.strftime(), which means that you can specify special
73 directives in NAME to make an archive named after the archive
74 cut-off date. See the discussion of the --suffix option for a list
75 of valid strftime() directives. The default is not to add a prefix.
76
77 -s NAME, --suffix=NAME
78 Use the suffix NAME to create the filename used for archives. The
79 default is _archive, unless a prefix is specified.
80
81 Like a prefix, the suffix NAME is expanded by the python(1)
82 function time.strftime() with the archive cut-off date.
83 time.strftime() understands the following directives:
84
85 %a Locale's abbreviated weekday name.
86
87 %A Locale's full weekday name.
88
89 %b Locale's abbreviated month name.
90
91 %B Locale's full month name.
92
93 %c Locale's appropriate date and time representation.
94
95 %d Day of the month as a decimal number [01,31].
96
97 %H Hour (24-hour clock) as a decimal number [00,23].
98
99 %I Hour (12-hour clock) as a decimal number [01,12].
100
101 %j Day of the year as a decimal number [001,366].
102
103 %m Month as a decimal number [01,12].
104
105 %M Minute as a decimal number [00,59].
106
107 %p Locale's equivalent of either AM or PM.
108
109 %S Second as a decimal number [00,61]. (1)
110
111 %U Week number of the year (Sunday as the first day of the
112 week) as a decimal number [00,53]. All days in a new year
113 preceding the first Sunday are considered to be in week 0.
114
115 %w Weekday as a decimal number [0(Sunday),6].
116
117 %W Week number of the year (Monday as the first day of the
118 week) as a decimal number [00,53]. All days in a new year
119 preceding the first Sunday are considered to be in week 0.
120
121 %x Locale's appropriate date representation.
122
123 %X Locale's appropriate time representation.
124
125 %y Year without century as a decimal number [00,99].
126
127 %Y Year with century as a decimal number.
128
129 %Z Time zone name (or by no characters if no time zone exists).
130
131 %% A literal “%” character.
132
133
134 -a NAME, --archive-name=NAME
135 Use NAME as the archive name, ignoring the name of the mailbox that
136 is archived. Like prefixes and suffixes, NAME is expanded by
137 time.strftime() with the archive cut-off date. Because it
138 hard-codes the archive name, this option cannot be used when
139 archiving multiple mailboxes.
140
141 -S NUM, --size=NUM
142 Only archive messages that are NUM bytes or greater.
143
144 -n, --dry-run
145 Don't write to any files -- just show what would have been done.
146 This is useful for testing to see how many messages would have been
147 archived.
148
149 -u, --preserve-unread
150 Do not archive any messages that have not yet been read.
151 archivemail determines if a message in a mbox-format or MH-format
152 mailbox has been read by looking at the Status header (if it
153 exists). If the status header is equal to “RO” or “OR” then
154 archivemail assumes the message has been read. archivemail
155 determines if a maildir message has been read by looking at the
156 filename. If the filename contains an “S” after :2, then it assumes
157 the message has been read.
158
159 --dont-mangle
160 Do not mangle lines in message bodies beginning with “From ”. When
161 archiving a message from a mailbox not in mbox format, by default
162 archivemail mangles such lines by prepending a “>” to them, since
163 mail user agents might otherwise interpret these lines as message
164 separators. Messages from mbox folders are never mangled. See
165 mbox(5) for more information.
166
167 --delete
168 Delete rather than archive old mail. Use this option with caution!
169
170 --copy
171 Copy rather than archive old mail. Creates an archive, but the
172 archived messages are not deleted from the originating mailbox,
173 which is left unchanged. This is a complement to the --delete
174 option, and mainly useful for testing purposes. Note that multiple
175 passes will create duplicates, since messages are blindly appended
176 to an existing archive.
177
178 --all
179 Archive all messages, without distinction.
180
181 --include-flagged
182 Normally messages that are flagged important are not archived or
183 deleted. If you specify this option, these messages can be archived
184 or deleted just like any other message.
185
186 --no-compress
187 Do not compress any archives.
188
189 --warn-duplicate
190 Warn about duplicate Message-IDs that appear in the input mailbox.
191
192 -v, --verbose
193 Reports lots of extra debugging information about what is going on.
194
195 --debug-imap=NUM
196 Set IMAP debugging level. This makes archivemail dump its
197 conversation with the IMAP server and some internal IMAP processing
198 to stdout. Higher values for NUM give more elaborate output. Set
199 NUM to 4 to see all exchanged IMAP commands. (Actually, NUM is just
200 passed literally to imaplib.Debug.)
201
202 -q, --quiet
203 Turns on quiet mode. Do not print any statistics about how many
204 messages were archived. This should be used if you are running
205 archivemail from cron.
206
207 -V, --version
208 Display the version of archivemail and exit.
209
210 -h, --help
211 Display brief summary information about how to run archivemail.
212
214 archivemail requires python(1) version 2.3 or later. When reading an
215 mbox-format mailbox, archivemail will create a lockfile with the
216 extension .lock so that procmail(1) will not deliver to the mailbox
217 while it is being processed. It will also create an advisory lock on
218 the mailbox using lockf(2). The archive is locked in the same way when
219 it is updated. archivemail will also complain and abort if a 3rd-party
220 modifies the mailbox while it is being read.
221
222 archivemail will always attempt to preserve the last-access and
223 last-modify times of the input mailbox. Archive mailboxes are always
224 created with a mode of 0600. If archivemail finds a pre-existing
225 archive mailbox it will append rather than overwrite that archive.
226 archivemail will refuse to operate on mailboxes that are symbolic
227 links.
228
229 archivemail attempts to find the delivery date of a message by looking
230 for valid dates in the following headers, in order of precedence:
231 Delivery-date, Received, Resent-Date and Date. If it cannot find any
232 valid date in these headers, it will use the last-modified file
233 timestamp on MH and Maildir format mailboxes, or the date on the From_
234 line on mbox-format mailboxes.
235
236 When archiving mailboxes with leading dots in the name, archivemail
237 will strip the dots off the archive name, so that the resulting archive
238 file is not hidden. This is not done if the --prefix or --archive-name
239 option is used. Should there really be mailboxes distinguished only by
240 leading dots in the name, they will thus be archived to the same
241 archive file by default.
242
243 A conversion from other formats to mbox(5) will silently overwrite
244 existing Status and X-Status message headers.
245
246 IMAP
247 When archivemail processes an IMAP folder, all messages in that folder
248 will have their \Recent flag unset, and they will probably not show up
249 as “new” in your user agent later on. There is no way around this, it's
250 just how IMAP works. This does not apply, however, if you run
251 archivemail with the options --dry-run or --copy.
252
253 archivemail relies on server-side searches to determine the messages
254 that should be archived. When matching message dates, IMAP servers
255 refer to server internal message dates, and these may differ from both
256 delivery time of a message and its Date header. Also, there exist
257 broken servers which do not implement server side searches.
258
259 IMAP URLs
260 archivemail's IMAP URL parser was written with the RFC 2882
261 (Internet Message Format) rules for the local-part of email
262 addresses in mind. So, rather than enforcing an URL-style encoding
263 of non-ascii and reserved characters, it allows to double-quote the
264 username and password. If your username or password contains the
265 delimiter characters “@” or “:”, just quote it like this:
266 imap://"username@bogus.com":"password"@imap.bogus.com/mailbox. You
267 can use a backslash to escape double-quotes that are part of a
268 quoted username or password. Note that quoting only a substring
269 will not work, and be aware that your shell will probably remove
270 unprotected quotes or backslashes.
271
272 Similarly, there is no need to percent-encode non-ascii characters
273 in IMAP mailbox names. As long as your locale is configured
274 properly, archivemail should handle these without problems. Note,
275 however, that due to limitations of the IMAP protocol, non-ascii
276 characters do not mix well with wildcards in mailbox names.
277
278 archivemail tries to be smart when handling mailbox paths. In
279 particular, it will automatically add an IMAP NAMESPACE prefix to
280 the mailbox path if necessary; and if you are archiving a
281 subfolder, you can use the slash as a path separator instead of the
282 IMAP server's internal representation.
283
285 To archive all messages in the mailbox debian-user that are older than
286 180 days to a compressed mailbox called debian-user_archive.gz in the
287 current directory:
288
289 bash$ archivemail debian-user
290
291 To archive all messages in the mailbox debian-user that are older than
292 180 days to a compressed mailbox called debian-user_October_2001.gz
293 (where the current month and year is April, 2002) in the current
294 directory:
295
296 bash$ archivemail --suffix '_%B_%Y' debian-user
297
298 To archive all messages in the mailbox cm-melb that are older than the
299 first of January 2002 to a compressed mailbox called cm-melb_archive.gz
300 in the current directory:
301
302 bash$ archivemail --date='1 Jan 2002' cm-melb
303
304 Exactly the same as the above example, using an ISO date format
305 instead:
306
307 bash$ archivemail --date=2002-01-01 cm-melb
308
309 To delete all messages in the mailbox spam that are older than 30 days:
310
311 bash$ archivemail --delete --days=30 spam
312
313 To archive all read messages in the mailbox incoming that are older
314 than 180 days to a compressed mailbox called incoming_archive.gz in the
315 current directory:
316
317 bash$ archivemail --preserve-unread incoming
318
319 To archive all messages in the mailbox received that are older than 180
320 days to an uncompressed mailbox called received_archive in the current
321 directory:
322
323 bash$ archivemail --no-compress received
324
325 To archive all mailboxes in the directory $HOME/Mail that are older
326 than 90 days to compressed mailboxes in the $HOME/Mail/Archive
327 directory:
328
329 bash$ archivemail -d90 -o $HOME/Mail/Archive $HOME/Mail/*
330
331 To archive all mails older than 180 days from the given IMAP INBOX to a
332 compressed mailbox INBOX_archive.gz in the $HOME/Mail/Archive
333 directory, quoting the password and reading it from the environment
334 variable PASSWORD:
335
336 bash$ archivemail -o $HOME/Mail/Archive imaps://user:'"'$PASSWORD'"'@example.org/INBOX
337
338 Note the protected quotes.
339
340 To archive all mails older than 180 days in subfolders of foo on the
341 given IMAP server to corresponding archives in the current working
342 directory, reading the password from the file ~/imap-pass.txt:
343
344 bash$ archivemail --pwfile=~/imap-pass.txt imaps://user@example.org/foo/*
345
347 Probably the best way to run archivemail is from your crontab(5) file,
348 using the --quiet option. Don't forget to try the --dry-run and perhaps
349 the --copy option for non-destructive testing.
350
352 Normally the exit status is 0. Nonzero indicates an unexpected error.
353
355 If an IMAP mailbox path contains slashes, the archive filename will be
356 derived from the basename of the mailbox. If the server's folder
357 separator differs from the Unix slash and is used in the IMAP URL,
358 however, the whole path will be considered the basename of the mailbox.
359 E.g. the two URLs imap://user@example.com/folder/subfolder and
360 imap://user@example.com/folder.subfolder will be archived in
361 subfolder_archive.gz and folder.subfolder_archive.gz, respectively,
362 although they might refer to the same IMAP mailbox.
363
364 archivemail does not support reading MMDF or Babyl-format mailboxes. In
365 fact, it will probably think it is reading an mbox-format mailbox and
366 cause all sorts of problems.
367
368 archivemail is still too slow, but if you are running from crontab(5)
369 you won't care. Archiving maildir-format mailboxes should be a lot
370 quicker than mbox-format mailboxes since it is less painful for the
371 original mailbox to be reconstructed after selective message removal.
372
374 mbox(5), crontab(5), python(1), procmail(1)
375
377 The archivemail home page is currently hosted at sourceforge[1]
378
380 This manual page was written by Paul Rodger <paul at paulrodger dot
381 com>. Updated and supplemented by Nikolaus Schulz microschulz@web.de
382
384 1. sourceforge
385 http://archivemail.sourceforge.net
386
387
388
389archivemail 0.9.0 5 July 2011 ARCHIVEMAIL(1)