1Dbx(3) User Contributed Perl Documentation Dbx(3)
2
3
4
6 Mail::Transport::Dbx - Parse Outlook Express mailboxes
7
9 use Mail::Transport::Dbx;
10
11 my $dbx = eval { Mail::Transport::Dbx->new("box.mbx") };
12 die $@ if $@;
13
14 for my $i (0 .. $dbx->msgcount - 1) {
15 my $msg = $dbx->get($i);
16 print $msg->subject;
17 ...
18 }
19
20 # more convenient
21 for my $msg ($dbx->emails) {
22 print $msg->subject;
23 ...
24 }
25
27 Read dbx files (mailbox files of Outlook Express)
28
30 Mail::Transport::Dbx gives you platform independent access to Outlook
31 Express' dbx files. Extract subfolders, messages etc. from those or
32 use it to convert dbx archives into a more portable format (such as
33 standard mbox format).
34
35 It relies on LibDBX to do its job. The bad news: LibDBX knows nothing
36 about the endianness of your machine so it does not work on big-endian
37 machines such as Macintoshs or SUNs. The good news: I made the
38 appropriate patches so that it in fact does work even on machines with
39 the 'wrong' byteorder (exception: machines with an even odder byteorder
40 such as Crays are not suppored; exception from the exception: If you
41 buy me a Cray I'll promise to fix it. :-).
42
43 You have to understand the structure of .dbx files to make proper use
44 of this module. Outlook Express keeps a couple of those files on your
45 harddisk. For instance:
46
47 Folders.dbx
48 folder1.dbx
49 comp.lang.perl.misc.dbx
50
51 The nasty thing about that is that there are really two different kinds
52 of such files: One that contains the actual messages and one that
53 merely holds references to other .dbx files. Folders.dbx could be
54 considered the toplevel file since it lists all other available .dbx
55 files. As for folder1.dbx and comp.lang.perl.misc.dbx you can't yet
56 know whether they contain messages or subfolders (though
57 comp.lang.perl.misc.dbx probably contains newsgroup messages that are
58 treated as mere emails).
59
60 Fortunately this module gives you the information you need. A common
61 approach would be the following:
62
63 1) create a new Mail::Transport::Dbx object from "Folders.dbx"
64
65 2) iterate over its items using the get() method
66 2.1 if it returns a Mail::Transport::Dbx::Email
67 => a message
68 2.2 if it returns a Mail::Transport::Dbx::Folder
69 => a folder
70
71 3) if message
72 3.1 call whatever method from Mail::Transport::Dbx::Email
73 you need
74
75 4) if folder
76 4.1 call whatever method from Mail::Transport::Dbx::Folder
77 you need
78 OR
79 4.2 call dbx() on it to create a new Mail::Transport::Dbx
80 object
81 4.2.1 if dbx() returned something defined
82 => rollback to item 2)
83
84 The confusing thing is that .dbx files may contain references to other
85 folders that don't really exist! If Outlook Express was used a
86 newsclient this is a common scenario since Folders.dbx lists all
87 newsgroups as separate "Mail::Transport::Dbx::Folder" objects no matter
88 whether you are subscribed to any of those or not. So in essence
89 calling dbx() on a folder will only return a new object if the
90 corresponding .dbx file exists.
91
93 The following are methods for Mail::Transport::Dbx objects:
94
95 new(filename)
96 new(filehandle-ref)
97 Passed either a string being the filename or an already opened and
98 readable filehandle ref, new() will construct a
99 Mail::Transport::Dbx object from that.
100
101 This happens regardless of whether you open an ordinary dbx file or
102 the special Folders.dbx file that contains an overview over all
103 available dbx subfolders.
104
105 If opening fails for some reason your program will instantly die()
106 so be sure to wrap the constructor into an eval() and check for $@:
107
108 my $dbx = eval { Mail::Transport::Dbx->new( "file.dbx" ) };
109 die $@ if $@;
110
111 Be careful with using a filehandle, though. On Windows, you might
112 need to use binmode() on your handle or otherwise the stream from
113 your dbx file might get corrupted.
114
115 msgcount
116 Returns the number of items stored in the dbx structure. If you
117 previously opened Folders.dbx msgcount() returns the number of
118 subfolders in it. Otherwise it returns the number of messages.
119 "msgcount() - 1" is the index of the last item.
120
121 emails
122 In list context this method returns all emails contained in the
123 file. In boolean (that is, scalar) context it returns a true value
124 if the file contains emails and false if it contains subfolders.
125
126 if ($dbx->emails) {
127 print "I contain emails";
128 } else {
129 print "I contain subfolders";
130 }
131
132 This is useful for iterations:
133
134 for my $msg ($dbx->emails) {
135 ...
136 }
137
138 subfolders
139 In list context this method returns all subfolders of the current
140 file as "Mail::Transport::Dbx::Folder" objects. In boolean (scalar)
141 context it returns true of the file contains subfolders and false
142 if it contains emails.
143
144 Remember that you still have to call dbx() on these subfolders if
145 you want to do something useful with them:
146
147 for my $sub ($dbx->subfolders) {
148 if (my $d = $sub->dbx) {
149 # $d now a proper Mail::Transport::Dbx object
150 # with content
151 } else {
152 print "Subfolder referenced but non-existent";
153 }
154 }
155
156 get(n)
157 Get the item at the n-th position. First item is at position 0.
158 get() is actually a factory method so it either returns a
159 "Mail::Transport::Dbx::Email" or "Mail::Transport::Dbx::Folder"
160 object. This depends on the folder you call this method upon:
161
162 my $dbx = Mail::Transport::Dbx->new( "Folders.dbx" );
163 my $item = $dbx->get(0);
164
165 $item will now definitely be a "Mail::Transport::Dbx::Folder"
166 object since Folders.dbx doesn't contain emails but references to
167 subfolders.
168
169 You can use the is_email() and is_folder() method to check for its
170 type:
171
172 if ($item->is_email) {
173 print $item->subject;
174 } else {
175 # it's a subfolder
176 ...
177 }
178
179 On an error, this method returns an undefined value. Check
180 "$dbx->errstr" to find out what went wrong.
181
182 errstr
183 Whenever an error occurs, errstr() will contain a string giving you
184 further help what went wrong.
185
186 WARNING: Internally it relies on a global variable so all objects
187 will have the same error-string! That means it only makes sense to
188 use it after an operation that potentially raises an error:
189
190 # example 1
191 my $dbx = Mail::Transport::Dbx->new("box.dbx")
192 or die Mail::Transport::Dbx->errstr;
193
194 # example 2
195 my $msg = $dbx->get(5) or print $dbx->errstr;
196
197 error
198 Similar to errstr(), only that it will return an error code. See
199 "Exportable constants/Error-Codes" under "EXPORT" for codes that
200 can be returned.
201
202 The following are the methods for Mail::Transport::Dbx::Email objects:
203
204 as_string
205 Returns the whole message (header and body) as one large string.
206
207 Note that the string still contains the raw newlines as used by
208 DOSish systems (\015\012). If you want newlines to be represented
209 in the native format of your operating system, use the following:
210
211 my $email = $msg->as_string;
212 $email =~ s/\015\012/\n/g;
213
214 On Windows this is a no-op so you can ommit this step.
215
216 Especially for news-articles this method may return "undef". This
217 always happens when the particular articles was only partially
218 downloaded (that is, only header retrieved from the newsserver).
219 There is no way to retrieve this header literally with "header".
220 Methods like "subject" etc. however do work.
221
222 header
223 Returns the header-portion of the whole email.
224
225 With respect to newlines the same as described under as_string()
226 applies.
227
228 Returns "undef" under the same circumstances as "as_string".
229
230 body
231 Returns the body-portion of the whole email.
232
233 With respect to newlines the same as described under as_string()
234 applies.
235
236 Returns "undef" under the same circumstances as "as_string".
237
238 subject
239 Returns the subject of the email as a string.
240
241 psubject
242 Returns the processed subject of the email as a string. 'Processed'
243 means that additions such as "Re:" etc. are cut off.
244
245 msgid
246 Returns the message-id of the message as a string.
247
248 parents_ids
249 Returns the message-ids of the parent messages as a string.
250
251 sender_name
252 Returns the name of the sender of this email as a string.
253
254 sender_address
255 Returns the address of the sender of this email as a string.
256
257 recip_name
258 Returns the name of the recipient of this email as a string. This
259 might be your name. ;-)
260
261 recip_address
262 Returns the address of the recipient of this email as a string.
263
264 oe_account_name
265 Returns the Outlook Express account name this message was retrieved
266 with as a string.
267
268 oe_account_num
269 Outlook Express accounts also seem to have a numerical
270 representation. This method will return this as a string (something
271 like "0000001").
272
273 fetched_server
274 Returns the name of the POP server that this message was retrieved
275 from as a string.
276
277 rcvd_localtime
278 This is the exact duplicate of Perl's builtin localtime() applied
279 to the date this message was received. It returns a string in
280 scalar context and a list with nine elements in list context. See
281 'perldoc -f localtime' for details.
282
283 rcvd_gmtime
284 Same as rcvd_localtime() but returning a date conforming to GMT.
285
286 date_received( [format, [len, [gmtime]]] )
287 This method returns the date this message was received by you as a
288 string. The date returned is calculated according to localtime().
289
290 Without additional arguments, the string returned looks something
291 like
292
293 Sun Apr 14 02:27:57 2002
294
295 The optional first argument is a string describing the format of
296 the date line. It is passed unchanged to strftime(3). Please
297 consult your system's documentation for strftime(3) to see how such
298 a string has to look like. The default string to render the date is
299 "%a %b %e %H:%M:%S %Y".
300
301 The optional second argument is the max string length to be
302 returned by date_received(). This parameter is also passed
303 unaltered to strftime(). This method uses 25 as default
304
305 The third argument can be set to a true value if you rather want to
306 get a date in GMT. So if you want to get the GMT of the date but
307 want to use the default rendering settings, you will have to
308 provide them yourself:
309
310 print $msg->date_received("%a %b %e %H:%M:%S %Y", 25, 1);
311
312 is_seen
313 Returns a true value if this message has already been seen. False
314 otherwise.
315
316 is_email
317 Always returns true for this kind of object.
318
319 is_folder
320 Always returns false for this kind of object.
321
322 The following methods exist for Mail::Transport::Dbx::Folder objects:
323
324 dbx This is a convenience method. It creates a "Mail::Transport::Dbx"
325 object from the folder object. If the folder is only mentioned but
326 not physically existing on your hard-drive (either because you
327 deleted the .dbx file or it was actually never there which
328 especially happens for newsgroup files) "dbx" returns an undefined
329 value.
330
331 Please read "DESCRIPTION" again to learn why dbx() can return an
332 undefined value.
333
334 num The index number of this folder. This is the number you passed to
335 "$dbx->get()" to retrieve this folder.
336
337 type
338 According to libdbx.h this returns one of "DBX_TYPE_FOLDER" or
339 "DBX_TYPE_EMAIL". Use it to check whether the folder contains
340 emails or other folders.
341
342 name
343 The name of the folder.
344
345 file
346 The filename of the folder. Use this, to create a new
347 "Mail::Transport::Dbx" object:
348
349 # $folder is a Mail::Transport::Dbx::Folder object
350 my $new_dbx = Mail::Transport::Dbx->new( $folder->file );
351
352 Consider using the dbx() method instead.
353
354 This method returns an undefined value if there is no .dbx file
355 belonging to this folder.
356
357 id Numerical id of the folder. Not sure what this is useful for.
358
359 parent_id
360 Numerical id of the parent's folder.
361
362 folder_path
363 Returns the full folder name of this folder as a list of path
364 elements. It's then in your responsibility to join them together by
365 using a delimiter that doesn't show up in any of the elements. ;-)
366
367 print join("/", $_->folder_path), "\n" for $dbx->subfolders;
368
369 # could for instance produce a long list, such as:
370 Outlook Express/news.rwth-aachen.de/de.comp.software.announce
371 Outlook Express/news.rwth-aachen.de/de.comp.software.misc
372 ...
373 Outlook Express/Lokale Ordner/test/test1
374 Outlook Express/Lokale Ordner/test
375 Outlook Express/Lokale Ordner/Entwürfe
376 Outlook Express/Lokale Ordner/Gelöschte Objekte
377 Outlook Express/Lokale Ordner/Gesendete Objekte
378 Outlook Express/Lokale Ordner/Postausgang
379 Outlook Express/Lokale Ordner/Posteingang
380 Outlook Express/Lokale Ordner
381 Outlook Express/Outlook Express
382
383 Note that a slash (as any other character) might not be a safe
384 choice as it could show up in a folder name.
385
387 None by default.
388
389 Exportable constants
390 If you intend to use any of the following constants, you have to import
391 them when use()ing the module. You can import them all in one go
392 thusly:
393
394 use Mail::Transport::Dbx qw(:all);
395
396 Or you import only those you need:
397
398 use Mail::Transport::Dbx qw(DBX_TYPE_EMAIL DBX_TYPE_FOLDER);
399
400 Error-Codes
401 • DBX_NOERROR
402
403 No error occured.
404
405 • DBX_BADFILE
406
407 Dbx file operation failed (open or close)
408
409 • DBX_DATA_READ
410
411 Reading of data from dbx file failed
412
413 • DBX_INDEXCOUNT
414
415 Index out of range
416
417 • DBX_INDEX_OVERREAD
418
419 Request was made for index reference greater than exists
420
421 • DBX_INDEX_UNDERREAD
422
423 Number of indexes read from dbx file is less than expected
424
425 • DBX_INDEX_READ
426
427 Reading of Index Pointer from dbx file failed
428
429 • DBX_ITEMCOUNT
430
431 Reading of Item Count from dbx file failed
432
433 • DBX_NEWS_ITEM
434
435 Item is a news item not an email
436
437 Dbx types
438 One of these is returned by "$folder->type" so you can check
439 whether the folder contains emails or subfolders. Note that only
440 DBX_TYPE_EMAIL and DBX_TYPE_FOLDER are ever returned so even
441 newsgroup postings are of the type DBX_TYPE_EMAIL.
442
443 • DBX_TYPE_EMAIL
444
445 • DBX_TYPE_FOLDER
446
447 • DBX_TYPE_NEWS
448
449 Don't use this one!
450
451 • DBX_TYPE_VOID
452
453 I have no idea what this is for.
454
455 Miscellaneous constants
456 • DBX_EMAIL_FLAG_ISSEEN
457
458 • DBX_FLAG_BODY
459
461 You can't retrieve the internal state of the objects using
462 "Data::Dumper" or so since "Mail::Transport::Dbx" uses a blessed scalar
463 to hold a reference to the respective C structures. That means you have
464 to use the provided methods for each object. Call that strong
465 encapsultion if you need an euphemism for that.
466
467 There are currently no plans to implement write access to .dbx files. I
468 leave that up to the authors of libdbx.
469
471 Other than that I don't know yet of any. This, of course, has never
472 actually been a strong indication for the absence of bugs.
473
475 http://sourceforge.net/projects/ol2mbox hosts the libdbx package. It
476 contains the library backing this module along with a description of
477 the file format for .dbx files.
478
480 Tassilo von Parseval, <tassilo.von.parseval@rwth-aachen.de>
481
483 Copyright 2003-2005 by Tassilo von Parseval
484
485 This library is free software; you can redistribute it and/or modify it
486 under the same terms as Perl itself.
487
488
489
490perl v5.38.0 2023-07-20 Dbx(3)