1Dbx(3) User Contributed Perl Documentation Dbx(3)
2
3
4
6 Mail::Transport::Dbx - Parse Outlook Express mailboxes
7
9 use Mail::Transport::Dbx;
10
11 my $dbx = eval { Mail::Transport::Dbx->new("box.mbx") };
12 die $@ if $@;
13
14 for my $i (0 .. $dbx->msgcount - 1) {
15 my $msg = $dbx->get($i);
16 print $msg->subject;
17 ...
18 }
19
20 # more convenient
21 for my $msg ($dbx->emails) {
22 print $msg->subject;
23 ...
24 }
25
27 Read dbx files (mailbox files of Outlook Express)
28
30 Mail::Transport::Dbx gives you platform independent access to Outlook
31 Express' dbx files. Extract subfolders, messages etc. from those or
32 use it to convert dbx archives into a more portable format (such as
33 standard mbox format).
34
35 It relies on LibDBX to do its job. The bad news: LibDBX knows nothing
36 about the endianness of your machine so it does not work on big-endian
37 machines such as Macintoshs or SUNs. The good news: I made the
38 appropriate patches so that it in fact does work even on machines with
39 the 'wrong' byteorder (exception: machines with an even odder byteorder
40 such as Crays are not suppored; exception from the exception: If you
41 buy me a Cray I'll promise to fix it. :-).
42
43 You have to understand the structure of .dbx files to make proper use
44 of this module. Outlook Express keeps a couple of those files on your
45 harddisk. For instance:
46
47 Folders.dbx
48 folder1.dbx
49 comp.lang.perl.misc.dbx
50
51 The nasty thing about that is that there are really two different kinds
52 of such files: One that contains the actual messages and one that
53 merely holds references to other .dbx files. Folders.dbx could be
54 considered the toplevel file since it lists all other available .dbx
55 files. As for folder1.dbx and comp.lang.perl.misc.dbx you can't yet
56 know whether they contain messages or subfolders (though
57 comp.lang.perl.misc.dbx probably contains newsgroup messages that are
58 treated as mere emails).
59
60 Fortunately this module gives you the information you need. A common
61 approach would be the following:
62
63 1) create a new Mail::Transport::Dbx object from "Folders.dbx"
64
65 2) iterate over its items using the get() method
66 2.1 if it returns a Mail::Transport::Dbx::Email
67 => a message
68 2.2 if it returns a Mail::Transport::Dbx::Folder
69 => a folder
70
71 3) if message
72 3.1 call whatever method from Mail::Transport::Dbx::Email
73 you need
74
75 4) if folder
76 4.1 call whatever method from Mail::Transport::Dbx::Folder
77 you need
78 OR
79 4.2 call dbx() on it to create a new Mail::Transport::Dbx
80 object
81 4.2.1 if dbx() returned something defined
82 => rollback to item 2)
83
84 The confusing thing is that .dbx files may contain references to other
85 folders that don't really exist! If Outlook Express was used a
86 newsclient this is a common scenario since Folders.dbx lists all
87 newsgroups as separate "Mail::Transport::Dbx::Folder" objects no matter
88 whether you are subscribed to any of those or not. So in essence
89 calling "dbx()" on a folder will only return a new object if the
90 corresponding .dbx file exists.
91
93 The following are methods for Mail::Transport::Dbx objects:
94
95 new(filename)
96 new(filehandle-ref)
97 Passed either a string being the filename or an already opened and
98 readable filehandle ref, "new()" will construct a
99 Mail::Transport::Dbx object from that.
100
101 This happens regardless of whether you open an ordinary dbx file or
102 the special Folders.dbx file that contains an overview over all
103 available dbx subfolders.
104
105 If opening fails for some reason your program will instantly
106 "die()" so be sure to wrap the constructor into an "eval()" and
107 check for $@:
108
109 my $dbx = eval { Mail::Transport::Dbx->new( "file.dbx" ) };
110 die $@ if $@;
111
112 Be careful with using a filehandle, though. On Windows, you might
113 need to use "binmode()" on your handle or otherwise the stream from
114 your dbx file might get corrupted.
115
116 msgcount
117 Returns the number of items stored in the dbx structure. If you
118 previously opened Folders.dbx "msgcount()" returns the number of
119 subfolders in it. Otherwise it returns the number of messages.
120 "msgcount() - 1" is the index of the last item.
121
122 emails
123 In list context this method returns all emails contained in the
124 file. In boolean (that is, scalar) context it returns a true value
125 if the file contains emails and false if it contains subfolders.
126
127 if ($dbx->emails) {
128 print "I contain emails";
129 } else {
130 print "I contain subfolders";
131 }
132
133 This is useful for iterations:
134
135 for my $msg ($dbx->emails) {
136 ...
137 }
138
139 subfolders
140 In list context this method returns all subfolders of the current
141 file as "Mail::Transport::Dbx::Folder" objects. In boolean (scalar)
142 context it returns true of the file contains subfolders and false
143 if it contains emails.
144
145 Remember that you still have to call "dbx()" on these subfolders if
146 you want to do something useful with them:
147
148 for my $sub ($dbx->subfolders) {
149 if (my $d = $sub->dbx) {
150 # $d now a proper Mail::Transport::Dbx object
151 # with content
152 } else {
153 print "Subfolder referenced but non-existent";
154 }
155 }
156
157 get(n)
158 Get the item at the n-th position. First item is at position 0.
159 "get()" is actually a factory method so it either returns a
160 "Mail::Transport::Dbx::Email" or "Mail::Transport::Dbx::Folder"
161 object. This depends on the folder you call this method upon:
162
163 my $dbx = Mail::Transport::Dbx->new( "Folders.dbx" );
164 my $item = $dbx->get(0);
165
166 $item will now definitely be a "Mail::Transport::Dbx::Folder"
167 object since Folders.dbx doesn't contain emails but references to
168 subfolders.
169
170 You can use the "is_email()" and "is_folder()" method to check for
171 its type:
172
173 if ($item->is_email) {
174 print $item->subject;
175 } else {
176 # it's a subfolder
177 ...
178 }
179
180 On an error, this method returns an undefined value. Check
181 "$dbx->errstr" to find out what went wrong.
182
183 errstr
184 Whenever an error occurs, "errstr()" will contain a string giving
185 you further help what went wrong.
186
187 WARNING: Internally it relies on a global variable so all objects
188 will have the same error-string! That means it only makes sense to
189 use it after an operation that potentially raises an error:
190
191 # example 1
192 my $dbx = Mail::Transport::Dbx->new("box.dbx")
193 or die Mail::Transport::Dbx->errstr;
194
195 # example 2
196 my $msg = $dbx->get(5) or print $dbx->errstr;
197
198 error
199 Similar to "errstr()", only that it will return an error code. See
200 "Exportable constants/Error-Codes" under "EXPORT" for codes that
201 can be returned.
202
203 The following are the methods for Mail::Transport::Dbx::Email objects:
204
205 as_string
206 Returns the whole message (header and body) as one large string.
207
208 Note that the string still contains the raw newlines as used by
209 DOSish systems (\015\012). If you want newlines to be represented
210 in the native format of your operating system, use the following:
211
212 my $email = $msg->as_string;
213 $email =~ s/\015\012/\n/g;
214
215 On Windows this is a no-op so you can ommit this step.
216
217 Especially for news-articles this method may return "undef". This
218 always happens when the particular articles was only partially
219 downloaded (that is, only header retrieved from the newsserver).
220 There is no way to retrieve this header literally with "header".
221 Methods like "subject" etc. however do work.
222
223 header
224 Returns the header-portion of the whole email.
225
226 With respect to newlines the same as described under "as_string()"
227 applies.
228
229 Returns "undef" under the same circumstances as "as_string".
230
231 body
232 Returns the body-portion of the whole email.
233
234 With respect to newlines the same as described under "as_string()"
235 applies.
236
237 Returns "undef" under the same circumstances as "as_string".
238
239 subject
240 Returns the subject of the email as a string.
241
242 psubject
243 Returns the processed subject of the email as a string. 'Processed'
244 means that additions such as "Re:" etc. are cut off.
245
246 msgid
247 Returns the message-id of the message as a string.
248
249 parents_ids
250 Returns the message-ids of the parent messages as a string.
251
252 sender_name
253 Returns the name of the sender of this email as a string.
254
255 sender_address
256 Returns the address of the sender of this email as a string.
257
258 recip_name
259 Returns the name of the recipient of this email as a string. This
260 might be your name. ;-)
261
262 recip_address
263 Returns the address of the recipient of this email as a string.
264
265 oe_account_name
266 Returns the Outlook Express account name this message was retrieved
267 with as a string.
268
269 oe_account_num
270 Outlook Express accounts also seem to have a numerical
271 representation. This method will return this as a string (something
272 like "0000001").
273
274 fetched_server
275 Returns the name of the POP server that this message was retrieved
276 from as a string.
277
278 rcvd_localtime
279 This is the exact duplicate of Perl's builtin "localtime()" applied
280 to the date this message was received. It returns a string in
281 scalar context and a list with nine elements in list context. See
282 'perldoc -f localtime' for details.
283
284 rcvd_gmtime
285 Same as "rcvd_localtime()" but returning a date conforming to GMT.
286
287 date_received( [format, [len, [gmtime]]] )
288 This method returns the date this message was received by you as a
289 string. The date returned is calculated according to "localtime()".
290
291 Without additional arguments, the string returned looks something
292 like
293
294 Sun Apr 14 02:27:57 2002
295
296 The optional first argument is a string describing the format of
297 the date line. It is passed unchanged to strftime(3). Please
298 consult your system's documentation for strftime(3) to see how such
299 a string has to look like. The default string to render the date is
300 "%a %b %e %H:%M:%S %Y".
301
302 The optional second argument is the max string length to be
303 returned by "date_received()". This parameter is also passed
304 unaltered to "strftime()". This method uses 25 as default
305
306 The third argument can be set to a true value if you rather want to
307 get a date in GMT. So if you want to get the GMT of the date but
308 want to use the default rendering settings, you will have to
309 provide them yourself:
310
311 print $msg->date_received("%a %b %e %H:%M:%S %Y", 25, 1);
312
313 is_seen
314 Returns a true value if this message has already been seen. False
315 otherwise.
316
317 is_email
318 Always returns true for this kind of object.
319
320 is_folder
321 Always returns false for this kind of object.
322
323 The following methods exist for Mail::Transport::Dbx::Folder objects:
324
325 dbx This is a convenience method. It creates a "Mail::Transport::Dbx"
326 object from the folder object. If the folder is only mentioned but
327 not physically existing on your hard-drive (either because you
328 deleted the .dbx file or it was actually never there which
329 especially happens for newsgroup files) "dbx" returns an undefined
330 value.
331
332 Please read "DESCRIPTION" again to learn why "dbx()" can return an
333 undefined value.
334
335 num The index number of this folder. This is the number you passed to
336 "$dbx->get()" to retrieve this folder.
337
338 type
339 According to libdbx.h this returns one of "DBX_TYPE_FOLDER" or
340 "DBX_TYPE_EMAIL". Use it to check whether the folder contains
341 emails or other folders.
342
343 name
344 The name of the folder.
345
346 file
347 The filename of the folder. Use this, to create a new
348 "Mail::Transport::Dbx" object:
349
350 # $folder is a Mail::Transport::Dbx::Folder object
351 my $new_dbx = Mail::Transport::Dbx->new( $folder->file );
352
353 Consider using the "dbx()" method instead.
354
355 This method returns an undefined value if there is no .dbx file
356 belonging to this folder.
357
358 id Numerical id of the folder. Not sure what this is useful for.
359
360 parent_id
361 Numerical id of the parent's folder.
362
363 folder_path
364 Returns the full folder name of this folder as a list of path
365 elements. It's then in your responsibility to join them together by
366 using a delimiter that doesn't show up in any of the elements. ;-)
367
368 print join("/", $_->folder_path), "\n" for $dbx->subfolders;
369
370 # could for instance produce a long list, such as:
371 Outlook Express/news.rwth-aachen.de/de.comp.software.announce
372 Outlook Express/news.rwth-aachen.de/de.comp.software.misc
373 ...
374 Outlook Express/Lokale Ordner/test/test1
375 Outlook Express/Lokale Ordner/test
376 Outlook Express/Lokale Ordner/Entwürfe
377 Outlook Express/Lokale Ordner/Gelöschte Objekte
378 Outlook Express/Lokale Ordner/Gesendete Objekte
379 Outlook Express/Lokale Ordner/Postausgang
380 Outlook Express/Lokale Ordner/Posteingang
381 Outlook Express/Lokale Ordner
382 Outlook Express/Outlook Express
383
384 Note that a slash (as any other character) might not be a safe
385 choice as it could show up in a folder name.
386
388 None by default.
389
390 Exportable constants
391 If you intend to use any of the following constants, you have to import
392 them when "use()"ing the module. You can import them all in one go
393 thusly:
394
395 use Mail::Transport::Dbx qw(:all);
396
397 Or you import only those you need:
398
399 use Mail::Transport::Dbx qw(DBX_TYPE_EMAIL DBX_TYPE_FOLDER);
400
401 Error-Codes
402 · DBX_NOERROR
403
404 No error occured.
405
406 · DBX_BADFILE
407
408 Dbx file operation failed (open or close)
409
410 · DBX_DATA_READ
411
412 Reading of data from dbx file failed
413
414 · DBX_INDEXCOUNT
415
416 Index out of range
417
418 · DBX_INDEX_OVERREAD
419
420 Request was made for index reference greater than exists
421
422 · DBX_INDEX_UNDERREAD
423
424 Number of indexes read from dbx file is less than expected
425
426 · DBX_INDEX_READ
427
428 Reading of Index Pointer from dbx file failed
429
430 · DBX_ITEMCOUNT
431
432 Reading of Item Count from dbx file failed
433
434 · DBX_NEWS_ITEM
435
436 Item is a news item not an email
437
438 Dbx types
439 One of these is returned by "$folder->type" so you can check
440 whether the folder contains emails or subfolders. Note that only
441 DBX_TYPE_EMAIL and DBX_TYPE_FOLDER are ever returned so even
442 newsgroup postings are of the type DBX_TYPE_EMAIL.
443
444 · DBX_TYPE_EMAIL
445
446 · DBX_TYPE_FOLDER
447
448 · DBX_TYPE_NEWS
449
450 Don't use this one!
451
452 · DBX_TYPE_VOID
453
454 I have no idea what this is for.
455
456 Miscellaneous constants
457 · DBX_EMAIL_FLAG_ISSEEN
458
459 · DBX_FLAG_BODY
460
462 You can't retrieve the internal state of the objects using
463 "Data::Dumper" or so since "Mail::Transport::Dbx" uses a blessed scalar
464 to hold a reference to the respective C structures. That means you have
465 to use the provided methods for each object. Call that strong
466 encapsultion if you need an euphemism for that.
467
468 There are currently no plans to implement write access to .dbx files. I
469 leave that up to the authors of libdbx.
470
472 Other than that I don't know yet of any. This, of course, has never
473 actually been a strong indication for the absence of bugs.
474
476 http://sourceforge.net/projects/ol2mbox hosts the libdbx package. It
477 contains the library backing this module along with a description of
478 the file format for .dbx files.
479
481 Tassilo von Parseval, <tassilo.von.parseval@rwth-aachen.de>
482
484 Copyright 2003-2005 by Tassilo von Parseval
485
486 This library is free software; you can redistribute it and/or modify it
487 under the same terms as Perl itself.
488
489
490
491perl v5.30.1 2020-01-30 Dbx(3)