1Dbx(3)                User Contributed Perl Documentation               Dbx(3)
2
3
4

NAME

6       Mail::Transport::Dbx - Parse Outlook Express mailboxes
7

SYNOPSIS

9           use Mail::Transport::Dbx;
10
11           my $dbx = eval { Mail::Transport::Dbx->new("box.mbx") };
12           die $@ if $@;
13
14           for my $i (0 .. $dbx->msgcount - 1) {
15               my $msg = $dbx->get($i);
16               print $msg->subject;
17               ...
18           }
19
20           # more convenient
21           for my $msg ($dbx->emails) {
22               print $msg->subject;
23               ...
24           }
25

ABSTRACT

27           Read dbx files (mailbox files of Outlook Express)
28

DESCRIPTION

30       Mail::Transport::Dbx gives you platform independent access to Outlook
31       Express' dbx files.  Extract subfolders, messages etc. from those or
32       use it to convert dbx archives into a more portable format (such as
33       standard mbox format).
34
35       It relies on LibDBX to do its job. The bad news: LibDBX knows nothing
36       about the endianness of your machine so it does not work on big-endian
37       machines such as Macintoshs or SUNs. The good news: I made the
38       appropriate patches so that it in fact does work even on machines with
39       the 'wrong' byteorder (exception: machines with an even odder byteorder
40       such as Crays are not suppored; exception from the exception: If you
41       buy me a Cray I'll promise to fix it. :-).
42
43       You have to understand the structure of .dbx files to make proper use
44       of this module. Outlook Express keeps a couple of those files on your
45       harddisk. For instance:
46
47           Folders.dbx
48           folder1.dbx
49           comp.lang.perl.misc.dbx
50
51       The nasty thing about that is that there are really two different kinds
52       of such files: One that contains the actual messages and one that
53       merely holds references to other .dbx files. Folders.dbx could be
54       considered the toplevel file since it lists all other available .dbx
55       files. As for folder1.dbx and comp.lang.perl.misc.dbx you can't yet
56       know whether they contain messages or subfolders (though
57       comp.lang.perl.misc.dbx probably contains newsgroup messages that are
58       treated as mere emails).
59
60       Fortunately this module gives you the information you need. A common
61       approach would be the following:
62
63           1) create a new Mail::Transport::Dbx object from "Folders.dbx"
64
65           2) iterate over its items using the get() method
66               2.1 if it returns a Mail::Transport::Dbx::Email
67                   => a message
68               2.2 if it returns a Mail::Transport::Dbx::Folder
69                   => a folder
70
71           3) if message
72               3.1 call whatever method from Mail::Transport::Dbx::Email
73                   you need
74
75           4) if folder
76               4.1 call whatever method from Mail::Transport::Dbx::Folder
77                   you need
78               OR
79               4.2 call dbx() on it to create a new Mail::Transport::Dbx
80                   object
81                   4.2.1 if dbx() returned something defined
82                         => rollback to item 2)
83
84       The confusing thing is that .dbx files may contain references to other
85       folders that don't really exist! If Outlook Express was used a
86       newsclient this is a common scenario since Folders.dbx lists all
87       newsgroups as separate "Mail::Transport::Dbx::Folder" objects no matter
88       whether you are subscribed to any of those or not. So in essence
89       calling "dbx()" on a folder will only return a new object if the
90       corresponding .dbx file exists.
91

METHODS

93       The following are methods for Mail::Transport::Dbx objects:
94
95       new(filename)
96       new(filehandle-ref)
97           Passed either a string being the filename or an already opened and
98           readable filehandle ref, "new()" will construct a
99           Mail::Transport::Dbx object from that.
100
101           This happens regardless of whether you open an ordinary dbx file or
102           the special Folders.dbx file that contains an overview over all
103           available dbx subfolders.
104
105           If opening fails for some reason your program will instantly
106           "die()" so be sure to wrap the constructor into an "eval()" and
107           check for $@:
108
109               my $dbx = eval { Mail::Transport::Dbx->new( "file.dbx" ) };
110               die $@ if $@;
111
112           Be careful with using a filehandle, though. On Windows, you might
113           need to use "binmode()" on your handle or otherwise the stream from
114           your dbx file might get corrupted.
115
116       msgcount
117           Returns the number of items stored in the dbx structure. If you
118           previously opened Folders.dbx "msgcount()" returns the number of
119           subfolders in it. Otherwise it returns the number of messages.
120           "msgcount() - 1" is the index of the last item.
121
122       emails
123           In list context this method returns all emails contained in the
124           file. In boolean (that is, scalar) context it returns a true value
125           if the file contains emails and false if it contains subfolders.
126
127               if ($dbx->emails) {
128                   print "I contain emails";
129               } else {
130                   print "I contain subfolders";
131               }
132
133           This is useful for iterations:
134
135               for my $msg ($dbx->emails) {
136                   ...
137               }
138
139       subfolders
140           In list context this method returns all subfolders of the current
141           file as "Mail::Transport::Dbx::Folder" objects. In boolean (scalar)
142           context it returns true of the file contains subfolders and false
143           if it contains emails.
144
145           Remember that you still have to call "dbx()" on these subfolders if
146           you want to do something useful with them:
147
148               for my $sub ($dbx->subfolders) {
149                   if (my $d = $sub->dbx) {
150                       # $d now a proper Mail::Transport::Dbx object
151                       # with content
152                   } else {
153                       print "Subfolder referenced but non-existent";
154                   }
155               }
156
157       get(n)
158           Get the item at the n-th position. First item is at position 0.
159           "get()" is actually a factory method so it either returns a
160           "Mail::Transport::Dbx::Email" or "Mail::Transport::Dbx::Folder"
161           object. This depends on the folder you call this method upon:
162
163               my $dbx  = Mail::Transport::Dbx->new( "Folders.dbx" );
164               my $item = $dbx->get(0);
165
166           $item will now definitely be a "Mail::Transport::Dbx::Folder"
167           object since Folders.dbx doesn't contain emails but references to
168           subfolders.
169
170           You can use the "is_email()" and "is_folder()" method to check for
171           its type:
172
173               if ($item->is_email) {
174                   print $item->subject;
175               } else {
176                   # it's a subfolder
177                   ...
178               }
179
180           On an error, this method returns an undefined value. Check
181           "$dbx->errstr" to find out what went wrong.
182
183       errstr
184           Whenever an error occurs, "errstr()" will contain a string giving
185           you further help what went wrong.
186
187           WARNING: Internally it relies on a global variable so all objects
188           will have the same error-string! That means it only makes sense to
189           use it after an operation that potentially raises an error:
190
191               # example 1
192               my $dbx = Mail::Transport::Dbx->new("box.dbx")
193                   or die Mail::Transport::Dbx->errstr;
194
195               # example 2
196               my $msg = $dbx->get(5) or print $dbx->errstr;
197
198       error
199           Similar to "errstr()", only that it will return an error code. See
200           "Exportable constants/Error-Codes" under "EXPORT" for codes that
201           can be returned.
202
203       The following are the methods for Mail::Transport::Dbx::Email objects:
204
205       as_string
206           Returns the whole message (header and body) as one large string.
207
208           Note that the string still contains the raw newlines as used by
209           DOSish systems (\015\012). If you want newlines to be represented
210           in the native format of your operating system, use the following:
211
212               my $email = $msg->as_string;
213               $email =~ s/\015\012/\n/g;
214
215           On Windows this is a no-op so you can ommit this step.
216
217           Especially for news-articles this method may return "undef". This
218           always happens when the particular articles was only partially
219           downloaded (that is, only header retrieved from the newsserver).
220           There is no way to retrieve this header literally with "header".
221           Methods like "subject" etc. however do work.
222
223       header
224           Returns the header-portion of the whole email.
225
226           With respect to newlines the same as described under "as_string()"
227           applies.
228
229           Returns "undef" under the same circumstances as "as_string".
230
231       body
232           Returns the body-portion of the whole email.
233
234           With respect to newlines the same as described under "as_string()"
235           applies.
236
237           Returns "undef" under the same circumstances as "as_string".
238
239       subject
240           Returns the subject of the email as a string.
241
242       psubject
243           Returns the processed subject of the email as a string. 'Processed'
244           means that additions such as "Re:" etc. are cut off.
245
246       msgid
247           Returns the message-id of the message as a string.
248
249       parents_ids
250           Returns the message-ids of the parent messages as a string.
251
252       sender_name
253           Returns the name of the sender of this email as a string.
254
255       sender_address
256           Returns the address of the sender of this email as a string.
257
258       recip_name
259           Returns the name of the recipient of this email as a string. This
260           might be your name. ;-)
261
262       recip_address
263           Returns the address of the recipient of this email as a string.
264
265       oe_account_name
266           Returns the Outlook Express account name this message was retrieved
267           with as a string.
268
269       oe_account_num
270           Outlook Express accounts also seem to have a numerical
271           representation. This method will return this as a string (something
272           like "0000001").
273
274       fetched_server
275           Returns the name of the POP server that this message was retrieved
276           from as a string.
277
278       rcvd_localtime
279           This is the exact duplicate of Perl's builtin "localtime()" applied
280           to the date this message was received. It returns a string in
281           scalar context and a list with nine elements in list context. See
282           'perldoc -f localtime' for details.
283
284       rcvd_gmtime
285           Same as "rcvd_localtime()" but returning a date conforming to GMT.
286
287       date_received( [format, [len, [gmtime]]] )
288           This method returns the date this message was received by you as a
289           string. The date returned is calculated according to "localtime()".
290
291           Without additional arguments, the string returned looks something
292           like
293
294               Sun Apr 14 02:27:57 2002
295
296           The optional first argument is a string describing the format of
297           the date line. It is passed unchanged to strftime(3). Please
298           consult your system's documentation for strftime(3) to see how such
299           a string has to look like. The default string to render the date is
300           "%a %b %e %H:%M:%S %Y".
301
302           The optional second argument is the max string length to be
303           returned by "date_received()". This parameter is also passed
304           unaltered to "strftime()". This method uses 25 as default
305
306           The third argument can be set to a true value if you rather want to
307           get a date in GMT. So if you want to get the GMT of the date but
308           want to use the default rendering settings, you will have to
309           provide them yourself:
310
311               print $msg->date_received("%a %b %e %H:%M:%S %Y", 25, 1);
312
313       is_seen
314           Returns a true value if this message has already been seen. False
315           otherwise.
316
317       is_email
318           Always returns true for this kind of object.
319
320       is_folder
321           Always returns false for this kind of object.
322
323       The following methods exist for Mail::Transport::Dbx::Folder objects:
324
325       dbx This is a convenience method. It creates a "Mail::Transport::Dbx"
326           object from the folder object. If the folder is only mentioned but
327           not physically existing on your hard-drive (either because you
328           deleted the .dbx file or it was actually never there which
329           especially happens for newsgroup files) "dbx" returns an undefined
330           value.
331
332           Please read "DESCRIPTION" again to learn why "dbx()" can return an
333           undefined value.
334
335       num The index number of this folder. This is the number you passed to
336           "$dbx->get()" to retrieve this folder.
337
338       type
339           According to libdbx.h this returns one of "DBX_TYPE_FOLDER" or
340           "DBX_TYPE_EMAIL". Use it to check whether the folder contains
341           emails or other folders.
342
343       name
344           The name of the folder.
345
346       file
347           The filename of the folder. Use this, to create a new
348           "Mail::Transport::Dbx" object:
349
350               # $folder is a Mail::Transport::Dbx::Folder object
351               my $new_dbx = Mail::Transport::Dbx->new( $folder->file );
352
353           Consider using the "dbx()" method instead.
354
355           This method returns an undefined value if there is no .dbx file
356           belonging to this folder.
357
358       id  Numerical id of the folder. Not sure what this is useful for.
359
360       parent_id
361           Numerical id of the parent's folder.
362
363       folder_path
364           Returns the full folder name of this folder as a list of path
365           elements. It's then in your responsibility to join them together by
366           using a delimiter that doesn't show up in any of the elements. ;-)
367
368               print join("/", $_->folder_path), "\n" for $dbx->subfolders;
369
370               # could for instance produce a long list, such as:
371               Outlook Express/news.rwth-aachen.de/de.comp.software.announce
372               Outlook Express/news.rwth-aachen.de/de.comp.software.misc
373               ...
374               Outlook Express/Lokale Ordner/test/test1
375               Outlook Express/Lokale Ordner/test
376               Outlook Express/Lokale Ordner/Entwürfe
377               Outlook Express/Lokale Ordner/Gelöschte Objekte
378               Outlook Express/Lokale Ordner/Gesendete Objekte
379               Outlook Express/Lokale Ordner/Postausgang
380               Outlook Express/Lokale Ordner/Posteingang
381               Outlook Express/Lokale Ordner
382               Outlook Express/Outlook Express
383
384           Note that a slash (as any other character) might not be a safe
385           choice as it could show up in a folder name.
386

EXPORT

388       None by default.
389
390   Exportable constants
391       If you intend to use any of the following constants, you have to import
392       them when "use()"ing the module. You can import them all in one go
393       thusly:
394
395           use Mail::Transport::Dbx qw(:all);
396
397       Or you import only those you need:
398
399           use Mail::Transport::Dbx qw(DBX_TYPE_EMAIL DBX_TYPE_FOLDER);
400
401       Error-Codes
402           ·       DBX_NOERROR
403
404                   No error occured.
405
406           ·       DBX_BADFILE
407
408                   Dbx file operation failed (open or close)
409
410           ·       DBX_DATA_READ
411
412                   Reading of data from dbx file failed
413
414           ·       DBX_INDEXCOUNT
415
416                   Index out of range
417
418           ·       DBX_INDEX_OVERREAD
419
420                   Request was made for index reference greater than exists
421
422           ·       DBX_INDEX_UNDERREAD
423
424                   Number of indexes read from dbx file is less than expected
425
426           ·       DBX_INDEX_READ
427
428                   Reading of Index Pointer from dbx file failed
429
430           ·       DBX_ITEMCOUNT
431
432                   Reading of Item Count from dbx file failed
433
434           ·       DBX_NEWS_ITEM
435
436                   Item is a news item not an email
437
438       Dbx types
439           One of these is returned by "$folder->type" so you can check
440           whether the folder contains emails or subfolders. Note that only
441           DBX_TYPE_EMAIL and DBX_TYPE_FOLDER are ever returned so even
442           newsgroup postings are of the type DBX_TYPE_EMAIL.
443
444           ·       DBX_TYPE_EMAIL
445
446           ·       DBX_TYPE_FOLDER
447
448           ·       DBX_TYPE_NEWS
449
450                   Don't use this one!
451
452           ·       DBX_TYPE_VOID
453
454                   I have no idea what this is for.
455
456       Miscellaneous constants
457           ·       DBX_EMAIL_FLAG_ISSEEN
458
459           ·       DBX_FLAG_BODY
460

CAVEATS

462       You can't retrieve the internal state of the objects using
463       "Data::Dumper" or so since "Mail::Transport::Dbx" uses a blessed scalar
464       to hold a reference to the respective C structures. That means you have
465       to use the provided methods for each object. Call that strong
466       encapsultion if you need an euphemism for that.
467
468       There are currently no plans to implement write access to .dbx files. I
469       leave that up to the authors of libdbx.
470

KNOWN BUGS

472       Other than that I don't know yet of any. This, of course, has never
473       actually been a strong indication for the absence of bugs.
474

SEE ALSO

476       http://sourceforge.net/projects/ol2mbox hosts the libdbx package. It
477       contains the library backing this module along with a description of
478       the file format for .dbx files.
479

AUTHOR

481       Tassilo von Parseval, <tassilo.von.parseval@rwth-aachen.de>
482
484       Copyright 2003-2005 by Tassilo von Parseval
485
486       This library is free software; you can redistribute it and/or modify it
487       under the same terms as Perl itself.
488
489
490
491perl v5.30.0                      2019-07-26                            Dbx(3)
Impressum