1Text::BibTeX(3)       User Contributed Perl Documentation      Text::BibTeX(3)
2
3
4

NAME

6       Text::BibTeX - interface to read and parse BibTeX files
7

SYNOPSIS

9          use Text::BibTeX;
10
11          my $bibfile = Text::BibTeX::File->new("foo.bib");
12          my $newfile = Text::BibTeX::File->new(">newfoo.bib");
13
14          while ($entry = Text::BibTeX::Entry->new($bibfile))
15          {
16             next unless $entry->parse_ok;
17
18                .             # hack on $entry contents, using various
19                .             # Text::BibTeX::Entry methods
20                .
21
22             $entry->write ($newfile);
23          }
24

DESCRIPTION

26       The "Text::BibTeX" module serves mainly as a high-level introduction to
27       the "Text::BibTeX" library, for both code and documentation purposes.
28       The code loads the two fundamental modules for processing BibTeX files
29       ("Text::BibTeX::File" and "Text::BibTeX::Entry"), and this
30       documentation gives a broad overview of the whole library that isn't
31       available in the documentation for the individual modules that comprise
32       it.
33
34       In addition, the "Text::BibTeX" module provides a number of
35       miscellaneous functions that are useful in processing BibTeX data
36       (especially the kind that comes from bibliographies as defined by
37       BibTeX 0.99, rather than generic database files).  These functions
38       don't generally fit in the object-oriented class hierarchy centred
39       around the "Text::BibTeX::Entry" class, mainly because they are
40       specific to bibliographic data and operate on generic strings (rather
41       than being tied to a particular BibTeX entry).  These are also
42       documented here, in "MISCELLANEOUS FUNCTIONS".
43
44       Note that every module described here begins with the "Text::BibTeX"
45       prefix.  For brevity, I have dropped this prefix from most class and
46       module names in the rest of this manual page (and in most of the other
47       manual pages in the library).
48

MODULES AND CLASSES

50       The "Text::BibTeX" library includes a number of modules, many of which
51       provide classes.  Usually, the relationship is simple and obvious: a
52       module provides a class of the same name---for instance, the
53       "Text::BibTeX::Entry" module provides the "Text::BibTeX::Entry" class.
54       There are a few exceptions, though: most obviously, the "Text::BibTeX"
55       module doesn't provide any classes itself, it merely loads two modules
56       ("Text::BibTeX::Entry" and "Text::BibTeX::File") that do.  The other
57       exceptions are mentioned in the descriptions below, and discussed in
58       detail in the documentation for the respective modules.
59
60       The modules are presented roughly in order of increasing
61       specialization: the first three are essential for any program that
62       processes BibTeX data files, regardless of what kind of data they hold.
63       The later modules are specialized for use with bibliographic databases,
64       and serve both to emulate BibTeX 0.99's standard styles and to provide
65       an example of how to define a database structure through such
66       specialized modules.  Each module is fully documented in its respective
67       manual page.
68
69       "Text::BibTeX"
70           Loads the two fundamental modules ("Entry" and "File"), and
71           provides a number of miscellaneous functions that don't fit
72           anywhere in the class hierarchy.
73
74       "Text::BibTeX::File"
75           Provides an object-oriented interface to BibTeX database files.  In
76           addition to the obvious attributes of filename and filehandle, the
77           "file" abstraction manages properties such as the database
78           structure and options for it.
79
80       "Text::BibTeX::Entry"
81           Provides an object-oriented interface to BibTeX entries, which can
82           be parsed from "File" objects, arbitrary filehandles, or strings.
83           Manages all the properties of a single entry: type, key, fields,
84           and values.  Also serves as the base class for the structured entry
85           classes (described in detail in Text::BibTeX::Structure).
86
87       "Text::BibTeX::Value"
88           Provides an object-oriented interface to values and simple values,
89           high-level constructs that can be used to represent the strings
90           associated with each field in an entry.  Normally, field values are
91           returned simply as Perl strings, with macros expanded and multiple
92           strings "pasted" together.  If desired, you can instruct
93           "Text::BibTeX" to return "Text::BibTeX::Value" objects, which give
94           you access to the original form of the data.
95
96       "Text::BibTeX::Structure"
97           Provides the "Structure" and "StructuredEntry" classes, which serve
98           primarily as base classes for the two kinds of classes that define
99           database structures.  Read this man page for a comprehensive
100           description of the mechanism for implementing Perl classes
101           analogous to BibTeX "style files".
102
103       "Text::BibTeX::Bib"
104           Provides the "BibStructure" and "BibEntry" classes, which serve two
105           purposes: they fulfill the same role as the standard style files of
106           BibTeX 0.99, and they give an example of how to write new database
107           structures.  These ultimately derive from, respectively, the
108           "Structure" and "StructuredEntry" classes provided by the
109           "Structure" module.
110
111       "Text::BibTeX::BibSort"
112           One of the "BibEntry" class's base classes: handles the generation
113           of sort keys for sorting prior to output formatting.
114
115       "Text::BibTeX::BibFormat"
116           One of the "BibEntry" class's base classes: handles the formatting
117           of bibliographic data for output in a markup language such as
118           LaTeX.
119
120       "Text::BibTeX::Name"
121           A class used by the "Bib" structure and specific to bibliographic
122           data as defined by BibTeX itself: parses individual author names
123           into "first", "von", "last", and "jr" parts.
124
125       "Text::BibTeX::NameFormat"
126           Also specific to bibliographic data: puts split-up names (as parsed
127           by the "Name" class) back together in a custom way.
128
129       For a first time through the library, you'll probably want to confine
130       your reading to Text::BibTeX::File and Text::BibTeX::Entry.  The other
131       modules will come in handy eventually, especially if you need to
132       emulate BibTeX in a fairly fine grained way (e.g. parsing names,
133       generating sort keys).  But for the simple database hacks that are the
134       bread and butter of the "Text::BibTeX" library, the "File" and "Entry"
135       classes are the bulk of what you'll need.  You may also find some of
136       the material in this manual page useful, namely "CONSTANT VALUES" and
137       "UTILITY FUNCTIONS".
138

EXPORTS

140       The "Text::BibTeX" module has a number of optional exports, most of
141       them constant values described in "CONSTANT VALUES" below.  The default
142       exports are a subset of these constant values that are used
143       particularly often, the "entry metatypes" (also accessible via the
144       export tag "metatypes").  Thus, the following two lines are equivalent:
145
146          use Text::BibTeX;
147          use Text::BibTeX qw(:metatypes);
148
149       Some of the various subroutines provided by the module are also
150       exportable.  "bibloop", "split_list", "purify_string", and
151       "change_case" are all useful in everyday processing of BibTeX data, but
152       don't really fit anywhere in the class hierarchy.  They may be imported
153       from "Text::BibTeX" using the "subs" export tag.  "check_class" and
154       "display_list" are also exportable, but only by name; they are not
155       included in any export tag.  (These two mainly exist for use by other
156       modules in the library.)  For instance, to use "Text::BibTeX" and
157       import the entry metatype constants and the common subroutines:
158
159          use Text::BibTeX qw(:metatypes :subs);
160
161       Another group of subroutines exists for direct manipulation of the
162       macro table maintained by the underlying C library.  These functions
163       (see "Macro table functions", below) allow you to define, delete, and
164       query the value of BibTeX macros (or "abbreviations").  They may be
165       imported en masse using the "macrosubs" export tag:
166
167          use Text::BibTeX qw(:macrosubs);
168

CONSTANT VALUES

170       The "Text::BibTeX" module makes a number of constant values available.
171       These correspond to the values of various enumerated types in the
172       underlying C library, btparse, and their meanings are more fully
173       explained in the btparse documentation.
174
175       Each group of constants is optionally exportable using an export tag
176       given in the descriptions below.
177
178       Entry metatypes
179           "BTE_UNKNOWN", "BTE_REGULAR", "BTE_COMMENT", "BTE_PREAMBLE",
180           "BTE_MACRODEF".  The "metatype" method in the "Entry" class always
181           returns one of these values.  The latter three describe,
182           respectively, "comment", "preamble", and "string" entries;
183           "BTE_REGULAR" describes all other entry types.  "BTE_UNKNOWN"
184           should never be seen (it's mainly useful for C code that might have
185           to detect half-baked data structures).  See also btparse.  Export
186           tag: "metatypes".
187
188       AST node types
189           "BTAST_STRING", "BTAST_MACRO", "BTAST_NUMBER".  Used to distinguish
190           the three kinds of simple values---strings, macros, and numbers.
191           The "SimpleValue" class' "type" method always returns one of these
192           three values.  See also Text::BibTeX::Value, btparse.  Export tag:
193           "nodetypes".
194
195       Name parts
196           "BTN_FIRST", "BTN_VON", "BTN_LAST", "BTN_JR", "BTN_NONE".  Used to
197           specify the various parts of a name after it has been split up.
198           These are mainly useful when using the "NameFormat" class.  See
199           also bt_split_names and bt_format_names.  Export tag: "nameparts".
200
201       Join methods
202           "BTJ_MAYTIE", "BTJ_SPACE", "BTJ_FORCETIE", "BTJ_NOTHING".  Used to
203           tell the "NameFormat" class how to join adjacent tokens together;
204           see Text::BibTeX::NameFormat and bt_format_names.  Export tag:
205           "joinmethods".
206

UTILITY FUNCTIONS

208       "Text::BibTeX" provides several functions that operate outside of the
209       normal class hierarchy.  Of these, only "bibloop" is likely to be of
210       much use to you in writing everyday BibTeX-hacking programs; the other
211       two ("check_class" and "display_list") are mainly provided for the use
212       of other modules in the library.  They are documented here mainly for
213       completeness, but also because they might conceivably be useful in
214       other circumstances.
215
216       bibloop (ACTION, FILES [, DEST])
217           Loops over all entries in a set of BibTeX files, performing some
218           caller-supplied action on each entry.  FILES should be a reference
219           to the list of filenames to process, and ACTION a reference to a
220           subroutine that will be called on each entry.  DEST, if given,
221           should be a "Text::BibTeX::File" object (opened for output) to
222           which entries might be printed.
223
224           The subroutine referenced by ACTION is called with exactly one
225           argument: the "Text::BibTeX::Entry" object representing the entry
226           currently being processed.  Information about both the entry itself
227           and the file where it originated is available through this object;
228           see Text::BibTeX::Entry.  The ACTION subroutine is only called if
229           the entry was successfully parsed; any syntax errors will result in
230           a warning message being printed, and that entry being skipped.
231           Note that all successfully parsed entries are passed to the ACTION
232           subroutine, even "preamble", "string", and "comment" entries.  To
233           skip these pseudo-entries and only process "regular" entries, then
234           your action subroutine should look something like this:
235
236              sub action {
237                 my $entry = shift;
238                 return unless $entry->metatype == BTE_REGULAR;
239                 # process $entry ...
240              }
241
242           If your action subroutine needs any more arguments, you can just
243           create a closure (anonymous subroutine) as a wrapper, and pass it
244           to "bibloop":
245
246              sub action {
247                 my ($entry, $extra_stuff) = @_;
248                 # ...
249              }
250
251              my $extra = ...;
252              Text::BibTeX::bibloop (sub { &action ($_[0], $extra) }, \@files);
253
254           If the ACTION subroutine returns a true value and DEST was given,
255           then the processed entry will be written to DEST.
256
257       check_class (PACKAGE, DESCRIPTION, SUPERCLASS, METHODS)
258           Ensures that a PACKAGE implements a class meeting certain
259           requirements.  First, it inspects Perl's symbol tables to ensure
260           that a package named PACKAGE actually exists.  Then, it ensures
261           that the class named by PACKAGE derives from SUPERCLASS (using the
262           universal method "isa").  This derivation might be through multiple
263           inheritance, or through several generations of a class hierarchy;
264           the only requirement is that SUPERCLASS is somewhere in PACKAGE's
265           tree of base classes.  Finally, it checks that PACKAGE provides
266           each method listed in METHODS (a reference to a list of method
267           names).  This is done with the universal method "can", so the
268           methods might actually come from one of PACKAGE's base classes.
269
270           DESCRIPTION should be a brief string describing the class that was
271           expected to be provided by PACKAGE.  It is used for generating
272           warning messages if any of the class requirements are not met.
273
274           This is mainly used by the supervisory code in
275           "Text::BibTeX::Structure", to ensure that user-supplied structure
276           modules meet the rules required of them.
277
278       display_list (LIST, QUOTE)
279           Converts a list of strings to the grammatical conventions of a
280           human language (currently, only English rules are supported).  LIST
281           must be a reference to a list of strings.  If this list is empty,
282           the empty string is returned.  If it has one element, then just
283           that element is returned.  If it has two elements, then they are
284           joined with the string " and " and the resulting string is
285           returned.  Otherwise, the list has N elements for N >= 3; elements
286           1..N-1 are joined with commas, and the final element is tacked on
287           with an intervening ", and ".
288
289           If QUOTE is true, then each string is encased in single quotes
290           before anything else is done.
291
292           This is used elsewhere in the library for two very distinct
293           purposes: for generating warning messages describing lists of
294           fields that should be present or are conflicting in an entry, and
295           for generating lists of author names in formatted bibliographies.
296

MISCELLANEOUS FUNCTIONS

298       In addition to loading the "File" and "Entry" modules, "Text::BibTeX"
299       loads the XSUB code which bridges the Perl modules to the underlying C
300       library, btparse.  This XSUB code provides a number of miscellaneous
301       utility functions, most of which are put into other packages in the
302       "Text::BibTeX" family for use by the corresponding classes.  (For
303       instance, the XSUB code loaded by "Text::BibTeX" provides a function
304       "Text::BibTeX::Entry::parse", which is actually documented as the
305       "parse" method of the "Text::BibTeX::Entry" class---see
306       Text::BibTeX::Entry.  However, for completeness this function---and all
307       the other functions that become available when you "use
308       Text::BibTeX"---are at least mentioned here.  The only functions from
309       this group that you're ever likely to use are described in "Generic
310       string-processing functions".
311
312   Startup/shutdown functions
313       These just initialize and shutdown the underlying C library.  Don't
314       call either one of them; the "Text::BibTeX" startup/shutdown code takes
315       care of it as appropriate.  They're just mentioned here for
316       completeness.
317
318       initialize ()
319       cleanup ()
320
321   Generic string-processing functions
322       split_list (STRING, DELIM [, FILENAME [, LINE [, DESCRIPTION [,
323       OPTS]]]])
324           Splits a string on a fixed delimiter according to the BibTeX rules
325           for splitting up lists of names.  With BibTeX, the delimiter is
326           hard-coded as "and"; here, you can supply any string.  Instances of
327           DELIM in STRING are considered delimiters if they are at brace-
328           depth zero, surrounded by whitespace, and not at the beginning or
329           end of STRING; the comparison is case-insensitive.  See
330           bt_split_names for full details of how splitting is done (it's not
331           the same as Perl's "split" function). OPTS is a hash ref of the
332           same binmode and normalization arguments as with, e.g.
333           Text::BibTeX::File->open(). split_list calls isplit_list()
334           internally but handles UTF-8 conversion and normalization, if
335           requested.
336
337           Returns the list of strings resulting from splitting STRING on
338           DELIM.
339
340       isplit_list (STRING, DELIM [, FILENAME [, LINE [, DESCRIPTION]]])
341           Splits a string on a fixed delimiter according to the BibTeX rules
342           for splitting up lists of names.  With BibTeX, the delimiter is
343           hard-coded as "and"; here, you can supply any string.  Instances of
344           DELIM in STRING are considered delimiters if they are at brace-
345           depth zero, surrounded by whitespace, and not at the beginning or
346           end of STRING; the comparison is case-insensitive.  See
347           bt_split_names for full details of how splitting is done (it's not
348           the same as Perl's "split" function). This function returns bytes.
349           Use Text::BibTeX::split_list to specify the same binmode and
350           normalization arguments as with, e.g. Text::BibTeX::File->open()
351
352           Returns the list of strings resulting from splitting STRING on
353           DELIM.
354
355       purify_string (STRING [, OPTIONS])
356           "Purifies" STRING in the BibTeX way (usually for generation of sort
357           keys).  See bt_misc for details; note that, unlike the C interface,
358           "purify_string" does not modify STRING in-place.  A purified copy
359           of the input string is returned.
360
361           OPTIONS is currently unused.
362
363       change_case (TRANSFORM, STRING [, OPTIONS])
364           Transforms the case of STRING according to TRANSFORM (a single
365           character, one of 'u', 'l', or 't').  See bt_misc for details;
366           again, "change_case" differs from the C interface in that STRING is
367           not modified in-place---the input string is copied, and the
368           transformed copy is returned.
369
370   Entry-parsing functions
371       Although these functions are provided by the "Text::BibTeX" module,
372       they are actually in the "Text::BibTeX::Entry" package.  That's because
373       they are implemented in C, and thus loaded with the XSUB code that
374       "Text::BibTeX" loads; however, they are actually methods in the
375       "Text::BibTeX::Entry" class.  Thus, they are documented as methods in
376       Text::BibTeX::Entry.
377
378       parse (ENTRY_STRUCT, FILENAME, FILEHANDLE)
379       parse_s (ENTRY_STRUCT, TEXT)
380
381   Macro table functions
382       These functions allow direct access to the macro table maintained by
383       btparse, the C library underlying "Text::BibTeX".  In the normal course
384       of events, macro definitions always accumulate, and are only defined as
385       a result of parsing a macro definition (@string) entry.  btparse never
386       deletes old macro definitions for you, and doesn't have any built-in
387       default macros.  If, for example, you wish to start fresh with new
388       macros for every file, use "delete_all_macros".  If you wish to pre-
389       define certain macros, use "add_macro_text".  (But note that the "Bib"
390       structure, as part of its mission to emulate BibTeX 0.99, defines the
391       standard "month name" macros for you.)
392
393       See also bt_macros in the btparse documentation for a description of
394       the C interface to these functions.
395
396       add_macro_text (MACRO, TEXT [, FILENAME [, LINE]])
397           Defines a new macro, or redefines an old one.  MACRO is the name of
398           the macro, and TEXT is the text it should expand to.  FILENAME and
399           LINE are just used to generate any warnings about the macro
400           definition.  The only such warning occurs when you redefine an old
401           macro: its value is overridden, and add_macro_text() issues a
402           warning saying so.
403
404       delete_macro (MACRO)
405           Deletes a macro from the macro table.  If MACRO isn't defined,
406           takes no action.
407
408       delete_all_macros ()
409           Deletes all macros from the macro table, even the predefined month
410           names.
411
412       macro_length (MACRO)
413           Returns the length of a macro's expansion text.  If the macro is
414           undefined, returns 0; no warning is issued.
415
416       macro_text (MACRO [, FILENAME [, LINE]])
417           Returns the expansion text of a macro.  If the macro is not
418           defined, issues a warning and returns "undef".  FILENAME and LINE,
419           if supplied, are used for generating this warning; they should be
420           supplied if you're looking up the macro as a result of finding it
421           in a file.
422
423   Name-parsing functions
424       These are both private functions for the use of the "Name" class, and
425       therefore are put in the "Text::BibTeX::Name" package.  You should use
426       the interface provided by that class for parsing names in the BibTeX
427       style.
428
429       _split (NAME_STRUCT, NAME, FILENAME, LINE, NAME_NUM, KEEP_CSTRUCT)
430       free (NAME_STRUCT)
431
432   Name-formatting functions
433       These are private functions for the use of the "NameFormat" class, and
434       therefore are put in the "Text::BibTeX::NameFormat" package.  You
435       should use the interface provided by that class for formatting names in
436       the BibTeX style.
437
438       create ([PARTS [, ABBREV_FIRST]])
439       free (FORMAT_STRUCT)
440       _set_text (FORMAT_STRUCT, PART, PRE_PART, POST_PART, PRE_TOKEN,
441       POST_TOKEN)
442       _set_options (FORMAT_STRUCT, PART, ABBREV, JOIN_TOKENS, JOIN_PART)
443       format_name (NAME_STRUCT, FORMAT_STRUCT)
444

BUGS AND LIMITATIONS

446       "Text::BibTeX" inherits several limitations from its base C library,
447       btparse; see "BUGS AND LIMITATIONS" in btparse for details.  In
448       addition, "Text::BibTeX" will not work with a Perl binary built using
449       the "sfio" library.  This is because Perl's I/O abstraction layer does
450       not extend to third-party C libraries that use stdio, and btparse most
451       certainly does use stdio.
452

SEE ALSO

454       btool_faq, Text::BibTeX::File, Text::BibTeX::Entry, Text::BibTeX::Value
455

AUTHOR

457       Greg Ward <gward@python.net>
458
460       Copyright (c) 1997-2000 by Gregory P. Ward.  All rights reserved.  This
461       file is part of the Text::BibTeX library.  This library is free
462       software; you may redistribute it and/or modify it under the same terms
463       as Perl itself.
464
465
466
467perl v5.36.0                      2023-01-29                   Text::BibTeX(3)
Impressum