1XML::LibXML::InputCallbUascekr(3C)ontributed Perl DocumeXnMtLa:t:iLoinbXML::InputCallback(3)
2
3
4
6 XML::LibXML::InputCallback - XML::LibXML Class for Input Callbacks
7
9 use XML::LibXML;
10
12 You may get unexpected results if you are trying to load external
13 documents during libxml2 parsing if the location of the resource is not
14 a HTTP, FTP or relative location but a absolute path for example. To
15 get around this limitation, you may add your own input handler to open,
16 read and close particular types of locations or URI classes. Using this
17 input callback handlers, you can handle your own custom URI schemes for
18 example.
19
20 The input callbacks are used whenever XML::LibXML has to get something
21 other than externally parsed entities from somewhere. They are
22 implemented using a callback stack on the Perl layer in analogy to
23 libxml2's native callback stack.
24
25 The XML::LibXML::InputCallback class transparently registers the input
26 callbacks for the libxml2's parser processes.
27
28 How does XML::LibXML::InputCallback work?
29 The libxml2 library offers a callback implementation as global
30 functions only. To work-around the troubles resulting in having only
31 global callbacks - for example, if the same global callback stack is
32 manipulated by different applications running together in a single
33 Apache Web-server environment -, XML::LibXML::InputCallback comes with
34 a object-oriented and a function-oriented part.
35
36 Using the function-oriented part the global callback stack of libxml2
37 can be manipulated. Those functions can be used as interface to the
38 callbacks on the C- and XS Layer. At the object-oriented part,
39 operations for working with the "pseudo-localized" callback stack are
40 implemented. Currently, you can register and de-register callbacks on
41 the Perl layer and initialize them on a per parser basis.
42
43 Callback Groups
44
45 The libxml2 input callbacks come in groups. One group contains a URI
46 matcher (match), a data stream constructor (open), a data stream reader
47 (read), and a data stream destructor (close). The callbacks can be
48 manipulated on a per group basis only.
49
50 The Parser Process
51
52 The parser process works on an XML data stream, along which, links to
53 other resources can be embedded. This can be links to external DTDs or
54 XIncludes for example. Those resources are identified by URIs. The
55 callback implementation of libxml2 assumes that one callback group can
56 handle a certain amount of URIs and a certain URI scheme. Per default,
57 callback handlers for file://*, file:://*.gz, http://* and ftp://* are
58 registered.
59
60 Callback groups in the callback stack are processed from top to bottom,
61 meaning that callback groups registered later will be processed before
62 the earlier registered ones.
63
64 While parsing the data stream, the libxml2 parser checks if a
65 registered callback group will handle a URI - if they will not, the URI
66 will be interpreted as file://URI. To handle a URI, the match callback
67 will have to return '1'. If that happens, the handling of the URI will
68 be passed to that callback group. Next, the URI will be passed to the
69 open callback, which should return a reference to the data stream if it
70 successfully opened the file, '0' otherwise. If opening the stream was
71 successful, the read callback will be called repeatedly until it
72 returns an empty string. After the read callback, the close callback
73 will be called to close the stream.
74
75 Organisation of callback groups in XML::LibXML::InputCallback
76
77 Callback groups are implemented as a stack (Array), each entry holds a
78 reference to an array of the callbacks. For the libxml2 library, the
79 XML::LibXML::InputCallback callback implementation appears as one
80 single callback group. The Perl implementation however allows one to
81 manage different callback stacks on a per libxml2-parser basis.
82
83 Using XML::LibXML::InputCallback
84 After object instantiation using the parameter-less constructor, you
85 can register callback groups.
86
87 my $input_callbacks = XML::LibXML::InputCallback->new();
88 $input_callbacks->register_callbacks([ $match_cb1, $open_cb1,
89 $read_cb1, $close_cb1 ] );
90 $input_callbacks->register_callbacks([ $match_cb2, $open_cb2,
91 $read_cb2, $close_cb2 ] );
92 $input_callbacks->register_callbacks( [ $match_cb3, $open_cb3,
93 $read_cb3, $close_cb3 ] );
94
95 $parser->input_callbacks( $input_callbacks );
96 $parser->parse_file( $some_xml_file );
97
98 What about the old callback system prior to XML::LibXML::InputCallback?
99 In XML::LibXML versions prior to 1.59 - i.e. without the
100 XML::LibXML::InputCallback module - you could define your callbacks
101 either using globally or locally. You still can do that using
102 XML::LibXML::InputCallback, and in addition to that you can define the
103 callbacks on a per parser basis!
104
105 If you use the old callback interface through global callbacks,
106 XML::LibXML::InputCallback will treat them with a lower priority as the
107 ones registered using the new interface. The global callbacks will not
108 override the callback groups registered using the new interface. Local
109 callbacks are attached to a specific parser instance, therefore they
110 are treated with highest priority. If the match callback of the
111 callback group registered as local variable is identical to one of the
112 callback groups registered using the new interface, that callback group
113 will be replaced.
114
115 Users of the old callback implementation whose open callback returned a
116 plain string, will have to adapt their code to return a reference to
117 that string after upgrading to version >= 1.59. The new callback system
118 can only deal with the open callback returning a reference!
119
121 Global Variables
122 $_CUR_CB
123 Stores the current callback and can be used as shortcut to access
124 the callback stack.
125
126 @_GLOBAL_CALLBACKS
127 Stores all callback groups for the current parser process.
128
129 @_CB_STACK
130 Stores the currently used callback group. Used to prevent parser
131 errors when dealing with nested XML data.
132
133 Global Callbacks
134 _callback_match
135 Implements the interface for the match callback at C-level and for
136 the selection of the callback group from the callbacks defined at
137 the Perl-level.
138
139 _callback_open
140 Forwards the open callback from libxml2 to the corresponding
141 callback function at the Perl-level.
142
143 _callback_read
144 Forwards the read request to the corresponding callback function at
145 the Perl-level and returns the result to libxml2.
146
147 _callback_close
148 Forwards the close callback from libxml2 to the corresponding
149 callback function at the Perl-level..
150
151 Class methods
152 new()
153 A simple constructor.
154
155 register_callbacks( [ $match_cb, $open_cb, $read_cb, $close_cb ])
156 The four callbacks have to be given as array reference in the above
157 order match, open, read, close!
158
159 unregister_callbacks( [ $match_cb, $open_cb, $read_cb, $close_cb ])
160 With no arguments given, unregister_callbacks() will delete the
161 last registered callback group from the stack. If four callbacks
162 are passed as array reference, the callback group to unregister
163 will be identified by the match callback and deleted from the
164 callback stack. Note that if several identical match callbacks are
165 defined in different callback groups, ALL of them will be deleted
166 from the stack.
167
168 init_callbacks( $parser )
169 Initializes the callback system for the provided parser before
170 starting a parsing process.
171
172 cleanup_callbacks()
173 Resets global variables and the libxml2 callback stack.
174
175 lib_init_callbacks()
176 Used internally for callback registration at C-level.
177
178 lib_cleanup_callbacks()
179 Used internally for callback resetting at the C-level.
180
182 The following example is a purely fictitious example that uses a
183 MyScheme::Handler object that responds to methods similar to an
184 IO::Handle.
185
186 # Define the four callback functions
187 sub match_uri {
188 my $uri = shift;
189 return $uri =~ /^myscheme:/; # trigger our callback group at a 'myscheme' URIs
190 }
191
192 sub open_uri {
193 my $uri = shift;
194 my $handler = MyScheme::Handler->new($uri);
195 return $handler;
196 }
197
198 # The returned $buffer will be parsed by the libxml2 parser
199 sub read_uri {
200 my $handler = shift;
201 my $length = shift;
202 my $buffer;
203 read($handler, $buffer, $length);
204 return $buffer; # $buffer will be an empty string '' if read() is done
205 }
206
207 # Close the handle associated with the resource.
208 sub close_uri {
209 my $handler = shift;
210 close($handler);
211 }
212
213 # Register them with a instance of XML::LibXML::InputCallback
214 my $input_callbacks = XML::LibXML::InputCallback->new();
215 $input_callbacks->register_callbacks([ \&match_uri, \&open_uri,
216 \&read_uri, \&close_uri ] );
217
218 # Register the callback group at a parser instance
219 $parser->input_callbacks( $input_callbacks );
220
221 # $some_xml_file will be parsed using our callbacks
222 $parser->parse_file( $some_xml_file );
223
225 Matt Sergeant, Christian Glahn, Petr Pajas
226
228 2.0208
229
231 2001-2007, AxKit.com Ltd.
232
233 2002-2006, Christian Glahn.
234
235 2006-2009, Petr Pajas.
236
238 This program is free software; you can redistribute it and/or modify it
239 under the same terms as Perl itself.
240
241
242
243perl v5.36.0 2023-01-20 XML::LibXML::InputCallback(3)