1xmerl_sax_parser(3)        Erlang Module Definition        xmerl_sax_parser(3)
2
3
4

NAME

6       xmerl_sax_parser - XML SAX parser API
7

DESCRIPTION

9       A  SAX  parser  for XML that sends the events through a callback inter‐
10       face. SAX is the Simple API for XML, originally a  Java-only  API.  SAX
11       was  the  first  widely  adopted API for XML in Java, and is a de facto
12       standard where there are versions for several programming language  en‐
13       vironments other than Java.
14

DATA TYPES

16         option():
17           Options used to customize the behaviour of the parser. Possible op‐
18           tions are:
19
20           {continuation_fun, ContinuationFun}:
21             ContinuationFun is a call back function to decide what to  do  if
22             the parser runs into EOF before the document is complete.
23
24           {continuation_state, term()}:
25              State that is accessible in the continuation call back function.
26
27           {event_fun, EventFun}:
28             EventFun is the call back function for parser events.
29
30           {event_state, term()}:
31              State that is accessible in the event call back function.
32
33           {file_type, FileType}:
34              Flag that tells the parser if it's parsing a DTD or a normal XML
35             file (default normal).
36
37             * FileType = normal | dtd
38
39           {encoding, Encoding}:
40              Set default character set used (default UTF-8).  This  character
41             set is used only if not explicitly given by the XML document.
42
43             * Encoding = utf8 | {utf16,big} | {utf16,little} | latin1 | list
44
45           skip_external_dtd:
46              Skips  the  external DTD during parsing. This option is the same
47             as {external_entities, none} and {fail_undeclared_ref, false} but
48             just for the DTD.
49
50           disallow_entities:
51              Implies that parsing fails if an ENTITY declaration is found.
52
53           {entity_recurse_limit, N}:
54              Sets  how many levels of recursion that is allowed for entities.
55             Default is 3 levels.
56
57           {external_entities, AllowedType}:
58              Sets which types of external entities that should be allowed, if
59             not allowed it's just skipped.
60
61             * AllowedType = all | file | none
62
63           {fail_undeclared_ref, Boolean}:
64              Decides  how  the parser should behave when an undeclared refer‐
65             ence is found. Can be useful if one has turned of external  enti‐
66             ties so that an external DTD is not parsed. Default is true.
67
68         :
69
70
71         event():
72           The SAX events that are sent to the user via the callback.
73
74           startDocument:
75              Receive  notification  of  the  beginning of a document. The SAX
76             parser will send this event only  once  before  any  other  event
77             callbacks.
78
79           endDocument:
80              Receive  notification  of  the end of a document. The SAX parser
81             will send this event only once, and it will  be  the  last  event
82             during the parse.
83
84           {startPrefixMapping, Prefix, Uri}:
85              Begin  the  scope  of  a prefix-URI Namespace mapping. Note that
86             start/endPrefixMapping events are not guaranteed to  be  properly
87             nested relative to each other: all startPrefixMapping events will
88             occur immediately before the  corresponding  startElement  event,
89             and  all endPrefixMapping events will occur immediately after the
90             corresponding endElement event, but their order is not  otherwise
91             guaranteed.  There  will not be start/endPrefixMapping events for
92             the "xml" prefix, since it is predeclared and immutable.
93
94             * Prefix = string()
95
96             * Uri = string()
97
98           {endPrefixMapping, Prefix}:
99              End the scope of a prefix-URI mapping.
100
101             * Prefix = string()
102
103           {startElement, Uri, LocalName, QualifiedName, Attributes}:
104              Receive notification of the beginning of an element. The  Parser
105             will send this event at the beginning of every element in the XML
106             document; there will be a corresponding endElement event for  ev‐
107             ery  startElement  event (even when the element is empty). All of
108             the element's content will be reported, in order, before the cor‐
109             responding endElement event.
110
111             * Uri = string()
112
113             * LocalName = string()
114
115             * QualifiedName = {Prefix, LocalName}
116
117             * Prefix = string()
118
119             * Attributes = [{Uri, Prefix, AttributeName, Value}]
120
121             * AttributeName = string()
122
123             * Value = string()
124
125           {endElement, Uri, LocalName, QualifiedName}:
126              Receive  notification  of  the end of an element. The SAX parser
127             will send this event at the end of every element in the XML docu‐
128             ment;  there will be a corresponding startElement event for every
129             endElement event (even when the element is empty).
130
131             * Uri = string()
132
133             * LocalName = string()
134
135             * QualifiedName = {Prefix, LocalName}
136
137             * Prefix = string()
138
139           {characters, string()}:
140              Receive notification of character data.
141
142           {ignorableWhitespace, string()}:
143              Receive notification of ignorable whitespace in element content.
144
145           {processingInstruction, Target, Data}:
146              Receive notification of a  processing  instruction.  The  Parser
147             will  send this event once for each processing instruction found:
148             note that processing instructions may occur before or  after  the
149             main document element.
150
151             * Target = string()
152
153             * Data = string()
154
155           {comment, string()}:
156              Report  an XML comment anywhere in the document (both inside and
157             outside of the document element).
158
159           startCDATA:
160              Report the start of a CDATA section. The contents of  the  CDATA
161             section will be reported through the regular characters event.
162
163           endCDATA:
164              Report the end of a CDATA section.
165
166           {startDTD, Name, PublicId, SystemId}:
167              Report  the  start of DTD declarations, it's reporting the start
168             of the DOCTYPE declaration. If the document has no DOCTYPE decla‐
169             ration, this event will not be sent.
170
171             * Name = string()
172
173             * PublicId = string()
174
175             * SystemId = string()
176
177           endDTD:
178              Report  the  end  of DTD declarations, it's reporting the end of
179             the DOCTYPE declaration.
180
181           {startEntity, SysId}:
182              Report the beginning of some internal and external XML entities.
183             ???
184
185           {endEntity, SysId}:
186              Report the end of an entity. ???
187
188           {elementDecl, Name, Model}:
189              Report  an element type declaration. The content model will con‐
190             sist of the string "EMPTY", the string "ANY", or a  parenthesised
191             group,  optionally followed by an occurrence indicator. The model
192             will be normalized so that all parameter entities are  fully  re‐
193             solved and all whitespace is removed,and will include the enclos‐
194             ing parentheses. Other normalization (such as removing  redundant
195             parentheses  or simplifying occurrence indicators) is at the dis‐
196             cretion of the parser.
197
198             * Name = string()
199
200             * Model = string()
201
202           {attributeDecl, ElementName, AttributeName, Type, Mode, Value}:
203              Report an attribute type declaration.
204
205             * ElementName = string()
206
207             * AttributeName = string()
208
209             * Type = string()
210
211             * Mode = string()
212
213             * Value = string()
214
215           {internalEntityDecl, Name, Value}:
216              Report an internal entity declaration.
217
218             * Name = string()
219
220             * Value = string()
221
222           {externalEntityDecl, Name, PublicId, SystemId}:
223              Report a parsed external entity declaration.
224
225             * Name = string()
226
227             * PublicId = string()
228
229             * SystemId = string()
230
231           {unparsedEntityDecl, Name, PublicId, SystemId, Ndata}:
232              Receive notification of an unparsed entity declaration event.
233
234             * Name = string()
235
236             * PublicId = string()
237
238             * SystemId = string()
239
240             * Ndata = string()
241
242           {notationDecl, Name, PublicId, SystemId}:
243              Receive notification of a notation declaration event.
244
245             * Name = string()
246
247             * PublicId = string()
248
249             * SystemId = string()
250
251         unicode_char():
252            Integer representing valid unicode codepoint.
253
254         unicode_binary():
255            Binary with characters encoded in UTF-8 or UTF-16.
256
257         latin1_binary():
258            Binary with characters encoded in iso-latin-1.
259

EXPORTS

261       file(Filename, Options) -> Result
262
263              Types:
264
265                 Filename = string()
266                 Options = [option()]
267                 Result = {ok, EventState, Rest} |
268                  {Tag, Location, Reason, EndTags, EventState}
269                 Rest = unicode_binary() | latin1_binary()
270                 Tag = atom() (fatal_error, or user defined tag)
271                 Location = {CurrentLocation, EntityName, LineNo}
272                 CurrentLocation = string()
273                 EntityName = string()
274                 LineNo = integer()
275                 EventState = term()
276                 Reason = term()
277
278              Parse file containing an XML document. This functions uses a de‐
279              fault continuation function to read the file in blocks.
280
281       stream(Xml, Options) -> Result
282
283              Types:
284
285                 Xml = unicode_binary() | latin1_binary() | [unicode_char()]
286                 Options = [option()]
287                 Result = {ok, EventState, Rest} |
288                  {Tag, Location, Reason, EndTags, EventState}
289                 Rest = unicode_binary() | latin1_binary() | [unicode_char()]
290                 Tag = atom() (fatal_error or user defined tag)
291                 Location = {CurrentLocation, EntityName, LineNo}
292                 CurrentLocation = string()
293                 EntityName = string()
294                 LineNo = integer()
295                 EventState = term()
296                 Reason = term()
297
298              Parse a stream containing an XML document.
299

CALLBACK FUNCTIONS

301       The  callback  interface is based on that the user sends a fun with the
302       correct signature to the parser.
303

EXPORTS

305       Module:ContinuationFun(State) -> {NewBytes, NewState}
306
307              Types:
308
309                 State = NewState = term()
310                 NewBytes = binary() | list() (should be same as  start  input
311                 in stream/2)
312
313              This  function  is  called whenever the parser runs out of input
314              data. If the function can't get hold of more input an empty list
315              or  binary  (depends  on  start  input in stream/2) is returned.
316              Other types of errors is handled through exceptions. Use throw/1
317              to send the following tuple {Tag = atom(), Reason = string()} if
318              the continuation function encounters a fatal error.  Tag  is  an
319              atom that identifies the functional entity that sends the excep‐
320              tion and Reason is a string that describes the problem.
321
322       Module:EventFun(Event, Location, State) -> NewState
323
324              Types:
325
326                 Event = event()
327                 Location = {CurrentLocation, Entityname, LineNo}
328                 CurrentLocation = string()
329                 Entityname = string()
330                 LineNo = integer()
331                 State = NewState = term()
332
333              This function is called for every event sent by the parser.  The
334              error  handling  is done through exceptions. Use throw/1 to send
335              the following tuple {Tag = atom(), Reason = string()} if the ap‐
336              plication  encounters a fatal error. Tag is an atom that identi‐
337              fies the functional entity that sends the exception  and  Reason
338              is a string that describes the problem.
339
340
341
342Ericsson AB                     xmerl 1.3.31.1             xmerl_sax_parser(3)
Impressum