1xmerl_sax_parser(3)        Erlang Module Definition        xmerl_sax_parser(3)
2
3
4

NAME

6       xmerl_sax_parser - XML SAX parser API
7

DESCRIPTION

9       A  SAX  parser  for XML that sends the events through a callback inter‐
10       face. SAX is the Simple API for XML, originally a  Java-only  API.  SAX
11       was  the  first  widely  adopted API for XML in Java, and is a de facto
12       standard where there are  versions  for  several  programming  language
13       environments other than Java.
14

DATA TYPES

16         option():
17           Options  used  to  customize  the behaviour of the parser. Possible
18           options are:
19
20           {continuation_fun, ContinuationFun}:
21             ContinuationFun is a call back function to decide what to  do  if
22             the parser runs into EOF before the document is complete.
23
24           {continuation_state, term()}:
25              State that is accessible in the continuation call back function.
26
27           {event_fun, EventFun}:
28             EventFun is the call back function for parser events.
29
30           {event_state, term()}:
31              State that is accessible in the event call back function.
32
33           {file_type, FileType}:
34              Flag that tells the parser if it's parsing a DTD or a normal XML
35             file (default normal).
36
37             * FileType = normal | dtd
38
39           {encoding, Encoding}:
40              Set default character set used (default UTF-8).  This  character
41             set is used only if not explicitly given by the XML document.
42
43             * Encoding = utf8 | {utf16,big} | {utf16,little} | latin1 | list
44
45           skip_external_dtd:
46              Skips the external DTD during parsing.
47
48         :
49
50
51         event():
52           The SAX events that are sent to the user via the callback.
53
54           startDocument:
55              Receive  notification  of  the  beginning of a document. The SAX
56             parser will send this event only  once  before  any  other  event
57             callbacks.
58
59           endDocument:
60              Receive  notification  of  the end of a document. The SAX parser
61             will send this event only once, and it will  be  the  last  event
62             during the parse.
63
64           {startPrefixMapping, Prefix, Uri}:
65              Begin  the  scope  of  a prefix-URI Namespace mapping. Note that
66             start/endPrefixMapping events are not guaranteed to  be  properly
67             nested relative to each other: all startPrefixMapping events will
68             occur immediately before the  corresponding  startElement  event,
69             and  all endPrefixMapping events will occur immediately after the
70             corresponding endElement event, but their order is not  otherwise
71             guaranteed.  There  will not be start/endPrefixMapping events for
72             the "xml" prefix, since it is predeclared and immutable.
73
74             * Prefix = string()
75
76             * Uri = string()
77
78           {endPrefixMapping, Prefix}:
79              End the scope of a prefix-URI mapping.
80
81             * Prefix = string()
82
83           {startElement, Uri, LocalName, QualifiedName, Attributes}:
84              Receive notification of the beginning of an element. The  Parser
85             will send this event at the beginning of every element in the XML
86             document; there will be  a  corresponding  endElement  event  for
87             every startElement event (even when the element is empty). All of
88             the element's content will be reported, in order, before the cor‐
89             responding endElement event.
90
91             * Uri = string()
92
93             * LocalName = string()
94
95             * QualifiedName = {Prefix, LocalName}
96
97             * Prefix = string()
98
99             * Attributes = [{Uri, Prefix, AttributeName, Value}]
100
101             * AttributeName = string()
102
103             * Value = string()
104
105           {endElement, Uri, LocalName, QualifiedName}:
106              Receive  notification  of  the end of an element. The SAX parser
107             will send this event at the end of every element in the XML docu‐
108             ment;  there will be a corresponding startElement event for every
109             endElement event (even when the element is empty).
110
111             * Uri = string()
112
113             * LocalName = string()
114
115             * QualifiedName = {Prefix, LocalName}
116
117             * Prefix = string()
118
119           {characters, string()}:
120              Receive notification of character data.
121
122           {ignorableWhitespace, string()}:
123              Receive notification of ignorable whitespace in element content.
124
125           {processingInstruction, Target, Data}:
126              Receive notification of a  processing  instruction.  The  Parser
127             will  send this event once for each processing instruction found:
128             note that processing instructions may occur before or  after  the
129             main document element.
130
131             * Target = string()
132
133             * Data = string()
134
135           {comment, string()}:
136              Report  an XML comment anywhere in the document (both inside and
137             outside of the document element).
138
139           startCDATA:
140              Report the start of a CDATA section. The contents of  the  CDATA
141             section will be reported through the regular characters event.
142
143           endCDATA:
144              Report the end of a CDATA section.
145
146           {startDTD, Name, PublicId, SystemId}:
147              Report  the  start of DTD declarations, it's reporting the start
148             of the DOCTYPE declaration. If the document has no DOCTYPE decla‐
149             ration, this event will not be sent.
150
151             * Name = string()
152
153             * PublicId = string()
154
155             * SystemId = string()
156
157           endDTD:
158              Report  the  end  of DTD declarations, it's reporting the end of
159             the DOCTYPE declaration.
160
161           {startEntity, SysId}:
162              Report the beginning of some internal and external XML entities.
163             ???
164
165           {endEntity, SysId}:
166              Report the end of an entity. ???
167
168           {elementDecl, Name, Model}:
169              Report  an element type declaration. The content model will con‐
170             sist of the string "EMPTY", the string "ANY", or a  parenthesised
171             group,  optionally followed by an occurrence indicator. The model
172             will be normalized so  that  all  parameter  entities  are  fully
173             resolved  and  all  whitespace  is  removed,and  will include the
174             enclosing parentheses.  Other  normalization  (such  as  removing
175             redundant parentheses or simplifying occurrence indicators) is at
176             the discretion of the parser.
177
178             * Name = string()
179
180             * Model = string()
181
182           {attributeDecl, ElementName, AttributeName, Type, Mode, Value}:
183              Report an attribute type declaration.
184
185             * ElementName = string()
186
187             * AttributeName = string()
188
189             * Type = string()
190
191             * Mode = string()
192
193             * Value = string()
194
195           {internalEntityDecl, Name, Value}:
196              Report an internal entity declaration.
197
198             * Name = string()
199
200             * Value = string()
201
202           {externalEntityDecl, Name, PublicId, SystemId}:
203              Report a parsed external entity declaration.
204
205             * Name = string()
206
207             * PublicId = string()
208
209             * SystemId = string()
210
211           {unparsedEntityDecl, Name, PublicId, SystemId, Ndata}:
212              Receive notification of an unparsed entity declaration event.
213
214             * Name = string()
215
216             * PublicId = string()
217
218             * SystemId = string()
219
220             * Ndata = string()
221
222           {notationDecl, Name, PublicId, SystemId}:
223              Receive notification of a notation declaration event.
224
225             * Name = string()
226
227             * PublicId = string()
228
229             * SystemId = string()
230
231         unicode_char():
232            Integer representing valid unicode codepoint.
233
234         unicode_binary():
235            Binary with characters encoded in UTF-8 or UTF-16.
236
237         latin1_binary():
238            Binary with characters encoded in iso-latin-1.
239

EXPORTS

241       file(Filename, Options) -> Result
242
243              Types:
244
245                 Filename = string()
246                 Options = [option()]
247                 Result = {ok, EventState, Rest} |
248                  {Tag, Location, Reason, EndTags, EventState}
249                 Rest = unicode_binary() | latin1_binary()
250                 Tag = atom() (fatal_error, or user defined tag)
251                 Location = {CurrentLocation, EntityName, LineNo}
252                 CurrentLocation = string()
253                 EntityName = string()
254                 LineNo = integer()
255                 EventState = term()
256                 Reason = term()
257
258              Parse file containing an XML document.  This  functions  uses  a
259              default continuation function to read the file in blocks.
260
261       stream(Xml, Options) -> Result
262
263              Types:
264
265                 Xml = unicode_binary() | latin1_binary() | [unicode_char()]
266                 Options = [option()]
267                 Result = {ok, EventState, Rest} |
268                  {Tag, Location, Reason, EndTags, EventState}
269                 Rest = unicode_binary() | latin1_binary() | [unicode_char()]
270                 Tag = atom() (fatal_error or user defined tag)
271                 Location = {CurrentLocation, EntityName, LineNo}
272                 CurrentLocation = string()
273                 EntityName = string()
274                 LineNo = integer()
275                 EventState = term()
276                 Reason = term()
277
278              Parse a stream containing an XML document.
279

CALLBACK FUNCTIONS

281       The  callback  interface is based on that the user sends a fun with the
282       correct signature to the parser.
283

EXPORTS

285       ContinuationFun(State) -> {NewBytes, NewState}
286
287              Types:
288
289                 State = NewState = term()
290                 NewBytes = binary() | list() (should be same as  start  input
291                 in stream/2)
292
293              This  function  is  called whenever the parser runs out of input
294              data. If the function can't get hold of more input an empty list
295              or  binary  (depends  on  start  input in stream/2) is returned.
296              Other types of errors is handled through exceptions. Use throw/1
297              to send the following tuple {Tag = atom(), Reason = string()} if
298              the continuation function encounters a fatal error.  Tag  is  an
299              atom that identifies the functional entity that sends the excep‐
300              tion and Reason is a string that describes the problem.
301
302       EventFun(Event, Location, State) -> NewState
303
304              Types:
305
306                 Event = event()
307                 Location = {CurrentLocation, Entityname, LineNo}
308                 CurrentLocation = string()
309                 Entityname = string()
310                 LineNo = integer()
311                 State = NewState = term()
312
313              This function is called for every event sent by the parser.  The
314              error  handling  is done through exceptions. Use throw/1 to send
315              the following tuple {Tag = atom(), Reason  =  string()}  if  the
316              application  encounters a fatal error. Tag is an atom that iden‐
317              tifies the functional entity that sends the exception and Reason
318              is a string that describes the problem.
319
320
321
322Ericsson AB                      xmerl 1.3.24              xmerl_sax_parser(3)
Impressum