1LaTeXML::Core::DocumentU(s3e)r Contributed Perl DocumentaLtaiToenXML::Core::Document(3)
2
3
4

NAME

6       "LaTeXML::Core::Document" - represents an XML document under
7       construction.
8

DESCRIPTION

10       A "LaTeXML::Core::Document" represents an XML document being
11       constructed by LaTeXML, and also provides the methods for constructing
12       it.  It extends LaTeXML::Common::Object.
13
14       LaTeXML will have digested the source material resulting in a
15       LaTeXML::Core::List (from a LaTeXML::Core::Stomach) of
16       LaTeXML::Core::Boxs, LaTeXML::Core::Whatsits and sublists.  At this
17       stage, a document is created and it is responsible for `absorbing' the
18       digested material.  Generally, the LaTeXML::Core::Boxs and
19       LaTeXML::Core::Lists create text nodes, whereas the
20       LaTeXML::Core::Whatsits create "XML" document fragments, elements and
21       attributes according to the defining
22       LaTeXML::Core::Definition::Constructor.
23
24       Most document construction occurs at a current insertion point where
25       material will be added, and which moves along with the inserted
26       material.  The LaTeXML::Common::Model, derived from various
27       declarations and document type, is consulted to determine whether an
28       insertion is allowed and when elements may need to be automatically
29       opened or closed in order to carry out a given insertion.  For example,
30       a "subsection" element will typically be closed automatically when it
31       is attempted to open a "section" element.
32
33       In the methods described here, the term $qname is used for XML
34       qualified names.  These are tag names with a namespace prefix.  The
35       prefix should be one registered with the current Model, for use within
36       the code.  This prefix is not necessarily the same as the one used in
37       any DTD, but should be mapped to the a Namespace URI that was
38       registered for the DTD.
39
40       The arguments named $node are an XML::LibXML node.
41
42       The methods here are grouped into three sections covering basic access
43       to the document, insertion methods at the current insertion point, and
44       less commonly used, lower-level, document manipulation methods.
45
46   Accessors
47       "$doc = $document->getDocument;"
48           Returns the "XML::LibXML::Document" currently being constructed.
49
50       "$doc = $document->getModel;"
51           Returns the "LaTeXML::Common::Model" that represents the document
52           model used for this document.
53
54       "$node = $document->getNode;"
55           Returns the node at the current insertion point during
56           construction.  This node is considered still to be `open'; any
57           insertions will go into it (if possible).  The node will be an
58           "XML::LibXML::Element", "XML::LibXML::Text" or, initially,
59           "XML::LibXML::Document".
60
61       "$node = $document->getElement;"
62           Returns the closest ancestor to the current insertion point that is
63           an Element.
64
65       "$node = $document->getChildElement($node);"
66           Returns a list of the child elements, if any, of the $node.
67
68       "@nodes = $document->getLastChildElement($node);"
69           Returns the last child element of the $node, if it has one, else
70           undef.
71
72       "$node = $document->getFirstChildElement($node);"
73           Returns the first child element of the $node, if it has one, else
74           undef.
75
76       "@nodes = $document->findnodes($xpath,$node);"
77           Returns a list of nodes matching the given $xpath expression.  The
78           context node for $xpath is $node, if given, otherwise it is the
79           document element.
80
81       "$node = $document->findnode($xpath,$node);"
82           Returns the first node matching the given $xpath expression.  The
83           context node for $xpath is $node, if given, otherwise it is the
84           document element.
85
86       "$node = $document->getNodeQName($node);"
87           Returns the qualified name (localname with namespace prefix) of the
88           given $node.  The namespace prefix mapping is the code mapping of
89           the current document model.
90
91       "$boolean = $document->canContain($tag,$child);"
92           Returns whether an element $tag can contain a child $child.  $tag
93           and $child can be nodes, qualified names of nodes
94           (prefix:localname), or one of a set of special symbols "#PCDATA",
95           "#Comment", "#Document" or "#ProcessingInstruction".
96
97       "$boolean = $document->canContainIndirect($tag,$child);"
98           Returns whether an element $tag can contain a child $child either
99           directly, or after automatically opening one or more autoOpen-able
100           elements.
101
102       "$boolean = $document->canContainSomehow($tag,$child);"
103           Returns whether an element $tag can contain a child $child either
104           directly, or after automatically opening one or more autoOpen-able
105           elements.
106
107       "$boolean = $document->canHaveAttribute($tag,$attrib);"
108           Returns whether an element $tag can have an attribute named
109           $attrib.
110
111       "$boolean = $document->canAutoOpen($tag);"
112           Returns whether an element $tag is able to be automatically opened.
113
114       "$boolean = $document->canAutoClose($node);"
115           Returns whether the node $node can be automatically closed.
116
117   Construction Methods
118       These methods are the most common ones used for construction of
119       documents.  They generally operate by creating new material at the
120       current insertion point.  That point initially is just the document
121       itself, but it moves along to follow any new insertions.  These methods
122       also adapt to the document model so as to automatically open or close
123       elements, when it is required for the pending insertion and allowed by
124       the document model (See Tag).
125
126       "$xmldoc = $document->finalize;"
127           This method finalizes the document by cleaning up various temporary
128           attributes, and returns the XML::LibXML::Document that was
129           constructed.
130
131       "@nodes = $document->absorb($digested);"
132           Absorb the $digested object into the document at the current
133           insertion point according to its type.  Various of the the other
134           methods are invoked as needed, and document nodes may be
135           automatically opened or closed according to the document model.
136
137           This method returns the nodes that were constructed.  Note that the
138           nodes may include children of other nodes, and nodes that may
139           already have been removed from the document (See filterChildren and
140           filterDeleted).  Also, text insertions are often merged with
141           existing text nodes; in such cases, the whole text node is included
142           in the result.
143
144       "$document->insertElement($qname,$content,%attributes);"
145           This is a shorthand for creating an element $qname (with given
146           attributes), absorbing $content from within that new node, and then
147           closing it.  The $content must be digested material, either a
148           single box, or an array of boxes, which will be absorbed into the
149           element.  This method returns the newly created node, although it
150           will no longer be the current insertion point.
151
152       "$document->insertMathToken($string,%attributes);"
153           Insert a math token (XMTok) containing the string $string with the
154           given attributes.  Useful attributes would be name, role, font.
155           Returns the newly inserted node.
156
157       "$document->insertComment($text);"
158           Insert, and return, a comment with the given $text into the current
159           node.
160
161       "$document->insertPI($op,%attributes);"
162           Insert, and return,  a ProcessingInstruction into the current node.
163
164       "$document->openText($text,$font);"
165           Open a text node in font $font, performing any required automatic
166           opening and closing of intermedate nodes (including those needed
167           for font changes) and inserting the string $text into it.
168
169       "$document->openElement($qname,%attributes);"
170           Open an element, named $qname and with the given attributes.  This
171           will be inserted into the current node while  performing any
172           required automatic opening and closing of intermedate nodes.  The
173           new element is returned, and also becomes the current insertion
174           point.  An error (fatal if in "Strict" mode) is signalled if there
175           is no allowed way to insert such an element into the current node.
176
177       "$document->closeElement($qname);"
178           Close the closest open element named $qname including any
179           intermedate nodes that may be automatically closed.  If that is not
180           possible, signal an error.  The closed node's parent becomes the
181           current node.  This method returns the closed node.
182
183       "$node = $document->isOpenable($qname);"
184           Check whether it is possible to open a $qname element at the
185           current insertion point.
186
187       "$node = $document->isCloseable($qname);"
188           Check whether it is possible to close a $qname element, returning
189           the node that would be closed if possible, otherwise undef.
190
191       "$document->maybeCloseElement($qname);"
192           Close a $qname element, if it is possible to do so, returns the
193           closed node if it was found, else undef.
194
195       "$document->addAttribute($key=>$value);"
196           Add the given attribute to the node nearest to the current
197           insertion point that is allowed to have it. This does not change
198           the current insertion point.
199
200       "$document->closeToNode($node);"
201           This method closes all children of $node until $node becomes the
202           insertion point. Note that it closes any open nodes, not only
203           autoCloseable ones.
204
205       Internal Insertion Methods
206
207       These are described as an aide to understanding the code; they rarely,
208       if ever, should be used outside this module.
209
210       "$document->setNode($node);"
211           Sets the current insertion point to be  $node.  This should be
212           rarely used, if at all; The construction methods of document
213           generally maintain the notion of insertion point automatically.
214           This may be useful to allow insertion into a different part of the
215           document, but you probably want to set the insertion point back to
216           the previous node, afterwards.
217
218       "$string = $document->getInsertionContext($levels);"
219           For debugging, return a string showing the context of the current
220           insertion point; that is, the string of the nodes leading up to it.
221           if $levels is defined, show only that many nodes.
222
223       "$node = $document->find_insertion_point($qname);"
224           This internal method is used to find the appropriate point,
225           relative to the current insertion point, that an element with the
226           specified $qname can be inserted.  That position may require
227           automatic opening or closing of elements, according to what is
228           allowed by the document model.
229
230       "@nodes = getInsertionCandidates($node);"
231           Returns a list of elements where an arbitrary insertion might take
232           place.  Roughly this is a list starting with $node, followed by its
233           parent and the parents siblings (in reverse order), followed by the
234           grandparent and siblings (in reverse order).
235
236       "$node = $document->floatToElement($qname);"
237           Finds the nearest element at or preceding the current insertion
238           point (see "getInsertionCandidates"), that can accept an element
239           $qname; it moves the insertion point to that point, and returns the
240           previous insertion point.  Generally, after doing whatever you need
241           at the new insertion point, you should call
242           "$document->setNode($node);" to restore the insertion point.  If no
243           such point is found, the insertion point is left unchanged, and
244           undef is returned.
245
246       "$node = $document->floatToAttribute($key);"
247           This method works the same as "floatToElement", but find the
248           nearest element that can accept the attribute $key.
249
250       "$node = $document->openText_internal($text);"
251           This is an internal method,  used by "openText", that assumes the
252           insertion point has been appropriately adjusted.)
253
254       "$node = $document->openMathText_internal($text);"
255           This internal method appends $text to the current insertion point,
256           which is assumed to be a math node.  It checks for math ligatures
257           and carries out any combinations called for.
258
259       "$node = $document->closeText_internal();"
260           This internal method closes the current node, which should be a
261           text node.  It carries out any text ligatures on the content.
262
263       "$node = $document->closeNode_internal($node);"
264           This internal method closes any open text or element nodes starting
265           at the current insertion point, up to and including $node.
266           Afterwards, the parent of $node will be the current insertion
267           point.  It condenses the tree to avoid redundant font switching
268           elements.
269
270       "$document->afterOpen($node);"
271           Carries out any afterOpen operations that have been recorded (using
272           "Tag") for the element name of $node.
273
274       "$document->afterClose($node);"
275           Carries out any afterClose operations that have been recorded
276           (using "Tag") for the element name of $node.
277
278   Document Modification
279       The following methods are used to perform various sorts of modification
280       and rearrangements of the document, after the normal flow of insertion
281       has taken place.  These may be needed after an environment (or perhaps
282       the whole document) has been completed and one needs to analyze what it
283       contains to decide on the appropriate representation.
284
285       "$document->setAttribute($node,$key,$value);"
286           Sets the attribute $key to $value on $node.  This method is
287           preferred over the direct LibXML one, since it takes care of
288           decoding namespaces (if $key is a qname), and also manages
289           recording of xml:id's.
290
291       "$document->recordID($id,$node);"
292           Records the association of the given $node with the $id, which
293           should be the "xml:id" attribute of the $node.  Usually this
294           association will be maintained by the methods that create nodes or
295           set attributes.
296
297       "$document->unRecordID($id);"
298           Removes the node associated with the given $id, if any.  This might
299           be needed if a node is deleted.
300
301       "$document->modifyID($id);"
302           Adjusts $id, if needed, so that it is unique.  It does this by
303           appending a letter and incrementing until it finds an id that is
304           not yet associated with a node.
305
306       "$node = $document->lookupID($id);"
307           Returns the node, if any, that is associated with the given $id.
308
309       "$document->setNodeBox($node,$box);"
310           Records the $box (being a Box, Whatsit or List), that was
311           (presumably) responsible for the creation of the element $node.
312           This information is useful for determining source locations,
313           original TeX strings, and so forth.
314
315       "$box = $document->getNodeBox($node);"
316           Returns the $box that was responsible for creating the element
317           $node.
318
319       "$document->setNodeFont($node,$font);"
320           Records the font object that encodes the font that should be used
321           to display any text within the element $node.
322
323       "$font = $document->getNodeFont($node);"
324           Returns the font object associated with the element $node.
325
326       "$node = $document->openElementAt($point,$qname,%attributes);"
327           Opens a new child element in $point with the qualified name $qname
328           and with the given attributes.  This method is not affected by, nor
329           does it affect, the current insertion point.  It does manage
330           namespaces, xml:id's and associating a box, font and locator with
331           the new element, as well as running any "afterOpen" operations.
332
333       "$node = $document->closeElementAt($node);"
334           Closes $node.  This method is not affected by, nor does it affect,
335           the current insertion point.  However, it does run any "afterClose"
336           operations, so any element that was created using the lower-level
337           "openElementAt" should be closed using this method.
338
339       "$node = $document->appendClone($node,@newchildren);"
340           Appends clones of @newchildren to $node.  This method modifies any
341           ids found within @newchildren (using "modifyID"), and fixes up any
342           references to those ids within the clones so that they refer to the
343           modified id.
344
345       "$node = $document->wrapNodes($qname,@nodes);"
346           This method wraps the @nodes by a new element with qualified name
347           $qname, that new node replaces the first of @node.  The remaining
348           nodes in @nodes must be following siblings of the first one.
349
350           NOTE: Does this need multiple nodes?  If so, perhaps some kind of
351           movenodes helper?  Otherwise, what about attributes?
352
353       "$node = $document->unwrapNodes($node);"
354           Unwrap the children of $node, by replacing $node by its children.
355
356       "$node = $document->replaceNode($node,@nodes);"
357           Replace $node by @nodes; presumably they are some sort of
358           descendant nodes.
359
360       "$node = $document->renameNode($node,$newname);"
361           Rename $node to the tagname $newname; equivalently replace $node by
362           a new node with name $newname and copy the attributes and contents.
363           It is assumed that $newname can contain those attributes and
364           contents.
365
366       "@nodes = $document->filterDeletions(@nodes);"
367           This function is useful with "$doc-"absorb($box)>, when you want to
368           filter out any nodes that have been deleted and no longer appear in
369           the document.
370
371       "@nodes = $document->filterChildren(@nodes);"
372           This function is useful with "$doc-"absorb($box)>, when you want to
373           filter out any nodes that are children of other nodes in @nodes.
374

AUTHOR

376       Bruce Miller <bruce.miller@nist.gov>
377
379       Public domain software, produced as part of work done by the United
380       States Government & not subject to copyright in the US.
381
382
383
384perl v5.34.0                      2021-11-24        LaTeXML::Core::Document(3)
Impressum