1Makefile::DOM(3)      User Contributed Perl Documentation     Makefile::DOM(3)
2
3
4

NAME

6       Makefile::DOM - Simple DOM parser for Makefiles
7

VERSION

9       This document describes Makefile::DOM 0.008 released on 18 November
10       2014.
11

DESCRIPTION

13       This libary can serve as an advanced lexer for (GNU) makefiles. It
14       parses makefiles as "documents" and the parsing is lossless. The
15       results are data structures similar to DOM trees. The DOM trees hold
16       every single bit of the information in the original input files,
17       including white spaces, blank lines and makefile comments. That means
18       it's possible to reproduce the original makefiles from the DOM trees.
19       In addition, each node of the DOM trees is modifiable and so is the
20       whole tree, just like the PPI module used for Perl source parsing and
21       the HTML::TreeBuilder module used for parsing HTML source.
22
23       If you're looking for a true GNU make parser that generates an AST,
24       please see Makefile::Parser::GmakeDB instead.
25
26       The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
27       I've directly stolen the source code and POD documentation of
28       PPI::Node, PPI::Element, and PPI::Dumper, with the full permission from
29       the author of PPI, Adam Kennedy.
30
31       "Makefile::DOM" tries to be independent of specific makefile's syntax.
32       The same set of DOM node types is supposed to get shared by different
33       makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
34       makefiles and returns an instance of MDOM::Document, i.e., the root of
35       the DOM tree while the NMAKE makefile lexer in the future,
36       "MDOM::Document::Nmake", also returns instances of the MDOM::Document
37       class. Later, I'll also consider adding support for dmake and bsdmake.
38

Structure of the DOM

40       Makefile DOM (MDOM) is a structured set of a series of data types. They
41       provide a flexible document model conformed to the makefile syntax.
42       Below is a complete list of the 19 MDOM classes in the current
43       implementation where the indentation indicates the class inheritance
44       relationships.
45
46           MDOM::Element
47               MDOM::Node
48                   MDOM::Unknown
49                   MDOM::Assignment
50                   MDOM::Command
51                   MDOM::Directive
52                   MDOM::Document
53                       MDOM::Document::Gmake
54                   MDOM::Rule
55                       MDOM::Rule::Simple
56                       MDOM::Rule::StaticPattern
57               MDOM::Token
58                   MDOM::Token::Bare
59                   MDOM::Token::Comment
60                   MDOM::Token::Continuation
61                   MDOM::Token::Interpolation
62                   MDOM::Token::Modifier
63                   MDOM::Token::Separator
64                   MDOM::Token::Whitespace
65
66       It's not hard to see that all of the MDOM classes inherit from the
67       MDOM::Element class. MDOM::Token and MDOM::Node are its direct
68       children. The former represents a string token which is atomic from the
69       perspective of the lexer while the latter represents a structured node,
70       which usually has one or more children, and serves as the container for
71       other DOM::Element objects.
72
73       Next we'll show a few examples to demonstrate how to map DOM trees to
74       particular makefiles.
75
76       Case 1
77           Consider the following simple "hello, world" makefile:
78
79               all : ; echo "hello, world"
80
81           We can use the MDOM::Dumper class provided by Makefile::DOM to dump
82           out the internal structure of its corresponding MDOM tree:
83
84               MDOM::Document::Gmake
85                 MDOM::Rule::Simple
86                   MDOM::Token::Bare         'all'
87                   MDOM::Token::Whitespace   ' '
88                   MDOM::Token::Separator    ':'
89                   MDOM::Token::Whitespace   ' '
90                   MDOM::Command
91                     MDOM::Token::Separator    ';'
92                     MDOM::Token::Whitespace   ' '
93                     MDOM::Token::Bare         'echo "hello, world"'
94                     MDOM::Token::Whitespace   '\n'
95
96           In this example, speparators ":" and ";" are all instances of the
97           MDOM::Token::Separator class while spaces and new line characters
98           are all represented as MDOM::Token::Whitespace. The other two leaf
99           nodes, "all" and "echo "hello, world"" both belong to
100           MDOM::Token::Bare.
101
102           It's worth mentioning that, the space characters in the rule
103           command "echo "hello, world"" were not represented as
104           MDOM::Token::Whitespace. That's because in makefiles, the spaces in
105           commands do not make any sense to "make" in syntax; those spaces
106           are usually sent to shell programs verbatim. Therefore, the DOM
107           parser does not try to recognize those spaces specifially so as to
108           reduce memory use and the number of nodes. However, leading spaces
109           and trailing new lines will still be recognized as
110           MDOM::Token::Whitespace.
111
112           On a higher level, it's a MDOM::Rule::Simple instance holding
113           several "Token" and one MDOM::Command. On the highest level, it's
114           the root node of the whole DOM tree, i.e., an instance of
115           MDOM::Document::Gmake.
116
117       Case 2
118           Below is a relatively complex example:
119
120               a: foo.c  bar.h $(baz) # hello!
121                   @echo ...
122
123           It's corresponding DOM structure is
124
125             MDOM::Document::Gmake
126               MDOM::Rule::Simple
127                 MDOM::Token::Bare         'a'
128                 MDOM::Token::Separator    ':'
129                 MDOM::Token::Whitespace   ' '
130                 MDOM::Token::Bare         'foo.c'
131                 MDOM::Token::Whitespace   '  '
132                 MDOM::Token::Bare         'bar.h'
133                 MDOM::Token::Whitespace   '\t'
134                 MDOM::Token::Interpolation   '$(baz)'
135                 MDOM::Token::Whitespace      ' '
136                 MDOM::Token::Comment         '# hello!'
137                 MDOM::Token::Whitespace      '\n'
138               MDOM::Command
139                 MDOM::Token::Separator    '\t'
140                 MDOM::Token::Modifier     '@'
141                 MDOM::Token::Bare         'echo ...'
142                 MDOM::Token::Whitespace   '\n'
143
144           Compared to the previous example, here appears several new node
145           types.
146
147           The variable interpolation "$(baz)" on the first line of the
148           original makefile corresponds to a MDOM::Token::Interpolation node
149           in its MDOM tree. Similarly, the comment "# hello" corresponds to a
150           MDOM::Token::Comment node.
151
152           On the second line, the rule command indented by a tab character is
153           still represented by a MDOM::Command object. Its first child node
154           (or its first element) is also an MDOM::Token::Seperator instance
155           corresponding to that tab. The command modifier "@" follows the
156           "Separator" immediately, which is of type MDOM::Token::Modifier.
157
158       Case 3
159           Now let's study a sample makefile with various global structures:
160
161             a: b
162             foo = bar
163                 # hello!
164
165           Here on the top level, there are three language structures: one
166           rule ""a: b"", one assignment statement "foo = bar", and one
167           comment "# hello!".
168
169           Its MDOM tree is shown below:
170
171             MDOM::Document::Gmake
172               MDOM::Rule::Simple
173                 MDOM::Token::Bare                  'a'
174                 MDOM::Token::Separator            ':'
175                 MDOM::Token::Whitespace           ' '
176                 MDOM::Token::Bare                   'b'
177                 MDOM::Token::Whitespace           '\n'
178               MDOM::Assignment
179                 MDOM::Token::Bare                  'foo'
180                 MDOM::Token::Whitespace           ' '
181                 MDOM::Token::Separator            '='
182                 MDOM::Token::Whitespace           ' '
183                 MDOM::Token::Bare                  'bar'
184                 MDOM::Token::Whitespace           '\n'
185               MDOM::Token::Whitespace            '\t'
186               MDOM::Token::Comment               '# hello!'
187               MDOM::Token::Whitespace            '\n'
188
189           We can see that below the root node MDOM::Document::Gmake, there
190           are MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
191           elements, as well as two MDOM::Token::Whitespace objects.
192
193       It can be observed from the examples above that the MDOM representation
194       for the makefile's lexical elements is rather loose. It only provides
195       very limited structural representation instead of making a bad guess.
196

OPERATIONS FOR MDOM TREES

198       Generating an MDOM tree from a GNU makefile only requires two lines of
199       Perl code:
200
201           use MDOM::Document::Gmake;
202           my $dom = MDOM::Document::Gmake->new('Makefile');
203
204       If the makefile source code being parsed is already stored in a Perl
205       variable, say, $var, then we can construct an MDOM via the following
206       code:
207
208           my $dom = MDOM::Document::Gmake->new(\$var);
209
210       Now $dom becomes the reference to the root of the MDOM tree and its
211       type is now MDOM::Document::Gmake, which is also an instance of the
212       MDOM::Node class.
213
214       Just as mentioned above, "MDOM::Node" is the container for other
215       MDOM::Element instances. So we can retrieve some element node's value
216       via its "child" method:
217
218           $node = $dom->child(3);
219           # or $node = $dom->elements(0);
220
221       And we may also use the "elements" method to obtain the values of all
222       the nodes:
223
224           @elems = $dom->elements;
225
226       For every MDOM node, its corresponding makefile source can be generated
227       by invoking its "content" method.
228

BUGS AND TODO

230       The current implementation of the MDOM::Document::Gmake lexer is based
231       on a hand-written state machie. Although the efficiency of the engine
232       is not bad, the code is rather complicated and messy, which hurts both
233       extensibility and maintanabilty. So it's expected to rewrite the parser
234       using some grammatical tools like the Perl 6 regex engine
235       Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.
236

SOURCE REPOSITORY

238       You can always get the latest source code of this module from its
239       GitHub repository:
240
241       <http://github.com/agentzh/makefile-dom-pm>
242
243       If you want a commit bit, please let me know.
244

AUTHOR

246       Yichun "agentzh" Zhang (章亦春) <agentzh@gmail.com>
247
249       Copyright 2006-2014 by Yichun "agentzh" Zhang (章亦春).
250
251       This library is free software; you can redistribute it and/or modify it
252       under the same terms as Perl itself.
253

SEE ALSO

255       MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
256       makesimple.
257
258
259
260perl v5.36.0                      2022-07-22                  Makefile::DOM(3)
Impressum