1Makefile::DOM(3) User Contributed Perl Documentation Makefile::DOM(3)
2
3
4
6 Makefile::DOM - Simple DOM parser for Makefiles
7
9 This document describes Makefile::DOM 0.008 released on 18 November
10 2014.
11
13 This libary can serve as an advanced lexer for (GNU) makefiles. It
14 parses makefiles as "documents" and the parsing is lossless. The
15 results are data structures similar to DOM trees. The DOM trees hold
16 every single bit of the information in the original input files,
17 including white spaces, blank lines and makefile comments. That means
18 it's possible to reproduce the original makefiles from the DOM trees.
19 In addition, each node of the DOM trees is modifiable and so is the
20 whole tree, just like the PPI module used for Perl source parsing and
21 the HTML::TreeBuilder module used for parsing HTML source.
22
23 If you're looking for a true GNU make parser that generates an AST,
24 please see Makefile::Parser::GmakeDB instead.
25
26 The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
27 I've directly stolen the source code and POD documentation of
28 PPI::Node, PPI::Element, and PPI::Dumper, with the full permission from
29 the author of PPI, Adam Kennedy.
30
31 "Makefile::DOM" tries to be independent of specific makefile's syntax.
32 The same set of DOM node types is supposed to get shared by different
33 makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
34 makefiles and returns an instance of MDOM::Document, i.e., the root of
35 the DOM tree while the NMAKE makefile lexer in the future,
36 "MDOM::Document::Nmake", also returns instances of the MDOM::Document
37 class. Later, I'll also consider adding support for dmake and bsdmake.
38
40 Makefile DOM (MDOM) is a structured set of a series of data types. They
41 provide a flexible document model conformed to the makefile syntax.
42 Below is a complete list of the 19 MDOM classes in the current
43 implementation where the indentation indicates the class inheritance
44 relationships.
45
46 MDOM::Element
47 MDOM::Node
48 MDOM::Unknown
49 MDOM::Assignment
50 MDOM::Command
51 MDOM::Directive
52 MDOM::Document
53 MDOM::Document::Gmake
54 MDOM::Rule
55 MDOM::Rule::Simple
56 MDOM::Rule::StaticPattern
57 MDOM::Token
58 MDOM::Token::Bare
59 MDOM::Token::Comment
60 MDOM::Token::Continuation
61 MDOM::Token::Interpolation
62 MDOM::Token::Modifier
63 MDOM::Token::Separator
64 MDOM::Token::Whitespace
65
66 It's not hard to see that all of the MDOM classes inherit from the
67 MDOM::Element class. MDOM::Token and MDOM::Node are its direct
68 children. The former represents a string token which is atomic from the
69 perspective of the lexer while the latter represents a structured node,
70 which usually has one or more children, and serves as the container for
71 other DOM::Element objects.
72
73 Next we'll show a few examples to demonstrate how to map DOM trees to
74 particular makefiles.
75
76 Case 1
77 Consider the following simple "hello, world" makefile:
78
79 all : ; echo "hello, world"
80
81 We can use the MDOM::Dumper class provided by Makefile::DOM to dump
82 out the internal structure of its corresponding MDOM tree:
83
84 MDOM::Document::Gmake
85 MDOM::Rule::Simple
86 MDOM::Token::Bare 'all'
87 MDOM::Token::Whitespace ' '
88 MDOM::Token::Separator ':'
89 MDOM::Token::Whitespace ' '
90 MDOM::Command
91 MDOM::Token::Separator ';'
92 MDOM::Token::Whitespace ' '
93 MDOM::Token::Bare 'echo "hello, world"'
94 MDOM::Token::Whitespace '\n'
95
96 In this example, speparators ":" and ";" are all instances of the
97 MDOM::Token::Separator class while spaces and new line characters
98 are all represented as MDOM::Token::Whitespace. The other two leaf
99 nodes, "all" and "echo "hello, world"" both belong to
100 MDOM::Token::Bare.
101
102 It's worth mentioning that, the space characters in the rule
103 command "echo "hello, world"" were not represented as
104 MDOM::Token::Whitespace. That's because in makefiles, the spaces in
105 commands do not make any sense to "make" in syntax; those spaces
106 are usually sent to shell programs verbatim. Therefore, the DOM
107 parser does not try to recognize those spaces specifially so as to
108 reduce memory use and the number of nodes. However, leading spaces
109 and trailing new lines will still be recognized as
110 MDOM::Token::Whitespace.
111
112 On a higher level, it's a MDOM::Rule::Simple instance holding
113 several "Token" and one MDOM::Command. On the highest level, it's
114 the root node of the whole DOM tree, i.e., an instance of
115 MDOM::Document::Gmake.
116
117 Case 2
118 Below is a relatively complex example:
119
120 a: foo.c bar.h $(baz) # hello!
121 @echo ...
122
123 It's corresponding DOM structure is
124
125 MDOM::Document::Gmake
126 MDOM::Rule::Simple
127 MDOM::Token::Bare 'a'
128 MDOM::Token::Separator ':'
129 MDOM::Token::Whitespace ' '
130 MDOM::Token::Bare 'foo.c'
131 MDOM::Token::Whitespace ' '
132 MDOM::Token::Bare 'bar.h'
133 MDOM::Token::Whitespace '\t'
134 MDOM::Token::Interpolation '$(baz)'
135 MDOM::Token::Whitespace ' '
136 MDOM::Token::Comment '# hello!'
137 MDOM::Token::Whitespace '\n'
138 MDOM::Command
139 MDOM::Token::Separator '\t'
140 MDOM::Token::Modifier '@'
141 MDOM::Token::Bare 'echo ...'
142 MDOM::Token::Whitespace '\n'
143
144 Compared to the previous example, here appears several new node
145 types.
146
147 The variable interpolation "$(baz)" on the first line of the
148 original makefile corresponds to a MDOM::Token::Interpolation node
149 in its MDOM tree. Similarly, the comment "# hello" corresponds to a
150 MDOM::Token::Comment node.
151
152 On the second line, the rule command indented by a tab character is
153 still represented by a MDOM::Command object. Its first child node
154 (or its first element) is also an MDOM::Token::Seperator instance
155 corresponding to that tab. The command modifier "@" follows the
156 "Separator" immediately, which is of type MDOM::Token::Modifier.
157
158 Case 3
159 Now let's study a sample makefile with various global structures:
160
161 a: b
162 foo = bar
163 # hello!
164
165 Here on the top level, there are three language structures: one
166 rule ""a: b"", one assignment statement "foo = bar", and one
167 comment "# hello!".
168
169 Its MDOM tree is shown below:
170
171 MDOM::Document::Gmake
172 MDOM::Rule::Simple
173 MDOM::Token::Bare 'a'
174 MDOM::Token::Separator ':'
175 MDOM::Token::Whitespace ' '
176 MDOM::Token::Bare 'b'
177 MDOM::Token::Whitespace '\n'
178 MDOM::Assignment
179 MDOM::Token::Bare 'foo'
180 MDOM::Token::Whitespace ' '
181 MDOM::Token::Separator '='
182 MDOM::Token::Whitespace ' '
183 MDOM::Token::Bare 'bar'
184 MDOM::Token::Whitespace '\n'
185 MDOM::Token::Whitespace '\t'
186 MDOM::Token::Comment '# hello!'
187 MDOM::Token::Whitespace '\n'
188
189 We can see that below the root node MDOM::Document::Gmake, there
190 are MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
191 elements, as well as two MDOM::Token::Whitespace objects.
192
193 It can be observed from the examples above that the MDOM representation
194 for the makefile's lexical elements is rather loose. It only provides
195 very limited structural representation instead of making a bad guess.
196
198 Generating an MDOM tree from a GNU makefile only requires two lines of
199 Perl code:
200
201 use MDOM::Document::Gmake;
202 my $dom = MDOM::Document::Gmake->new('Makefile');
203
204 If the makefile source code being parsed is already stored in a Perl
205 variable, say, $var, then we can construct an MDOM via the following
206 code:
207
208 my $dom = MDOM::Document::Gmake->new(\$var);
209
210 Now $dom becomes the reference to the root of the MDOM tree and its
211 type is now MDOM::Document::Gmake, which is also an instance of the
212 MDOM::Node class.
213
214 Just as mentioned above, "MDOM::Node" is the container for other
215 MDOM::Element instances. So we can retrieve some element node's value
216 via its "child" method:
217
218 $node = $dom->child(3);
219 # or $node = $dom->elements(0);
220
221 And we may also use the "elements" method to obtain the values of all
222 the nodes:
223
224 @elems = $dom->elements;
225
226 For every MDOM node, its corresponding makefile source can be generated
227 by invoking its "content" method.
228
230 The current implementation of the MDOM::Document::Gmake lexer is based
231 on a hand-written state machie. Although the efficiency of the engine
232 is not bad, the code is rather complicated and messy, which hurts both
233 extensibility and maintanabilty. So it's expected to rewrite the parser
234 using some grammatical tools like the Perl 6 regex engine
235 Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.
236
238 You can always get the latest source code of this module from its
239 GitHub repository:
240
241 <http://github.com/agentzh/makefile-dom-pm>
242
243 If you want a commit bit, please let me know.
244
246 Yichun "agentzh" Zhang (章亦春) <agentzh@gmail.com>
247
249 Copyright 2006-2014 by Yichun "agentzh" Zhang (章亦春).
250
251 This library is free software; you can redistribute it and/or modify it
252 under the same terms as Perl itself.
253
255 MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
256 makesimple.
257
258
259
260perl v5.32.0 2020-07-28 Makefile::DOM(3)