1Makefile::DOM(3) User Contributed Perl Documentation Makefile::DOM(3)
2
3
4
6 Makefile::DOM - Simple DOM parser for Makefiles
7
9 This document describes Makefile::DOM 0.004 released on March 10, 2008.
10
12 This libary can serve as an advanced lexer for (GNU) makefiles. It
13 parses makefiles as "documents" and the parsing is lossless. The
14 results are data structures similar to DOM trees. The DOM trees hold
15 every single bit of the information in the original input files,
16 including white spaces, blank lines and makefile comments. That means
17 it's possible to reproduce the original makefiles from the DOM trees.
18 In addition, each node of the DOM trees is modifiable and so is the
19 whole tree, just like the PPI module used for Perl source parsing and
20 the HTML::TreeBuilder module used for parsing HTML source.
21
22 If you're looking for a true GNU make parser that generates an AST,
23 please see Makefile::Parser::GmakeDB instead.
24
25 The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
26 I've directly stolen the source code and POD documentation of
27 PPI::Node, PPI::Element, and PPI::Dumper, with the full permission from
28 the author of PPI, Adam Kennedy.
29
30 "Makefile::DOM" tries to be independent of specific makefile's syntax.
31 The same set of DOM node types is supposed to get shared by different
32 makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
33 makefiles and returns an instance of MDOM::Document, i.e., the root of
34 the DOM tree while the NMAKE makefile lexer in the future,
35 "MDOM::Document::Nmake", also returns instances of the MDOM::Document
36 class. Later, I'll also consider adding support for dmake and bsdmake.
37
39 Makefile DOM (MDOM) is a structured set of a series of data types. They
40 provide a flexible document model conformed to the makefile syntax.
41 Below is a complete list of the 19 MDOM classes in the current
42 implementation where the indentation indicates the class inheritance
43 relationships.
44
45 MDOM::Element
46 MDOM::Node
47 MDOM::Unknown
48 MDOM::Assignment
49 MDOM::Command
50 MDOM::Directive
51 MDOM::Document
52 MDOM::Document::Gmake
53 MDOM::Rule
54 MDOM::Rule::Simple
55 MDOM::Rule::StaticPattern
56 MDOM::Token
57 MDOM::Token::Bare
58 MDOM::Token::Comment
59 MDOM::Token::Continuation
60 MDOM::Token::Interpolation
61 MDOM::Token::Modifier
62 MDOM::Token::Separator
63 MDOM::Token::Whitespace
64
65 It's not hard to see that all of the MDOM classes inherit from the
66 MDOM::Element class. MDOM::Token and MDOM::Node are its direct
67 children. The former represents a string token which is atomic from the
68 perspective of the lexer while the latter represents a structured node,
69 which usually has one or more children, and serves as the container for
70 other DOM::Element objects.
71
72 Next we'll show a few examples to demonstrate how to map DOM trees to
73 particular makefiles.
74
75 Case 1
76 Consider the following simple "hello, world" makefile:
77
78 all : ; echo "hello, world"
79
80 We can use the MDOM::Dumper class provided by Makefile::DOM to dump
81 out the internal structure of its corresponding MDOM tree:
82
83 MDOM::Document::Gmake
84 MDOM::Rule::Simple
85 MDOM::Token::Bare 'all'
86 MDOM::Token::Whitespace ' '
87 MDOM::Token::Separator ':'
88 MDOM::Token::Whitespace ' '
89 MDOM::Command
90 MDOM::Token::Separator ';'
91 MDOM::Token::Whitespace ' '
92 MDOM::Token::Bare 'echo "hello, world"'
93 MDOM::Token::Whitespace '\n'
94
95 In this example, speparators ":" and ";" are all instances of the
96 MDOM::Token::Separator class while spaces and new line characters
97 are all represented as MDOM::Token::Whitespace. The other two leaf
98 nodes, "all" and "echo "hello, world"" both belong to
99 MDOM::Token::Bare.
100
101 It's worth mentioning that, the space characters in the rule
102 command "echo "hello, world"" were not represented as
103 MDOM::Token::Whitespace. That's because in makefiles, the spaces in
104 commands do not make any sense to "make" in syntax; those spaces
105 are usually sent to shell programs verbatim. Therefore, the DOM
106 parser does not try to recognize those spaces specifially so as to
107 reduce memory use and the number of nodes. However, leading spaces
108 and trailing new lines will still be recognized as
109 MDOM::Token::Whitespace.
110
111 On a higher level, it's a MDOM::Rule::Simple instance holding
112 several "Token" and one MDOM::Command. On the highest level, it's
113 the root node of the whole DOM tree, i.e., an instance of
114 MDOM::Document::Gmake.
115
116 Case 2
117 Below is a relatively complex example:
118
119 a: foo.c bar.h $(baz) # hello!
120 @echo ...
121
122 It's corresponding DOM structure is
123
124 MDOM::Document::Gmake
125 MDOM::Rule::Simple
126 MDOM::Token::Bare 'a'
127 MDOM::Token::Separator ':'
128 MDOM::Token::Whitespace ' '
129 MDOM::Token::Bare 'foo.c'
130 MDOM::Token::Whitespace ' '
131 MDOM::Token::Bare 'bar.h'
132 MDOM::Token::Whitespace '\t'
133 MDOM::Token::Interpolation '$(baz)'
134 MDOM::Token::Whitespace ' '
135 MDOM::Token::Comment '# hello!'
136 MDOM::Token::Whitespace '\n'
137 MDOM::Command
138 MDOM::Token::Separator '\t'
139 MDOM::Token::Modifier '@'
140 MDOM::Token::Bare 'echo ...'
141 MDOM::Token::Whitespace '\n'
142
143 Compared to the previous example, here appears several new node
144 types.
145
146 The variable interpolation "$(baz)" on the first line of the
147 original makefile corresponds to a MDOM::Token::Interpolation node
148 in its MDOM tree. Similarly, the comment "# hello" corresponds to a
149 MDOM::Token::Comment node.
150
151 On the second line, the rule command indented by a tab character is
152 still represented by a MDOM::Command object. Its first child node
153 (or its first element) is also an MDOM::Token::Seperator instance
154 corresponding to that tab. The command modifier "@" follows the
155 "Separator" immediately, which is of type MDOM::Token::Modifier.
156
157 Case 3
158 Now let's study a sample makefile with various global structures:
159
160 a: b
161 foo = bar
162 # hello!
163
164 Here on the top level, there are three language structures: one
165 rule ""a: b"", one assignment statement "foo = bar", and one
166 comment "# hello!".
167
168 Its MDOM tree is shown below:
169
170 MDOM::Document::Gmake
171 MDOM::Rule::Simple
172 MDOM::Token::Bare 'a'
173 MDOM::Token::Separator ':'
174 MDOM::Token::Whitespace ' '
175 MDOM::Token::Bare 'b'
176 MDOM::Token::Whitespace '\n'
177 MDOM::Assignment
178 MDOM::Token::Bare 'foo'
179 MDOM::Token::Whitespace ' '
180 MDOM::Token::Separator '='
181 MDOM::Token::Whitespace ' '
182 MDOM::Token::Bare 'bar'
183 MDOM::Token::Whitespace '\n'
184 MDOM::Token::Whitespace '\t'
185 MDOM::Token::Comment '# hello!'
186 MDOM::Token::Whitespace '\n'
187
188 We can see that below the root node MDOM::Document::Gmake, there
189 are MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
190 elements, as well as two MDOM::Token::Whitespace objects.
191
192 It can be observed that the MDOM representation for the makefile's
193 lexical elements is rather loose. It only provides very limited
194 structural representation instead of making a bad guess.
195
197 Generating an MDOM tree from a GNU makefile only requires two lines of
198 Perl code:
199
200 use MDOM::Document::Gmake;
201 my $dom = MDOM::Document::Gmake->new('Makefile');
202
203 If the makefile source code being parsed is already stored in a Perl
204 variable, say, $var, then we can construct an MDOM via the following
205 code:
206
207 my $dom = MDOM::Document::Gmake->new(\$var);
208
209 Now $dom becomes the reference to the root of the MDOM tree and its
210 type is now MDOM::Document::Gmake, which is also an instance of the
211 MDOM::Node class.
212
213 Just as mentioned above, "MDOM::Node" is the container for other
214 MDOM::Element instances. So we can retrieve some element node's value
215 via its "child" method:
216
217 $node = $dom->child(3);
218 # or $node = $dom->elements(0);
219
220 And we may also use the "elements" method to obtain the values of all
221 the nodes:
222
223 @elems = $dom->elements;
224
225 For every MDOM node, its corresponding makefile source can be generated
226 by invoking its "content" method.
227
229 The current implemenation of the MDOM::Document::Gmake lexer is based
230 on a hand-written state machie. Although the efficiency of the engine
231 is not bad, the code is rather complicated and messy, which hurts both
232 extensibility and maintanabilty. So it's expected to rewrite the parser
233 using some grammatical tools like the Perl 6 regex engine
234 Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.
235
237 Agent Zhang <agentzh@gmail.com>
238
240 Copyright 2006-2008 by Agent Zhang.
241
242 This library is free software; you can redistribute it and/or modify it
243 under the same terms as Perl itself.
244
246 MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
247 makesimple.
248
250 Hey! The above document had some coding errors, which are explained
251 below:
252
253 Around line 163:
254 You forgot a '=back' before '=head1'
255
256 Around line 191:
257 =back without =over
258
259
260
261perl v5.12.0 2008-03-10 Makefile::DOM(3)