1Makefile::DOM(3)      User Contributed Perl Documentation     Makefile::DOM(3)
2
3
4

NAME

6       Makefile::DOM - Simple DOM parser for Makefiles
7

VERSION

9       This document describes Makefile::DOM 0.004 released on March 10, 2008.
10

DESCRIPTION

12       This libary can serve as an advanced lexer for (GNU) makefiles. It
13       parses makefiles as "documents" and the parsing is lossless. The
14       results are data structures similar to DOM trees. The DOM trees hold
15       every single bit of the information in the original input files,
16       including white spaces, blank lines and makefile comments. That means
17       it's possible to reproduce the original makefiles from the DOM trees.
18       In addition, each node of the DOM trees is modifiable and so is the
19       whole tree, just like the PPI module used for Perl source parsing and
20       the HTML::TreeBuilder module used for parsing HTML source.
21
22       If you're looking for a true GNU make parser that generates an AST,
23       please see Makefile::Parser::GmakeDB instead.
24
25       The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
26       I've directly stolen the source code and POD documentation of
27       PPI::Node, PPI::Element, and PPI::Dumper, with the full permission from
28       the author of PPI, Adam Kennedy.
29
30       "Makefile::DOM" tries to be independent of specific makefile's syntax.
31       The same set of DOM node types is supposed to get shared by different
32       makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
33       makefiles and returns an instance of MDOM::Document, i.e., the root of
34       the DOM tree while the NMAKE makefile lexer in the future,
35       "MDOM::Document::Nmake", also returns instances of the MDOM::Document
36       class. Later, I'll also consider adding support for dmake and bsdmake.
37

Structure of the DOM

39       Makefile DOM (MDOM) is a structured set of a series of data types. They
40       provide a flexible document model conformed to the makefile syntax.
41       Below is a complete list of the 19 MDOM classes in the current
42       implementation where the indentation indicates the class inheritance
43       relationships.
44
45           MDOM::Element
46               MDOM::Node
47                   MDOM::Unknown
48                   MDOM::Assignment
49                   MDOM::Command
50                   MDOM::Directive
51                   MDOM::Document
52                       MDOM::Document::Gmake
53                   MDOM::Rule
54                       MDOM::Rule::Simple
55                       MDOM::Rule::StaticPattern
56               MDOM::Token
57                   MDOM::Token::Bare
58                   MDOM::Token::Comment
59                   MDOM::Token::Continuation
60                   MDOM::Token::Interpolation
61                   MDOM::Token::Modifier
62                   MDOM::Token::Separator
63                   MDOM::Token::Whitespace
64
65       It's not hard to see that all of the MDOM classes inherit from the
66       MDOM::Element class. MDOM::Token and MDOM::Node are its direct
67       children. The former represents a string token which is atomic from the
68       perspective of the lexer while the latter represents a structured node,
69       which usually has one or more children, and serves as the container for
70       other DOM::Element objects.
71
72       Next we'll show a few examples to demonstrate how to map DOM trees to
73       particular makefiles.
74
75       Case 1
76           Consider the following simple "hello, world" makefile:
77
78               all : ; echo "hello, world"
79
80           We can use the MDOM::Dumper class provided by Makefile::DOM to dump
81           out the internal structure of its corresponding MDOM tree:
82
83               MDOM::Document::Gmake
84                 MDOM::Rule::Simple
85                   MDOM::Token::Bare         'all'
86                   MDOM::Token::Whitespace   ' '
87                   MDOM::Token::Separator    ':'
88                   MDOM::Token::Whitespace   ' '
89                   MDOM::Command
90                     MDOM::Token::Separator    ';'
91                     MDOM::Token::Whitespace   ' '
92                     MDOM::Token::Bare         'echo "hello, world"'
93                     MDOM::Token::Whitespace   '\n'
94
95           In this example, speparators ":" and ";" are all instances of the
96           MDOM::Token::Separator class while spaces and new line characters
97           are all represented as MDOM::Token::Whitespace. The other two leaf
98           nodes, "all" and "echo "hello, world"" both belong to
99           MDOM::Token::Bare.
100
101           It's worth mentioning that, the space characters in the rule
102           command "echo "hello, world"" were not represented as
103           MDOM::Token::Whitespace. That's because in makefiles, the spaces in
104           commands do not make any sense to "make" in syntax; those spaces
105           are usually sent to shell programs verbatim. Therefore, the DOM
106           parser does not try to recognize those spaces specifially so as to
107           reduce memory use and the number of nodes. However, leading spaces
108           and trailing new lines will still be recognized as
109           MDOM::Token::Whitespace.
110
111           On a higher level, it's a MDOM::Rule::Simple instance holding
112           several "Token" and one MDOM::Command. On the highest level, it's
113           the root node of the whole DOM tree, i.e., an instance of
114           MDOM::Document::Gmake.
115
116       Case 2
117           Below is a relatively complex example:
118
119               a: foo.c  bar.h $(baz) # hello!
120                   @echo ...
121
122           It's corresponding DOM structure is
123
124             MDOM::Document::Gmake
125               MDOM::Rule::Simple
126                 MDOM::Token::Bare         'a'
127                 MDOM::Token::Separator    ':'
128                 MDOM::Token::Whitespace   ' '
129                 MDOM::Token::Bare         'foo.c'
130                 MDOM::Token::Whitespace   '  '
131                 MDOM::Token::Bare         'bar.h'
132                 MDOM::Token::Whitespace   '\t'
133                 MDOM::Token::Interpolation   '$(baz)'
134                 MDOM::Token::Whitespace      ' '
135                 MDOM::Token::Comment         '# hello!'
136                 MDOM::Token::Whitespace      '\n'
137               MDOM::Command
138                 MDOM::Token::Separator    '\t'
139                 MDOM::Token::Modifier     '@'
140                 MDOM::Token::Bare         'echo ...'
141                 MDOM::Token::Whitespace   '\n'
142
143           Compared to the previous example, here appears several new node
144           types.
145
146           The variable interpolation "$(baz)" on the first line of the
147           original makefile corresponds to a MDOM::Token::Interpolation node
148           in its MDOM tree. Similarly, the comment "# hello" corresponds to a
149           MDOM::Token::Comment node.
150
151           On the second line, the rule command indented by a tab character is
152           still represented by a MDOM::Command object. Its first child node
153           (or its first element) is also an MDOM::Token::Seperator instance
154           corresponding to that tab. The command modifier "@" follows the
155           "Separator" immediately, which is of type MDOM::Token::Modifier.
156
157       Case 3
158           Now let's study a sample makefile with various global structures:
159
160             a: b
161             foo = bar
162                 # hello!
163
164           Here on the top level, there are three language structures: one
165           rule ""a: b"", one assignment statement "foo = bar", and one
166           comment "# hello!".
167
168           Its MDOM tree is shown below:
169
170             MDOM::Document::Gmake
171               MDOM::Rule::Simple
172                 MDOM::Token::Bare                  'a'
173                 MDOM::Token::Separator            ':'
174                 MDOM::Token::Whitespace           ' '
175                 MDOM::Token::Bare                   'b'
176                 MDOM::Token::Whitespace           '\n'
177               MDOM::Assignment
178                 MDOM::Token::Bare                  'foo'
179                 MDOM::Token::Whitespace           ' '
180                 MDOM::Token::Separator            '='
181                 MDOM::Token::Whitespace           ' '
182                 MDOM::Token::Bare                  'bar'
183                 MDOM::Token::Whitespace           '\n'
184               MDOM::Token::Whitespace            '\t'
185               MDOM::Token::Comment               '# hello!'
186               MDOM::Token::Whitespace            '\n'
187
188           We can see that below the root node MDOM::Document::Gmake, there
189           are MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
190           elements, as well as two MDOM::Token::Whitespace objects.
191
192           It can be observed that the MDOM representation for the makefile's
193           lexical elements is rather loose. It only provides very limited
194           structural representation instead of making a bad guess.
195

OPERATIONS FOR MDOM TREES

197       Generating an MDOM tree from a GNU makefile only requires two lines of
198       Perl code:
199
200           use MDOM::Document::Gmake;
201           my $dom = MDOM::Document::Gmake->new('Makefile');
202
203       If the makefile source code being parsed is already stored in a Perl
204       variable, say, $var, then we can construct an MDOM via the following
205       code:
206
207           my $dom = MDOM::Document::Gmake->new(\$var);
208
209       Now $dom becomes the reference to the root of the MDOM tree and its
210       type is now MDOM::Document::Gmake, which is also an instance of the
211       MDOM::Node class.
212
213       Just as mentioned above, "MDOM::Node" is the container for other
214       MDOM::Element instances. So we can retrieve some element node's value
215       via its "child" method:
216
217           $node = $dom->child(3);
218           # or $node = $dom->elements(0);
219
220       And we may also use the "elements" method to obtain the values of all
221       the nodes:
222
223           @elems = $dom->elements;
224
225       For every MDOM node, its corresponding makefile source can be generated
226       by invoking its "content" method.
227

BUGS AND TODO

229       The current implemenation of the MDOM::Document::Gmake lexer is based
230       on a hand-written state machie. Although the efficiency of the engine
231       is not bad, the code is rather complicated and messy, which hurts both
232       extensibility and maintanabilty. So it's expected to rewrite the parser
233       using some grammatical tools like the Perl 6 regex engine
234       Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.
235

AUTHOR

237       Agent Zhang <agentzh@gmail.com>
238
240       Copyright 2006-2008 by Agent Zhang.
241
242       This library is free software; you can redistribute it and/or modify it
243       under the same terms as Perl itself.
244

SEE ALSO

246       MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
247       makesimple.
248

POD ERRORS

250       Hey! The above document had some coding errors, which are explained
251       below:
252
253       Around line 163:
254           You forgot a '=back' before '=head1'
255
256       Around line 191:
257           =back without =over
258
259
260
261perl v5.10.1                      2008-03-10                  Makefile::DOM(3)
Impressum