1HCT::Lang::LexerDriver(U3s)er Contributed Perl DocumentatHiCoTn::Lang::LexerDriver(3)
2
3
4
6 HCT::Language::LexerDriver - Lexer driver for languages under HCT
7 system.
8
10 Lexical analysis or scanning is the process where the stream of
11 characters making up the source program is read from left-to-right and
12 grouped into tokens. Tokens are sequences of characters with a
13 collective meaning. There are usually only a small number of tokens for
14 a programming language: constants (integer, double, char, string,
15 etc.), operators (arithmetic, relational, logical), punctuation, and
16 reserved words.
17
18 error messages
19 |
20 source language -> [ LEXICAL ANALYZER ] -> token stream
21
22 The lexical analyzer takes a source program as input, and produces a
23 stream of tokens as output. The lexical analyzer might recognize
24 particular instances of tokens -- lexemes. A lexeme is the actual
25 character sequence forming a token, the token is the general class that
26 a lexeme belongs to. Some tokens have exactly one lexeme (e.g., the >
27 character); for others, there are many lexemes (e.g., integer
28 constants).
29
31 new ()
32 Creates a new "HCT::Lang::LexerDriver" object.
33
35 DESTROY ()
36 Destroy "HCT::Lang::LexerDriver" object.
37
39 input ([FH])
40 Returns input object created by "HCT::Std::IO". If FH set, creates
41 a new input object.
42
43 skip_whitespace ()
44 Skips whitespaces. Just removes leading white spaces.
45
46 match (EXPR)
47 If EXPR passed successfully, returns matched value and sets new
48 line position. Else, goes back to the previous position and returns
49 "undef".
50
51 linepos ([POS])
52 If POS set, changes current line position. Returns current line
53 position.
54
55 Important: works not directly with current line position ("pos"),
56 but with variable that store previous position value.
57
58 linelen ()
59 Returns current line length.
60
61 linenum ()
62 Returns current line number.
63
64 is_eof ()
65 Returns true if end of file (input), or false.
66
67 is_eol ()
68 Returns true if end of line, or false.
69
70 move ()
71 Returns can we moved forward or no. If line is not ended ("pos"
72 equal to "len"), returns "TRUE", else gets a new line while line is
73 empty. If file is finished returns "FALSE".
74
75 is_emptyline ()
76 Returns "TRUE" if current line is empty.
77
78 get_next_line ()
79 Gets a new line and returns "TRUE" if the new line has been
80 received. Or "FALSE" if "EOF".
81
82 scan (PARSER)
83 The scanner has encoded within it information on the possible
84 sequences of characters that can be contained within any of the
85 tokens it handles. Returns token.
86
87 stop ()
88 Returns "EMPTY_TOKEN" to stop the parser. Could be rewritten if
89 needed, see CDL lexer.
90
91 callback ()
92 Returns parser handler. Will be updated as soon, as "scan" called.
93
94 tracer ()
95 Returns object of tracing created by "HCT::Lang::LexerTracer".
96
97 next_token ()
98 Tries to find corresponding token. Return true if token was found,
99 or false as finish.
100
101 match_pattern ()
102 checkup_available_tokens ()
103 do_reserve ()
104 Makes reservation of tokens by calling "reserve_tokens".
105
106 reserve_patterns ()
107 Creates a list of patterns in "patterns". Each element has such
108 fields: tokens name, type and one pattern. That means, we create a
109 sigle element for each new pattern. After creating this list will
110 be sorted by types and patterns.
111
112 reserve (TYPE, TOKENS)
113 Creates an objects of new tokens by using "HCT::Lang::Token" and
114 stores them into the "tokens".
115
116 get_token (NAME)
117 Gets token name and returns token object from "tokens".
118
119 shift_token ()
120 Shifts and returns token from token stack.
121
122 push_token ()
123 Psuh new token to the stack and returns true.
124
126 skip_comment ()
127 Virtual method that provides skip from the comments. Returns true
128 if current position should be skipped.
129
130 skip_this ()
131 Virtual method that provides skip some stuff from the current
132 positions. Returns true or false.
133
134 reserve_tokens ()
135 Virtual method to provide reserve of tokens.
136
138 Identifiers and keywords
139 The lexer state solution involves coordination between the lexer and
140 the parser. In particular, the parser must tell the lexer whether in
141 "this context" a keyword is expected or an identifier is expected.
142 There are several problems with coordinating the lexer's state with the
143 parser's context. One of the most frequently noted ones is that it
144 makes (multiple token) lookahead more difficult. A related problem
145 occurs if the context is miscommunicated between the parser and the
146 lexer, the lexer may return a keyword when only identifers are expected
147 or return the keyword as an identifier when the keyword was supposed to
148 be treated as a keyword.
149
150
151
152perl v5.34.0 2021-07-22 HCT::Lang::LexerDriver(3)