1HCT::Lang::LexerDriver(U3s)er Contributed Perl DocumentatHiCoTn::Lang::LexerDriver(3)
2
3
4

NAME

6       HCT::Language::LexerDriver - Lexer driver for languages under HCT
7       system.
8

DESCRIPTION

10       Lexical analysis or scanning is the process where the stream of
11       characters making up the source program is read from left-to-right and
12       grouped into tokens. Tokens are sequences of characters with a
13       collective meaning. There are usually only a small number of tokens for
14       a programming language: constants (integer, double, char, string,
15       etc.), operators (arithmetic, relational, logical), punctuation, and
16       reserved words.
17
18                              error messages
19                                    |
20       source language -> [ LEXICAL ANALYZER ] -> token stream
21
22       The lexical analyzer takes a source program as input, and produces a
23       stream of tokens as output. The lexical analyzer might recognize
24       particular instances of tokens -- lexemes.  A lexeme is the actual
25       character sequence forming a token, the token is the general class that
26       a lexeme belongs to. Some tokens have exactly one lexeme (e.g., the >
27       character); for others, there are many lexemes (e.g., integer
28       constants).
29

CONSTRUCTOR

31       new ()
32           Creates a new "HCT::Lang::LexerDriver" object.
33

DESTRUCTOR

35       DESTROY ()
36           Destroy "HCT::Lang::LexerDriver" object.
37

METHODS

39       input ([FH])
40           Returns input object created by "HCT::Std::IO". If FH set, creates
41           a new input object.
42
43       skip_whitespace ()
44           Skips whitespaces. Just removes leading white spaces.
45
46       match (EXPR)
47           If EXPR passed successfully, returns matched value and sets new
48           line position. Else, goes back to the previous position and returns
49           "undef".
50
51       linepos ([POS])
52           If POS set, changes current line position. Returns current line
53           position.
54
55           Important: works not directly with current line position ("pos"),
56           but with variable that store previous position value.
57
58       linelen ()
59           Returns current line length.
60
61       linenum ()
62           Returns current line number.
63
64       is_eof ()
65           Returns true if end of file (input), or false.
66
67       is_eol ()
68           Returns true if end of line, or false.
69
70       move ()
71           Returns can we moved forward or no. If line is not ended ("pos"
72           equal to "len"), returns "TRUE", else gets a new line while line is
73           empty. If file is finished returns "FALSE".
74
75       is_emptyline ()
76           Returns "TRUE" if current line is empty.
77
78       get_next_line ()
79           Gets a new line and returns "TRUE" if the new line has been
80           received.  Or "FALSE" if "EOF".
81
82       scan (PARSER)
83           The scanner has encoded within it information on the possible
84           sequences of characters that can be contained within any of the
85           tokens it handles. Returns token.
86
87       stop ()
88           Returns "EMPTY_TOKEN" to stop the parser. Could be rewritten if
89           needed, see CDL lexer.
90
91       callback ()
92           Returns parser handler. Will be updated as soon, as "scan" called.
93
94       tracer ()
95           Returns object of tracing created by "HCT::Lang::LexerTracer".
96
97       next_token ()
98           Tries to find corresponding token. Return true if token was found,
99           or false as finish.
100
101       match_pattern ()
102       checkup_available_tokens ()
103       do_reserve ()
104           Makes reservation of tokens by calling "reserve_tokens".
105
106       reserve_patterns ()
107           Creates a list of patterns in "patterns". Each element has such
108           fields: tokens name, type and one pattern. That means, we create a
109           sigle element for each new pattern. After creating this list will
110           be sorted by types and patterns.
111
112       reserve (TYPE, TOKENS)
113           Creates an objects of new tokens by using "HCT::Lang::Token" and
114           stores them into the "tokens".
115
116       get_token (NAME)
117           Gets token name and returns token object from "tokens".
118
119       shift_token ()
120           Shifts and returns token from token stack.
121
122       push_token ()
123           Psuh new token to the stack and returns true.
124

VIRTUAL METHODS

126       skip_comment ()
127           Virtual method that provides skip from the comments. Returns true
128           if current position should be skipped.
129
130       skip_this ()
131           Virtual method that provides skip some stuff from the current
132           positions.  Returns true or false.
133
134       reserve_tokens ()
135           Virtual method to provide reserve of tokens.
136

PROBLEMS

138   Identifiers and keywords
139       The lexer state solution involves coordination between the lexer and
140       the parser. In particular, the parser must tell the lexer whether in
141       "this context" a keyword is expected or an identifier is expected.
142       There are several problems with coordinating the lexer's state with the
143       parser's context. One of the most frequently noted ones is that it
144       makes (multiple token) lookahead more difficult. A related problem
145       occurs if the context is miscommunicated between the parser and the
146       lexer, the lexer may return a keyword when only identifers are expected
147       or return the keyword as an identifier when the keyword was supposed to
148       be treated as a keyword.
149
150
151
152perl v5.30.1                      2020-01-29         HCT::Lang::LexerDriver(3)
Impressum