1Spreadsheet::WriteExcelU:s:eFrorCmounltar(i3b)uted PerlSDporceuamdesnhteaetti:o:nWriteExcel::Formula(3)
2
3
4
6 Formula - A class for generating Excel formulas
7
9 See the documentation for Spreadsheet::WriteExcel
10
12 This module is used by Spreadsheet::WriteExcel. You do not need to use
13 it directly.
14
16 The following notes are to help developers and maintainers understand
17 the sequence of operation. They are also intended as a pro-memoria for
18 the author. ;-)
19
20 Spreadsheet::WriteExcel::Formula converts a textual representation of a
21 formula into the pre-parsed binary format that Excel uses to store
22 formulas. For example "1+2*3" is stored as follows: "1E 01 00 1E 02 00
23 1E 03 00 05 03".
24
25 This string is comprised of operators and operands arranged in a
26 reverse-Polish format. The meaning of the tokens in the above example
27 is shown in the following table:
28
29 Token Name Value
30 1E ptgInt 0001 (stored as 01 00)
31 1E ptgInt 0002 (stored as 02 00)
32 1E ptgInt 0003 (stored as 03 00)
33 05 ptgMul
34 03 ptgAdd
35
36 The tokens and token names are defined in the "Excel Developer's Kit"
37 from Microsoft Press. "ptg" stands for Parse ThinG (as in "That lexer
38 can't grok it, it's a parse thang.")
39
40 In general the tokens fall into two categories: operators such as
41 "ptgMul" and operands such as "ptgInt". When the formula is evaluated
42 by Excel the operand tokens push values onto a stack. The operator
43 tokens then pop the required number of operands off of the stack,
44 perform an operation and push the resulting value back onto the stack.
45 This methodology is similar to the basic operation of a reverse-Polish
46 (RPN) calculator.
47
48 Spreadsheet::WriteExcel::Formula parses a formula using a
49 "Parse::RecDescent" parser (at a later stage it may use a "Parse::Yapp"
50 parser or "Parse::FastDescent").
51
52 The parser converts the textual representation of a formula into a
53 parse tree. Thus, "1+2*3" is converted into something like the
54 following, "e" stands for expression:
55
56 e
57 / | \
58 1 + e
59 / | \
60 2 * 3
61
62 The function "_reverse_tree()" recurses down through this structure
63 swapping the order of operators followed by operands to produce a
64 reverse-Polish tree. In other words the formula is converted from in-
65 fix notation to post-fix. Following the above example the resulting
66 tree would look like this:
67
68 e
69 / | \
70 1 e +
71 / | \
72 2 3 *
73
74 The result of the recursion is a single array of tokens. In our example
75 the simplified form would look like the following:
76
77 (1, 2, 3, *, +)
78
79 The actual return value contains some additional information to help in
80 the secondary parsing stage:
81
82 (_num, 1, _num, 2, _num, 3, ptgMul, ptgAdd, _arg, 1)
83
84 The additional tokens are:
85
86 Token Meaning
87 _num The next token is a number
88 _str The next token is a string
89 _ref2d The next token is a 2d cell reference
90 _ref3d The next token is a 3d cell reference
91 _range2d The next token is a 2d range
92 _range3d The next token is a 3d range
93 _funcV The next token is a function
94 _arg The next token is the number of args for a function
95 _class The next token is a function name
96 _vol The formula contains a voltile function
97
98 The "_arg" token is generated for all lists but is only used for
99 functions that take a variable number of arguments.
100
101 The "_class" token indicates the start of the arguments to a function.
102 This allows the post-processor to decide the "class" of the ref and
103 range arguments that the function takes. The class can be reference,
104 value or array. Since function calls can be nested, the class variable
105 is stored on a stack in the @class array. The class of the ref or range
106 is then read as the top element of the stack $class[-1]. When a
107 "_funcV" is read it pops the class value.
108
109 Certain Excel functions such as RAND() and NOW() are designated as
110 volatile and must be recalculated by Excel every time that a cell is
111 updated. Any formulas that contain one of these functions has a
112 specially formatted "ptgAttr" tag prepended to it to indicate that it
113 is volatile.
114
115 A secondary parsing stage is carried out by "parse_tokens()" which
116 converts these tokens into a binary string. For the "1+2*3" example
117 this would give:
118
119 1E 01 00 1E 02 00 1E 03 00 05 03
120
121 This two-pass method could probably have been reduced to a single pass
122 through the "Parse::RecDescent" parser. However, it was easier to
123 develop and debug this way.
124
125 The token values and formula values are stored in the %ptg and
126 %functions hashes. These hashes and the parser object $parser are
127 exposed as global data. This breaks the OO encapsulation, but means
128 that they can be shared by several instances of Spreadsheet::WriteExcel
129 called from the same program.
130
131 Non-English function names can be added to the %functions hash using
132 the "function_locale.pl" program in the "examples" directory of the
133 distro. The supported languages are: German, French, Spanish,
134 Portuguese, Dutch, Finnish, Italian and Swedish. These languages are
135 not added by default because there are conflicts between functions
136 names in different languages.
137
138 The parser is initialised by "_init_parser()". The initialisation is
139 delayed until the first formula is parsed. This eliminates the overhead
140 of generating the parser in programs that are not processing formulas.
141 (The parser should really be pre-compiled, this is to-do when the
142 grammar stabilises).
143
145 John McNamara jmcnamara@cpan.org
146
148 Copyright MM-MMX, John McNamara.
149
150 All Rights Reserved. This module is free software. It may be used,
151 redistributed and/or modified under the same terms as Perl itself.
152
153
154
155perl v5.32.0 2020-07-28Spreadsheet::WriteExcel::Formula(3)