1PERLCLASSGUTS(1) Perl Programmers Reference Guide PERLCLASSGUTS(1)
2
3
4
6 perlclassguts - Internals of how "feature 'class'" and class syntax
7 works
8
10 This document provides in-depth information about the way in which the
11 perl interpreter implements the "feature 'class'" syntax and overall
12 behaviour. It is not intended as an end-user guide on how to use the
13 feature. For that, see perlclass.
14
15 The reader is assumed to be generally familiar with the perl
16 interpreter internals overall. For a more general overview of these
17 details, see also perlguts.
18
20 Classes
21 A class is fundamentally a package, and exists in the symbol table as
22 an HV with an aux structure in exactly the same way as a non-class
23 package. It is distinguished from a non-class package by the fact that
24 the HvSTASH_IS_CLASS() macro will return true on it.
25
26 Extra information relating to it being a class is stored in the "struct
27 xpvhv_aux" structure attached to the stash, in the following fields:
28
29 HV *xhv_class_superclass;
30 CV *xhv_class_initfields_cv;
31 AV *xhv_class_adjust_blocks;
32 PADNAMELIST *xhv_class_fields;
33 PADOFFSET xhv_class_next_fieldix;
34 HV *xhv_class_param_map;
35
36 • "xhv_class_superclass" will be "NULL" for a class with no
37 superclass. It will point directly to the stash of the parent class
38 if one has been set with the :isa() class attribute.
39
40 • "xhv_class_initfields_cv" will contain a "CV *" pointing to a
41 function to be invoked as part of the constructor of this class or
42 any subclass thereof. This CV is responsible for initializing all
43 the fields defined by this class for a new instance. This CV will
44 be an anonymous real function - i.e. while it has no name and no
45 GV, it is not a protosub and may be directly invoked.
46
47 • "xhv_class_adjust_blocks" may point to an AV containing CV pointers
48 to each of the "ADJUST" blocks defined on the class. If the class
49 has a superclass, this array will additionally contain duplicate
50 pointers of the CVs of its parent class. The AV is created lazily
51 the first time an element is pushed to it; it is valid for there
52 not to be one, and this pointer will be "NULL" in that case.
53
54 The CVs are stored directly, not via RVs. Each CV will be an
55 anonymous real function.
56
57 • "xhv_class_fields" will point to a "PADNAMELIST" containing
58 "PADNAME"s, each being one defined field of the class. They are
59 stored in order of declaration. Note however, that the index into
60 this array will not necessarily be equal to the "fieldix" of each
61 field, because in the case of a subclass, the array will begin at
62 zero but the index of the first field in it will be non-zero if its
63 parent class contains any fields at all.
64
65 For more information on how individual fields are represented, see
66 "Fields".
67
68 • "xhv_class_next_fieldix" gives the field index that will be
69 assigned to the next field to be added to the class. It is only
70 useful at compile-time.
71
72 • "xhv_class_param_map" may point to an HV which maps field ":param"
73 attribute names to the field index of the field with that name.
74 This mapping is copied from parent classes; each class will contain
75 the sum total of all its parents in addition to its own.
76
77 Fields
78 A field is still fundamentally a lexical variable declared in a scope,
79 and exists in the "PADNAMELIST" of its corresponding CV. Methods and
80 other method-like CVs can still capture them exactly as they can with
81 regular lexicals. A field is distinguished from other kinds of pad
82 entry in that the PadnameIsFIELD() macro will return true on it.
83
84 Extra information relating to it being a field is stored in an
85 additional structure accessible via the PadnameFIELDINFO() macro on the
86 padname. This structure has the following fields:
87
88 PADOFFSET fieldix;
89 HV *fieldstash;
90 OP *defop;
91 SV *paramname;
92 bool def_if_undef;
93 bool def_if_false;
94
95 • "fieldix" stores the "field index" of the field; that is, the index
96 into the instance field array where this field's value will be
97 stored. Note that the first index in the array is not specially
98 reserved. The first field in a class will start from field index 0.
99
100 • "fieldstash" stores a pointer to the stash of the class that
101 defined this field. This is necessary in case there are multiple
102 classes defined within the same scope; it is used to disambiguate
103 the fields of each.
104
105 {
106 class C1; field $x;
107 class C2; field $x;
108 }
109
110 • "defop" may store a pointer to a defaulting expression optree for
111 this field. Defaulting expressions are optional; this field may be
112 "NULL".
113
114 • "paramname" may point to a regular string SV containing the
115 ":param" name attribute given to the field. If none, it will be
116 "NULL".
117
118 • One of "def_if_undef" and "def_if_false" will be true if the
119 defaulting expression was set using the "//=" or "||=" operators
120 respectively.
121
122 Methods
123 A method is still fundamentally a CV, and has the same basic
124 representation as one. It has an optree and a pad, and is stored via a
125 GV in the stash of its containing package. It is distinguished from a
126 non-method CV by the fact that the CvIsMETHOD() macro will return true
127 on it.
128
129 (Note: This macro should not be confused with the one that was
130 previously called CvMETHOD(). That one does not relate to the class
131 system, and was renamed to CvNOWARN_AMBIGUOUS() to avoid this
132 confusion.)
133
134 There is currently no extra information that needs to be stored about a
135 method CV, so the structure does not add any new fields.
136
137 Instances
138 Object instances are represented by an entirely new SV type, whose base
139 type is "SVt_PVOBJ". This should still be blessed into its class stash
140 and wrapped in an RV in the usual manner for classical object.
141
142 As these are their own unique container type, distinct from hashes or
143 arrays, the core "builtin::reftype" function returns a new value when
144 asked about these. That value is "OBJECT".
145
146 Internally, such an object is an array of SV pointers whose size is
147 fixed at creation time (because the number of fields in a class is
148 known after compilation). An object instance stores the max field index
149 within it (for basic error-checking on access), and a fixed-size array
150 of SV pointers storing the individual field values.
151
152 Fields of array and hash type directly store AV or HV pointers into the
153 array; they are not stored via an intervening RV.
154
156 The data structures described above are supported by the following API
157 functions.
158
159 Class Manipulation
160 class_setup_stash
161
162 void class_setup_stash(HV *stash);
163
164 Called by the parser on encountering the "class" keyword. It upgrades
165 the stash into being a class and prepares it for receiving class-
166 specific items like methods and fields.
167
168 class_seal_stash
169
170 void class_seal_stash(HV *stash);
171
172 Called by the parser at the end of a "class" block, or for unit classes
173 its containing scope. This function performs various finalisation
174 activities that are required before instances of the class can be
175 constructed, but could not have been done until all the information
176 about the members of the class is known.
177
178 Any additions to or modifications of the class under compilation must
179 be performed between these two function calls. Classes cannot be
180 modified once they have been sealed.
181
182 class_add_field
183
184 void class_add_field(HV *stash, PADNAME *pn);
185
186 Called by pad.c as part of defining a new field name in the current
187 pad. Note that this function does not create the padname; that must
188 already be done by pad.c. This API function simply informs the class
189 that the new field name has been created and is now available for it.
190
191 class_add_ADJUST
192
193 void class_add_ADJUST(HV *stash, CV *cv);
194
195 Called by the parser once it has parsed and constructed a CV for a new
196 "ADJUST" block. This gets added to the list stored by the class.
197
198 Field Manipulation
199 class_prepare_initfield_parse
200
201 void class_prepare_initfield_parse();
202
203 Called by the parser just before parsing an initializing expression for
204 a field variable. This makes use of a suspended compcv to combine all
205 the field initializing expressions into the same CV.
206
207 class_set_field_defop
208
209 void class_set_field_defop(PADNAME *pn, OPCODE defmode, OP *defop);
210
211 Called by the parser after it has parsed an initializing expression for
212 the field. Sets the defaulting expression and mode of application.
213 "defmode" should either be zero, or one of "OP_ORASSIGN" or
214 "OP_DORASSIGN" depending on the defaulting mode.
215
216 padadd_FIELD
217
218 #define padadd_FIELD
219
220 This flag constant tells the "pad_add_name_*" family of functions that
221 the new name should be added as a field. There is no need to call
222 class_add_field(); this will be done automatically.
223
224 Method Manipulation
225 class_prepare_method_parse
226
227 void class_prepare_method_parse(CV *cv);
228
229 Called by the parser after start_subparse() but immediately before
230 doing anything else. This prepares the "PL_compcv" for parsing a
231 method; arranging for the "CvIsMETHOD" test to be true, adding the
232 $self lexical, and any other activities that may be required.
233
234 class_wrap_method_body
235
236 OP *class_wrap_method_body(OP *o);
237
238 Called by the parser at the end of parsing a method body into an optree
239 but just before wrapping it in the eventual CV. This function inserts
240 extra ops into the optree to make the method work correctly.
241
242 Object Instances
243 SVt_PVOBJ
244
245 #define SVt_PVOBJ
246
247 An SV type constant used for comparison with the SvTYPE() macro.
248
249 ObjectMAXFIELD
250
251 SSize_t ObjectMAXFIELD(sv);
252
253 A function-like macro that obtains the maximum valid field index that
254 can be accessed from the "ObjectFIELDS" array.
255
256 ObjectFIELDS
257
258 SV **ObjectFIELDS(sv);
259
260 A function-like macro that obtains the fields array directly out of an
261 object instance. Fields can be accessed by their field index, from 0 up
262 to the maximum valid index given by "ObjectMAXFIELD".
263
265 OP_METHSTART
266 newUNOP_AUX(OP_METHSTART, ...);
267
268 An "OP_METHSTART" is an "UNOP_AUX" which must be present at the start
269 of a method CV in order to make it work properly. This is inserted by
270 class_wrap_method_body(), and even appears before any optree fragment
271 associated with signature argument checking or extraction.
272
273 This op is responsible for shifting the value of $self out of the
274 arguments list and binding any field variables that the method requires
275 access to into the pad. The AUX vector will contain details of the
276 field/pad index pairings required.
277
278 This op also performs sanity checking on the invocant value. It checks
279 that it is definitely an object reference of a compatible class type.
280 If not, an exception is thrown.
281
282 If the "op_private" field includes the "OPpINITFIELDS" flag, this
283 indicates that the op begins the special "xhv_class_initfields_cv" CV.
284 In this case it should additionally take the second value from the
285 arguments list, which should be a plain HV pointer (directly, not via
286 RV). and bind it to the second pad slot, where the generated optree
287 will expect to find it.
288
289 OP_INITFIELD
290 An "OP_INITFIELD" is only invoked as part of the
291 "xhv_class_initfields_cv" CV during the construction phase of an
292 instance. This is the time that the individual SVs that make up the
293 mutable fields of the instance (including AVs and HVs) are actually
294 assigned into the "ObjectFIELDS" array. The "OPpINITFIELD_AV" and
295 "OPpINITFIELD_HV" private flags indicate whether it is creating an AV
296 or HV; if neither is set then an SV is created.
297
298 If the op has the "OPf_STACKED" flag it expects to find an initializing
299 value on the stack. For SVs this is the topmost SV on the data stack.
300 For AVs and HVs it expects a marked list.
301
303 "ADJUST" Phasers
304 During compiletime, parsing of an "ADJUST" phaser is handled in a
305 fundamentally different way to the existing perl phasers ("BEGIN",
306 etc...)
307
308 Rather than taking the usual route, the tokenizer recognises that the
309 "ADJUST" keyword introduces a phaser block. The parser then parses the
310 body of this block similarly to how it would parse an (anonymous)
311 method body, creating a CV that has no name GV. This is then inserted
312 directly into the class information by calling "class_add_ADJUST",
313 entirely bypassing the symbol table.
314
315 Attributes
316 During compilation, attributes of both classes and fields are handled
317 in a different way to existing perl attributes on subroutines and
318 lexical variables.
319
320 The parser still forms an "OP_LIST" optree of "OP_CONST" nodes, but
321 these are passed to the "class_apply_attributes" or
322 "class_apply_field_attributes" functions. Rather than using a class
323 lookup for a method in the class being parsed, a fixed internal list of
324 known attributes is used to find functions to apply the attribute to
325 the class or field. In future this may support user-supplied extension
326 attribute, though at present it only recognises ones defined by the
327 core itself.
328
329 Field Initializing Expressions
330 During compilation, the parser makes use of a suspended compcv when
331 parsing the defaulting expression for a field. All the expressions for
332 all the fields in the class share the same suspended compcv, which is
333 then compiled up into the same internal CV called by the constructor to
334 initialize all the fields provided by that class.
335
337 Constructor
338 The generated constructor for a class itself is an XSUB which performs
339 three tasks in order: it creates the instance SV itself, invokes the
340 field initializers, then invokes the ADJUST block CVs. The constructor
341 for any class is always the same basic shape, regardless of whether the
342 class has a superclass or not.
343
344 The field initializers are collected into a generated optree-based CV
345 called the field initializer CV. This is the CV which contains all the
346 optree fragments for the field initializing expressions. When invoked,
347 the field initializer CV might make a chained call to the superclass
348 initializer if one exists, before invoking all of the individual field
349 initialization ops. The field initializer CV is invoked with two items
350 on the stack; being the instance SV and a direct HV containing the
351 constructor parameters. Note carefully: this HV is passed directly, not
352 via an RV reference. This is permitted because both the caller and the
353 callee are directly generated code and not arbitrary pure-perl
354 subroutines.
355
356 The ADJUST block CVs are all collected into a single flat list, merging
357 all of the ones defined by the superclass as well. They are all invoked
358 in order, after the field initializer CV.
359
360 $self Access During Methods
361 When class_prepare_method_parse() is called, it arranges that the pad
362 of the new CV body will begin with a lexical called $self. Because the
363 pad should be freshly-created at this point, this will have the pad
364 index of 1. The function checks this and aborts if that is not true.
365
366 Because of this fact, code within the body of a method or method-like
367 CV can reliably use pad index 1 to obtain the invocant reference. The
368 "OP_INITFIELD" opcode also relies on this fact.
369
370 In similar fashion, during the "xhv_class_initfields_cv" the next pad
371 slot is relied on to store the constructor parameters HV, at pad index
372 2.
373
375 Paul Evans
376
377
378
379perl v5.38.2 2023-11-30 PERLCLASSGUTS(1)