1PERLCLASSGUTS(1)       Perl Programmers Reference Guide       PERLCLASSGUTS(1)
2
3
4

NAME

6       perlclassguts - Internals of how "feature 'class'" and class syntax
7       works
8

DESCRIPTION

10       This document provides in-depth information about the way in which the
11       perl interpreter implements the "feature 'class'" syntax and overall
12       behaviour.  It is not intended as an end-user guide on how to use the
13       feature. For that, see perlclass.
14
15       The reader is assumed to be generally familiar with the perl
16       interpreter internals overall. For a more general overview of these
17       details, see also perlguts.
18

DATA STORAGE

20   Classes
21       A class is fundamentally a package, and exists in the symbol table as
22       an HV with an aux structure in exactly the same way as a non-class
23       package. It is distinguished from a non-class package by the fact that
24       the HvSTASH_IS_CLASS() macro will return true on it.
25
26       Extra information relating to it being a class is stored in the "struct
27       xpvhv_aux" structure attached to the stash, in the following fields:
28
29           HV          *xhv_class_superclass;
30           CV          *xhv_class_initfields_cv;
31           AV          *xhv_class_adjust_blocks;
32           PADNAMELIST *xhv_class_fields;
33           PADOFFSET    xhv_class_next_fieldix;
34           HV          *xhv_class_param_map;
35
36       •   "xhv_class_superclass" will be "NULL" for a class with no
37           superclass. It will point directly to the stash of the parent class
38           if one has been set with the :isa() class attribute.
39
40       •   "xhv_class_initfields_cv" will contain a "CV *" pointing to a
41           function to be invoked as part of the constructor of this class or
42           any subclass thereof. This CV is responsible for initializing all
43           the fields defined by this class for a new instance. This CV will
44           be an anonymous real function - i.e. while it has no name and no
45           GV, it is not a protosub and may be directly invoked.
46
47       •   "xhv_class_adjust_blocks" may point to an AV containing CV pointers
48           to each of the "ADJUST" blocks defined on the class. If the class
49           has a superclass, this array will additionally contain duplicate
50           pointers of the CVs of its parent class. The AV is created lazily
51           the first time an element is pushed to it; it is valid for there
52           not to be one, and this pointer will be "NULL" in that case.
53
54           The CVs are stored directly, not via RVs. Each CV will be an
55           anonymous real function.
56
57       •   "xhv_class_fields" will point to a "PADNAMELIST" containing
58           "PADNAME"s, each being one defined field of the class. They are
59           stored in order of declaration. Note however, that the index into
60           this array will not necessarily be equal to the "fieldix" of each
61           field, because in the case of a subclass, the array will begin at
62           zero but the index of the first field in it will be non-zero if its
63           parent class contains any fields at all.
64
65           For more information on how individual fields are represented, see
66           "Fields".
67
68       •   "xhv_class_next_fieldix" gives the field index that will be
69           assigned to the next field to be added to the class. It is only
70           useful at compile-time.
71
72       •   "xhv_class_param_map" may point to an HV which maps field ":param"
73           attribute names to the field index of the field with that name.
74           This mapping is copied from parent classes; each class will contain
75           the sum total of all its parents in addition to its own.
76
77   Fields
78       A field is still fundamentally a lexical variable declared in a scope,
79       and exists in the "PADNAMELIST" of its corresponding CV. Methods and
80       other method-like CVs can still capture them exactly as they can with
81       regular lexicals. A field is distinguished from other kinds of pad
82       entry in that the PadnameIsFIELD() macro will return true on it.
83
84       Extra information relating to it being a field is stored in an
85       additional structure accessible via the PadnameFIELDINFO() macro on the
86       padname. This structure has the following fields:
87
88           PADOFFSET  fieldix;
89           HV        *fieldstash;
90           OP        *defop;
91           SV        *paramname;
92           bool       def_if_undef;
93           bool       def_if_false;
94
95       •   "fieldix" stores the "field index" of the field; that is, the index
96           into the instance field array where this field's value will be
97           stored. Note that the first index in the array is not specially
98           reserved. The first field in a class will start from field index 0.
99
100       •   "fieldstash" stores a pointer to the stash of the class that
101           defined this field. This is necessary in case there are multiple
102           classes defined within the same scope; it is used to disambiguate
103           the fields of each.
104
105               {
106                   class C1; field $x;
107                   class C2; field $x;
108               }
109
110       •   "defop" may store a pointer to a defaulting expression optree for
111           this field.  Defaulting expressions are optional; this field may be
112           "NULL".
113
114       •   "paramname" may point to a regular string SV containing the
115           ":param" name attribute given to the field. If none, it will be
116           "NULL".
117
118       •   One of "def_if_undef" and "def_if_false" will be true if the
119           defaulting expression was set using the "//=" or "||=" operators
120           respectively.
121
122   Methods
123       A method is still fundamentally a CV, and has the same basic
124       representation as one. It has an optree and a pad, and is stored via a
125       GV in the stash of its containing package. It is distinguished from a
126       non-method CV by the fact that the CvIsMETHOD() macro will return true
127       on it.
128
129       (Note: This macro should not be confused with the one that was
130       previously called CvMETHOD(). That one does not relate to the class
131       system, and was renamed to CvNOWARN_AMBIGUOUS() to avoid this
132       confusion.)
133
134       There is currently no extra information that needs to be stored about a
135       method CV, so the structure does not add any new fields.
136
137   Instances
138       Object instances are represented by an entirely new SV type, whose base
139       type is "SVt_PVOBJ". This should still be blessed into its class stash
140       and wrapped in an RV in the usual manner for classical object.
141
142       As these are their own unique container type, distinct from hashes or
143       arrays, the core "builtin::reftype" function returns a new value when
144       asked about these. That value is "OBJECT".
145
146       Internally, such an object is an array of SV pointers whose size is
147       fixed at creation time (because the number of fields in a class is
148       known after compilation). An object instance stores the max field index
149       within it (for basic error-checking on access), and a fixed-size array
150       of SV pointers storing the individual field values.
151
152       Fields of array and hash type directly store AV or HV pointers into the
153       array; they are not stored via an intervening RV.
154

API

156       The data structures described above are supported by the following API
157       functions.
158
159   Class Manipulation
160       class_setup_stash
161
162           void class_setup_stash(HV *stash);
163
164       Called by the parser on encountering the "class" keyword. It upgrades
165       the stash into being a class and prepares it for receiving class-
166       specific items like methods and fields.
167
168       class_seal_stash
169
170           void class_seal_stash(HV *stash);
171
172       Called by the parser at the end of a "class" block, or for unit classes
173       its containing scope. This function performs various finalisation
174       activities that are required before instances of the class can be
175       constructed, but could not have been done until all the information
176       about the members of the class is known.
177
178       Any additions to or modifications of the class under compilation must
179       be performed between these two function calls. Classes cannot be
180       modified once they have been sealed.
181
182       class_add_field
183
184           void class_add_field(HV *stash, PADNAME *pn);
185
186       Called by pad.c as part of defining a new field name in the current
187       pad.  Note that this function does not create the padname; that must
188       already be done by pad.c. This API function simply informs the class
189       that the new field name has been created and is now available for it.
190
191       class_add_ADJUST
192
193           void class_add_ADJUST(HV *stash, CV *cv);
194
195       Called by the parser once it has parsed and constructed a CV for a new
196       "ADJUST" block. This gets added to the list stored by the class.
197
198   Field Manipulation
199       class_prepare_initfield_parse
200
201           void class_prepare_initfield_parse();
202
203       Called by the parser just before parsing an initializing expression for
204       a field variable. This makes use of a suspended compcv to combine all
205       the field initializing expressions into the same CV.
206
207       class_set_field_defop
208
209           void class_set_field_defop(PADNAME *pn, OPCODE defmode, OP *defop);
210
211       Called by the parser after it has parsed an initializing expression for
212       the field. Sets the defaulting expression and mode of application.
213       "defmode" should either be zero, or one of "OP_ORASSIGN" or
214       "OP_DORASSIGN" depending on the defaulting mode.
215
216       padadd_FIELD
217
218           #define padadd_FIELD
219
220       This flag constant tells the "pad_add_name_*" family of functions that
221       the new name should be added as a field. There is no need to call
222       class_add_field(); this will be done automatically.
223
224   Method Manipulation
225       class_prepare_method_parse
226
227           void class_prepare_method_parse(CV *cv);
228
229       Called by the parser after start_subparse() but immediately before
230       doing anything else. This prepares the "PL_compcv" for parsing a
231       method; arranging for the "CvIsMETHOD" test to be true, adding the
232       $self lexical, and any other activities that may be required.
233
234       class_wrap_method_body
235
236           OP *class_wrap_method_body(OP *o);
237
238       Called by the parser at the end of parsing a method body into an optree
239       but just before wrapping it in the eventual CV. This function inserts
240       extra ops into the optree to make the method work correctly.
241
242   Object Instances
243       SVt_PVOBJ
244
245           #define SVt_PVOBJ
246
247       An SV type constant used for comparison with the SvTYPE() macro.
248
249       ObjectMAXFIELD
250
251           SSize_t ObjectMAXFIELD(sv);
252
253       A function-like macro that obtains the maximum valid field index that
254       can be accessed from the "ObjectFIELDS" array.
255
256       ObjectFIELDS
257
258           SV **ObjectFIELDS(sv);
259
260       A function-like macro that obtains the fields array directly out of an
261       object instance. Fields can be accessed by their field index, from 0 up
262       to the maximum valid index given by "ObjectMAXFIELD".
263

OPCODES

265   OP_METHSTART
266           newUNOP_AUX(OP_METHSTART, ...);
267
268       An "OP_METHSTART" is an "UNOP_AUX" which must be present at the start
269       of a method CV in order to make it work properly. This is inserted by
270       class_wrap_method_body(), and even appears before any optree fragment
271       associated with signature argument checking or extraction.
272
273       This op is responsible for shifting the value of $self out of the
274       arguments list and binding any field variables that the method requires
275       access to into the pad. The AUX vector will contain details of the
276       field/pad index pairings required.
277
278       This op also performs sanity checking on the invocant value. It checks
279       that it is definitely an object reference of a compatible class type.
280       If not, an exception is thrown.
281
282       If the "op_private" field includes the "OPpINITFIELDS" flag, this
283       indicates that the op begins the special "xhv_class_initfields_cv" CV.
284       In this case it should additionally take the second value from the
285       arguments list, which should be a plain HV pointer (directly, not via
286       RV). and bind it to the second pad slot, where the generated optree
287       will expect to find it.
288
289   OP_INITFIELD
290       An "OP_INITFIELD" is only invoked as part of the
291       "xhv_class_initfields_cv" CV during the construction phase of an
292       instance. This is the time that the individual SVs that make up the
293       mutable fields of the instance (including AVs and HVs) are actually
294       assigned into the "ObjectFIELDS" array. The "OPpINITFIELD_AV" and
295       "OPpINITFIELD_HV" private flags indicate whether it is creating an AV
296       or HV; if neither is set then an SV is created.
297
298       If the op has the "OPf_STACKED" flag it expects to find an initializing
299       value on the stack. For SVs this is the topmost SV on the data stack.
300       For AVs and HVs it expects a marked list.
301

COMPILE-TIME BEHAVIOUR

303   "ADJUST" Phasers
304       During compiletime, parsing of an "ADJUST" phaser is handled in a
305       fundamentally different way to the existing perl phasers ("BEGIN",
306       etc...)
307
308       Rather than taking the usual route, the tokenizer recognises that the
309       "ADJUST" keyword introduces a phaser block. The parser then parses the
310       body of this block similarly to how it would parse an (anonymous)
311       method body, creating a CV that has no name GV. This is then inserted
312       directly into the class information by calling "class_add_ADJUST",
313       entirely bypassing the symbol table.
314
315   Attributes
316       During compilation, attributes of both classes and fields are handled
317       in a different way to existing perl attributes on subroutines and
318       lexical variables.
319
320       The parser still forms an "OP_LIST" optree of "OP_CONST" nodes, but
321       these are passed to the "class_apply_attributes" or
322       "class_apply_field_attributes" functions. Rather than using a class
323       lookup for a method in the class being parsed, a fixed internal list of
324       known attributes is used to find functions to apply the attribute to
325       the class or field. In future this may support user-supplied extension
326       attribute, though at present it only recognises ones defined by the
327       core itself.
328
329   Field Initializing Expressions
330       During compilation, the parser makes use of a suspended compcv when
331       parsing the defaulting expression for a field. All the expressions for
332       all the fields in the class share the same suspended compcv, which is
333       then compiled up into the same internal CV called by the constructor to
334       initialize all the fields provided by that class.
335

RUNTIME BEHAVIOUR

337   Constructor
338       The generated constructor for a class itself is an XSUB which performs
339       three tasks in order: it creates the instance SV itself, invokes the
340       field initializers, then invokes the ADJUST block CVs. The constructor
341       for any class is always the same basic shape, regardless of whether the
342       class has a superclass or not.
343
344       The field initializers are collected into a generated optree-based CV
345       called the field initializer CV. This is the CV which contains all the
346       optree fragments for the field initializing expressions. When invoked,
347       the field initializer CV might make a chained call to the superclass
348       initializer if one exists, before invoking all of the individual field
349       initialization ops. The field initializer CV is invoked with two items
350       on the stack; being the instance SV and a direct HV containing the
351       constructor parameters. Note carefully: this HV is passed directly, not
352       via an RV reference. This is permitted because both the caller and the
353       callee are directly generated code and not arbitrary pure-perl
354       subroutines.
355
356       The ADJUST block CVs are all collected into a single flat list, merging
357       all of the ones defined by the superclass as well. They are all invoked
358       in order, after the field initializer CV.
359
360   $self Access During Methods
361       When class_prepare_method_parse() is called, it arranges that the pad
362       of the new CV body will begin with a lexical called $self. Because the
363       pad should be freshly-created at this point, this will have the pad
364       index of 1.  The function checks this and aborts if that is not true.
365
366       Because of this fact, code within the body of a method or method-like
367       CV can reliably use pad index 1 to obtain the invocant reference. The
368       "OP_INITFIELD" opcode also relies on this fact.
369
370       In similar fashion, during the "xhv_class_initfields_cv" the next pad
371       slot is relied on to store the constructor parameters HV, at pad index
372       2.
373

AUTHORS

375       Paul Evans
376
377
378
379perl v5.38.2                      2023-11-30                  PERLCLASSGUTS(1)
Impressum