1PERLAPI(1) Perl Programmers Reference Guide PERLAPI(1)
2
3
4
6 perlapi - autogenerated documentation for the perl public API
7
9 This file contains the documentation of the perl public API generated
10 by embed.pl, specifically a listing of functions, macros, flags, and
11 variables that may be used by extension writers. At the end is a list
12 of functions which have yet to be documented. The interfaces of those
13 are subject to change without notice. Anything not listed here is not
14 part of the public API, and should not be used by extension writers at
15 all. For these reasons, blindly using functions listed in proto.h is
16 to be avoided when writing extensions.
17
18 In Perl, unlike C, a string of characters may generally contain
19 embedded "NUL" characters. Sometimes in the documentation a Perl
20 string is referred to as a "buffer" to distinguish it from a C string,
21 but sometimes they are both just referred to as strings.
22
23 Note that all Perl API global variables must be referenced with the
24 "PL_" prefix. Again, those not listed here are not to be used by
25 extension writers, and can be changed or removed without notice; same
26 with macros. Some macros are provided for compatibility with the
27 older, unadorned names, but this support may be disabled in a future
28 release.
29
30 Perl was originally written to handle US-ASCII only (that is characters
31 whose ordinal numbers are in the range 0 - 127). And documentation and
32 comments may still use the term ASCII, when sometimes in fact the
33 entire range from 0 - 255 is meant.
34
35 The non-ASCII characters below 256 can have various meanings, depending
36 on various things. (See, most notably, perllocale.) But usually the
37 whole range can be referred to as ISO-8859-1. Often, the term
38 "Latin-1" (or "Latin1") is used as an equivalent for ISO-8859-1. But
39 some people treat "Latin1" as referring just to the characters in the
40 range 128 through 255, or somethimes from 160 through 255. This
41 documentation uses "Latin1" and "Latin-1" to refer to all 256
42 characters.
43
44 Note that Perl can be compiled and run under either ASCII or EBCDIC
45 (See perlebcdic). Most of the documentation (and even comments in the
46 code) ignore the EBCDIC possibility. For almost all purposes the
47 differences are transparent. As an example, under EBCDIC, instead of
48 UTF-8, UTF-EBCDIC is used to encode Unicode strings, and so whenever
49 this documentation refers to "utf8" (and variants of that name,
50 including in function names), it also (essentially transparently) means
51 "UTF-EBCDIC". But the ordinals of characters differ between ASCII,
52 EBCDIC, and the UTF- encodings, and a string encoded in UTF-EBCDIC may
53 occupy a different number of bytes than in UTF-8.
54
55 The listing below is alphabetical, case insensitive.
56
58 av_clear
59 Frees the all the elements of an array, leaving it empty. The
60 XS equivalent of "@array = ()". See also "av_undef".
61
62 Note that it is possible that the actions of a destructor
63 called directly or indirectly by freeing an element of the
64 array could cause the reference count of the array itself to be
65 reduced (e.g. by deleting an entry in the symbol table). So it
66 is a possibility that the AV could have been freed (or even
67 reallocated) on return from the call unless you hold a
68 reference to it.
69
70 void av_clear(AV *av)
71
72 av_create_and_push
73 NOTE: this function is experimental and may change or be
74 removed without notice.
75
76 Push an SV onto the end of the array, creating the array if
77 necessary. A small internal helper function to remove a
78 commonly duplicated idiom.
79
80 void av_create_and_push(AV **const avp,
81 SV *const val)
82
83 av_create_and_unshift_one
84 NOTE: this function is experimental and may change or be
85 removed without notice.
86
87 Unshifts an SV onto the beginning of the array, creating the
88 array if necessary. A small internal helper function to remove
89 a commonly duplicated idiom.
90
91 SV** av_create_and_unshift_one(AV **const avp,
92 SV *const val)
93
94 av_delete
95 Deletes the element indexed by "key" from the array, makes the
96 element mortal, and returns it. If "flags" equals "G_DISCARD",
97 the element is freed and NULL is returned. NULL is also
98 returned if "key" is out of range.
99
100 Perl equivalent: "splice(@myarray, $key, 1, undef)" (with the
101 "splice" in void context if "G_DISCARD" is present).
102
103 SV* av_delete(AV *av, SSize_t key, I32 flags)
104
105 av_exists
106 Returns true if the element indexed by "key" has been
107 initialized.
108
109 This relies on the fact that uninitialized array elements are
110 set to "NULL".
111
112 Perl equivalent: "exists($myarray[$key])".
113
114 bool av_exists(AV *av, SSize_t key)
115
116 av_extend
117 Pre-extend an array so that it is capable of storing values at
118 indexes "0..key". Thus "av_extend(av,99)" guarantees that the
119 array can store 100 elements, i.e. that "av_store(av, 0, sv)"
120 through "av_store(av, 99, sv)" on a plain array will work
121 without any further memory allocation.
122
123 If the av argument is a tied array then will call the "EXTEND"
124 tied array method with an argument of "(key+1)".
125
126 void av_extend(AV *av, SSize_t key)
127
128 av_fetch
129 Returns the SV at the specified index in the array. The "key"
130 is the index. If lval is true, you are guaranteed to get a
131 real SV back (in case it wasn't real before), which you can
132 then modify. Check that the return value is non-null before
133 dereferencing it to a "SV*".
134
135 See "Understanding the Magic of Tied Hashes and Arrays" in
136 perlguts for more information on how to use this function on
137 tied arrays.
138
139 The rough perl equivalent is $myarray[$key].
140
141 SV** av_fetch(AV *av, SSize_t key, I32 lval)
142
143 AvFILL Same as "av_top_index()" or "av_tindex()".
144
145 int AvFILL(AV* av)
146
147 av_fill Set the highest index in the array to the given number,
148 equivalent to Perl's "$#array = $fill;".
149
150 The number of elements in the array will be "fill + 1" after
151 "av_fill()" returns. If the array was previously shorter, then
152 the additional elements appended are set to NULL. If the array
153 was longer, then the excess elements are freed.
154 "av_fill(av, -1)" is the same as "av_clear(av)".
155
156 void av_fill(AV *av, SSize_t fill)
157
158 av_len Same as "av_top_index". Note that, unlike what the name
159 implies, it returns the highest index in the array, so to get
160 the size of the array you need to use "av_len(av) + 1". This
161 is unlike "sv_len", which returns what you would expect.
162
163 SSize_t av_len(AV *av)
164
165 av_make Creates a new AV and populates it with a list of SVs. The SVs
166 are copied into the array, so they may be freed after the call
167 to "av_make". The new AV will have a reference count of 1.
168
169 Perl equivalent: "my @new_array = ($scalar1, $scalar2,
170 $scalar3...);"
171
172 AV* av_make(SSize_t size, SV **strp)
173
174 av_pop Removes one SV from the end of the array, reducing its size by
175 one and returning the SV (transferring control of one reference
176 count) to the caller. Returns &PL_sv_undef if the array is
177 empty.
178
179 Perl equivalent: "pop(@myarray);"
180
181 SV* av_pop(AV *av)
182
183 av_push Pushes an SV (transferring control of one reference count) onto
184 the end of the array. The array will grow automatically to
185 accommodate the addition.
186
187 Perl equivalent: "push @myarray, $val;".
188
189 void av_push(AV *av, SV *val)
190
191 av_shift
192 Removes one SV from the start of the array, reducing its size
193 by one and returning the SV (transferring control of one
194 reference count) to the caller. Returns &PL_sv_undef if the
195 array is empty.
196
197 Perl equivalent: "shift(@myarray);"
198
199 SV* av_shift(AV *av)
200
201 av_store
202 Stores an SV in an array. The array index is specified as
203 "key". The return value will be "NULL" if the operation failed
204 or if the value did not need to be actually stored within the
205 array (as in the case of tied arrays). Otherwise, it can be
206 dereferenced to get the "SV*" that was stored there (= "val")).
207
208 Note that the caller is responsible for suitably incrementing
209 the reference count of "val" before the call, and decrementing
210 it if the function returned "NULL".
211
212 Approximate Perl equivalent: "splice(@myarray, $key, 1, $val)".
213
214 See "Understanding the Magic of Tied Hashes and Arrays" in
215 perlguts for more information on how to use this function on
216 tied arrays.
217
218 SV** av_store(AV *av, SSize_t key, SV *val)
219
220 av_tindex
221 Same as "av_top_index()".
222
223 int av_tindex(AV* av)
224
225 av_top_index
226 Returns the highest index in the array. The number of elements
227 in the array is "av_top_index(av) + 1". Returns -1 if the
228 array is empty.
229
230 The Perl equivalent for this is $#myarray.
231
232 (A slightly shorter form is "av_tindex".)
233
234 SSize_t av_top_index(AV *av)
235
236 av_undef
237 Undefines the array. The XS equivalent of "undef(@array)".
238
239 As well as freeing all the elements of the array (like
240 "av_clear()"), this also frees the memory used by the av to
241 store its list of scalars.
242
243 See "av_clear" for a note about the array possibly being
244 invalid on return.
245
246 void av_undef(AV *av)
247
248 av_unshift
249 Unshift the given number of "undef" values onto the beginning
250 of the array. The array will grow automatically to accommodate
251 the addition.
252
253 Perl equivalent: "unshift @myarray, ((undef) x $num);"
254
255 void av_unshift(AV *av, SSize_t num)
256
257 get_av Returns the AV of the specified Perl global or package array
258 with the given name (so it won't work on lexical variables).
259 "flags" are passed to "gv_fetchpv". If "GV_ADD" is set and the
260 Perl variable does not exist then it will be created. If
261 "flags" is zero and the variable does not exist then NULL is
262 returned.
263
264 Perl equivalent: "@{"$name"}".
265
266 NOTE: the perl_ form of this function is deprecated.
267
268 AV* get_av(const char *name, I32 flags)
269
270 newAV Creates a new AV. The reference count is set to 1.
271
272 Perl equivalent: "my @array;".
273
274 AV* newAV()
275
276 sortsv In-place sort an array of SV pointers with the given comparison
277 routine.
278
279 Currently this always uses mergesort. See "sortsv_flags" for a
280 more flexible routine.
281
282 void sortsv(SV** array, size_t num_elts,
283 SVCOMPARE_t cmp)
284
286 call_argv
287 Performs a callback to the specified named and package-scoped
288 Perl subroutine with "argv" (a "NULL"-terminated array of
289 strings) as arguments. See perlcall.
290
291 Approximate Perl equivalent: "&{"$sub_name"}(@$argv)".
292
293 NOTE: the perl_ form of this function is deprecated.
294
295 I32 call_argv(const char* sub_name, I32 flags,
296 char** argv)
297
298 call_method
299 Performs a callback to the specified Perl method. The blessed
300 object must be on the stack. See perlcall.
301
302 NOTE: the perl_ form of this function is deprecated.
303
304 I32 call_method(const char* methname, I32 flags)
305
306 call_pv Performs a callback to the specified Perl sub. See perlcall.
307
308 NOTE: the perl_ form of this function is deprecated.
309
310 I32 call_pv(const char* sub_name, I32 flags)
311
312 call_sv Performs a callback to the Perl sub specified by the SV.
313
314 If neither the "G_METHOD" nor "G_METHOD_NAMED" flag is
315 supplied, the SV may be any of a CV, a GV, a reference to a CV,
316 a reference to a GV or "SvPV(sv)" will be used as the name of
317 the sub to call.
318
319 If the "G_METHOD" flag is supplied, the SV may be a reference
320 to a CV or "SvPV(sv)" will be used as the name of the method to
321 call.
322
323 If the "G_METHOD_NAMED" flag is supplied, "SvPV(sv)" will be
324 used as the name of the method to call.
325
326 Some other values are treated specially for internal use and
327 should not be depended on.
328
329 See perlcall.
330
331 NOTE: the perl_ form of this function is deprecated.
332
333 I32 call_sv(SV* sv, volatile I32 flags)
334
335 ENTER Opening bracket on a callback. See "LEAVE" and perlcall.
336
337 ENTER;
338
339 ENTER_with_name(name)
340 Same as "ENTER", but when debugging is enabled it also
341 associates the given literal string with the new scope.
342
343 ENTER_with_name(name);
344
345 eval_pv Tells Perl to "eval" the given string in scalar context and
346 return an SV* result.
347
348 NOTE: the perl_ form of this function is deprecated.
349
350 SV* eval_pv(const char* p, I32 croak_on_error)
351
352 eval_sv Tells Perl to "eval" the string in the SV. It supports the
353 same flags as "call_sv", with the obvious exception of
354 "G_EVAL". See perlcall.
355
356 NOTE: the perl_ form of this function is deprecated.
357
358 I32 eval_sv(SV* sv, I32 flags)
359
360 FREETMPS
361 Closing bracket for temporaries on a callback. See "SAVETMPS"
362 and perlcall.
363
364 FREETMPS;
365
366 LEAVE Closing bracket on a callback. See "ENTER" and perlcall.
367
368 LEAVE;
369
370 LEAVE_with_name(name)
371 Same as "LEAVE", but when debugging is enabled it first checks
372 that the scope has the given name. "name" must be a literal
373 string.
374
375 LEAVE_with_name(name);
376
377 SAVETMPS
378 Opening bracket for temporaries on a callback. See "FREETMPS"
379 and perlcall.
380
381 SAVETMPS;
382
384 Perl uses "full" Unicode case mappings. This means that converting a
385 single character to another case may result in a sequence of more than
386 one character. For example, the uppercase of "ss" (LATIN SMALL LETTER
387 SHARP S) is the two character sequence "SS". This presents some
388 complications The lowercase of all characters in the range 0..255 is
389 a single character, and thus "toLOWER_L1" is furnished. But,
390 "toUPPER_L1" can't exist, as it couldn't return a valid result for all
391 legal inputs. Instead "toUPPER_uvchr" has an API that does allow every
392 possible legal result to be returned.) Likewise no other function that
393 is crippled by not being able to give the correct results for the full
394 range of possible inputs has been implemented here.
395
396 toFOLD Converts the specified character to foldcase. If the input is
397 anything but an ASCII uppercase character, that input character
398 itself is returned. Variant "toFOLD_A" is equivalent. (There
399 is no equivalent "to_FOLD_L1" for the full Latin1 range, as the
400 full generality of "toFOLD_uvchr" is needed there.)
401
402 U8 toFOLD(U8 ch)
403
404 toFOLD_utf8
405 This is like "toFOLD_utf8_safe", but doesn't have the "e"
406 parameter The function therefore can't check if it is reading
407 beyond the end of the string. Starting in Perl v5.32, it will
408 take the "e" parameter, becoming a synonym for
409 "toFOLD_utf8_safe". At that time every program that uses it
410 will have to be changed to successfully compile. In the
411 meantime, the first runtime call to "toFOLD_utf8" from each
412 call point in the program will raise a deprecation warning,
413 enabled by default. You can convert your program now to use
414 "toFOLD_utf8_safe", and avoid the warnings, and get an extra
415 measure of protection, or you can wait until v5.32, when you'll
416 be forced to add the "e" parameter.
417
418 UV toFOLD_utf8(U8* p, U8* s, STRLEN* lenp)
419
420 toFOLD_utf8_safe
421 Converts the first UTF-8 encoded character in the sequence
422 starting at "p" and extending no further than "e - 1" to its
423 foldcase version, and stores that in UTF-8 in "s", and its
424 length in bytes in "lenp". Note that the buffer pointed to by
425 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
426 foldcase version may be longer than the original character.
427
428 The first code point of the foldcased version is returned (but
429 note, as explained at the top of this section, that there may
430 be more).
431
432 The suffix "_safe" in the function's name indicates that it
433 will not attempt to read beyond "e - 1", provided that the
434 constraint "s < e" is true (this is asserted for in
435 "-DDEBUGGING" builds). If the UTF-8 for the input character is
436 malformed in some way, the program may croak, or the function
437 may return the REPLACEMENT CHARACTER, at the discretion of the
438 implementation, and subject to change in future releases.
439
440 UV toFOLD_utf8_safe(U8* p, U8* e, U8* s,
441 STRLEN* lenp)
442
443 toFOLD_uvchr
444 Converts the code point "cp" to its foldcase version, and
445 stores that in UTF-8 in "s", and its length in bytes in "lenp".
446 The code point is interpreted as native if less than 256;
447 otherwise as Unicode. Note that the buffer pointed to by "s"
448 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
449 foldcase version may be longer than the original character.
450
451 The first code point of the foldcased version is returned (but
452 note, as explained at the top of this section, that there may
453 be more).
454
455 UV toFOLD_uvchr(UV cp, U8* s, STRLEN* lenp)
456
457 toLOWER Converts the specified character to lowercase. If the input is
458 anything but an ASCII uppercase character, that input character
459 itself is returned. Variant "toLOWER_A" is equivalent.
460
461 U8 toLOWER(U8 ch)
462
463 toLOWER_L1
464 Converts the specified Latin1 character to lowercase. The
465 results are undefined if the input doesn't fit in a byte.
466
467 U8 toLOWER_L1(U8 ch)
468
469 toLOWER_LC
470 Converts the specified character to lowercase using the current
471 locale's rules, if possible; otherwise returns the input
472 character itself.
473
474 U8 toLOWER_LC(U8 ch)
475
476 toLOWER_utf8
477 This is like "toLOWER_utf8_safe", but doesn't have the "e"
478 parameter The function therefore can't check if it is reading
479 beyond the end of the string. Starting in Perl v5.32, it will
480 take the "e" parameter, becoming a synonym for
481 "toLOWER_utf8_safe". At that time every program that uses it
482 will have to be changed to successfully compile. In the
483 meantime, the first runtime call to "toLOWER_utf8" from each
484 call point in the program will raise a deprecation warning,
485 enabled by default. You can convert your program now to use
486 "toLOWER_utf8_safe", and avoid the warnings, and get an extra
487 measure of protection, or you can wait until v5.32, when you'll
488 be forced to add the "e" parameter.
489
490 UV toLOWER_utf8(U8* p, U8* s, STRLEN* lenp)
491
492 toLOWER_utf8_safe
493 Converts the first UTF-8 encoded character in the sequence
494 starting at "p" and extending no further than "e - 1" to its
495 lowercase version, and stores that in UTF-8 in "s", and its
496 length in bytes in "lenp". Note that the buffer pointed to by
497 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
498 lowercase version may be longer than the original character.
499
500 The first code point of the lowercased version is returned (but
501 note, as explained at the top of this section, that there may
502 be more).
503
504 The suffix "_safe" in the function's name indicates that it
505 will not attempt to read beyond "e - 1", provided that the
506 constraint "s < e" is true (this is asserted for in
507 "-DDEBUGGING" builds). If the UTF-8 for the input character is
508 malformed in some way, the program may croak, or the function
509 may return the REPLACEMENT CHARACTER, at the discretion of the
510 implementation, and subject to change in future releases.
511
512 UV toLOWER_utf8_safe(U8* p, U8* e, U8* s,
513 STRLEN* lenp)
514
515 toLOWER_uvchr
516 Converts the code point "cp" to its lowercase version, and
517 stores that in UTF-8 in "s", and its length in bytes in "lenp".
518 The code point is interpreted as native if less than 256;
519 otherwise as Unicode. Note that the buffer pointed to by "s"
520 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
521 lowercase version may be longer than the original character.
522
523 The first code point of the lowercased version is returned (but
524 note, as explained at the top of this section, that there may
525 be more).
526
527 UV toLOWER_uvchr(UV cp, U8* s, STRLEN* lenp)
528
529 toTITLE Converts the specified character to titlecase. If the input is
530 anything but an ASCII lowercase character, that input character
531 itself is returned. Variant "toTITLE_A" is equivalent. (There
532 is no "toTITLE_L1" for the full Latin1 range, as the full
533 generality of "toTITLE_uvchr" is needed there. Titlecase is
534 not a concept used in locale handling, so there is no
535 functionality for that.)
536
537 U8 toTITLE(U8 ch)
538
539 toTITLE_utf8
540 This is like "toLOWER_utf8_safe", but doesn't have the "e"
541 parameter The function therefore can't check if it is reading
542 beyond the end of the string. Starting in Perl v5.32, it will
543 take the "e" parameter, becoming a synonym for
544 "toTITLE_utf8_safe". At that time every program that uses it
545 will have to be changed to successfully compile. In the
546 meantime, the first runtime call to "toTITLE_utf8" from each
547 call point in the program will raise a deprecation warning,
548 enabled by default. You can convert your program now to use
549 "toTITLE_utf8_safe", and avoid the warnings, and get an extra
550 measure of protection, or you can wait until v5.32, when you'll
551 be forced to add the "e" parameter.
552
553 UV toTITLE_utf8(U8* p, U8* s, STRLEN* lenp)
554
555 toTITLE_utf8_safe
556 Converts the first UTF-8 encoded character in the sequence
557 starting at "p" and extending no further than "e - 1" to its
558 titlecase version, and stores that in UTF-8 in "s", and its
559 length in bytes in "lenp". Note that the buffer pointed to by
560 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
561 titlecase version may be longer than the original character.
562
563 The first code point of the titlecased version is returned (but
564 note, as explained at the top of this section, that there may
565 be more).
566
567 The suffix "_safe" in the function's name indicates that it
568 will not attempt to read beyond "e - 1", provided that the
569 constraint "s < e" is true (this is asserted for in
570 "-DDEBUGGING" builds). If the UTF-8 for the input character is
571 malformed in some way, the program may croak, or the function
572 may return the REPLACEMENT CHARACTER, at the discretion of the
573 implementation, and subject to change in future releases.
574
575 UV toTITLE_utf8_safe(U8* p, U8* e, U8* s,
576 STRLEN* lenp)
577
578 toTITLE_uvchr
579 Converts the code point "cp" to its titlecase version, and
580 stores that in UTF-8 in "s", and its length in bytes in "lenp".
581 The code point is interpreted as native if less than 256;
582 otherwise as Unicode. Note that the buffer pointed to by "s"
583 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
584 titlecase version may be longer than the original character.
585
586 The first code point of the titlecased version is returned (but
587 note, as explained at the top of this section, that there may
588 be more).
589
590 UV toTITLE_uvchr(UV cp, U8* s, STRLEN* lenp)
591
592 toUPPER Converts the specified character to uppercase. If the input is
593 anything but an ASCII lowercase character, that input character
594 itself is returned. Variant "toUPPER_A" is equivalent.
595
596 U8 toUPPER(U8 ch)
597
598 toUPPER_utf8
599 This is like "toUPPER_utf8_safe", but doesn't have the "e"
600 parameter The function therefore can't check if it is reading
601 beyond the end of the string. Starting in Perl v5.32, it will
602 take the "e" parameter, becoming a synonym for
603 "toUPPER_utf8_safe". At that time every program that uses it
604 will have to be changed to successfully compile. In the
605 meantime, the first runtime call to "toUPPER_utf8" from each
606 call point in the program will raise a deprecation warning,
607 enabled by default. You can convert your program now to use
608 "toUPPER_utf8_safe", and avoid the warnings, and get an extra
609 measure of protection, or you can wait until v5.32, when you'll
610 be forced to add the "e" parameter.
611
612 UV toUPPER_utf8(U8* p, U8* s, STRLEN* lenp)
613
614 toUPPER_utf8_safe
615 Converts the first UTF-8 encoded character in the sequence
616 starting at "p" and extending no further than "e - 1" to its
617 uppercase version, and stores that in UTF-8 in "s", and its
618 length in bytes in "lenp". Note that the buffer pointed to by
619 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
620 uppercase version may be longer than the original character.
621
622 The first code point of the uppercased version is returned (but
623 note, as explained at the top of this section, that there may
624 be more).
625
626 The suffix "_safe" in the function's name indicates that it
627 will not attempt to read beyond "e - 1", provided that the
628 constraint "s < e" is true (this is asserted for in
629 "-DDEBUGGING" builds). If the UTF-8 for the input character is
630 malformed in some way, the program may croak, or the function
631 may return the REPLACEMENT CHARACTER, at the discretion of the
632 implementation, and subject to change in future releases.
633
634 UV toUPPER_utf8_safe(U8* p, U8* e, U8* s,
635 STRLEN* lenp)
636
637 toUPPER_uvchr
638 Converts the code point "cp" to its uppercase version, and
639 stores that in UTF-8 in "s", and its length in bytes in "lenp".
640 The code point is interpreted as native if less than 256;
641 otherwise as Unicode. Note that the buffer pointed to by "s"
642 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
643 uppercase version may be longer than the original character.
644
645 The first code point of the uppercased version is returned (but
646 note, as explained at the top of this section, that there may
647 be more.)
648
649 UV toUPPER_uvchr(UV cp, U8* s, STRLEN* lenp)
650
652 This section is about functions (really macros) that classify
653 characters into types, such as punctuation versus alphabetic, etc.
654 Most of these are analogous to regular expression character classes.
655 (See "POSIX Character Classes" in perlrecharclass.) There are several
656 variants for each class. (Not all macros have all variants; each item
657 below lists the ones valid for it.) None are affected by "use bytes",
658 and only the ones with "LC" in the name are affected by the current
659 locale.
660
661 The base function, e.g., "isALPHA()", takes an octet (either a "char"
662 or a "U8") as input and returns a boolean as to whether or not the
663 character represented by that octet is (or on non-ASCII platforms,
664 corresponds to) an ASCII character in the named class based on
665 platform, Unicode, and Perl rules. If the input is a number that
666 doesn't fit in an octet, FALSE is returned.
667
668 Variant "isFOO_A" (e.g., "isALPHA_A()") is identical to the base
669 function with no suffix "_A". This variant is used to emphasize by its
670 name that only ASCII-range characters can return TRUE.
671
672 Variant "isFOO_L1" imposes the Latin-1 (or EBCDIC equivalent) character
673 set onto the platform. That is, the code points that are ASCII are
674 unaffected, since ASCII is a subset of Latin-1. But the non-ASCII code
675 points are treated as if they are Latin-1 characters. For example,
676 "isWORDCHAR_L1()" will return true when called with the code point
677 0xDF, which is a word character in both ASCII and EBCDIC (though it
678 represents different characters in each).
679
680 Variant "isFOO_uvchr" is like the "isFOO_L1" variant, but accepts any
681 UV code point as input. If the code point is larger than 255, Unicode
682 rules are used to determine if it is in the character class. For
683 example, "isWORDCHAR_uvchr(0x100)" returns TRUE, since 0x100 is LATIN
684 CAPITAL LETTER A WITH MACRON in Unicode, and is a word character.
685
686 Variant "isFOO_utf8_safe" is like "isFOO_uvchr", but is used for UTF-8
687 encoded strings. Each call classifies one character, even if the
688 string contains many. This variant takes two parameters. The first,
689 "p", is a pointer to the first byte of the character to be classified.
690 (Recall that it may take more than one byte to represent a character in
691 UTF-8 strings.) The second parameter, "e", points to anywhere in the
692 string beyond the first character, up to one byte past the end of the
693 entire string. The suffix "_safe" in the function's name indicates
694 that it will not attempt to read beyond "e - 1", provided that the
695 constraint "s < e" is true (this is asserted for in "-DDEBUGGING"
696 builds). If the UTF-8 for the input character is malformed in some
697 way, the program may croak, or the function may return FALSE, at the
698 discretion of the implementation, and subject to change in future
699 releases.
700
701 Variant "isFOO_utf8" is like "isFOO_utf8_safe", but takes just a single
702 parameter, "p", which has the same meaning as the corresponding
703 parameter does in "isFOO_utf8_safe". The function therefore can't
704 check if it is reading beyond the end of the string. Starting in Perl
705 v5.32, it will take a second parameter, becoming a synonym for
706 "isFOO_utf8_safe". At that time every program that uses it will have
707 to be changed to successfully compile. In the meantime, the first
708 runtime call to "isFOO_utf8" from each call point in the program will
709 raise a deprecation warning, enabled by default. You can convert your
710 program now to use "isFOO_utf8_safe", and avoid the warnings, and get
711 an extra measure of protection, or you can wait until v5.32, when
712 you'll be forced to add the "e" parameter.
713
714 Variant "isFOO_LC" is like the "isFOO_A" and "isFOO_L1" variants, but
715 the result is based on the current locale, which is what "LC" in the
716 name stands for. If Perl can determine that the current locale is a
717 UTF-8 locale, it uses the published Unicode rules; otherwise, it uses
718 the C library function that gives the named classification. For
719 example, "isDIGIT_LC()" when not in a UTF-8 locale returns the result
720 of calling "isdigit()". FALSE is always returned if the input won't
721 fit into an octet. On some platforms where the C library function is
722 known to be defective, Perl changes its result to follow the POSIX
723 standard's rules.
724
725 Variant "isFOO_LC_uvchr" is like "isFOO_LC", but is defined on any UV.
726 It returns the same as "isFOO_LC" for input code points less than 256,
727 and returns the hard-coded, not-affected-by-locale, Unicode results for
728 larger ones.
729
730 Variant "isFOO_LC_utf8_safe" is like "isFOO_LC_uvchr", but is used for
731 UTF-8 encoded strings. Each call classifies one character, even if the
732 string contains many. This variant takes two parameters. The first,
733 "p", is a pointer to the first byte of the character to be classified.
734 (Recall that it may take more than one byte to represent a character in
735 UTF-8 strings.) The second parameter, "e", points to anywhere in the
736 string beyond the first character, up to one byte past the end of the
737 entire string. The suffix "_safe" in the function's name indicates
738 that it will not attempt to read beyond "e - 1", provided that the
739 constraint "s < e" is true (this is asserted for in "-DDEBUGGING"
740 builds). If the UTF-8 for the input character is malformed in some
741 way, the program may croak, or the function may return FALSE, at the
742 discretion of the implementation, and subject to change in future
743 releases.
744
745 Variant "isFOO_LC_utf8" is like "isFOO_LC_utf8_safe", but takes just a
746 single parameter, "p", which has the same meaning as the corresponding
747 parameter does in "isFOO_LC_utf8_safe". The function therefore can't
748 check if it is reading beyond the end of the string. Starting in Perl
749 v5.32, it will take a second parameter, becoming a synonym for
750 "isFOO_LC_utf8_safe". At that time every program that uses it will
751 have to be changed to successfully compile. In the meantime, the first
752 runtime call to "isFOO_LC_utf8" from each call point in the program
753 will raise a deprecation warning, enabled by default. You can convert
754 your program now to use "isFOO_LC_utf8_safe", and avoid the warnings,
755 and get an extra measure of protection, or you can wait until v5.32,
756 when you'll be forced to add the "e" parameter.
757
758 isALPHA Returns a boolean indicating whether the specified character is
759 an alphabetic character, analogous to "m/[[:alpha:]]/". See
760 the top of this section for an explanation of variants
761 "isALPHA_A", "isALPHA_L1", "isALPHA_uvchr",
762 "isALPHA_utf8_safe", "isALPHA_LC", "isALPHA_LC_uvchr", and
763 "isALPHA_LC_utf8_safe".
764
765 bool isALPHA(char ch)
766
767 isALPHANUMERIC
768 Returns a boolean indicating whether the specified character is
769 a either an alphabetic character or decimal digit, analogous to
770 "m/[[:alnum:]]/". See the top of this section for an
771 explanation of variants "isALPHANUMERIC_A",
772 "isALPHANUMERIC_L1", "isALPHANUMERIC_uvchr",
773 "isALPHANUMERIC_utf8_safe", "isALPHANUMERIC_LC",
774 "isALPHANUMERIC_LC_uvchr", and "isALPHANUMERIC_LC_utf8_safe".
775
776 bool isALPHANUMERIC(char ch)
777
778 isASCII Returns a boolean indicating whether the specified character is
779 one of the 128 characters in the ASCII character set, analogous
780 to "m/[[:ascii:]]/". On non-ASCII platforms, it returns TRUE
781 iff this character corresponds to an ASCII character. Variants
782 "isASCII_A()" and "isASCII_L1()" are identical to "isASCII()".
783 See the top of this section for an explanation of variants
784 "isASCII_uvchr", "isASCII_utf8_safe", "isASCII_LC",
785 "isASCII_LC_uvchr", and "isASCII_LC_utf8_safe". Note, however,
786 that some platforms do not have the C library routine
787 "isascii()". In these cases, the variants whose names contain
788 "LC" are the same as the corresponding ones without.
789
790 Also note, that because all ASCII characters are UTF-8
791 invariant (meaning they have the exact same representation
792 (always a single byte) whether encoded in UTF-8 or not),
793 "isASCII" will give the correct results when called with any
794 byte in any string encoded or not in UTF-8. And similarly
795 "isASCII_utf8_safe" will work properly on any string encoded or
796 not in UTF-8.
797
798 bool isASCII(char ch)
799
800 isBLANK Returns a boolean indicating whether the specified character is
801 a character considered to be a blank, analogous to
802 "m/[[:blank:]]/". See the top of this section for an
803 explanation of variants "isBLANK_A", "isBLANK_L1",
804 "isBLANK_uvchr", "isBLANK_utf8_safe", "isBLANK_LC",
805 "isBLANK_LC_uvchr", and "isBLANK_LC_utf8_safe". Note, however,
806 that some platforms do not have the C library routine
807 "isblank()". In these cases, the variants whose names contain
808 "LC" are the same as the corresponding ones without.
809
810 bool isBLANK(char ch)
811
812 isCNTRL Returns a boolean indicating whether the specified character is
813 a control character, analogous to "m/[[:cntrl:]]/". See the
814 top of this section for an explanation of variants "isCNTRL_A",
815 "isCNTRL_L1", "isCNTRL_uvchr", "isCNTRL_utf8_safe",
816 "isCNTRL_LC", "isCNTRL_LC_uvchr", and "isCNTRL_LC_utf8_safe" On
817 EBCDIC platforms, you almost always want to use the
818 "isCNTRL_L1" variant.
819
820 bool isCNTRL(char ch)
821
822 isDIGIT Returns a boolean indicating whether the specified character is
823 a digit, analogous to "m/[[:digit:]]/". Variants "isDIGIT_A"
824 and "isDIGIT_L1" are identical to "isDIGIT". See the top of
825 this section for an explanation of variants "isDIGIT_uvchr",
826 "isDIGIT_utf8_safe", "isDIGIT_LC", "isDIGIT_LC_uvchr", and
827 "isDIGIT_LC_utf8_safe".
828
829 bool isDIGIT(char ch)
830
831 isGRAPH Returns a boolean indicating whether the specified character is
832 a graphic character, analogous to "m/[[:graph:]]/". See the
833 top of this section for an explanation of variants "isGRAPH_A",
834 "isGRAPH_L1", "isGRAPH_uvchr", "isGRAPH_utf8_safe",
835 "isGRAPH_LC", "isGRAPH_LC_uvchr", and "isGRAPH_LC_utf8_safe".
836
837 bool isGRAPH(char ch)
838
839 isIDCONT
840 Returns a boolean indicating whether the specified character
841 can be the second or succeeding character of an identifier.
842 This is very close to, but not quite the same as the official
843 Unicode property "XID_Continue". The difference is that this
844 returns true only if the input character also matches
845 "isWORDCHAR". See the top of this section for an explanation
846 of variants "isIDCONT_A", "isIDCONT_L1", "isIDCONT_uvchr",
847 "isIDCONT_utf8_safe", "isIDCONT_LC", "isIDCONT_LC_uvchr", and
848 "isIDCONT_LC_utf8_safe".
849
850 bool isIDCONT(char ch)
851
852 isIDFIRST
853 Returns a boolean indicating whether the specified character
854 can be the first character of an identifier. This is very
855 close to, but not quite the same as the official Unicode
856 property "XID_Start". The difference is that this returns true
857 only if the input character also matches "isWORDCHAR". See the
858 top of this section for an explanation of variants
859 "isIDFIRST_A", "isIDFIRST_L1", "isIDFIRST_uvchr",
860 "isIDFIRST_utf8_safe", "isIDFIRST_LC", "isIDFIRST_LC_uvchr",
861 and "isIDFIRST_LC_utf8_safe".
862
863 bool isIDFIRST(char ch)
864
865 isLOWER Returns a boolean indicating whether the specified character is
866 a lowercase character, analogous to "m/[[:lower:]]/". See the
867 top of this section for an explanation of variants "isLOWER_A",
868 "isLOWER_L1", "isLOWER_uvchr", "isLOWER_utf8_safe",
869 "isLOWER_LC", "isLOWER_LC_uvchr", and "isLOWER_LC_utf8_safe".
870
871 bool isLOWER(char ch)
872
873 isOCTAL Returns a boolean indicating whether the specified character is
874 an octal digit, [0-7]. The only two variants are "isOCTAL_A"
875 and "isOCTAL_L1"; each is identical to "isOCTAL".
876
877 bool isOCTAL(char ch)
878
879 isPRINT Returns a boolean indicating whether the specified character is
880 a printable character, analogous to "m/[[:print:]]/". See the
881 top of this section for an explanation of variants "isPRINT_A",
882 "isPRINT_L1", "isPRINT_uvchr", "isPRINT_utf8_safe",
883 "isPRINT_LC", "isPRINT_LC_uvchr", and "isPRINT_LC_utf8_safe".
884
885 bool isPRINT(char ch)
886
887 isPSXSPC
888 (short for Posix Space) Starting in 5.18, this is identical in
889 all its forms to the corresponding "isSPACE()" macros. The
890 locale forms of this macro are identical to their corresponding
891 "isSPACE()" forms in all Perl releases. In releases prior to
892 5.18, the non-locale forms differ from their "isSPACE()" forms
893 only in that the "isSPACE()" forms don't match a Vertical Tab,
894 and the "isPSXSPC()" forms do. Otherwise they are identical.
895 Thus this macro is analogous to what "m/[[:space:]]/" matches
896 in a regular expression. See the top of this section for an
897 explanation of variants "isPSXSPC_A", "isPSXSPC_L1",
898 "isPSXSPC_uvchr", "isPSXSPC_utf8_safe", "isPSXSPC_LC",
899 "isPSXSPC_LC_uvchr", and "isPSXSPC_LC_utf8_safe".
900
901 bool isPSXSPC(char ch)
902
903 isPUNCT Returns a boolean indicating whether the specified character is
904 a punctuation character, analogous to "m/[[:punct:]]/". Note
905 that the definition of what is punctuation isn't as
906 straightforward as one might desire. See "POSIX Character
907 Classes" in perlrecharclass for details. See the top of this
908 section for an explanation of variants "isPUNCT_A",
909 "isPUNCT_L1", "isPUNCT_uvchr", "isPUNCT_utf8_safe",
910 "isPUNCT_LC", "isPUNCT_LC_uvchr", and "isPUNCT_LC_utf8_safe".
911
912 bool isPUNCT(char ch)
913
914 isSPACE Returns a boolean indicating whether the specified character is
915 a whitespace character. This is analogous to what "m/\s/"
916 matches in a regular expression. Starting in Perl 5.18 this
917 also matches what "m/[[:space:]]/" does. Prior to 5.18, only
918 the locale forms of this macro (the ones with "LC" in their
919 names) matched precisely what "m/[[:space:]]/" does. In those
920 releases, the only difference, in the non-locale variants, was
921 that "isSPACE()" did not match a vertical tab. (See "isPSXSPC"
922 for a macro that matches a vertical tab in all releases.) See
923 the top of this section for an explanation of variants
924 "isSPACE_A", "isSPACE_L1", "isSPACE_uvchr",
925 "isSPACE_utf8_safe", "isSPACE_LC", "isSPACE_LC_uvchr", and
926 "isSPACE_LC_utf8_safe".
927
928 bool isSPACE(char ch)
929
930 isUPPER Returns a boolean indicating whether the specified character is
931 an uppercase character, analogous to "m/[[:upper:]]/". See the
932 top of this section for an explanation of variants "isUPPER_A",
933 "isUPPER_L1", "isUPPER_uvchr", "isUPPER_utf8_safe",
934 "isUPPER_LC", "isUPPER_LC_uvchr", and "isUPPER_LC_utf8_safe".
935
936 bool isUPPER(char ch)
937
938 isWORDCHAR
939 Returns a boolean indicating whether the specified character is
940 a character that is a word character, analogous to what "m/\w/"
941 and "m/[[:word:]]/" match in a regular expression. A word
942 character is an alphabetic character, a decimal digit, a
943 connecting punctuation character (such as an underscore), or a
944 "mark" character that attaches to one of those (like some sort
945 of accent). "isALNUM()" is a synonym provided for backward
946 compatibility, even though a word character includes more than
947 the standard C language meaning of alphanumeric. See the top
948 of this section for an explanation of variants "isWORDCHAR_A",
949 "isWORDCHAR_L1", "isWORDCHAR_uvchr", and
950 "isWORDCHAR_utf8_safe". "isWORDCHAR_LC",
951 "isWORDCHAR_LC_uvchr", and "isWORDCHAR_LC_utf8_safe" are also
952 as described there, but additionally include the platform's
953 native underscore.
954
955 bool isWORDCHAR(char ch)
956
957 isXDIGIT
958 Returns a boolean indicating whether the specified character is
959 a hexadecimal digit. In the ASCII range these are
960 "[0-9A-Fa-f]". Variants "isXDIGIT_A()" and "isXDIGIT_L1()" are
961 identical to "isXDIGIT()". See the top of this section for an
962 explanation of variants "isXDIGIT_uvchr", "isXDIGIT_utf8_safe",
963 "isXDIGIT_LC", "isXDIGIT_LC_uvchr", and
964 "isXDIGIT_LC_utf8_safe".
965
966 bool isXDIGIT(char ch)
967
969 perl_clone
970 Create and return a new interpreter by cloning the current one.
971
972 "perl_clone" takes these flags as parameters:
973
974 "CLONEf_COPY_STACKS" - is used to, well, copy the stacks also,
975 without it we only clone the data and zero the stacks, with it
976 we copy the stacks and the new perl interpreter is ready to run
977 at the exact same point as the previous one. The pseudo-fork
978 code uses "COPY_STACKS" while the threads->create doesn't.
979
980 "CLONEf_KEEP_PTR_TABLE" - "perl_clone" keeps a ptr_table with
981 the pointer of the old variable as a key and the new variable
982 as a value, this allows it to check if something has been
983 cloned and not clone it again but rather just use the value and
984 increase the refcount. If "KEEP_PTR_TABLE" is not set then
985 "perl_clone" will kill the ptr_table using the function
986 "ptr_table_free(PL_ptr_table); PL_ptr_table = NULL;", reason to
987 keep it around is if you want to dup some of your own variable
988 who are outside the graph perl scans, an example of this code
989 is in threads.xs create.
990
991 "CLONEf_CLONE_HOST" - This is a win32 thing, it is ignored on
992 unix, it tells perls win32host code (which is c++) to clone
993 itself, this is needed on win32 if you want to run two threads
994 at the same time, if you just want to do some stuff in a
995 separate perl interpreter and then throw it away and return to
996 the original one, you don't need to do anything.
997
998 PerlInterpreter* perl_clone(
999 PerlInterpreter *proto_perl,
1000 UV flags
1001 )
1002
1004 BhkDISABLE
1005 NOTE: this function is experimental and may change or be
1006 removed without notice.
1007
1008 Temporarily disable an entry in this BHK structure, by clearing
1009 the appropriate flag. "which" is a preprocessor token
1010 indicating which entry to disable.
1011
1012 void BhkDISABLE(BHK *hk, which)
1013
1014 BhkENABLE
1015 NOTE: this function is experimental and may change or be
1016 removed without notice.
1017
1018 Re-enable an entry in this BHK structure, by setting the
1019 appropriate flag. "which" is a preprocessor token indicating
1020 which entry to enable. This will assert (under -DDEBUGGING) if
1021 the entry doesn't contain a valid pointer.
1022
1023 void BhkENABLE(BHK *hk, which)
1024
1025 BhkENTRY_set
1026 NOTE: this function is experimental and may change or be
1027 removed without notice.
1028
1029 Set an entry in the BHK structure, and set the flags to
1030 indicate it is valid. "which" is a preprocessing token
1031 indicating which entry to set. The type of "ptr" depends on
1032 the entry.
1033
1034 void BhkENTRY_set(BHK *hk, which, void *ptr)
1035
1036 blockhook_register
1037 NOTE: this function is experimental and may change or be
1038 removed without notice.
1039
1040 Register a set of hooks to be called when the Perl lexical
1041 scope changes at compile time. See "Compile-time scope hooks"
1042 in perlguts.
1043
1044 NOTE: this function must be explicitly called as
1045 Perl_blockhook_register with an aTHX_ parameter.
1046
1047 void Perl_blockhook_register(pTHX_ BHK *hk)
1048
1050 cophh_2hv
1051 NOTE: this function is experimental and may change or be
1052 removed without notice.
1053
1054 Generates and returns a standard Perl hash representing the
1055 full set of key/value pairs in the cop hints hash "cophh".
1056 "flags" is currently unused and must be zero.
1057
1058 HV * cophh_2hv(const COPHH *cophh, U32 flags)
1059
1060 cophh_copy
1061 NOTE: this function is experimental and may change or be
1062 removed without notice.
1063
1064 Make and return a complete copy of the cop hints hash "cophh".
1065
1066 COPHH * cophh_copy(COPHH *cophh)
1067
1068 cophh_delete_pv
1069 NOTE: this function is experimental and may change or be
1070 removed without notice.
1071
1072 Like "cophh_delete_pvn", but takes a nul-terminated string
1073 instead of a string/length pair.
1074
1075 COPHH * cophh_delete_pv(const COPHH *cophh,
1076 const char *key, U32 hash,
1077 U32 flags)
1078
1079 cophh_delete_pvn
1080 NOTE: this function is experimental and may change or be
1081 removed without notice.
1082
1083 Delete a key and its associated value from the cop hints hash
1084 "cophh", and returns the modified hash. The returned hash
1085 pointer is in general not the same as the hash pointer that was
1086 passed in. The input hash is consumed by the function, and the
1087 pointer to it must not be subsequently used. Use "cophh_copy"
1088 if you need both hashes.
1089
1090 The key is specified by "keypv" and "keylen". If "flags" has
1091 the "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1092 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1093 precomputed hash of the key string, or zero if it has not been
1094 precomputed.
1095
1096 COPHH * cophh_delete_pvn(COPHH *cophh,
1097 const char *keypv,
1098 STRLEN keylen, U32 hash,
1099 U32 flags)
1100
1101 cophh_delete_pvs
1102 NOTE: this function is experimental and may change or be
1103 removed without notice.
1104
1105 Like "cophh_delete_pvn", but takes a literal string instead of
1106 a string/length pair, and no precomputed hash.
1107
1108 COPHH * cophh_delete_pvs(const COPHH *cophh,
1109 "literal string" key,
1110 U32 flags)
1111
1112 cophh_delete_sv
1113 NOTE: this function is experimental and may change or be
1114 removed without notice.
1115
1116 Like "cophh_delete_pvn", but takes a Perl scalar instead of a
1117 string/length pair.
1118
1119 COPHH * cophh_delete_sv(const COPHH *cophh, SV *key,
1120 U32 hash, U32 flags)
1121
1122 cophh_fetch_pv
1123 NOTE: this function is experimental and may change or be
1124 removed without notice.
1125
1126 Like "cophh_fetch_pvn", but takes a nul-terminated string
1127 instead of a string/length pair.
1128
1129 SV * cophh_fetch_pv(const COPHH *cophh,
1130 const char *key, U32 hash,
1131 U32 flags)
1132
1133 cophh_fetch_pvn
1134 NOTE: this function is experimental and may change or be
1135 removed without notice.
1136
1137 Look up the entry in the cop hints hash "cophh" with the key
1138 specified by "keypv" and "keylen". If "flags" has the
1139 "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1140 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1141 precomputed hash of the key string, or zero if it has not been
1142 precomputed. Returns a mortal scalar copy of the value
1143 associated with the key, or &PL_sv_placeholder if there is no
1144 value associated with the key.
1145
1146 SV * cophh_fetch_pvn(const COPHH *cophh,
1147 const char *keypv,
1148 STRLEN keylen, U32 hash,
1149 U32 flags)
1150
1151 cophh_fetch_pvs
1152 NOTE: this function is experimental and may change or be
1153 removed without notice.
1154
1155 Like "cophh_fetch_pvn", but takes a literal string instead of a
1156 string/length pair, and no precomputed hash.
1157
1158 SV * cophh_fetch_pvs(const COPHH *cophh,
1159 "literal string" key, U32 flags)
1160
1161 cophh_fetch_sv
1162 NOTE: this function is experimental and may change or be
1163 removed without notice.
1164
1165 Like "cophh_fetch_pvn", but takes a Perl scalar instead of a
1166 string/length pair.
1167
1168 SV * cophh_fetch_sv(const COPHH *cophh, SV *key,
1169 U32 hash, U32 flags)
1170
1171 cophh_free
1172 NOTE: this function is experimental and may change or be
1173 removed without notice.
1174
1175 Discard the cop hints hash "cophh", freeing all resources
1176 associated with it.
1177
1178 void cophh_free(COPHH *cophh)
1179
1180 cophh_new_empty
1181 NOTE: this function is experimental and may change or be
1182 removed without notice.
1183
1184 Generate and return a fresh cop hints hash containing no
1185 entries.
1186
1187 COPHH * cophh_new_empty()
1188
1189 cophh_store_pv
1190 NOTE: this function is experimental and may change or be
1191 removed without notice.
1192
1193 Like "cophh_store_pvn", but takes a nul-terminated string
1194 instead of a string/length pair.
1195
1196 COPHH * cophh_store_pv(const COPHH *cophh,
1197 const char *key, U32 hash,
1198 SV *value, U32 flags)
1199
1200 cophh_store_pvn
1201 NOTE: this function is experimental and may change or be
1202 removed without notice.
1203
1204 Stores a value, associated with a key, in the cop hints hash
1205 "cophh", and returns the modified hash. The returned hash
1206 pointer is in general not the same as the hash pointer that was
1207 passed in. The input hash is consumed by the function, and the
1208 pointer to it must not be subsequently used. Use "cophh_copy"
1209 if you need both hashes.
1210
1211 The key is specified by "keypv" and "keylen". If "flags" has
1212 the "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1213 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1214 precomputed hash of the key string, or zero if it has not been
1215 precomputed.
1216
1217 "value" is the scalar value to store for this key. "value" is
1218 copied by this function, which thus does not take ownership of
1219 any reference to it, and later changes to the scalar will not
1220 be reflected in the value visible in the cop hints hash.
1221 Complex types of scalar will not be stored with referential
1222 integrity, but will be coerced to strings.
1223
1224 COPHH * cophh_store_pvn(COPHH *cophh, const char *keypv,
1225 STRLEN keylen, U32 hash,
1226 SV *value, U32 flags)
1227
1228 cophh_store_pvs
1229 NOTE: this function is experimental and may change or be
1230 removed without notice.
1231
1232 Like "cophh_store_pvn", but takes a literal string instead of a
1233 string/length pair, and no precomputed hash.
1234
1235 COPHH * cophh_store_pvs(const COPHH *cophh,
1236 "literal string" key, SV *value,
1237 U32 flags)
1238
1239 cophh_store_sv
1240 NOTE: this function is experimental and may change or be
1241 removed without notice.
1242
1243 Like "cophh_store_pvn", but takes a Perl scalar instead of a
1244 string/length pair.
1245
1246 COPHH * cophh_store_sv(const COPHH *cophh, SV *key,
1247 U32 hash, SV *value, U32 flags)
1248
1250 cop_hints_2hv
1251 Generates and returns a standard Perl hash representing the
1252 full set of hint entries in the cop "cop". "flags" is
1253 currently unused and must be zero.
1254
1255 HV * cop_hints_2hv(const COP *cop, U32 flags)
1256
1257 cop_hints_fetch_pv
1258 Like "cop_hints_fetch_pvn", but takes a nul-terminated string
1259 instead of a string/length pair.
1260
1261 SV * cop_hints_fetch_pv(const COP *cop,
1262 const char *key, U32 hash,
1263 U32 flags)
1264
1265 cop_hints_fetch_pvn
1266 Look up the hint entry in the cop "cop" with the key specified
1267 by "keypv" and "keylen". If "flags" has the "COPHH_KEY_UTF8"
1268 bit set, the key octets are interpreted as UTF-8, otherwise
1269 they are interpreted as Latin-1. "hash" is a precomputed hash
1270 of the key string, or zero if it has not been precomputed.
1271 Returns a mortal scalar copy of the value associated with the
1272 key, or &PL_sv_placeholder if there is no value associated with
1273 the key.
1274
1275 SV * cop_hints_fetch_pvn(const COP *cop,
1276 const char *keypv,
1277 STRLEN keylen, U32 hash,
1278 U32 flags)
1279
1280 cop_hints_fetch_pvs
1281 Like "cop_hints_fetch_pvn", but takes a literal string instead
1282 of a string/length pair, and no precomputed hash.
1283
1284 SV * cop_hints_fetch_pvs(const COP *cop,
1285 "literal string" key,
1286 U32 flags)
1287
1288 cop_hints_fetch_sv
1289 Like "cop_hints_fetch_pvn", but takes a Perl scalar instead of
1290 a string/length pair.
1291
1292 SV * cop_hints_fetch_sv(const COP *cop, SV *key,
1293 U32 hash, U32 flags)
1294
1296 custom_op_register
1297 Register a custom op. See "Custom Operators" in perlguts.
1298
1299 NOTE: this function must be explicitly called as
1300 Perl_custom_op_register with an aTHX_ parameter.
1301
1302 void Perl_custom_op_register(pTHX_
1303 Perl_ppaddr_t ppaddr,
1304 const XOP *xop)
1305
1306 custom_op_xop
1307 Return the XOP structure for a given custom op. This macro
1308 should be considered internal to "OP_NAME" and the other access
1309 macros: use them instead. This macro does call a function.
1310 Prior to 5.19.6, this was implemented as a function.
1311
1312 NOTE: this function must be explicitly called as
1313 Perl_custom_op_xop with an aTHX_ parameter.
1314
1315 const XOP * Perl_custom_op_xop(pTHX_ const OP *o)
1316
1317 XopDISABLE
1318 Temporarily disable a member of the XOP, by clearing the
1319 appropriate flag.
1320
1321 void XopDISABLE(XOP *xop, which)
1322
1323 XopENABLE
1324 Reenable a member of the XOP which has been disabled.
1325
1326 void XopENABLE(XOP *xop, which)
1327
1328 XopENTRY
1329 Return a member of the XOP structure. "which" is a cpp token
1330 indicating which entry to return. If the member is not set
1331 this will return a default value. The return type depends on
1332 "which". This macro evaluates its arguments more than once.
1333 If you are using "Perl_custom_op_xop" to retreive a "XOP *"
1334 from a "OP *", use the more efficient "XopENTRYCUSTOM" instead.
1335
1336 XopENTRY(XOP *xop, which)
1337
1338 XopENTRYCUSTOM
1339 Exactly like "XopENTRY(XopENTRY(Perl_custom_op_xop(aTHX_ o),
1340 which)" but more efficient. The "which" parameter is identical
1341 to "XopENTRY".
1342
1343 XopENTRYCUSTOM(const OP *o, which)
1344
1345 XopENTRY_set
1346 Set a member of the XOP structure. "which" is a cpp token
1347 indicating which entry to set. See "Custom Operators" in
1348 perlguts for details about the available members and how they
1349 are used. This macro evaluates its argument more than once.
1350
1351 void XopENTRY_set(XOP *xop, which, value)
1352
1353 XopFLAGS
1354 Return the XOP's flags.
1355
1356 U32 XopFLAGS(XOP *xop)
1357
1359 This section documents functions to manipulate CVs which are code-
1360 values, or subroutines. For more information, see perlguts.
1361
1362 caller_cx
1363 The XSUB-writer's equivalent of caller(). The returned
1364 "PERL_CONTEXT" structure can be interrogated to find all the
1365 information returned to Perl by "caller". Note that XSUBs
1366 don't get a stack frame, so "caller_cx(0, NULL)" will return
1367 information for the immediately-surrounding Perl code.
1368
1369 This function skips over the automatic calls to &DB::sub made
1370 on the behalf of the debugger. If the stack frame requested
1371 was a sub called by "DB::sub", the return value will be the
1372 frame for the call to "DB::sub", since that has the correct
1373 line number/etc. for the call site. If dbcxp is non-"NULL", it
1374 will be set to a pointer to the frame for the sub call itself.
1375
1376 const PERL_CONTEXT * caller_cx(
1377 I32 level,
1378 const PERL_CONTEXT **dbcxp
1379 )
1380
1381 CvSTASH Returns the stash of the CV. A stash is the symbol table hash,
1382 containing the package-scoped variables in the package where
1383 the subroutine was defined. For more information, see
1384 perlguts.
1385
1386 This also has a special use with XS AUTOLOAD subs. See
1387 "Autoloading with XSUBs" in perlguts.
1388
1389 HV* CvSTASH(CV* cv)
1390
1391 find_runcv
1392 Locate the CV corresponding to the currently executing sub or
1393 eval. If "db_seqp" is non_null, skip CVs that are in the DB
1394 package and populate *db_seqp with the cop sequence number at
1395 the point that the DB:: code was entered. (This allows
1396 debuggers to eval in the scope of the breakpoint rather than in
1397 the scope of the debugger itself.)
1398
1399 CV* find_runcv(U32 *db_seqp)
1400
1401 get_cv Uses "strlen" to get the length of "name", then calls
1402 "get_cvn_flags".
1403
1404 NOTE: the perl_ form of this function is deprecated.
1405
1406 CV* get_cv(const char* name, I32 flags)
1407
1408 get_cvn_flags
1409 Returns the CV of the specified Perl subroutine. "flags" are
1410 passed to "gv_fetchpvn_flags". If "GV_ADD" is set and the Perl
1411 subroutine does not exist then it will be declared (which has
1412 the same effect as saying "sub name;"). If "GV_ADD" is not set
1413 and the subroutine does not exist then NULL is returned.
1414
1415 NOTE: the perl_ form of this function is deprecated.
1416
1417 CV* get_cvn_flags(const char* name, STRLEN len,
1418 I32 flags)
1419
1421 ax Variable which is setup by "xsubpp" to indicate the stack base
1422 offset, used by the "ST", "XSprePUSH" and "XSRETURN" macros.
1423 The "dMARK" macro must be called prior to setup the "MARK"
1424 variable.
1425
1426 I32 ax
1427
1428 CLASS Variable which is setup by "xsubpp" to indicate the class name
1429 for a C++ XS constructor. This is always a "char*". See
1430 "THIS".
1431
1432 char* CLASS
1433
1434 dAX Sets up the "ax" variable. This is usually handled
1435 automatically by "xsubpp" by calling "dXSARGS".
1436
1437 dAX;
1438
1439 dAXMARK Sets up the "ax" variable and stack marker variable "mark".
1440 This is usually handled automatically by "xsubpp" by calling
1441 "dXSARGS".
1442
1443 dAXMARK;
1444
1445 dITEMS Sets up the "items" variable. This is usually handled
1446 automatically by "xsubpp" by calling "dXSARGS".
1447
1448 dITEMS;
1449
1450 dUNDERBAR
1451 Sets up any variable needed by the "UNDERBAR" macro. It used
1452 to define "padoff_du", but it is currently a noop. However, it
1453 is strongly advised to still use it for ensuring past and
1454 future compatibility.
1455
1456 dUNDERBAR;
1457
1458 dXSARGS Sets up stack and mark pointers for an XSUB, calling "dSP" and
1459 "dMARK". Sets up the "ax" and "items" variables by calling
1460 "dAX" and "dITEMS". This is usually handled automatically by
1461 "xsubpp".
1462
1463 dXSARGS;
1464
1465 dXSI32 Sets up the "ix" variable for an XSUB which has aliases. This
1466 is usually handled automatically by "xsubpp".
1467
1468 dXSI32;
1469
1470 items Variable which is setup by "xsubpp" to indicate the number of
1471 items on the stack. See "Variable-length Parameter Lists" in
1472 perlxs.
1473
1474 I32 items
1475
1476 ix Variable which is setup by "xsubpp" to indicate which of an
1477 XSUB's aliases was used to invoke it. See "The ALIAS: Keyword"
1478 in perlxs.
1479
1480 I32 ix
1481
1482 RETVAL Variable which is setup by "xsubpp" to hold the return value
1483 for an XSUB. This is always the proper type for the XSUB. See
1484 "The RETVAL Variable" in perlxs.
1485
1486 (whatever) RETVAL
1487
1488 ST Used to access elements on the XSUB's stack.
1489
1490 SV* ST(int ix)
1491
1492 THIS Variable which is setup by "xsubpp" to designate the object in
1493 a C++ XSUB. This is always the proper type for the C++ object.
1494 See "CLASS" and "Using XS With C++" in perlxs.
1495
1496 (whatever) THIS
1497
1498 UNDERBAR
1499 The SV* corresponding to the $_ variable. Works even if there
1500 is a lexical $_ in scope.
1501
1502 XS Macro to declare an XSUB and its C parameter list. This is
1503 handled by "xsubpp". It is the same as using the more explicit
1504 "XS_EXTERNAL" macro.
1505
1506 XS_EXTERNAL
1507 Macro to declare an XSUB and its C parameter list explicitly
1508 exporting the symbols.
1509
1510 XS_INTERNAL
1511 Macro to declare an XSUB and its C parameter list without
1512 exporting the symbols. This is handled by "xsubpp" and
1513 generally preferable over exporting the XSUB symbols
1514 unnecessarily.
1515
1517 dump_all
1518 Dumps the entire optree of the current program starting at
1519 "PL_main_root" to "STDERR". Also dumps the optrees for all
1520 visible subroutines in "PL_defstash".
1521
1522 void dump_all()
1523
1524 dump_packsubs
1525 Dumps the optrees for all visible subroutines in "stash".
1526
1527 void dump_packsubs(const HV* stash)
1528
1529 op_class
1530 Given an op, determine what type of struct it has been
1531 allocated as. Returns one of the OPclass enums, such as
1532 OPclass_LISTOP.
1533
1534 OPclass op_class(const OP *o)
1535
1536 op_dump Dumps the optree starting at OP "o" to "STDERR".
1537
1538 void op_dump(const OP *o)
1539
1540 sv_dump Dumps the contents of an SV to the "STDERR" filehandle.
1541
1542 For an example of its output, see Devel::Peek.
1543
1544 void sv_dump(SV* sv)
1545
1547 pv_display
1548 Similar to
1549
1550 pv_escape(dsv,pv,cur,pvlim,PERL_PV_ESCAPE_QUOTE);
1551
1552 except that an additional "\0" will be appended to the string
1553 when len > cur and pv[cur] is "\0".
1554
1555 Note that the final string may be up to 7 chars longer than
1556 pvlim.
1557
1558 char* pv_display(SV *dsv, const char *pv, STRLEN cur,
1559 STRLEN len, STRLEN pvlim)
1560
1561 pv_escape
1562 Escapes at most the first "count" chars of "pv" and puts the
1563 results into "dsv" such that the size of the escaped string
1564 will not exceed "max" chars and will not contain any incomplete
1565 escape sequences. The number of bytes escaped will be returned
1566 in the "STRLEN *escaped" parameter if it is not null. When the
1567 "dsv" parameter is null no escaping actually occurs, but the
1568 number of bytes that would be escaped were it not null will be
1569 calculated.
1570
1571 If flags contains "PERL_PV_ESCAPE_QUOTE" then any double quotes
1572 in the string will also be escaped.
1573
1574 Normally the SV will be cleared before the escaped string is
1575 prepared, but when "PERL_PV_ESCAPE_NOCLEAR" is set this will
1576 not occur.
1577
1578 If "PERL_PV_ESCAPE_UNI" is set then the input string is treated
1579 as UTF-8 if "PERL_PV_ESCAPE_UNI_DETECT" is set then the input
1580 string is scanned using "is_utf8_string()" to determine if it
1581 is UTF-8.
1582
1583 If "PERL_PV_ESCAPE_ALL" is set then all input chars will be
1584 output using "\x01F1" style escapes, otherwise if
1585 "PERL_PV_ESCAPE_NONASCII" is set, only non-ASCII chars will be
1586 escaped using this style; otherwise, only chars above 255 will
1587 be so escaped; other non printable chars will use octal or
1588 common escaped patterns like "\n". Otherwise, if
1589 "PERL_PV_ESCAPE_NOBACKSLASH" then all chars below 255 will be
1590 treated as printable and will be output as literals.
1591
1592 If "PERL_PV_ESCAPE_FIRSTCHAR" is set then only the first char
1593 of the string will be escaped, regardless of max. If the
1594 output is to be in hex, then it will be returned as a plain hex
1595 sequence. Thus the output will either be a single char, an
1596 octal escape sequence, a special escape like "\n" or a hex
1597 value.
1598
1599 If "PERL_PV_ESCAPE_RE" is set then the escape char used will be
1600 a "%" and not a "\\". This is because regexes very often
1601 contain backslashed sequences, whereas "%" is not a
1602 particularly common character in patterns.
1603
1604 Returns a pointer to the escaped text as held by "dsv".
1605
1606 char* pv_escape(SV *dsv, char const * const str,
1607 const STRLEN count, const STRLEN max,
1608 STRLEN * const escaped,
1609 const U32 flags)
1610
1611 pv_pretty
1612 Converts a string into something presentable, handling escaping
1613 via "pv_escape()" and supporting quoting and ellipses.
1614
1615 If the "PERL_PV_PRETTY_QUOTE" flag is set then the result will
1616 be double quoted with any double quotes in the string escaped.
1617 Otherwise if the "PERL_PV_PRETTY_LTGT" flag is set then the
1618 result be wrapped in angle brackets.
1619
1620 If the "PERL_PV_PRETTY_ELLIPSES" flag is set and not all
1621 characters in string were output then an ellipsis "..." will be
1622 appended to the string. Note that this happens AFTER it has
1623 been quoted.
1624
1625 If "start_color" is non-null then it will be inserted after the
1626 opening quote (if there is one) but before the escaped text.
1627 If "end_color" is non-null then it will be inserted after the
1628 escaped text but before any quotes or ellipses.
1629
1630 Returns a pointer to the prettified text as held by "dsv".
1631
1632 char* pv_pretty(SV *dsv, char const * const str,
1633 const STRLEN count, const STRLEN max,
1634 char const * const start_color,
1635 char const * const end_color,
1636 const U32 flags)
1637
1639 cv_clone
1640 Clone a CV, making a lexical closure. "proto" supplies the
1641 prototype of the function: its code, pad structure, and other
1642 attributes. The prototype is combined with a capture of outer
1643 lexicals to which the code refers, which are taken from the
1644 currently-executing instance of the immediately surrounding
1645 code.
1646
1647 CV * cv_clone(CV *proto)
1648
1649 cv_name Returns an SV containing the name of the CV, mainly for use in
1650 error reporting. The CV may actually be a GV instead, in which
1651 case the returned SV holds the GV's name. Anything other than
1652 a GV or CV is treated as a string already holding the sub name,
1653 but this could change in the future.
1654
1655 An SV may be passed as a second argument. If so, the name will
1656 be assigned to it and it will be returned. Otherwise the
1657 returned SV will be a new mortal.
1658
1659 If "flags" has the "CV_NAME_NOTQUAL" bit set, then the package
1660 name will not be included. If the first argument is neither a
1661 CV nor a GV, this flag is ignored (subject to change).
1662
1663 SV * cv_name(CV *cv, SV *sv, U32 flags)
1664
1665 cv_undef
1666 Clear out all the active components of a CV. This can happen
1667 either by an explicit "undef &foo", or by the reference count
1668 going to zero. In the former case, we keep the "CvOUTSIDE"
1669 pointer, so that any anonymous children can still follow the
1670 full lexical scope chain.
1671
1672 void cv_undef(CV* cv)
1673
1674 find_rundefsv
1675 Returns the global variable $_.
1676
1677 SV * find_rundefsv()
1678
1679 find_rundefsvoffset
1680 DEPRECATED! It is planned to remove this function from a
1681 future release of Perl. Do not use it for new code; remove it
1682 from existing code.
1683
1684 Until the lexical $_ feature was removed, this function would
1685 find the position of the lexical $_ in the pad of the
1686 currently-executing function and return the offset in the
1687 current pad, or "NOT_IN_PAD".
1688
1689 Now it always returns "NOT_IN_PAD".
1690
1691 NOTE: the perl_ form of this function is deprecated.
1692
1693 PADOFFSET find_rundefsvoffset()
1694
1695 intro_my
1696 "Introduce" "my" variables to visible status. This is called
1697 during parsing at the end of each statement to make lexical
1698 variables visible to subsequent statements.
1699
1700 U32 intro_my()
1701
1702 load_module
1703 Loads the module whose name is pointed to by the string part of
1704 "name". Note that the actual module name, not its filename,
1705 should be given. Eg, "Foo::Bar" instead of "Foo/Bar.pm". ver,
1706 if specified and not NULL, provides version semantics similar
1707 to "use Foo::Bar VERSION". The optional trailing arguments can
1708 be used to specify arguments to the module's "import()" method,
1709 similar to "use Foo::Bar VERSION LIST"; their precise handling
1710 depends on the flags. The flags argument is a bitwise-ORed
1711 collection of any of "PERL_LOADMOD_DENY",
1712 "PERL_LOADMOD_NOIMPORT", or "PERL_LOADMOD_IMPORT_OPS" (or 0 for
1713 no flags).
1714
1715 If "PERL_LOADMOD_NOIMPORT" is set, the module is loaded as if
1716 with an empty import list, as in "use Foo::Bar ()"; this is the
1717 only circumstance in which the trailing optional arguments may
1718 be omitted entirely. Otherwise, if "PERL_LOADMOD_IMPORT_OPS" is
1719 set, the trailing arguments must consist of exactly one "OP*",
1720 containing the op tree that produces the relevant import
1721 arguments. Otherwise, the trailing arguments must all be "SV*"
1722 values that will be used as import arguments; and the list must
1723 be terminated with "(SV*) NULL". If neither
1724 "PERL_LOADMOD_NOIMPORT" nor "PERL_LOADMOD_IMPORT_OPS" is set,
1725 the trailing "NULL" pointer is needed even if no import
1726 arguments are desired. The reference count for each specified
1727 "SV*" argument is decremented. In addition, the "name" argument
1728 is modified.
1729
1730 If "PERL_LOADMOD_DENY" is set, the module is loaded as if with
1731 "no" rather than "use".
1732
1733 void load_module(U32 flags, SV* name, SV* ver, ...)
1734
1735 newPADNAMELIST
1736 NOTE: this function is experimental and may change or be
1737 removed without notice.
1738
1739 Creates a new pad name list. "max" is the highest index for
1740 which space is allocated.
1741
1742 PADNAMELIST * newPADNAMELIST(size_t max)
1743
1744 newPADNAMEouter
1745 NOTE: this function is experimental and may change or be
1746 removed without notice.
1747
1748 Constructs and returns a new pad name. Only use this function
1749 for names that refer to outer lexicals. (See also
1750 "newPADNAMEpvn".) "outer" is the outer pad name that this one
1751 mirrors. The returned pad name has the "PADNAMEt_OUTER" flag
1752 already set.
1753
1754 PADNAME * newPADNAMEouter(PADNAME *outer)
1755
1756 newPADNAMEpvn
1757 NOTE: this function is experimental and may change or be
1758 removed without notice.
1759
1760 Constructs and returns a new pad name. "s" must be a UTF-8
1761 string. Do not use this for pad names that point to outer
1762 lexicals. See "newPADNAMEouter".
1763
1764 PADNAME * newPADNAMEpvn(const char *s, STRLEN len)
1765
1766 nothreadhook
1767 Stub that provides thread hook for perl_destruct when there are
1768 no threads.
1769
1770 int nothreadhook()
1771
1772 pad_add_anon
1773 Allocates a place in the currently-compiling pad (via
1774 "pad_alloc") for an anonymous function that is lexically scoped
1775 inside the currently-compiling function. The function "func"
1776 is linked into the pad, and its "CvOUTSIDE" link to the outer
1777 scope is weakened to avoid a reference loop.
1778
1779 One reference count is stolen, so you may need to do
1780 "SvREFCNT_inc(func)".
1781
1782 "optype" should be an opcode indicating the type of operation
1783 that the pad entry is to support. This doesn't affect
1784 operational semantics, but is used for debugging.
1785
1786 PADOFFSET pad_add_anon(CV *func, I32 optype)
1787
1788 pad_add_name_pv
1789 Exactly like "pad_add_name_pvn", but takes a nul-terminated
1790 string instead of a string/length pair.
1791
1792 PADOFFSET pad_add_name_pv(const char *name, U32 flags,
1793 HV *typestash, HV *ourstash)
1794
1795 pad_add_name_pvn
1796 Allocates a place in the currently-compiling pad for a named
1797 lexical variable. Stores the name and other metadata in the
1798 name part of the pad, and makes preparations to manage the
1799 variable's lexical scoping. Returns the offset of the
1800 allocated pad slot.
1801
1802 "namepv"/"namelen" specify the variable's name, including
1803 leading sigil. If "typestash" is non-null, the name is for a
1804 typed lexical, and this identifies the type. If "ourstash" is
1805 non-null, it's a lexical reference to a package variable, and
1806 this identifies the package. The following flags can be OR'ed
1807 together:
1808
1809 padadd_OUR redundantly specifies if it's a package var
1810 padadd_STATE variable will retain value persistently
1811 padadd_NO_DUP_CHECK skip check for lexical shadowing
1812
1813 PADOFFSET pad_add_name_pvn(const char *namepv,
1814 STRLEN namelen, U32 flags,
1815 HV *typestash, HV *ourstash)
1816
1817 pad_add_name_sv
1818 Exactly like "pad_add_name_pvn", but takes the name string in
1819 the form of an SV instead of a string/length pair.
1820
1821 PADOFFSET pad_add_name_sv(SV *name, U32 flags,
1822 HV *typestash, HV *ourstash)
1823
1824 pad_alloc
1825 NOTE: this function is experimental and may change or be
1826 removed without notice.
1827
1828 Allocates a place in the currently-compiling pad, returning the
1829 offset of the allocated pad slot. No name is initially
1830 attached to the pad slot. "tmptype" is a set of flags
1831 indicating the kind of pad entry required, which will be set in
1832 the value SV for the allocated pad entry:
1833
1834 SVs_PADMY named lexical variable ("my", "our", "state")
1835 SVs_PADTMP unnamed temporary store
1836 SVf_READONLY constant shared between recursion levels
1837
1838 "SVf_READONLY" has been supported here only since perl 5.20.
1839 To work with earlier versions as well, use
1840 "SVf_READONLY|SVs_PADTMP". "SVf_READONLY" does not cause the
1841 SV in the pad slot to be marked read-only, but simply tells
1842 "pad_alloc" that it will be made read-only (by the caller), or
1843 at least should be treated as such.
1844
1845 "optype" should be an opcode indicating the type of operation
1846 that the pad entry is to support. This doesn't affect
1847 operational semantics, but is used for debugging.
1848
1849 PADOFFSET pad_alloc(I32 optype, U32 tmptype)
1850
1851 pad_findmy_pv
1852 Exactly like "pad_findmy_pvn", but takes a nul-terminated
1853 string instead of a string/length pair.
1854
1855 PADOFFSET pad_findmy_pv(const char *name, U32 flags)
1856
1857 pad_findmy_pvn
1858 Given the name of a lexical variable, find its position in the
1859 currently-compiling pad. "namepv"/"namelen" specify the
1860 variable's name, including leading sigil. "flags" is reserved
1861 and must be zero. If it is not in the current pad but appears
1862 in the pad of any lexically enclosing scope, then a pseudo-
1863 entry for it is added in the current pad. Returns the offset
1864 in the current pad, or "NOT_IN_PAD" if no such lexical is in
1865 scope.
1866
1867 PADOFFSET pad_findmy_pvn(const char *namepv,
1868 STRLEN namelen, U32 flags)
1869
1870 pad_findmy_sv
1871 Exactly like "pad_findmy_pvn", but takes the name string in the
1872 form of an SV instead of a string/length pair.
1873
1874 PADOFFSET pad_findmy_sv(SV *name, U32 flags)
1875
1876 padnamelist_fetch
1877 NOTE: this function is experimental and may change or be
1878 removed without notice.
1879
1880 Fetches the pad name from the given index.
1881
1882 PADNAME * padnamelist_fetch(PADNAMELIST *pnl,
1883 SSize_t key)
1884
1885 padnamelist_store
1886 NOTE: this function is experimental and may change or be
1887 removed without notice.
1888
1889 Stores the pad name (which may be null) at the given index,
1890 freeing any existing pad name in that slot.
1891
1892 PADNAME ** padnamelist_store(PADNAMELIST *pnl,
1893 SSize_t key, PADNAME *val)
1894
1895 pad_setsv
1896 Set the value at offset "po" in the current (compiling or
1897 executing) pad. Use the macro "PAD_SETSV()" rather than
1898 calling this function directly.
1899
1900 void pad_setsv(PADOFFSET po, SV *sv)
1901
1902 pad_sv Get the value at offset "po" in the current (compiling or
1903 executing) pad. Use macro "PAD_SV" instead of calling this
1904 function directly.
1905
1906 SV * pad_sv(PADOFFSET po)
1907
1908 pad_tidy
1909 NOTE: this function is experimental and may change or be
1910 removed without notice.
1911
1912 Tidy up a pad at the end of compilation of the code to which it
1913 belongs. Jobs performed here are: remove most stuff from the
1914 pads of anonsub prototypes; give it a @_; mark temporaries as
1915 such. "type" indicates the kind of subroutine:
1916
1917 padtidy_SUB ordinary subroutine
1918 padtidy_SUBCLONE prototype for lexical closure
1919 padtidy_FORMAT format
1920
1921 void pad_tidy(padtidy_type type)
1922
1923 perl_alloc
1924 Allocates a new Perl interpreter. See perlembed.
1925
1926 PerlInterpreter* perl_alloc()
1927
1928 perl_construct
1929 Initializes a new Perl interpreter. See perlembed.
1930
1931 void perl_construct(PerlInterpreter *my_perl)
1932
1933 perl_destruct
1934 Shuts down a Perl interpreter. See perlembed for a tutorial.
1935
1936 "my_perl" points to the Perl interpreter. It must have been
1937 previously created through the use of "perl_alloc" and
1938 "perl_construct". It may have been initialised through
1939 "perl_parse", and may have been used through "perl_run" and
1940 other means. This function should be called for any Perl
1941 interpreter that has been constructed with "perl_construct",
1942 even if subsequent operations on it failed, for example if
1943 "perl_parse" returned a non-zero value.
1944
1945 If the interpreter's "PL_exit_flags" word has the
1946 "PERL_EXIT_DESTRUCT_END" flag set, then this function will
1947 execute code in "END" blocks before performing the rest of
1948 destruction. If it is desired to make any use of the
1949 interpreter between "perl_parse" and "perl_destruct" other than
1950 just calling "perl_run", then this flag should be set early on.
1951 This matters if "perl_run" will not be called, or if anything
1952 else will be done in addition to calling "perl_run".
1953
1954 Returns a value be a suitable value to pass to the C library
1955 function "exit" (or to return from "main"), to serve as an exit
1956 code indicating the nature of the way the interpreter
1957 terminated. This takes into account any failure of
1958 "perl_parse" and any early exit from "perl_run". The exit code
1959 is of the type required by the host operating system, so
1960 because of differing exit code conventions it is not portable
1961 to interpret specific numeric values as having specific
1962 meanings.
1963
1964 int perl_destruct(PerlInterpreter *my_perl)
1965
1966 perl_free
1967 Releases a Perl interpreter. See perlembed.
1968
1969 void perl_free(PerlInterpreter *my_perl)
1970
1971 perl_parse
1972 Tells a Perl interpreter to parse a Perl script. This performs
1973 most of the initialisation of a Perl interpreter. See
1974 perlembed for a tutorial.
1975
1976 "my_perl" points to the Perl interpreter that is to parse the
1977 script. It must have been previously created through the use
1978 of "perl_alloc" and "perl_construct". "xsinit" points to a
1979 callback function that will be called to set up the ability for
1980 this Perl interpreter to load XS extensions, or may be null to
1981 perform no such setup.
1982
1983 "argc" and "argv" supply a set of command-line arguments to the
1984 Perl interpreter, as would normally be passed to the "main"
1985 function of a C program. "argv[argc]" must be null. These
1986 arguments are where the script to parse is specified, either by
1987 naming a script file or by providing a script in a "-e" option.
1988 If $0 will be written to in the Perl interpreter, then the
1989 argument strings must be in writable memory, and so mustn't
1990 just be string constants.
1991
1992 "env" specifies a set of environment variables that will be
1993 used by this Perl interpreter. If non-null, it must point to a
1994 null-terminated array of environment strings. If null, the
1995 Perl interpreter will use the environment supplied by the
1996 "environ" global variable.
1997
1998 This function initialises the interpreter, and parses and
1999 compiles the script specified by the command-line arguments.
2000 This includes executing code in "BEGIN", "UNITCHECK", and
2001 "CHECK" blocks. It does not execute "INIT" blocks or the main
2002 program.
2003
2004 Returns an integer of slightly tricky interpretation. The
2005 correct use of the return value is as a truth value indicating
2006 whether there was a failure in initialisation. If zero is
2007 returned, this indicates that initialisation was successful,
2008 and it is safe to proceed to call "perl_run" and make other use
2009 of it. If a non-zero value is returned, this indicates some
2010 problem that means the interpreter wants to terminate. The
2011 interpreter should not be just abandoned upon such failure; the
2012 caller should proceed to shut the interpreter down cleanly with
2013 "perl_destruct" and free it with "perl_free".
2014
2015 For historical reasons, the non-zero return value also attempts
2016 to be a suitable value to pass to the C library function "exit"
2017 (or to return from "main"), to serve as an exit code indicating
2018 the nature of the way initialisation terminated. However, this
2019 isn't portable, due to differing exit code conventions. A
2020 historical bug is preserved for the time being: if the Perl
2021 built-in "exit" is called during this function's execution,
2022 with a type of exit entailing a zero exit code under the host
2023 operating system's conventions, then this function returns zero
2024 rather than a non-zero value. This bug, [perl #2754], leads to
2025 "perl_run" being called (and therefore "INIT" blocks and the
2026 main program running) despite a call to "exit". It has been
2027 preserved because a popular module-installing module has come
2028 to rely on it and needs time to be fixed. This issue is [perl
2029 #132577], and the original bug is due to be fixed in Perl 5.30.
2030
2031 int perl_parse(PerlInterpreter *my_perl,
2032 XSINIT_t xsinit, int argc,
2033 char **argv, char **env)
2034
2035 perl_run
2036 Tells a Perl interpreter to run its main program. See
2037 perlembed for a tutorial.
2038
2039 "my_perl" points to the Perl interpreter. It must have been
2040 previously created through the use of "perl_alloc" and
2041 "perl_construct", and initialised through "perl_parse". This
2042 function should not be called if "perl_parse" returned a non-
2043 zero value, indicating a failure in initialisation or
2044 compilation.
2045
2046 This function executes code in "INIT" blocks, and then executes
2047 the main program. The code to be executed is that established
2048 by the prior call to "perl_parse". If the interpreter's
2049 "PL_exit_flags" word does not have the "PERL_EXIT_DESTRUCT_END"
2050 flag set, then this function will also execute code in "END"
2051 blocks. If it is desired to make any further use of the
2052 interpreter after calling this function, then "END" blocks
2053 should be postponed to "perl_destruct" time by setting that
2054 flag.
2055
2056 Returns an integer of slightly tricky interpretation. The
2057 correct use of the return value is as a truth value indicating
2058 whether the program terminated non-locally. If zero is
2059 returned, this indicates that the program ran to completion,
2060 and it is safe to make other use of the interpreter (provided
2061 that the "PERL_EXIT_DESTRUCT_END" flag was set as described
2062 above). If a non-zero value is returned, this indicates that
2063 the interpreter wants to terminate early. The interpreter
2064 should not be just abandoned because of this desire to
2065 terminate; the caller should proceed to shut the interpreter
2066 down cleanly with "perl_destruct" and free it with "perl_free".
2067
2068 For historical reasons, the non-zero return value also attempts
2069 to be a suitable value to pass to the C library function "exit"
2070 (or to return from "main"), to serve as an exit code indicating
2071 the nature of the way the program terminated. However, this
2072 isn't portable, due to differing exit code conventions. An
2073 attempt is made to return an exit code of the type required by
2074 the host operating system, but because it is constrained to be
2075 non-zero, it is not necessarily possible to indicate every type
2076 of exit. It is only reliable on Unix, where a zero exit code
2077 can be augmented with a set bit that will be ignored. In any
2078 case, this function is not the correct place to acquire an exit
2079 code: one should get that from "perl_destruct".
2080
2081 int perl_run(PerlInterpreter *my_perl)
2082
2083 require_pv
2084 Tells Perl to "require" the file named by the string argument.
2085 It is analogous to the Perl code "eval "require '$file'"".
2086 It's even implemented that way; consider using load_module
2087 instead.
2088
2089 NOTE: the perl_ form of this function is deprecated.
2090
2091 void require_pv(const char* pv)
2092
2094 dXCPT Set up necessary local variables for exception handling. See
2095 "Exception Handling" in perlguts.
2096
2097 dXCPT;
2098
2099 XCPT_CATCH
2100 Introduces a catch block. See "Exception Handling" in
2101 perlguts.
2102
2103 XCPT_RETHROW
2104 Rethrows a previously caught exception. See "Exception
2105 Handling" in perlguts.
2106
2107 XCPT_RETHROW;
2108
2109 XCPT_TRY_END
2110 Ends a try block. See "Exception Handling" in perlguts.
2111
2112 XCPT_TRY_START
2113 Starts a try block. See "Exception Handling" in perlguts.
2114
2116 sortsv_flags
2117 In-place sort an array of SV pointers with the given comparison
2118 routine, with various SORTf_* flag options.
2119
2120 void sortsv_flags(SV** array, size_t num_elts,
2121 SVCOMPARE_t cmp, U32 flags)
2122
2124 save_gp Saves the current GP of gv on the save stack to be restored on
2125 scope exit.
2126
2127 If empty is true, replace the GP with a new GP.
2128
2129 If empty is false, mark gv with GVf_INTRO so the next reference
2130 assigned is localized, which is how " local *foo = $someref; "
2131 works.
2132
2133 void save_gp(GV* gv, I32 empty)
2134
2136 new_version
2137 Returns a new version object based on the passed in SV:
2138
2139 SV *sv = new_version(SV *ver);
2140
2141 Does not alter the passed in ver SV. See "upg_version" if you
2142 want to upgrade the SV.
2143
2144 SV* new_version(SV *ver)
2145
2146 prescan_version
2147 Validate that a given string can be parsed as a version object,
2148 but doesn't actually perform the parsing. Can use either
2149 strict or lax validation rules. Can optionally set a number of
2150 hint variables to save the parsing code some time when
2151 tokenizing.
2152
2153 const char* prescan_version(const char *s, bool strict,
2154 const char** errstr,
2155 bool *sqv,
2156 int *ssaw_decimal,
2157 int *swidth, bool *salpha)
2158
2159 scan_version
2160 Returns a pointer to the next character after the parsed
2161 version string, as well as upgrading the passed in SV to an RV.
2162
2163 Function must be called with an already existing SV like
2164
2165 sv = newSV(0);
2166 s = scan_version(s, SV *sv, bool qv);
2167
2168 Performs some preprocessing to the string to ensure that it has
2169 the correct characteristics of a version. Flags the object if
2170 it contains an underscore (which denotes this is an alpha
2171 version). The boolean qv denotes that the version should be
2172 interpreted as if it had multiple decimals, even if it doesn't.
2173
2174 const char* scan_version(const char *s, SV *rv, bool qv)
2175
2176 upg_version
2177 In-place upgrade of the supplied SV to a version object.
2178
2179 SV *sv = upg_version(SV *sv, bool qv);
2180
2181 Returns a pointer to the upgraded SV. Set the boolean qv if
2182 you want to force this SV to be interpreted as an "extended"
2183 version.
2184
2185 SV* upg_version(SV *ver, bool qv)
2186
2187 vcmp Version object aware cmp. Both operands must already have been
2188 converted into version objects.
2189
2190 int vcmp(SV *lhv, SV *rhv)
2191
2192 vnormal Accepts a version object and returns the normalized string
2193 representation. Call like:
2194
2195 sv = vnormal(rv);
2196
2197 NOTE: you can pass either the object directly or the SV
2198 contained within the RV.
2199
2200 The SV returned has a refcount of 1.
2201
2202 SV* vnormal(SV *vs)
2203
2204 vnumify Accepts a version object and returns the normalized floating
2205 point representation. Call like:
2206
2207 sv = vnumify(rv);
2208
2209 NOTE: you can pass either the object directly or the SV
2210 contained within the RV.
2211
2212 The SV returned has a refcount of 1.
2213
2214 SV* vnumify(SV *vs)
2215
2216 vstringify
2217 In order to maintain maximum compatibility with earlier
2218 versions of Perl, this function will return either the floating
2219 point notation or the multiple dotted notation, depending on
2220 whether the original version contained 1 or more dots,
2221 respectively.
2222
2223 The SV returned has a refcount of 1.
2224
2225 SV* vstringify(SV *vs)
2226
2227 vverify Validates that the SV contains valid internal structure for a
2228 version object. It may be passed either the version object
2229 (RV) or the hash itself (HV). If the structure is valid, it
2230 returns the HV. If the structure is invalid, it returns NULL.
2231
2232 SV *hv = vverify(sv);
2233
2234 Note that it only confirms the bare minimum structure (so as
2235 not to get confused by derived classes which may contain
2236 additional hash entries):
2237
2238 · The SV is an HV or a reference to an HV
2239
2240 · The hash contains a "version" key
2241
2242 · The "version" key has a reference to an AV as its value
2243
2244 SV* vverify(SV *vs)
2245
2247 G_ARRAY Used to indicate list context. See "GIMME_V", "GIMME" and
2248 perlcall.
2249
2250 G_DISCARD
2251 Indicates that arguments returned from a callback should be
2252 discarded. See perlcall.
2253
2254 G_EVAL Used to force a Perl "eval" wrapper around a callback. See
2255 perlcall.
2256
2257 GIMME A backward-compatible version of "GIMME_V" which can only
2258 return "G_SCALAR" or "G_ARRAY"; in a void context, it returns
2259 "G_SCALAR". Deprecated. Use "GIMME_V" instead.
2260
2261 U32 GIMME
2262
2263 GIMME_V The XSUB-writer's equivalent to Perl's "wantarray". Returns
2264 "G_VOID", "G_SCALAR" or "G_ARRAY" for void, scalar or list
2265 context, respectively. See perlcall for a usage example.
2266
2267 U32 GIMME_V
2268
2269 G_NOARGS
2270 Indicates that no arguments are being sent to a callback. See
2271 perlcall.
2272
2273 G_SCALAR
2274 Used to indicate scalar context. See "GIMME_V", "GIMME", and
2275 perlcall.
2276
2277 G_VOID Used to indicate void context. See "GIMME_V" and perlcall.
2278
2280 These variables are global to an entire process. They are shared
2281 between all interpreters and all threads in a process. Any variables
2282 not documented here may be changed or removed without notice, so don't
2283 use them! If you feel you really do need to use an unlisted variable,
2284 first send email to perl5-porters@perl.org
2285 <mailto:perl5-porters@perl.org>. It may be that someone there will
2286 point out a way to accomplish what you need without using an internal
2287 variable. But if not, you should get a go-ahead to document and then
2288 use the variable.
2289
2290 PL_check
2291 Array, indexed by opcode, of functions that will be called for
2292 the "check" phase of optree building during compilation of Perl
2293 code. For most (but not all) types of op, once the op has been
2294 initially built and populated with child ops it will be
2295 filtered through the check function referenced by the
2296 appropriate element of this array. The new op is passed in as
2297 the sole argument to the check function, and the check function
2298 returns the completed op. The check function may (as the name
2299 suggests) check the op for validity and signal errors. It may
2300 also initialise or modify parts of the ops, or perform more
2301 radical surgery such as adding or removing child ops, or even
2302 throw the op away and return a different op in its place.
2303
2304 This array of function pointers is a convenient place to hook
2305 into the compilation process. An XS module can put its own
2306 custom check function in place of any of the standard ones, to
2307 influence the compilation of a particular type of op. However,
2308 a custom check function must never fully replace a standard
2309 check function (or even a custom check function from another
2310 module). A module modifying checking must instead wrap the
2311 preexisting check function. A custom check function must be
2312 selective about when to apply its custom behaviour. In the
2313 usual case where it decides not to do anything special with an
2314 op, it must chain the preexisting op function. Check functions
2315 are thus linked in a chain, with the core's base checker at the
2316 end.
2317
2318 For thread safety, modules should not write directly to this
2319 array. Instead, use the function "wrap_op_checker".
2320
2321 PL_keyword_plugin
2322 NOTE: this function is experimental and may change or be
2323 removed without notice.
2324
2325 Function pointer, pointing at a function used to handle
2326 extended keywords. The function should be declared as
2327
2328 int keyword_plugin_function(pTHX_
2329 char *keyword_ptr, STRLEN keyword_len,
2330 OP **op_ptr)
2331
2332 The function is called from the tokeniser, whenever a possible
2333 keyword is seen. "keyword_ptr" points at the word in the
2334 parser's input buffer, and "keyword_len" gives its length; it
2335 is not null-terminated. The function is expected to examine
2336 the word, and possibly other state such as %^H, to decide
2337 whether it wants to handle it as an extended keyword. If it
2338 does not, the function should return "KEYWORD_PLUGIN_DECLINE",
2339 and the normal parser process will continue.
2340
2341 If the function wants to handle the keyword, it first must
2342 parse anything following the keyword that is part of the syntax
2343 introduced by the keyword. See "Lexer interface" for details.
2344
2345 When a keyword is being handled, the plugin function must build
2346 a tree of "OP" structures, representing the code that was
2347 parsed. The root of the tree must be stored in *op_ptr. The
2348 function then returns a constant indicating the syntactic role
2349 of the construct that it has parsed: "KEYWORD_PLUGIN_STMT" if
2350 it is a complete statement, or "KEYWORD_PLUGIN_EXPR" if it is
2351 an expression. Note that a statement construct cannot be used
2352 inside an expression (except via "do BLOCK" and similar), and
2353 an expression is not a complete statement (it requires at least
2354 a terminating semicolon).
2355
2356 When a keyword is handled, the plugin function may also have
2357 (compile-time) side effects. It may modify "%^H", define
2358 functions, and so on. Typically, if side effects are the main
2359 purpose of a handler, it does not wish to generate any ops to
2360 be included in the normal compilation. In this case it is
2361 still required to supply an op tree, but it suffices to
2362 generate a single null op.
2363
2364 That's how the *PL_keyword_plugin function needs to behave
2365 overall. Conventionally, however, one does not completely
2366 replace the existing handler function. Instead, take a copy of
2367 "PL_keyword_plugin" before assigning your own function pointer
2368 to it. Your handler function should look for keywords that it
2369 is interested in and handle those. Where it is not interested,
2370 it should call the saved plugin function, passing on the
2371 arguments it received. Thus "PL_keyword_plugin" actually
2372 points at a chain of handler functions, all of which have an
2373 opportunity to handle keywords, and only the last function in
2374 the chain (built into the Perl core) will normally return
2375 "KEYWORD_PLUGIN_DECLINE".
2376
2377 For thread safety, modules should not set this variable
2378 directly. Instead, use the function "wrap_keyword_plugin".
2379
2381 A GV is a structure which corresponds to to a Perl typeglob, ie *foo.
2382 It is a structure that holds a pointer to a scalar, an array, a hash
2383 etc, corresponding to $foo, @foo, %foo.
2384
2385 GVs are usually found as values in stashes (symbol table hashes) where
2386 Perl stores its global variables.
2387
2388 GvAV Return the AV from the GV.
2389
2390 AV* GvAV(GV* gv)
2391
2392 gv_const_sv
2393 If "gv" is a typeglob whose subroutine entry is a constant sub
2394 eligible for inlining, or "gv" is a placeholder reference that
2395 would be promoted to such a typeglob, then returns the value
2396 returned by the sub. Otherwise, returns "NULL".
2397
2398 SV* gv_const_sv(GV* gv)
2399
2400 GvCV Return the CV from the GV.
2401
2402 CV* GvCV(GV* gv)
2403
2404 gv_fetchmeth
2405 Like "gv_fetchmeth_pvn", but lacks a flags parameter.
2406
2407 GV* gv_fetchmeth(HV* stash, const char* name,
2408 STRLEN len, I32 level)
2409
2410 gv_fetchmethod_autoload
2411 Returns the glob which contains the subroutine to call to
2412 invoke the method on the "stash". In fact in the presence of
2413 autoloading this may be the glob for "AUTOLOAD". In this case
2414 the corresponding variable $AUTOLOAD is already setup.
2415
2416 The third parameter of "gv_fetchmethod_autoload" determines
2417 whether AUTOLOAD lookup is performed if the given method is not
2418 present: non-zero means yes, look for AUTOLOAD; zero means no,
2419 don't look for AUTOLOAD. Calling "gv_fetchmethod" is
2420 equivalent to calling "gv_fetchmethod_autoload" with a non-zero
2421 "autoload" parameter.
2422
2423 These functions grant "SUPER" token as a prefix of the method
2424 name. Note that if you want to keep the returned glob for a
2425 long time, you need to check for it being "AUTOLOAD", since at
2426 the later time the call may load a different subroutine due to
2427 $AUTOLOAD changing its value. Use the glob created as a side
2428 effect to do this.
2429
2430 These functions have the same side-effects as "gv_fetchmeth"
2431 with "level==0". The warning against passing the GV returned
2432 by "gv_fetchmeth" to "call_sv" applies equally to these
2433 functions.
2434
2435 GV* gv_fetchmethod_autoload(HV* stash,
2436 const char* name,
2437 I32 autoload)
2438
2439 gv_fetchmeth_autoload
2440 This is the old form of "gv_fetchmeth_pvn_autoload", which has
2441 no flags parameter.
2442
2443 GV* gv_fetchmeth_autoload(HV* stash,
2444 const char* name,
2445 STRLEN len, I32 level)
2446
2447 gv_fetchmeth_pv
2448 Exactly like "gv_fetchmeth_pvn", but takes a nul-terminated
2449 string instead of a string/length pair.
2450
2451 GV* gv_fetchmeth_pv(HV* stash, const char* name,
2452 I32 level, U32 flags)
2453
2454 gv_fetchmeth_pvn
2455 Returns the glob with the given "name" and a defined subroutine
2456 or "NULL". The glob lives in the given "stash", or in the
2457 stashes accessible via @ISA and "UNIVERSAL::".
2458
2459 The argument "level" should be either 0 or -1. If "level==0",
2460 as a side-effect creates a glob with the given "name" in the
2461 given "stash" which in the case of success contains an alias
2462 for the subroutine, and sets up caching info for this glob.
2463
2464 The only significant values for "flags" are "GV_SUPER" and
2465 "SVf_UTF8".
2466
2467 "GV_SUPER" indicates that we want to look up the method in the
2468 superclasses of the "stash".
2469
2470 The GV returned from "gv_fetchmeth" may be a method cache
2471 entry, which is not visible to Perl code. So when calling
2472 "call_sv", you should not use the GV directly; instead, you
2473 should use the method's CV, which can be obtained from the GV
2474 with the "GvCV" macro.
2475
2476 GV* gv_fetchmeth_pvn(HV* stash, const char* name,
2477 STRLEN len, I32 level,
2478 U32 flags)
2479
2480 gv_fetchmeth_pvn_autoload
2481 Same as "gv_fetchmeth_pvn()", but looks for autoloaded
2482 subroutines too. Returns a glob for the subroutine.
2483
2484 For an autoloaded subroutine without a GV, will create a GV
2485 even if "level < 0". For an autoloaded subroutine without a
2486 stub, "GvCV()" of the result may be zero.
2487
2488 Currently, the only significant value for "flags" is
2489 "SVf_UTF8".
2490
2491 GV* gv_fetchmeth_pvn_autoload(HV* stash,
2492 const char* name,
2493 STRLEN len, I32 level,
2494 U32 flags)
2495
2496 gv_fetchmeth_pv_autoload
2497 Exactly like "gv_fetchmeth_pvn_autoload", but takes a nul-
2498 terminated string instead of a string/length pair.
2499
2500 GV* gv_fetchmeth_pv_autoload(HV* stash,
2501 const char* name,
2502 I32 level, U32 flags)
2503
2504 gv_fetchmeth_sv
2505 Exactly like "gv_fetchmeth_pvn", but takes the name string in
2506 the form of an SV instead of a string/length pair.
2507
2508 GV* gv_fetchmeth_sv(HV* stash, SV* namesv,
2509 I32 level, U32 flags)
2510
2511 gv_fetchmeth_sv_autoload
2512 Exactly like "gv_fetchmeth_pvn_autoload", but takes the name
2513 string in the form of an SV instead of a string/length pair.
2514
2515 GV* gv_fetchmeth_sv_autoload(HV* stash, SV* namesv,
2516 I32 level, U32 flags)
2517
2518 GvHV Return the HV from the GV.
2519
2520 HV* GvHV(GV* gv)
2521
2522 gv_init The old form of "gv_init_pvn()". It does not work with UTF-8
2523 strings, as it has no flags parameter. If the "multi"
2524 parameter is set, the "GV_ADDMULTI" flag will be passed to
2525 "gv_init_pvn()".
2526
2527 void gv_init(GV* gv, HV* stash, const char* name,
2528 STRLEN len, int multi)
2529
2530 gv_init_pv
2531 Same as "gv_init_pvn()", but takes a nul-terminated string for
2532 the name instead of separate char * and length parameters.
2533
2534 void gv_init_pv(GV* gv, HV* stash, const char* name,
2535 U32 flags)
2536
2537 gv_init_pvn
2538 Converts a scalar into a typeglob. This is an incoercible
2539 typeglob; assigning a reference to it will assign to one of its
2540 slots, instead of overwriting it as happens with typeglobs
2541 created by "SvSetSV". Converting any scalar that is "SvOK()"
2542 may produce unpredictable results and is reserved for perl's
2543 internal use.
2544
2545 "gv" is the scalar to be converted.
2546
2547 "stash" is the parent stash/package, if any.
2548
2549 "name" and "len" give the name. The name must be unqualified;
2550 that is, it must not include the package name. If "gv" is a
2551 stash element, it is the caller's responsibility to ensure that
2552 the name passed to this function matches the name of the
2553 element. If it does not match, perl's internal bookkeeping
2554 will get out of sync.
2555
2556 "flags" can be set to "SVf_UTF8" if "name" is a UTF-8 string,
2557 or the return value of SvUTF8(sv). It can also take the
2558 "GV_ADDMULTI" flag, which means to pretend that the GV has been
2559 seen before (i.e., suppress "Used once" warnings).
2560
2561 void gv_init_pvn(GV* gv, HV* stash, const char* name,
2562 STRLEN len, U32 flags)
2563
2564 gv_init_sv
2565 Same as "gv_init_pvn()", but takes an SV * for the name instead
2566 of separate char * and length parameters. "flags" is currently
2567 unused.
2568
2569 void gv_init_sv(GV* gv, HV* stash, SV* namesv,
2570 U32 flags)
2571
2572 gv_stashpv
2573 Returns a pointer to the stash for a specified package. Uses
2574 "strlen" to determine the length of "name", then calls
2575 "gv_stashpvn()".
2576
2577 HV* gv_stashpv(const char* name, I32 flags)
2578
2579 gv_stashpvn
2580 Returns a pointer to the stash for a specified package. The
2581 "namelen" parameter indicates the length of the "name", in
2582 bytes. "flags" is passed to "gv_fetchpvn_flags()", so if set
2583 to "GV_ADD" then the package will be created if it does not
2584 already exist. If the package does not exist and "flags" is 0
2585 (or any other setting that does not create packages) then
2586 "NULL" is returned.
2587
2588 Flags may be one of:
2589
2590 GV_ADD
2591 SVf_UTF8
2592 GV_NOADD_NOINIT
2593 GV_NOINIT
2594 GV_NOEXPAND
2595 GV_ADDMG
2596
2597 The most important of which are probably "GV_ADD" and
2598 "SVf_UTF8".
2599
2600 Note, use of "gv_stashsv" instead of "gv_stashpvn" where
2601 possible is strongly recommended for performance reasons.
2602
2603 HV* gv_stashpvn(const char* name, U32 namelen,
2604 I32 flags)
2605
2606 gv_stashpvs
2607 Like "gv_stashpvn", but takes a literal string instead of a
2608 string/length pair.
2609
2610 HV* gv_stashpvs("literal string" name, I32 create)
2611
2612 gv_stashsv
2613 Returns a pointer to the stash for a specified package. See
2614 "gv_stashpvn".
2615
2616 Note this interface is strongly preferred over "gv_stashpvn"
2617 for performance reasons.
2618
2619 HV* gv_stashsv(SV* sv, I32 flags)
2620
2621 GvSV Return the SV from the GV.
2622
2623 SV* GvSV(GV* gv)
2624
2625 setdefout
2626 Sets "PL_defoutgv", the default file handle for output, to the
2627 passed in typeglob. As "PL_defoutgv" "owns" a reference on its
2628 typeglob, the reference count of the passed in typeglob is
2629 increased by one, and the reference count of the typeglob that
2630 "PL_defoutgv" points to is decreased by one.
2631
2632 void setdefout(GV* gv)
2633
2635 Nullav Null AV pointer.
2636
2637 (deprecated - use "(AV *)NULL" instead)
2638
2639 Nullch Null character pointer. (No longer available when "PERL_CORE"
2640 is defined.)
2641
2642 Nullcv Null CV pointer.
2643
2644 (deprecated - use "(CV *)NULL" instead)
2645
2646 Nullhv Null HV pointer.
2647
2648 (deprecated - use "(HV *)NULL" instead)
2649
2650 Nullsv Null SV pointer. (No longer available when "PERL_CORE" is
2651 defined.)
2652
2654 A HV structure represents a Perl hash. It consists mainly of an array
2655 of pointers, each of which points to a linked list of HE structures.
2656 The array is indexed by the hash function of the key, so each linked
2657 list represents all the hash entries with the same hash value. Each HE
2658 contains a pointer to the actual value, plus a pointer to a HEK
2659 structure which holds the key and hash value.
2660
2661 cop_fetch_label
2662 NOTE: this function is experimental and may change or be
2663 removed without notice.
2664
2665 Returns the label attached to a cop. The flags pointer may be
2666 set to "SVf_UTF8" or 0.
2667
2668 const char * cop_fetch_label(COP *const cop,
2669 STRLEN *len, U32 *flags)
2670
2671 cop_store_label
2672 NOTE: this function is experimental and may change or be
2673 removed without notice.
2674
2675 Save a label into a "cop_hints_hash". You need to set flags to
2676 "SVf_UTF8" for a UTF-8 label.
2677
2678 void cop_store_label(COP *const cop,
2679 const char *label, STRLEN len,
2680 U32 flags)
2681
2682 get_hv Returns the HV of the specified Perl hash. "flags" are passed
2683 to "gv_fetchpv". If "GV_ADD" is set and the Perl variable does
2684 not exist then it will be created. If "flags" is zero and the
2685 variable does not exist then "NULL" is returned.
2686
2687 NOTE: the perl_ form of this function is deprecated.
2688
2689 HV* get_hv(const char *name, I32 flags)
2690
2691 HEf_SVKEY
2692 This flag, used in the length slot of hash entries and magic
2693 structures, specifies the structure contains an "SV*" pointer
2694 where a "char*" pointer is to be expected. (For information
2695 only--not to be used).
2696
2697 HeHASH Returns the computed hash stored in the hash entry.
2698
2699 U32 HeHASH(HE* he)
2700
2701 HeKEY Returns the actual pointer stored in the key slot of the hash
2702 entry. The pointer may be either "char*" or "SV*", depending
2703 on the value of "HeKLEN()". Can be assigned to. The "HePV()"
2704 or "HeSVKEY()" macros are usually preferable for finding the
2705 value of a key.
2706
2707 void* HeKEY(HE* he)
2708
2709 HeKLEN If this is negative, and amounts to "HEf_SVKEY", it indicates
2710 the entry holds an "SV*" key. Otherwise, holds the actual
2711 length of the key. Can be assigned to. The "HePV()" macro is
2712 usually preferable for finding key lengths.
2713
2714 STRLEN HeKLEN(HE* he)
2715
2716 HePV Returns the key slot of the hash entry as a "char*" value,
2717 doing any necessary dereferencing of possibly "SV*" keys. The
2718 length of the string is placed in "len" (this is a macro, so do
2719 not use &len). If you do not care about what the length of the
2720 key is, you may use the global variable "PL_na", though this is
2721 rather less efficient than using a local variable. Remember
2722 though, that hash keys in perl are free to contain embedded
2723 nulls, so using "strlen()" or similar is not a good way to find
2724 the length of hash keys. This is very similar to the "SvPV()"
2725 macro described elsewhere in this document. See also "HeUTF8".
2726
2727 If you are using "HePV" to get values to pass to "newSVpvn()"
2728 to create a new SV, you should consider using
2729 "newSVhek(HeKEY_hek(he))" as it is more efficient.
2730
2731 char* HePV(HE* he, STRLEN len)
2732
2733 HeSVKEY Returns the key as an "SV*", or "NULL" if the hash entry does
2734 not contain an "SV*" key.
2735
2736 SV* HeSVKEY(HE* he)
2737
2738 HeSVKEY_force
2739 Returns the key as an "SV*". Will create and return a
2740 temporary mortal "SV*" if the hash entry contains only a
2741 "char*" key.
2742
2743 SV* HeSVKEY_force(HE* he)
2744
2745 HeSVKEY_set
2746 Sets the key to a given "SV*", taking care to set the
2747 appropriate flags to indicate the presence of an "SV*" key, and
2748 returns the same "SV*".
2749
2750 SV* HeSVKEY_set(HE* he, SV* sv)
2751
2752 HeUTF8 Returns whether the "char *" value returned by "HePV" is
2753 encoded in UTF-8, doing any necessary dereferencing of possibly
2754 "SV*" keys. The value returned will be 0 or non-0, not
2755 necessarily 1 (or even a value with any low bits set), so do
2756 not blindly assign this to a "bool" variable, as "bool" may be
2757 a typedef for "char".
2758
2759 U32 HeUTF8(HE* he)
2760
2761 HeVAL Returns the value slot (type "SV*") stored in the hash entry.
2762 Can be assigned to.
2763
2764 SV *foo= HeVAL(hv);
2765 HeVAL(hv)= sv;
2766
2767
2768 SV* HeVAL(HE* he)
2769
2770 hv_assert
2771 Check that a hash is in an internally consistent state.
2772
2773 void hv_assert(HV *hv)
2774
2775 hv_bucket_ratio
2776 NOTE: this function is experimental and may change or be
2777 removed without notice.
2778
2779 If the hash is tied dispatches through to the SCALAR tied
2780 method, otherwise if the hash contains no keys returns 0,
2781 otherwise returns a mortal sv containing a string specifying
2782 the number of used buckets, followed by a slash, followed by
2783 the number of available buckets.
2784
2785 This function is expensive, it must scan all of the buckets to
2786 determine which are used, and the count is NOT cached. In a
2787 large hash this could be a lot of buckets.
2788
2789 SV* hv_bucket_ratio(HV *hv)
2790
2791 hv_clear
2792 Frees the all the elements of a hash, leaving it empty. The XS
2793 equivalent of "%hash = ()". See also "hv_undef".
2794
2795 See "av_clear" for a note about the hash possibly being invalid
2796 on return.
2797
2798 void hv_clear(HV *hv)
2799
2800 hv_clear_placeholders
2801 Clears any placeholders from a hash. If a restricted hash has
2802 any of its keys marked as readonly and the key is subsequently
2803 deleted, the key is not actually deleted but is marked by
2804 assigning it a value of &PL_sv_placeholder. This tags it so it
2805 will be ignored by future operations such as iterating over the
2806 hash, but will still allow the hash to have a value reassigned
2807 to the key at some future point. This function clears any such
2808 placeholder keys from the hash. See "Hash::Util::lock_keys()"
2809 for an example of its use.
2810
2811 void hv_clear_placeholders(HV *hv)
2812
2813 hv_copy_hints_hv
2814 A specialised version of "newHVhv" for copying "%^H". "ohv"
2815 must be a pointer to a hash (which may have "%^H" magic, but
2816 should be generally non-magical), or "NULL" (interpreted as an
2817 empty hash). The content of "ohv" is copied to a new hash,
2818 which has the "%^H"-specific magic added to it. A pointer to
2819 the new hash is returned.
2820
2821 HV * hv_copy_hints_hv(HV *ohv)
2822
2823 hv_delete
2824 Deletes a key/value pair in the hash. The value's SV is
2825 removed from the hash, made mortal, and returned to the caller.
2826 The absolute value of "klen" is the length of the key. If
2827 "klen" is negative the key is assumed to be in UTF-8-encoded
2828 Unicode. The "flags" value will normally be zero; if set to
2829 "G_DISCARD" then "NULL" will be returned. "NULL" will also be
2830 returned if the key is not found.
2831
2832 SV* hv_delete(HV *hv, const char *key, I32 klen,
2833 I32 flags)
2834
2835 hv_delete_ent
2836 Deletes a key/value pair in the hash. The value SV is removed
2837 from the hash, made mortal, and returned to the caller. The
2838 "flags" value will normally be zero; if set to "G_DISCARD" then
2839 "NULL" will be returned. "NULL" will also be returned if the
2840 key is not found. "hash" can be a valid precomputed hash
2841 value, or 0 to ask for it to be computed.
2842
2843 SV* hv_delete_ent(HV *hv, SV *keysv, I32 flags,
2844 U32 hash)
2845
2846 HvENAME Returns the effective name of a stash, or NULL if there is
2847 none. The effective name represents a location in the symbol
2848 table where this stash resides. It is updated automatically
2849 when packages are aliased or deleted. A stash that is no
2850 longer in the symbol table has no effective name. This name is
2851 preferable to "HvNAME" for use in MRO linearisations and isa
2852 caches.
2853
2854 char* HvENAME(HV* stash)
2855
2856 HvENAMELEN
2857 Returns the length of the stash's effective name.
2858
2859 STRLEN HvENAMELEN(HV *stash)
2860
2861 HvENAMEUTF8
2862 Returns true if the effective name is in UTF-8 encoding.
2863
2864 unsigned char HvENAMEUTF8(HV *stash)
2865
2866 hv_exists
2867 Returns a boolean indicating whether the specified hash key
2868 exists. The absolute value of "klen" is the length of the key.
2869 If "klen" is negative the key is assumed to be in UTF-8-encoded
2870 Unicode.
2871
2872 bool hv_exists(HV *hv, const char *key, I32 klen)
2873
2874 hv_exists_ent
2875 Returns a boolean indicating whether the specified hash key
2876 exists. "hash" can be a valid precomputed hash value, or 0 to
2877 ask for it to be computed.
2878
2879 bool hv_exists_ent(HV *hv, SV *keysv, U32 hash)
2880
2881 hv_fetch
2882 Returns the SV which corresponds to the specified key in the
2883 hash. The absolute value of "klen" is the length of the key.
2884 If "klen" is negative the key is assumed to be in UTF-8-encoded
2885 Unicode. If "lval" is set then the fetch will be part of a
2886 store. This means that if there is no value in the hash
2887 associated with the given key, then one is created and a
2888 pointer to it is returned. The "SV*" it points to can be
2889 assigned to. But always check that the return value is non-
2890 null before dereferencing it to an "SV*".
2891
2892 See "Understanding the Magic of Tied Hashes and Arrays" in
2893 perlguts for more information on how to use this function on
2894 tied hashes.
2895
2896 SV** hv_fetch(HV *hv, const char *key, I32 klen,
2897 I32 lval)
2898
2899 hv_fetchs
2900 Like "hv_fetch", but takes a literal string instead of a
2901 string/length pair.
2902
2903 SV** hv_fetchs(HV* tb, "literal string" key,
2904 I32 lval)
2905
2906 hv_fetch_ent
2907 Returns the hash entry which corresponds to the specified key
2908 in the hash. "hash" must be a valid precomputed hash number
2909 for the given "key", or 0 if you want the function to compute
2910 it. IF "lval" is set then the fetch will be part of a store.
2911 Make sure the return value is non-null before accessing it.
2912 The return value when "hv" is a tied hash is a pointer to a
2913 static location, so be sure to make a copy of the structure if
2914 you need to store it somewhere.
2915
2916 See "Understanding the Magic of Tied Hashes and Arrays" in
2917 perlguts for more information on how to use this function on
2918 tied hashes.
2919
2920 HE* hv_fetch_ent(HV *hv, SV *keysv, I32 lval,
2921 U32 hash)
2922
2923 hv_fill Returns the number of hash buckets that happen to be in use.
2924
2925 This function is wrapped by the macro "HvFILL".
2926
2927 As of perl 5.25 this function is used only for debugging
2928 purposes, and the number of used hash buckets is not in any way
2929 cached, thus this function can be costly to execute as it must
2930 iterate over all the buckets in the hash.
2931
2932 STRLEN hv_fill(HV *const hv)
2933
2934 hv_iterinit
2935 Prepares a starting point to traverse a hash table. Returns
2936 the number of keys in the hash, including placeholders (i.e.
2937 the same as "HvTOTALKEYS(hv)"). The return value is currently
2938 only meaningful for hashes without tie magic.
2939
2940 NOTE: Before version 5.004_65, "hv_iterinit" used to return the
2941 number of hash buckets that happen to be in use. If you still
2942 need that esoteric value, you can get it through the macro
2943 "HvFILL(hv)".
2944
2945 I32 hv_iterinit(HV *hv)
2946
2947 hv_iterkey
2948 Returns the key from the current position of the hash iterator.
2949 See "hv_iterinit".
2950
2951 char* hv_iterkey(HE* entry, I32* retlen)
2952
2953 hv_iterkeysv
2954 Returns the key as an "SV*" from the current position of the
2955 hash iterator. The return value will always be a mortal copy
2956 of the key. Also see "hv_iterinit".
2957
2958 SV* hv_iterkeysv(HE* entry)
2959
2960 hv_iternext
2961 Returns entries from a hash iterator. See "hv_iterinit".
2962
2963 You may call "hv_delete" or "hv_delete_ent" on the hash entry
2964 that the iterator currently points to, without losing your
2965 place or invalidating your iterator. Note that in this case
2966 the current entry is deleted from the hash with your iterator
2967 holding the last reference to it. Your iterator is flagged to
2968 free the entry on the next call to "hv_iternext", so you must
2969 not discard your iterator immediately else the entry will leak
2970 - call "hv_iternext" to trigger the resource deallocation.
2971
2972 HE* hv_iternext(HV *hv)
2973
2974 hv_iternextsv
2975 Performs an "hv_iternext", "hv_iterkey", and "hv_iterval" in
2976 one operation.
2977
2978 SV* hv_iternextsv(HV *hv, char **key, I32 *retlen)
2979
2980 hv_iternext_flags
2981 NOTE: this function is experimental and may change or be
2982 removed without notice.
2983
2984 Returns entries from a hash iterator. See "hv_iterinit" and
2985 "hv_iternext". The "flags" value will normally be zero; if
2986 "HV_ITERNEXT_WANTPLACEHOLDERS" is set the placeholders keys
2987 (for restricted hashes) will be returned in addition to normal
2988 keys. By default placeholders are automatically skipped over.
2989 Currently a placeholder is implemented with a value that is
2990 &PL_sv_placeholder. Note that the implementation of
2991 placeholders and restricted hashes may change, and the
2992 implementation currently is insufficiently abstracted for any
2993 change to be tidy.
2994
2995 HE* hv_iternext_flags(HV *hv, I32 flags)
2996
2997 hv_iterval
2998 Returns the value from the current position of the hash
2999 iterator. See "hv_iterkey".
3000
3001 SV* hv_iterval(HV *hv, HE *entry)
3002
3003 hv_magic
3004 Adds magic to a hash. See "sv_magic".
3005
3006 void hv_magic(HV *hv, GV *gv, int how)
3007
3008 HvNAME Returns the package name of a stash, or "NULL" if "stash" isn't
3009 a stash. See "SvSTASH", "CvSTASH".
3010
3011 char* HvNAME(HV* stash)
3012
3013 HvNAMELEN
3014 Returns the length of the stash's name.
3015
3016 STRLEN HvNAMELEN(HV *stash)
3017
3018 HvNAMEUTF8
3019 Returns true if the name is in UTF-8 encoding.
3020
3021 unsigned char HvNAMEUTF8(HV *stash)
3022
3023 hv_scalar
3024 Evaluates the hash in scalar context and returns the result.
3025
3026 When the hash is tied dispatches through to the SCALAR method,
3027 otherwise returns a mortal SV containing the number of keys in
3028 the hash.
3029
3030 Note, prior to 5.25 this function returned what is now returned
3031 by the hv_bucket_ratio() function.
3032
3033 SV* hv_scalar(HV *hv)
3034
3035 hv_store
3036 Stores an SV in a hash. The hash key is specified as "key" and
3037 the absolute value of "klen" is the length of the key. If
3038 "klen" is negative the key is assumed to be in UTF-8-encoded
3039 Unicode. The "hash" parameter is the precomputed hash value;
3040 if it is zero then Perl will compute it.
3041
3042 The return value will be "NULL" if the operation failed or if
3043 the value did not need to be actually stored within the hash
3044 (as in the case of tied hashes). Otherwise it can be
3045 dereferenced to get the original "SV*". Note that the caller
3046 is responsible for suitably incrementing the reference count of
3047 "val" before the call, and decrementing it if the function
3048 returned "NULL". Effectively a successful "hv_store" takes
3049 ownership of one reference to "val". This is usually what you
3050 want; a newly created SV has a reference count of one, so if
3051 all your code does is create SVs then store them in a hash,
3052 "hv_store" will own the only reference to the new SV, and your
3053 code doesn't need to do anything further to tidy up.
3054 "hv_store" is not implemented as a call to "hv_store_ent", and
3055 does not create a temporary SV for the key, so if your key data
3056 is not already in SV form then use "hv_store" in preference to
3057 "hv_store_ent".
3058
3059 See "Understanding the Magic of Tied Hashes and Arrays" in
3060 perlguts for more information on how to use this function on
3061 tied hashes.
3062
3063 SV** hv_store(HV *hv, const char *key, I32 klen,
3064 SV *val, U32 hash)
3065
3066 hv_stores
3067 Like "hv_store", but takes a literal string instead of a
3068 string/length pair and omits the hash parameter.
3069
3070 SV** hv_stores(HV* tb, "literal string" key, SV* val)
3071
3072 hv_store_ent
3073 Stores "val" in a hash. The hash key is specified as "key".
3074 The "hash" parameter is the precomputed hash value; if it is
3075 zero then Perl will compute it. The return value is the new
3076 hash entry so created. It will be "NULL" if the operation
3077 failed or if the value did not need to be actually stored
3078 within the hash (as in the case of tied hashes). Otherwise the
3079 contents of the return value can be accessed using the "He?"
3080 macros described here. Note that the caller is responsible for
3081 suitably incrementing the reference count of "val" before the
3082 call, and decrementing it if the function returned NULL.
3083 Effectively a successful "hv_store_ent" takes ownership of one
3084 reference to "val". This is usually what you want; a newly
3085 created SV has a reference count of one, so if all your code
3086 does is create SVs then store them in a hash, "hv_store" will
3087 own the only reference to the new SV, and your code doesn't
3088 need to do anything further to tidy up. Note that
3089 "hv_store_ent" only reads the "key"; unlike "val" it does not
3090 take ownership of it, so maintaining the correct reference
3091 count on "key" is entirely the caller's responsibility. The
3092 reason it does not take ownership, is that "key" is not used
3093 after this function returns, and so can be freed immediately.
3094 "hv_store" is not implemented as a call to "hv_store_ent", and
3095 does not create a temporary SV for the key, so if your key data
3096 is not already in SV form then use "hv_store" in preference to
3097 "hv_store_ent".
3098
3099 See "Understanding the Magic of Tied Hashes and Arrays" in
3100 perlguts for more information on how to use this function on
3101 tied hashes.
3102
3103 HE* hv_store_ent(HV *hv, SV *key, SV *val, U32 hash)
3104
3105 hv_undef
3106 Undefines the hash. The XS equivalent of "undef(%hash)".
3107
3108 As well as freeing all the elements of the hash (like
3109 "hv_clear()"), this also frees any auxiliary data and storage
3110 associated with the hash.
3111
3112 See "av_clear" for a note about the hash possibly being invalid
3113 on return.
3114
3115 void hv_undef(HV *hv)
3116
3117 newHV Creates a new HV. The reference count is set to 1.
3118
3119 HV* newHV()
3120
3122 These functions provide convenient and thread-safe means of
3123 manipulating hook variables.
3124
3125 wrap_op_checker
3126 Puts a C function into the chain of check functions for a
3127 specified op type. This is the preferred way to manipulate the
3128 "PL_check" array. "opcode" specifies which type of op is to be
3129 affected. "new_checker" is a pointer to the C function that is
3130 to be added to that opcode's check chain, and "old_checker_p"
3131 points to the storage location where a pointer to the next
3132 function in the chain will be stored. The value of
3133 "new_checker" is written into the "PL_check" array, while the
3134 value previously stored there is written to *old_checker_p.
3135
3136 "PL_check" is global to an entire process, and a module wishing
3137 to hook op checking may find itself invoked more than once per
3138 process, typically in different threads. To handle that
3139 situation, this function is idempotent. The location
3140 *old_checker_p must initially (once per process) contain a null
3141 pointer. A C variable of static duration (declared at file
3142 scope, typically also marked "static" to give it internal
3143 linkage) will be implicitly initialised appropriately, if it
3144 does not have an explicit initialiser. This function will only
3145 actually modify the check chain if it finds *old_checker_p to
3146 be null. This function is also thread safe on the small scale.
3147 It uses appropriate locking to avoid race conditions in
3148 accessing "PL_check".
3149
3150 When this function is called, the function referenced by
3151 "new_checker" must be ready to be called, except for
3152 *old_checker_p being unfilled. In a threading situation,
3153 "new_checker" may be called immediately, even before this
3154 function has returned. *old_checker_p will always be
3155 appropriately set before "new_checker" is called. If
3156 "new_checker" decides not to do anything special with an op
3157 that it is given (which is the usual case for most uses of op
3158 check hooking), it must chain the check function referenced by
3159 *old_checker_p.
3160
3161 Taken all together, XS code to hook an op checker should
3162 typically look something like this:
3163
3164 static Perl_check_t nxck_frob;
3165 static OP *myck_frob(pTHX_ OP *op) {
3166 ...
3167 op = nxck_frob(aTHX_ op);
3168 ...
3169 return op;
3170 }
3171 BOOT:
3172 wrap_op_checker(OP_FROB, myck_frob, &nxck_frob);
3173
3174 If you want to influence compilation of calls to a specific
3175 subroutine, then use "cv_set_call_checker_flags" rather than
3176 hooking checking of all "entersub" ops.
3177
3178 void wrap_op_checker(Optype opcode,
3179 Perl_check_t new_checker,
3180 Perl_check_t *old_checker_p)
3181
3183 This is the lower layer of the Perl parser, managing characters and
3184 tokens.
3185
3186 lex_bufutf8
3187 NOTE: this function is experimental and may change or be
3188 removed without notice.
3189
3190 Indicates whether the octets in the lexer buffer
3191 ("PL_parser->linestr") should be interpreted as the UTF-8
3192 encoding of Unicode characters. If not, they should be
3193 interpreted as Latin-1 characters. This is analogous to the
3194 "SvUTF8" flag for scalars.
3195
3196 In UTF-8 mode, it is not guaranteed that the lexer buffer
3197 actually contains valid UTF-8. Lexing code must be robust in
3198 the face of invalid encoding.
3199
3200 The actual "SvUTF8" flag of the "PL_parser->linestr" scalar is
3201 significant, but not the whole story regarding the input
3202 character encoding. Normally, when a file is being read, the
3203 scalar contains octets and its "SvUTF8" flag is off, but the
3204 octets should be interpreted as UTF-8 if the "use utf8" pragma
3205 is in effect. During a string eval, however, the scalar may
3206 have the "SvUTF8" flag on, and in this case its octets should
3207 be interpreted as UTF-8 unless the "use bytes" pragma is in
3208 effect. This logic may change in the future; use this function
3209 instead of implementing the logic yourself.
3210
3211 bool lex_bufutf8()
3212
3213 lex_discard_to
3214 NOTE: this function is experimental and may change or be
3215 removed without notice.
3216
3217 Discards the first part of the "PL_parser->linestr" buffer, up
3218 to "ptr". The remaining content of the buffer will be moved,
3219 and all pointers into the buffer updated appropriately. "ptr"
3220 must not be later in the buffer than the position of
3221 "PL_parser->bufptr": it is not permitted to discard text that
3222 has yet to be lexed.
3223
3224 Normally it is not necessarily to do this directly, because it
3225 suffices to use the implicit discarding behaviour of
3226 "lex_next_chunk" and things based on it. However, if a token
3227 stretches across multiple lines, and the lexing code has kept
3228 multiple lines of text in the buffer for that purpose, then
3229 after completion of the token it would be wise to explicitly
3230 discard the now-unneeded earlier lines, to avoid future multi-
3231 line tokens growing the buffer without bound.
3232
3233 void lex_discard_to(char *ptr)
3234
3235 lex_grow_linestr
3236 NOTE: this function is experimental and may change or be
3237 removed without notice.
3238
3239 Reallocates the lexer buffer ("PL_parser->linestr") to
3240 accommodate at least "len" octets (including terminating
3241 "NUL"). Returns a pointer to the reallocated buffer. This is
3242 necessary before making any direct modification of the buffer
3243 that would increase its length. "lex_stuff_pvn" provides a
3244 more convenient way to insert text into the buffer.
3245
3246 Do not use "SvGROW" or "sv_grow" directly on
3247 "PL_parser->linestr"; this function updates all of the lexer's
3248 variables that point directly into the buffer.
3249
3250 char * lex_grow_linestr(STRLEN len)
3251
3252 lex_next_chunk
3253 NOTE: this function is experimental and may change or be
3254 removed without notice.
3255
3256 Reads in the next chunk of text to be lexed, appending it to
3257 "PL_parser->linestr". This should be called when lexing code
3258 has looked to the end of the current chunk and wants to know
3259 more. It is usual, but not necessary, for lexing to have
3260 consumed the entirety of the current chunk at this time.
3261
3262 If "PL_parser->bufptr" is pointing to the very end of the
3263 current chunk (i.e., the current chunk has been entirely
3264 consumed), normally the current chunk will be discarded at the
3265 same time that the new chunk is read in. If "flags" has the
3266 "LEX_KEEP_PREVIOUS" bit set, the current chunk will not be
3267 discarded. If the current chunk has not been entirely
3268 consumed, then it will not be discarded regardless of the flag.
3269
3270 Returns true if some new text was added to the buffer, or false
3271 if the buffer has reached the end of the input text.
3272
3273 bool lex_next_chunk(U32 flags)
3274
3275 lex_peek_unichar
3276 NOTE: this function is experimental and may change or be
3277 removed without notice.
3278
3279 Looks ahead one (Unicode) character in the text currently being
3280 lexed. Returns the codepoint (unsigned integer value) of the
3281 next character, or -1 if lexing has reached the end of the
3282 input text. To consume the peeked character, use
3283 "lex_read_unichar".
3284
3285 If the next character is in (or extends into) the next chunk of
3286 input text, the next chunk will be read in. Normally the
3287 current chunk will be discarded at the same time, but if
3288 "flags" has the "LEX_KEEP_PREVIOUS" bit set, then the current
3289 chunk will not be discarded.
3290
3291 If the input is being interpreted as UTF-8 and a UTF-8 encoding
3292 error is encountered, an exception is generated.
3293
3294 I32 lex_peek_unichar(U32 flags)
3295
3296 lex_read_space
3297 NOTE: this function is experimental and may change or be
3298 removed without notice.
3299
3300 Reads optional spaces, in Perl style, in the text currently
3301 being lexed. The spaces may include ordinary whitespace
3302 characters and Perl-style comments. "#line" directives are
3303 processed if encountered. "PL_parser->bufptr" is moved past
3304 the spaces, so that it points at a non-space character (or the
3305 end of the input text).
3306
3307 If spaces extend into the next chunk of input text, the next
3308 chunk will be read in. Normally the current chunk will be
3309 discarded at the same time, but if "flags" has the
3310 "LEX_KEEP_PREVIOUS" bit set, then the current chunk will not be
3311 discarded.
3312
3313 void lex_read_space(U32 flags)
3314
3315 lex_read_to
3316 NOTE: this function is experimental and may change or be
3317 removed without notice.
3318
3319 Consume text in the lexer buffer, from "PL_parser->bufptr" up
3320 to "ptr". This advances "PL_parser->bufptr" to match "ptr",
3321 performing the correct bookkeeping whenever a newline character
3322 is passed. This is the normal way to consume lexed text.
3323
3324 Interpretation of the buffer's octets can be abstracted out by
3325 using the slightly higher-level functions "lex_peek_unichar"
3326 and "lex_read_unichar".
3327
3328 void lex_read_to(char *ptr)
3329
3330 lex_read_unichar
3331 NOTE: this function is experimental and may change or be
3332 removed without notice.
3333
3334 Reads the next (Unicode) character in the text currently being
3335 lexed. Returns the codepoint (unsigned integer value) of the
3336 character read, and moves "PL_parser->bufptr" past the
3337 character, or returns -1 if lexing has reached the end of the
3338 input text. To non-destructively examine the next character,
3339 use "lex_peek_unichar" instead.
3340
3341 If the next character is in (or extends into) the next chunk of
3342 input text, the next chunk will be read in. Normally the
3343 current chunk will be discarded at the same time, but if
3344 "flags" has the "LEX_KEEP_PREVIOUS" bit set, then the current
3345 chunk will not be discarded.
3346
3347 If the input is being interpreted as UTF-8 and a UTF-8 encoding
3348 error is encountered, an exception is generated.
3349
3350 I32 lex_read_unichar(U32 flags)
3351
3352 lex_start
3353 NOTE: this function is experimental and may change or be
3354 removed without notice.
3355
3356 Creates and initialises a new lexer/parser state object,
3357 supplying a context in which to lex and parse from a new source
3358 of Perl code. A pointer to the new state object is placed in
3359 "PL_parser". An entry is made on the save stack so that upon
3360 unwinding, the new state object will be destroyed and the
3361 former value of "PL_parser" will be restored. Nothing else
3362 need be done to clean up the parsing context.
3363
3364 The code to be parsed comes from "line" and "rsfp". "line", if
3365 non-null, provides a string (in SV form) containing code to be
3366 parsed. A copy of the string is made, so subsequent
3367 modification of "line" does not affect parsing. "rsfp", if
3368 non-null, provides an input stream from which code will be read
3369 to be parsed. If both are non-null, the code in "line" comes
3370 first and must consist of complete lines of input, and "rsfp"
3371 supplies the remainder of the source.
3372
3373 The "flags" parameter is reserved for future use. Currently it
3374 is only used by perl internally, so extensions should always
3375 pass zero.
3376
3377 void lex_start(SV *line, PerlIO *rsfp, U32 flags)
3378
3379 lex_stuff_pv
3380 NOTE: this function is experimental and may change or be
3381 removed without notice.
3382
3383 Insert characters into the lexer buffer ("PL_parser->linestr"),
3384 immediately after the current lexing point
3385 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3386 This means that lexing code that runs later will see the
3387 characters as if they had appeared in the input. It is not
3388 recommended to do this as part of normal parsing, and most uses
3389 of this facility run the risk of the inserted characters being
3390 interpreted in an unintended manner.
3391
3392 The string to be inserted is represented by octets starting at
3393 "pv" and continuing to the first nul. These octets are
3394 interpreted as either UTF-8 or Latin-1, according to whether
3395 the "LEX_STUFF_UTF8" flag is set in "flags". The characters
3396 are recoded for the lexer buffer, according to how the buffer
3397 is currently being interpreted ("lex_bufutf8"). If it is not
3398 convenient to nul-terminate a string to be inserted, the
3399 "lex_stuff_pvn" function is more appropriate.
3400
3401 void lex_stuff_pv(const char *pv, U32 flags)
3402
3403 lex_stuff_pvn
3404 NOTE: this function is experimental and may change or be
3405 removed without notice.
3406
3407 Insert characters into the lexer buffer ("PL_parser->linestr"),
3408 immediately after the current lexing point
3409 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3410 This means that lexing code that runs later will see the
3411 characters as if they had appeared in the input. It is not
3412 recommended to do this as part of normal parsing, and most uses
3413 of this facility run the risk of the inserted characters being
3414 interpreted in an unintended manner.
3415
3416 The string to be inserted is represented by "len" octets
3417 starting at "pv". These octets are interpreted as either UTF-8
3418 or Latin-1, according to whether the "LEX_STUFF_UTF8" flag is
3419 set in "flags". The characters are recoded for the lexer
3420 buffer, according to how the buffer is currently being
3421 interpreted ("lex_bufutf8"). If a string to be inserted is
3422 available as a Perl scalar, the "lex_stuff_sv" function is more
3423 convenient.
3424
3425 void lex_stuff_pvn(const char *pv, STRLEN len,
3426 U32 flags)
3427
3428 lex_stuff_pvs
3429 NOTE: this function is experimental and may change or be
3430 removed without notice.
3431
3432 Like "lex_stuff_pvn", but takes a literal string instead of a
3433 string/length pair.
3434
3435 void lex_stuff_pvs("literal string" pv, U32 flags)
3436
3437 lex_stuff_sv
3438 NOTE: this function is experimental and may change or be
3439 removed without notice.
3440
3441 Insert characters into the lexer buffer ("PL_parser->linestr"),
3442 immediately after the current lexing point
3443 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3444 This means that lexing code that runs later will see the
3445 characters as if they had appeared in the input. It is not
3446 recommended to do this as part of normal parsing, and most uses
3447 of this facility run the risk of the inserted characters being
3448 interpreted in an unintended manner.
3449
3450 The string to be inserted is the string value of "sv". The
3451 characters are recoded for the lexer buffer, according to how
3452 the buffer is currently being interpreted ("lex_bufutf8"). If
3453 a string to be inserted is not already a Perl scalar, the
3454 "lex_stuff_pvn" function avoids the need to construct a scalar.
3455
3456 void lex_stuff_sv(SV *sv, U32 flags)
3457
3458 lex_unstuff
3459 NOTE: this function is experimental and may change or be
3460 removed without notice.
3461
3462 Discards text about to be lexed, from "PL_parser->bufptr" up to
3463 "ptr". Text following "ptr" will be moved, and the buffer
3464 shortened. This hides the discarded text from any lexing code
3465 that runs later, as if the text had never appeared.
3466
3467 This is not the normal way to consume lexed text. For that,
3468 use "lex_read_to".
3469
3470 void lex_unstuff(char *ptr)
3471
3472 parse_arithexpr
3473 NOTE: this function is experimental and may change or be
3474 removed without notice.
3475
3476 Parse a Perl arithmetic expression. This may contain operators
3477 of precedence down to the bit shift operators. The expression
3478 must be followed (and thus terminated) either by a comparison
3479 or lower-precedence operator or by something that would
3480 normally terminate an expression such as semicolon. If "flags"
3481 has the "PARSE_OPTIONAL" bit set, then the expression is
3482 optional, otherwise it is mandatory. It is up to the caller to
3483 ensure that the dynamic parser state ("PL_parser" et al) is
3484 correctly set to reflect the source of the code to be parsed
3485 and the lexical context for the expression.
3486
3487 The op tree representing the expression is returned. If an
3488 optional expression is absent, a null pointer is returned,
3489 otherwise the pointer will be non-null.
3490
3491 If an error occurs in parsing or compilation, in most cases a
3492 valid op tree is returned anyway. The error is reflected in
3493 the parser state, normally resulting in a single exception at
3494 the top level of parsing which covers all the compilation
3495 errors that occurred. Some compilation errors, however, will
3496 throw an exception immediately.
3497
3498 OP * parse_arithexpr(U32 flags)
3499
3500 parse_barestmt
3501 NOTE: this function is experimental and may change or be
3502 removed without notice.
3503
3504 Parse a single unadorned Perl statement. This may be a normal
3505 imperative statement or a declaration that has compile-time
3506 effect. It does not include any label or other affixture. It
3507 is up to the caller to ensure that the dynamic parser state
3508 ("PL_parser" et al) is correctly set to reflect the source of
3509 the code to be parsed and the lexical context for the
3510 statement.
3511
3512 The op tree representing the statement is returned. This may
3513 be a null pointer if the statement is null, for example if it
3514 was actually a subroutine definition (which has compile-time
3515 side effects). If not null, it will be ops directly
3516 implementing the statement, suitable to pass to "newSTATEOP".
3517 It will not normally include a "nextstate" or equivalent op
3518 (except for those embedded in a scope contained entirely within
3519 the statement).
3520
3521 If an error occurs in parsing or compilation, in most cases a
3522 valid op tree (most likely null) is returned anyway. The error
3523 is reflected in the parser state, normally resulting in a
3524 single exception at the top level of parsing which covers all
3525 the compilation errors that occurred. Some compilation errors,
3526 however, will throw an exception immediately.
3527
3528 The "flags" parameter is reserved for future use, and must
3529 always be zero.
3530
3531 OP * parse_barestmt(U32 flags)
3532
3533 parse_block
3534 NOTE: this function is experimental and may change or be
3535 removed without notice.
3536
3537 Parse a single complete Perl code block. This consists of an
3538 opening brace, a sequence of statements, and a closing brace.
3539 The block constitutes a lexical scope, so "my" variables and
3540 various compile-time effects can be contained within it. It is
3541 up to the caller to ensure that the dynamic parser state
3542 ("PL_parser" et al) is correctly set to reflect the source of
3543 the code to be parsed and the lexical context for the
3544 statement.
3545
3546 The op tree representing the code block is returned. This is
3547 always a real op, never a null pointer. It will normally be a
3548 "lineseq" list, including "nextstate" or equivalent ops. No
3549 ops to construct any kind of runtime scope are included by
3550 virtue of it being a block.
3551
3552 If an error occurs in parsing or compilation, in most cases a
3553 valid op tree (most likely null) is returned anyway. The error
3554 is reflected in the parser state, normally resulting in a
3555 single exception at the top level of parsing which covers all
3556 the compilation errors that occurred. Some compilation errors,
3557 however, will throw an exception immediately.
3558
3559 The "flags" parameter is reserved for future use, and must
3560 always be zero.
3561
3562 OP * parse_block(U32 flags)
3563
3564 parse_fullexpr
3565 NOTE: this function is experimental and may change or be
3566 removed without notice.
3567
3568 Parse a single complete Perl expression. This allows the full
3569 expression grammar, including the lowest-precedence operators
3570 such as "or". The expression must be followed (and thus
3571 terminated) by a token that an expression would normally be
3572 terminated by: end-of-file, closing bracketing punctuation,
3573 semicolon, or one of the keywords that signals a postfix
3574 expression-statement modifier. If "flags" has the
3575 "PARSE_OPTIONAL" bit set, then the expression is optional,
3576 otherwise it is mandatory. It is up to the caller to ensure
3577 that the dynamic parser state ("PL_parser" et al) is correctly
3578 set to reflect the source of the code to be parsed and the
3579 lexical context for the expression.
3580
3581 The op tree representing the expression is returned. If an
3582 optional expression is absent, a null pointer is returned,
3583 otherwise the pointer will be non-null.
3584
3585 If an error occurs in parsing or compilation, in most cases a
3586 valid op tree is returned anyway. The error is reflected in
3587 the parser state, normally resulting in a single exception at
3588 the top level of parsing which covers all the compilation
3589 errors that occurred. Some compilation errors, however, will
3590 throw an exception immediately.
3591
3592 OP * parse_fullexpr(U32 flags)
3593
3594 parse_fullstmt
3595 NOTE: this function is experimental and may change or be
3596 removed without notice.
3597
3598 Parse a single complete Perl statement. This may be a normal
3599 imperative statement or a declaration that has compile-time
3600 effect, and may include optional labels. It is up to the
3601 caller to ensure that the dynamic parser state ("PL_parser" et
3602 al) is correctly set to reflect the source of the code to be
3603 parsed and the lexical context for the statement.
3604
3605 The op tree representing the statement is returned. This may
3606 be a null pointer if the statement is null, for example if it
3607 was actually a subroutine definition (which has compile-time
3608 side effects). If not null, it will be the result of a
3609 "newSTATEOP" call, normally including a "nextstate" or
3610 equivalent op.
3611
3612 If an error occurs in parsing or compilation, in most cases a
3613 valid op tree (most likely null) is returned anyway. The error
3614 is reflected in the parser state, normally resulting in a
3615 single exception at the top level of parsing which covers all
3616 the compilation errors that occurred. Some compilation errors,
3617 however, will throw an exception immediately.
3618
3619 The "flags" parameter is reserved for future use, and must
3620 always be zero.
3621
3622 OP * parse_fullstmt(U32 flags)
3623
3624 parse_label
3625 NOTE: this function is experimental and may change or be
3626 removed without notice.
3627
3628 Parse a single label, possibly optional, of the type that may
3629 prefix a Perl statement. It is up to the caller to ensure that
3630 the dynamic parser state ("PL_parser" et al) is correctly set
3631 to reflect the source of the code to be parsed. If "flags" has
3632 the "PARSE_OPTIONAL" bit set, then the label is optional,
3633 otherwise it is mandatory.
3634
3635 The name of the label is returned in the form of a fresh
3636 scalar. If an optional label is absent, a null pointer is
3637 returned.
3638
3639 If an error occurs in parsing, which can only occur if the
3640 label is mandatory, a valid label is returned anyway. The
3641 error is reflected in the parser state, normally resulting in a
3642 single exception at the top level of parsing which covers all
3643 the compilation errors that occurred.
3644
3645 SV * parse_label(U32 flags)
3646
3647 parse_listexpr
3648 NOTE: this function is experimental and may change or be
3649 removed without notice.
3650
3651 Parse a Perl list expression. This may contain operators of
3652 precedence down to the comma operator. The expression must be
3653 followed (and thus terminated) either by a low-precedence logic
3654 operator such as "or" or by something that would normally
3655 terminate an expression such as semicolon. If "flags" has the
3656 "PARSE_OPTIONAL" bit set, then the expression is optional,
3657 otherwise it is mandatory. It is up to the caller to ensure
3658 that the dynamic parser state ("PL_parser" et al) is correctly
3659 set to reflect the source of the code to be parsed and the
3660 lexical context for the expression.
3661
3662 The op tree representing the expression is returned. If an
3663 optional expression is absent, a null pointer is returned,
3664 otherwise the pointer will be non-null.
3665
3666 If an error occurs in parsing or compilation, in most cases a
3667 valid op tree is returned anyway. The error is reflected in
3668 the parser state, normally resulting in a single exception at
3669 the top level of parsing which covers all the compilation
3670 errors that occurred. Some compilation errors, however, will
3671 throw an exception immediately.
3672
3673 OP * parse_listexpr(U32 flags)
3674
3675 parse_stmtseq
3676 NOTE: this function is experimental and may change or be
3677 removed without notice.
3678
3679 Parse a sequence of zero or more Perl statements. These may be
3680 normal imperative statements, including optional labels, or
3681 declarations that have compile-time effect, or any mixture
3682 thereof. The statement sequence ends when a closing brace or
3683 end-of-file is encountered in a place where a new statement
3684 could have validly started. It is up to the caller to ensure
3685 that the dynamic parser state ("PL_parser" et al) is correctly
3686 set to reflect the source of the code to be parsed and the
3687 lexical context for the statements.
3688
3689 The op tree representing the statement sequence is returned.
3690 This may be a null pointer if the statements were all null, for
3691 example if there were no statements or if there were only
3692 subroutine definitions (which have compile-time side effects).
3693 If not null, it will be a "lineseq" list, normally including
3694 "nextstate" or equivalent ops.
3695
3696 If an error occurs in parsing or compilation, in most cases a
3697 valid op tree is returned anyway. The error is reflected in
3698 the parser state, normally resulting in a single exception at
3699 the top level of parsing which covers all the compilation
3700 errors that occurred. Some compilation errors, however, will
3701 throw an exception immediately.
3702
3703 The "flags" parameter is reserved for future use, and must
3704 always be zero.
3705
3706 OP * parse_stmtseq(U32 flags)
3707
3708 parse_termexpr
3709 NOTE: this function is experimental and may change or be
3710 removed without notice.
3711
3712 Parse a Perl term expression. This may contain operators of
3713 precedence down to the assignment operators. The expression
3714 must be followed (and thus terminated) either by a comma or
3715 lower-precedence operator or by something that would normally
3716 terminate an expression such as semicolon. If "flags" has the
3717 "PARSE_OPTIONAL" bit set, then the expression is optional,
3718 otherwise it is mandatory. It is up to the caller to ensure
3719 that the dynamic parser state ("PL_parser" et al) is correctly
3720 set to reflect the source of the code to be parsed and the
3721 lexical context for the expression.
3722
3723 The op tree representing the expression is returned. If an
3724 optional expression is absent, a null pointer is returned,
3725 otherwise the pointer will be non-null.
3726
3727 If an error occurs in parsing or compilation, in most cases a
3728 valid op tree is returned anyway. The error is reflected in
3729 the parser state, normally resulting in a single exception at
3730 the top level of parsing which covers all the compilation
3731 errors that occurred. Some compilation errors, however, will
3732 throw an exception immediately.
3733
3734 OP * parse_termexpr(U32 flags)
3735
3736 PL_parser
3737 Pointer to a structure encapsulating the state of the parsing
3738 operation currently in progress. The pointer can be locally
3739 changed to perform a nested parse without interfering with the
3740 state of an outer parse. Individual members of "PL_parser"
3741 have their own documentation.
3742
3743 PL_parser->bufend
3744 NOTE: this function is experimental and may change or be
3745 removed without notice.
3746
3747 Direct pointer to the end of the chunk of text currently being
3748 lexed, the end of the lexer buffer. This is equal to
3749 "SvPVX(PL_parser->linestr) + SvCUR(PL_parser->linestr)". A
3750 "NUL" character (zero octet) is always located at the end of
3751 the buffer, and does not count as part of the buffer's
3752 contents.
3753
3754 PL_parser->bufptr
3755 NOTE: this function is experimental and may change or be
3756 removed without notice.
3757
3758 Points to the current position of lexing inside the lexer
3759 buffer. Characters around this point may be freely examined,
3760 within the range delimited by "SvPVX("PL_parser->linestr")" and
3761 "PL_parser->bufend". The octets of the buffer may be intended
3762 to be interpreted as either UTF-8 or Latin-1, as indicated by
3763 "lex_bufutf8".
3764
3765 Lexing code (whether in the Perl core or not) moves this
3766 pointer past the characters that it consumes. It is also
3767 expected to perform some bookkeeping whenever a newline
3768 character is consumed. This movement can be more conveniently
3769 performed by the function "lex_read_to", which handles newlines
3770 appropriately.
3771
3772 Interpretation of the buffer's octets can be abstracted out by
3773 using the slightly higher-level functions "lex_peek_unichar"
3774 and "lex_read_unichar".
3775
3776 PL_parser->linestart
3777 NOTE: this function is experimental and may change or be
3778 removed without notice.
3779
3780 Points to the start of the current line inside the lexer
3781 buffer. This is useful for indicating at which column an error
3782 occurred, and not much else. This must be updated by any
3783 lexing code that consumes a newline; the function "lex_read_to"
3784 handles this detail.
3785
3786 PL_parser->linestr
3787 NOTE: this function is experimental and may change or be
3788 removed without notice.
3789
3790 Buffer scalar containing the chunk currently under
3791 consideration of the text currently being lexed. This is
3792 always a plain string scalar (for which "SvPOK" is true). It
3793 is not intended to be used as a scalar by normal scalar means;
3794 instead refer to the buffer directly by the pointer variables
3795 described below.
3796
3797 The lexer maintains various "char*" pointers to things in the
3798 "PL_parser->linestr" buffer. If "PL_parser->linestr" is ever
3799 reallocated, all of these pointers must be updated. Don't
3800 attempt to do this manually, but rather use "lex_grow_linestr"
3801 if you need to reallocate the buffer.
3802
3803 The content of the text chunk in the buffer is commonly exactly
3804 one complete line of input, up to and including a newline
3805 terminator, but there are situations where it is otherwise.
3806 The octets of the buffer may be intended to be interpreted as
3807 either UTF-8 or Latin-1. The function "lex_bufutf8" tells you
3808 which. Do not use the "SvUTF8" flag on this scalar, which may
3809 disagree with it.
3810
3811 For direct examination of the buffer, the variable
3812 "PL_parser->bufend" points to the end of the buffer. The
3813 current lexing position is pointed to by "PL_parser->bufptr".
3814 Direct use of these pointers is usually preferable to
3815 examination of the scalar through normal scalar means.
3816
3817 wrap_keyword_plugin
3818 NOTE: this function is experimental and may change or be
3819 removed without notice.
3820
3821 Puts a C function into the chain of keyword plugins. This is
3822 the preferred way to manipulate the "PL_keyword_plugin"
3823 variable. "new_plugin" is a pointer to the C function that is
3824 to be added to the keyword plugin chain, and "old_plugin_p"
3825 points to the storage location where a pointer to the next
3826 function in the chain will be stored. The value of
3827 "new_plugin" is written into the "PL_keyword_plugin" variable,
3828 while the value previously stored there is written to
3829 *old_plugin_p.
3830
3831 "PL_keyword_plugin" is global to an entire process, and a
3832 module wishing to hook keyword parsing may find itself invoked
3833 more than once per process, typically in different threads. To
3834 handle that situation, this function is idempotent. The
3835 location *old_plugin_p must initially (once per process)
3836 contain a null pointer. A C variable of static duration
3837 (declared at file scope, typically also marked "static" to give
3838 it internal linkage) will be implicitly initialised
3839 appropriately, if it does not have an explicit initialiser.
3840 This function will only actually modify the plugin chain if it
3841 finds *old_plugin_p to be null. This function is also thread
3842 safe on the small scale. It uses appropriate locking to avoid
3843 race conditions in accessing "PL_keyword_plugin".
3844
3845 When this function is called, the function referenced by
3846 "new_plugin" must be ready to be called, except for
3847 *old_plugin_p being unfilled. In a threading situation,
3848 "new_plugin" may be called immediately, even before this
3849 function has returned. *old_plugin_p will always be
3850 appropriately set before "new_plugin" is called. If
3851 "new_plugin" decides not to do anything special with the
3852 identifier that it is given (which is the usual case for most
3853 calls to a keyword plugin), it must chain the plugin function
3854 referenced by *old_plugin_p.
3855
3856 Taken all together, XS code to install a keyword plugin should
3857 typically look something like this:
3858
3859 static Perl_keyword_plugin_t next_keyword_plugin;
3860 static OP *my_keyword_plugin(pTHX_
3861 char *keyword_plugin, STRLEN keyword_len, OP **op_ptr)
3862 {
3863 if (memEQs(keyword_ptr, keyword_len,
3864 "my_new_keyword")) {
3865 ...
3866 } else {
3867 return next_keyword_plugin(aTHX_
3868 keyword_ptr, keyword_len, op_ptr);
3869 }
3870 }
3871 BOOT:
3872 wrap_keyword_plugin(my_keyword_plugin,
3873 &next_keyword_plugin);
3874
3875 Direct access to "PL_keyword_plugin" should be avoided.
3876
3877 void wrap_keyword_plugin(
3878 Perl_keyword_plugin_t new_plugin,
3879 Perl_keyword_plugin_t *old_plugin_p
3880 )
3881
3883 DECLARATION_FOR_LC_NUMERIC_MANIPULATION
3884 This macro should be used as a statement. It declares a
3885 private variable (whose name begins with an underscore) that is
3886 needed by the other macros in this section. Failing to include
3887 this correctly should lead to a syntax error. For
3888 compatibility with C89 C compilers it should be placed in a
3889 block before any executable statements.
3890
3891 void DECLARATION_FOR_LC_NUMERIC_MANIPULATION
3892
3893 Perl_langinfo
3894 This is an (almost) drop-in replacement for the system
3895 nl_langinfo(3), taking the same "item" parameter values, and
3896 returning the same information. But it is more thread-safe
3897 than regular "nl_langinfo()", and hides the quirks of Perl's
3898 locale handling from your code, and can be used on systems that
3899 lack a native "nl_langinfo".
3900
3901 Expanding on these:
3902
3903 · The reason it isn't quite a drop-in replacement is actually
3904 an advantage. The only difference is that it returns
3905 "const char *", whereas plain "nl_langinfo()" returns
3906 "char *", but you are (only by documentation) forbidden to
3907 write into the buffer. By declaring this "const", the
3908 compiler enforces this restriction, so if it is violated,
3909 you know at compilation time, rather than getting segfaults
3910 at runtime.
3911
3912 · It delivers the correct results for the "RADIXCHAR" and
3913 "THOUSEP" items, without you having to write extra code.
3914 The reason for the extra code would be because these are
3915 from the "LC_NUMERIC" locale category, which is normally
3916 kept set by Perl so that the radix is a dot, and the
3917 separator is the empty string, no matter what the
3918 underlying locale is supposed to be, and so to get the
3919 expected results, you have to temporarily toggle into the
3920 underlying locale, and later toggle back. (You could use
3921 plain "nl_langinfo" and
3922 "STORE_LC_NUMERIC_FORCE_TO_UNDERLYING" for this but then
3923 you wouldn't get the other advantages of "Perl_langinfo()";
3924 not keeping "LC_NUMERIC" in the C (or equivalent) locale
3925 would break a lot of CPAN, which is expecting the radix
3926 (decimal point) character to be a dot.)
3927
3928 · The system function it replaces can have its static return
3929 buffer trashed, not only by a subesequent call to that
3930 function, but by a "freelocale", "setlocale", or other
3931 locale change. The returned buffer of this function is not
3932 changed until the next call to it, so the buffer is never
3933 in a trashed state.
3934
3935 · Its return buffer is per-thread, so it also is never
3936 overwritten by a call to this function from another thread;
3937 unlike the function it replaces.
3938
3939 · But most importantly, it works on systems that don't have
3940 "nl_langinfo", such as Windows, hence makes your code more
3941 portable. Of the fifty-some possible items specified by
3942 the POSIX 2008 standard,
3943 <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/langinfo.h.html>,
3944 only one is completely unimplemented, though on non-Windows
3945 platforms, another significant one is also not
3946 implemented). It uses various techniques to recover the
3947 other items, including calling localeconv(3), and
3948 strftime(3), both of which are specified in C89, so should
3949 be always be available. Later "strftime()" versions have
3950 additional capabilities; "" is returned for those not
3951 available on your system.
3952
3953 It is important to note that when called with an item that
3954 is recovered by using "localeconv", the buffer from any
3955 previous explicit call to "localeconv" will be overwritten.
3956 This means you must save that buffer's contents if you need
3957 to access them after a call to this function. (But note
3958 that you might not want to be using "localeconv()" directly
3959 anyway, because of issues like the ones listed in the
3960 second item of this list (above) for "RADIXCHAR" and
3961 "THOUSEP". You can use the methods given in perlcall to
3962 call "localeconv" in POSIX and avoid all the issues, but
3963 then you have a hash to unpack).
3964
3965 The details for those items which may deviate from what
3966 this emulation returns and what a native "nl_langinfo()"
3967 would return are specified in I18N::Langinfo.
3968
3969 When using "Perl_langinfo" on systems that don't have a native
3970 "nl_langinfo()", you must
3971
3972 #include "perl_langinfo.h"
3973
3974 before the "perl.h" "#include". You can replace your
3975 "langinfo.h" "#include" with this one. (Doing it this way
3976 keeps out the symbols that plain "langinfo.h" would try to
3977 import into the namespace for code that doesn't need it.)
3978
3979 The original impetus for "Perl_langinfo()" was so that code
3980 that needs to find out the current currency symbol, floating
3981 point radix character, or digit grouping separator can use, on
3982 all systems, the simpler and more thread-friendly "nl_langinfo"
3983 API instead of localeconv(3) which is a pain to make thread-
3984 friendly. For other fields returned by "localeconv", it is
3985 better to use the methods given in perlcall to call
3986 "POSIX::localeconv()", which is thread-friendly.
3987
3988 const char* Perl_langinfo(const nl_item item)
3989
3990 Perl_setlocale
3991 This is an (almost) drop-in replacement for the system
3992 setlocale(3), taking the same parameters, and returning the
3993 same information, except that it returns the correct underlying
3994 "LC_NUMERIC" locale. Regular "setlocale" will instead return
3995 "C" if the underlying locale has a non-dot decimal point
3996 character, or a non-empty thousands separator for displaying
3997 floating point numbers. This is because perl keeps that locale
3998 category such that it has a dot and empty separator, changing
3999 the locale briefly during the operations where the underlying
4000 one is required. "Perl_setlocale" knows about this, and
4001 compensates; regular "setlocale" doesn't.
4002
4003 Another reason it isn't completely a drop-in replacement is
4004 that it is declared to return "const char *", whereas the
4005 system setlocale omits the "const" (presumably because its API
4006 was specified long ago, and can't be updated; it is illegal to
4007 change the information "setlocale" returns; doing so leads to
4008 segfaults.)
4009
4010 Finally, "Perl_setlocale" works under all circumstances,
4011 whereas plain "setlocale" can be completely ineffective on some
4012 platforms under some configurations.
4013
4014 "Perl_setlocale" should not be used to change the locale except
4015 on systems where the predefined variable "${^SAFE_LOCALES}" is
4016 1. On some such systems, the system "setlocale()" is
4017 ineffective, returning the wrong information, and failing to
4018 actually change the locale. "Perl_setlocale", however works
4019 properly in all circumstances.
4020
4021 The return points to a per-thread static buffer, which is
4022 overwritten the next time "Perl_setlocale" is called from the
4023 same thread.
4024
4025 const char* Perl_setlocale(const int category,
4026 const char* locale)
4027
4028 RESTORE_LC_NUMERIC
4029 This is used in conjunction with one of the macros
4030 "STORE_LC_NUMERIC_SET_TO_NEEDED" and
4031 "STORE_LC_NUMERIC_FORCE_TO_UNDERLYING" to properly restore the
4032 "LC_NUMERIC" state.
4033
4034 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4035 been made to declare at compile time a private variable used by
4036 this macro and the two "STORE" ones. This macro should be
4037 called as a single statement, not an expression, but with an
4038 empty argument list, like this:
4039
4040 {
4041 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4042 ...
4043 RESTORE_LC_NUMERIC();
4044 ...
4045 }
4046
4047 void RESTORE_LC_NUMERIC()
4048
4049 STORE_LC_NUMERIC_FORCE_TO_UNDERLYING
4050 This is used by XS code that that is "LC_NUMERIC" locale-aware
4051 to force the locale for category "LC_NUMERIC" to be what perl
4052 thinks is the current underlying locale. (The perl interpreter
4053 could be wrong about what the underlying locale actually is if
4054 some C or XS code has called the C library function
4055 setlocale(3) behind its back; calling "sync_locale" before
4056 calling this macro will update perl's records.)
4057
4058 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4059 been made to declare at compile time a private variable used by
4060 this macro. This macro should be called as a single statement,
4061 not an expression, but with an empty argument list, like this:
4062
4063 {
4064 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4065 ...
4066 STORE_LC_NUMERIC_FORCE_TO_UNDERLYING();
4067 ...
4068 RESTORE_LC_NUMERIC();
4069 ...
4070 }
4071
4072 The private variable is used to save the current locale state,
4073 so that the requisite matching call to "RESTORE_LC_NUMERIC" can
4074 restore it.
4075
4076 On threaded perls not operating with thread-safe functionality,
4077 this macro uses a mutex to force a critical section. Therefore
4078 the matching RESTORE should be close by, and guaranteed to be
4079 called.
4080
4081 void STORE_LC_NUMERIC_FORCE_TO_UNDERLYING()
4082
4083 STORE_LC_NUMERIC_SET_TO_NEEDED
4084 This is used to help wrap XS or C code that is "LC_NUMERIC"
4085 locale-aware. This locale category is generally kept set to a
4086 locale where the decimal radix character is a dot, and the
4087 separator between groups of digits is empty. This is because
4088 most XS code that reads floating point numbers is expecting
4089 them to have this syntax.
4090
4091 This macro makes sure the current "LC_NUMERIC" state is set
4092 properly, to be aware of locale if the call to the XS or C code
4093 from the Perl program is from within the scope of a
4094 "use locale"; or to ignore locale if the call is instead from
4095 outside such scope.
4096
4097 This macro is the start of wrapping the C or XS code; the wrap
4098 ending is done by calling the "RESTORE_LC_NUMERIC" macro after
4099 the operation. Otherwise the state can be changed that will
4100 adversely affect other XS code.
4101
4102 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4103 been made to declare at compile time a private variable used by
4104 this macro. This macro should be called as a single statement,
4105 not an expression, but with an empty argument list, like this:
4106
4107 {
4108 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4109 ...
4110 STORE_LC_NUMERIC_SET_TO_NEEDED();
4111 ...
4112 RESTORE_LC_NUMERIC();
4113 ...
4114 }
4115
4116 On threaded perls not operating with thread-safe functionality,
4117 this macro uses a mutex to force a critical section. Therefore
4118 the matching RESTORE should be close by, and guaranteed to be
4119 called.
4120
4121 void STORE_LC_NUMERIC_SET_TO_NEEDED()
4122
4123 switch_to_global_locale
4124 On systems without locale support, or on typical single-
4125 threaded builds, or on platforms that do not support per-thread
4126 locale operations, this function does nothing. On such systems
4127 that do have locale support, only a locale global to the whole
4128 program is available.
4129
4130 On multi-threaded builds on systems that do have per-thread
4131 locale operations, this function converts the thread it is
4132 running in to use the global locale. This is for code that has
4133 not yet or cannot be updated to handle multi-threaded locale
4134 operation. As long as only a single thread is so-converted,
4135 everything works fine, as all the other threads continue to
4136 ignore the global one, so only this thread looks at it.
4137
4138 However, on Windows systems this isn't quite true prior to
4139 Visual Studio 15, at which point Microsoft fixed a bug. A race
4140 can occur if you use the following operations on earlier
4141 Windows platforms:
4142
4143 POSIX::localeconv
4144 I18N::Langinfo, items "CRNCYSTR" and "THOUSEP"
4145 "Perl_langinfo" in perlapi, items "CRNCYSTR" and "THOUSEP"
4146
4147 The first item is not fixable (except by upgrading to a later
4148 Visual Studio release), but it would be possible to work around
4149 the latter two items by using the Windows API functions
4150 "GetNumberFormat" and "GetCurrencyFormat"; patches welcome.
4151
4152 Without this function call, threads that use the setlocale(3)
4153 system function will not work properly, as all the locale-
4154 sensitive functions will look at the per-thread locale, and
4155 "setlocale" will have no effect on this thread.
4156
4157 Perl code should convert to either call "Perl_setlocale" (which
4158 is a drop-in for the system "setlocale") or use the methods
4159 given in perlcall to call "POSIX::setlocale". Either one will
4160 transparently properly handle all cases of single- vs multi-
4161 thread, POSIX 2008-supported or not.
4162
4163 Non-Perl libraries, such as "gtk", that call the system
4164 "setlocale" can continue to work if this function is called
4165 before transferring control to the library.
4166
4167 Upon return from the code that needs to use the global locale,
4168 "sync_locale()" should be called to restore the safe multi-
4169 thread operation.
4170
4171 void switch_to_global_locale()
4172
4173 sync_locale
4174 "Perl_setlocale" can be used at any time to query or change the
4175 locale (though changing the locale is antisocial and dangerous
4176 on multi-threaded systems that don't have multi-thread safe
4177 locale operations. (See "Multi-threaded operation" in
4178 perllocale). Using the system setlocale(3) should be avoided.
4179 Nevertheless, certain non-Perl libraries called from XS, such
4180 as "Gtk" do so, and this can't be changed. When the locale is
4181 changed by XS code that didn't use "Perl_setlocale", Perl needs
4182 to be told that the locale has changed. Use this function to
4183 do so, before returning to Perl.
4184
4185 The return value is a boolean: TRUE if the global locale at the
4186 time of call was in effect; and FALSE if a per-thread locale
4187 was in effect. This can be used by the caller that needs to
4188 restore things as-they-were to decide whether or not to call
4189 "Perl_switch_to_global_locale".
4190
4191 bool sync_locale()
4192
4194 mg_clear
4195 Clear something magical that the SV represents. See
4196 "sv_magic".
4197
4198 int mg_clear(SV* sv)
4199
4200 mg_copy Copies the magic from one SV to another. See "sv_magic".
4201
4202 int mg_copy(SV *sv, SV *nsv, const char *key,
4203 I32 klen)
4204
4205 mg_find Finds the magic pointer for "type" matching the SV. See
4206 "sv_magic".
4207
4208 MAGIC* mg_find(const SV* sv, int type)
4209
4210 mg_findext
4211 Finds the magic pointer of "type" with the given "vtbl" for the
4212 "SV". See "sv_magicext".
4213
4214 MAGIC* mg_findext(const SV* sv, int type,
4215 const MGVTBL *vtbl)
4216
4217 mg_free Free any magic storage used by the SV. See "sv_magic".
4218
4219 int mg_free(SV* sv)
4220
4221 mg_freeext
4222 Remove any magic of type "how" using virtual table "vtbl" from
4223 the SV "sv". See "sv_magic".
4224
4225 "mg_freeext(sv, how, NULL)" is equivalent to "mg_free_type(sv,
4226 how)".
4227
4228 void mg_freeext(SV* sv, int how, const MGVTBL *vtbl)
4229
4230 mg_free_type
4231 Remove any magic of type "how" from the SV "sv". See
4232 "sv_magic".
4233
4234 void mg_free_type(SV *sv, int how)
4235
4236 mg_get Do magic before a value is retrieved from the SV. The type of
4237 SV must be >= "SVt_PVMG". See "sv_magic".
4238
4239 int mg_get(SV* sv)
4240
4241 mg_length
4242 DEPRECATED! It is planned to remove this function from a
4243 future release of Perl. Do not use it for new code; remove it
4244 from existing code.
4245
4246 Reports on the SV's length in bytes, calling length magic if
4247 available, but does not set the UTF8 flag on "sv". It will
4248 fall back to 'get' magic if there is no 'length' magic, but
4249 with no indication as to whether it called 'get' magic. It
4250 assumes "sv" is a "PVMG" or higher. Use "sv_len()" instead.
4251
4252 U32 mg_length(SV* sv)
4253
4254 mg_magical
4255 Turns on the magical status of an SV. See "sv_magic".
4256
4257 void mg_magical(SV* sv)
4258
4259 mg_set Do magic after a value is assigned to the SV. See "sv_magic".
4260
4261 int mg_set(SV* sv)
4262
4263 SvGETMAGIC
4264 Invokes "mg_get" on an SV if it has 'get' magic. For example,
4265 this will call "FETCH" on a tied variable. This macro
4266 evaluates its argument more than once.
4267
4268 void SvGETMAGIC(SV* sv)
4269
4270 SvLOCK Arranges for a mutual exclusion lock to be obtained on "sv" if
4271 a suitable module has been loaded.
4272
4273 void SvLOCK(SV* sv)
4274
4275 SvSETMAGIC
4276 Invokes "mg_set" on an SV if it has 'set' magic. This is
4277 necessary after modifying a scalar, in case it is a magical
4278 variable like $| or a tied variable (it calls "STORE"). This
4279 macro evaluates its argument more than once.
4280
4281 void SvSETMAGIC(SV* sv)
4282
4283 SvSetMagicSV
4284 Like "SvSetSV", but does any set magic required afterwards.
4285
4286 void SvSetMagicSV(SV* dsv, SV* ssv)
4287
4288 SvSetMagicSV_nosteal
4289 Like "SvSetSV_nosteal", but does any set magic required
4290 afterwards.
4291
4292 void SvSetMagicSV_nosteal(SV* dsv, SV* ssv)
4293
4294 SvSetSV Calls "sv_setsv" if "dsv" is not the same as "ssv". May
4295 evaluate arguments more than once. Does not handle 'set' magic
4296 on the destination SV.
4297
4298 void SvSetSV(SV* dsv, SV* ssv)
4299
4300 SvSetSV_nosteal
4301 Calls a non-destructive version of "sv_setsv" if "dsv" is not
4302 the same as "ssv". May evaluate arguments more than once.
4303
4304 void SvSetSV_nosteal(SV* dsv, SV* ssv)
4305
4306 SvSHARE Arranges for "sv" to be shared between threads if a suitable
4307 module has been loaded.
4308
4309 void SvSHARE(SV* sv)
4310
4311 sv_string_from_errnum
4312 Generates the message string describing an OS error and returns
4313 it as an SV. "errnum" must be a value that "errno" could take,
4314 identifying the type of error.
4315
4316 If "tgtsv" is non-null then the string will be written into
4317 that SV (overwriting existing content) and it will be returned.
4318 If "tgtsv" is a null pointer then the string will be written
4319 into a new mortal SV which will be returned.
4320
4321 The message will be taken from whatever locale would be used by
4322 $!, and will be encoded in the SV in whatever manner would be
4323 used by $!. The details of this process are subject to future
4324 change. Currently, the message is taken from the C locale by
4325 default (usually producing an English message), and from the
4326 currently selected locale when in the scope of the "use locale"
4327 pragma. A heuristic attempt is made to decode the message from
4328 the locale's character encoding, but it will only be decoded as
4329 either UTF-8 or ISO-8859-1. It is always correctly decoded in
4330 a UTF-8 locale, usually in an ISO-8859-1 locale, and never in
4331 any other locale.
4332
4333 The SV is always returned containing an actual string, and with
4334 no other OK bits set. Unlike $!, a message is even yielded for
4335 "errnum" zero (meaning success), and if no useful message is
4336 available then a useless string (currently empty) is returned.
4337
4338 SV * sv_string_from_errnum(int errnum, SV *tgtsv)
4339
4340 SvUNLOCK
4341 Releases a mutual exclusion lock on "sv" if a suitable module
4342 has been loaded.
4343
4344 void SvUNLOCK(SV* sv)
4345
4347 Copy The XSUB-writer's interface to the C "memcpy" function. The
4348 "src" is the source, "dest" is the destination, "nitems" is the
4349 number of items, and "type" is the type. May fail on
4350 overlapping copies. See also "Move".
4351
4352 void Copy(void* src, void* dest, int nitems, type)
4353
4354 CopyD Like "Copy" but returns "dest". Useful for encouraging
4355 compilers to tail-call optimise.
4356
4357 void * CopyD(void* src, void* dest, int nitems, type)
4358
4359 Move The XSUB-writer's interface to the C "memmove" function. The
4360 "src" is the source, "dest" is the destination, "nitems" is the
4361 number of items, and "type" is the type. Can do overlapping
4362 moves. See also "Copy".
4363
4364 void Move(void* src, void* dest, int nitems, type)
4365
4366 MoveD Like "Move" but returns "dest". Useful for encouraging
4367 compilers to tail-call optimise.
4368
4369 void * MoveD(void* src, void* dest, int nitems, type)
4370
4371 Newx The XSUB-writer's interface to the C "malloc" function.
4372
4373 Memory obtained by this should ONLY be freed with "Safefree".
4374
4375 In 5.9.3, Newx() and friends replace the older New() API, and
4376 drops the first parameter, x, a debug aid which allowed callers
4377 to identify themselves. This aid has been superseded by a new
4378 build option, PERL_MEM_LOG (see "PERL_MEM_LOG" in
4379 perlhacktips). The older API is still there for use in XS
4380 modules supporting older perls.
4381
4382 void Newx(void* ptr, int nitems, type)
4383
4384 Newxc The XSUB-writer's interface to the C "malloc" function, with
4385 cast. See also "Newx".
4386
4387 Memory obtained by this should ONLY be freed with "Safefree".
4388
4389 void Newxc(void* ptr, int nitems, type, cast)
4390
4391 Newxz The XSUB-writer's interface to the C "malloc" function. The
4392 allocated memory is zeroed with "memzero". See also "Newx".
4393
4394 Memory obtained by this should ONLY be freed with "Safefree".
4395
4396 void Newxz(void* ptr, int nitems, type)
4397
4398 Poison PoisonWith(0xEF) for catching access to freed memory.
4399
4400 void Poison(void* dest, int nitems, type)
4401
4402 PoisonFree
4403 PoisonWith(0xEF) for catching access to freed memory.
4404
4405 void PoisonFree(void* dest, int nitems, type)
4406
4407 PoisonNew
4408 PoisonWith(0xAB) for catching access to allocated but
4409 uninitialized memory.
4410
4411 void PoisonNew(void* dest, int nitems, type)
4412
4413 PoisonWith
4414 Fill up memory with a byte pattern (a byte repeated over and
4415 over again) that hopefully catches attempts to access
4416 uninitialized memory.
4417
4418 void PoisonWith(void* dest, int nitems, type,
4419 U8 byte)
4420
4421 Renew The XSUB-writer's interface to the C "realloc" function.
4422
4423 Memory obtained by this should ONLY be freed with "Safefree".
4424
4425 void Renew(void* ptr, int nitems, type)
4426
4427 Renewc The XSUB-writer's interface to the C "realloc" function, with
4428 cast.
4429
4430 Memory obtained by this should ONLY be freed with "Safefree".
4431
4432 void Renewc(void* ptr, int nitems, type, cast)
4433
4434 Safefree
4435 The XSUB-writer's interface to the C "free" function.
4436
4437 This should ONLY be used on memory obtained using "Newx" and
4438 friends.
4439
4440 void Safefree(void* ptr)
4441
4442 savepv Perl's version of "strdup()". Returns a pointer to a newly
4443 allocated string which is a duplicate of "pv". The size of the
4444 string is determined by "strlen()", which means it may not
4445 contain embedded "NUL" characters and must have a trailing
4446 "NUL". The memory allocated for the new string can be freed
4447 with the "Safefree()" function.
4448
4449 On some platforms, Windows for example, all allocated memory
4450 owned by a thread is deallocated when that thread ends. So if
4451 you need that not to happen, you need to use the shared memory
4452 functions, such as "savesharedpv".
4453
4454 char* savepv(const char* pv)
4455
4456 savepvn Perl's version of what "strndup()" would be if it existed.
4457 Returns a pointer to a newly allocated string which is a
4458 duplicate of the first "len" bytes from "pv", plus a trailing
4459 "NUL" byte. The memory allocated for the new string can be
4460 freed with the "Safefree()" function.
4461
4462 On some platforms, Windows for example, all allocated memory
4463 owned by a thread is deallocated when that thread ends. So if
4464 you need that not to happen, you need to use the shared memory
4465 functions, such as "savesharedpvn".
4466
4467 char* savepvn(const char* pv, I32 len)
4468
4469 savepvs Like "savepvn", but takes a literal string instead of a
4470 string/length pair.
4471
4472 char* savepvs("literal string" s)
4473
4474 savesharedpv
4475 A version of "savepv()" which allocates the duplicate string in
4476 memory which is shared between threads.
4477
4478 char* savesharedpv(const char* pv)
4479
4480 savesharedpvn
4481 A version of "savepvn()" which allocates the duplicate string
4482 in memory which is shared between threads. (With the specific
4483 difference that a "NULL" pointer is not acceptable)
4484
4485 char* savesharedpvn(const char *const pv,
4486 const STRLEN len)
4487
4488 savesharedpvs
4489 A version of "savepvs()" which allocates the duplicate string
4490 in memory which is shared between threads.
4491
4492 char* savesharedpvs("literal string" s)
4493
4494 savesharedsvpv
4495 A version of "savesharedpv()" which allocates the duplicate
4496 string in memory which is shared between threads.
4497
4498 char* savesharedsvpv(SV *sv)
4499
4500 savesvpv
4501 A version of "savepv()"/"savepvn()" which gets the string to
4502 duplicate from the passed in SV using "SvPV()"
4503
4504 On some platforms, Windows for example, all allocated memory
4505 owned by a thread is deallocated when that thread ends. So if
4506 you need that not to happen, you need to use the shared memory
4507 functions, such as "savesharedsvpv".
4508
4509 char* savesvpv(SV* sv)
4510
4511 StructCopy
4512 This is an architecture-independent macro to copy one structure
4513 to another.
4514
4515 void StructCopy(type *src, type *dest, type)
4516
4517 Zero The XSUB-writer's interface to the C "memzero" function. The
4518 "dest" is the destination, "nitems" is the number of items, and
4519 "type" is the type.
4520
4521 void Zero(void* dest, int nitems, type)
4522
4523 ZeroD Like "Zero" but returns dest. Useful for encouraging compilers
4524 to tail-call optimise.
4525
4526 void * ZeroD(void* dest, int nitems, type)
4527
4529 dump_c_backtrace
4530 Dumps the C backtrace to the given "fp".
4531
4532 Returns true if a backtrace could be retrieved, false if not.
4533
4534 bool dump_c_backtrace(PerlIO* fp, int max_depth,
4535 int skip)
4536
4537 fbm_compile
4538 Analyzes the string in order to make fast searches on it using
4539 "fbm_instr()" -- the Boyer-Moore algorithm.
4540
4541 void fbm_compile(SV* sv, U32 flags)
4542
4543 fbm_instr
4544 Returns the location of the SV in the string delimited by "big"
4545 and "bigend" ("bigend") is the char following the last char).
4546 It returns "NULL" if the string can't be found. The "sv" does
4547 not have to be "fbm_compiled", but the search will not be as
4548 fast then.
4549
4550 char* fbm_instr(unsigned char* big,
4551 unsigned char* bigend, SV* littlestr,
4552 U32 flags)
4553
4554 foldEQ Returns true if the leading "len" bytes of the strings "s1" and
4555 "s2" are the same case-insensitively; false otherwise.
4556 Uppercase and lowercase ASCII range bytes match themselves and
4557 their opposite case counterparts. Non-cased and non-ASCII
4558 range bytes match only themselves.
4559
4560 I32 foldEQ(const char* a, const char* b, I32 len)
4561
4562 foldEQ_locale
4563 Returns true if the leading "len" bytes of the strings "s1" and
4564 "s2" are the same case-insensitively in the current locale;
4565 false otherwise.
4566
4567 I32 foldEQ_locale(const char* a, const char* b,
4568 I32 len)
4569
4570 form Takes a sprintf-style format pattern and conventional (non-SV)
4571 arguments and returns the formatted string.
4572
4573 (char *) Perl_form(pTHX_ const char* pat, ...)
4574
4575 can be used any place a string (char *) is required:
4576
4577 char * s = Perl_form("%d.%d",major,minor);
4578
4579 Uses a single private buffer so if you want to format several
4580 strings you must explicitly copy the earlier strings away (and
4581 free the copies when you are done).
4582
4583 char* form(const char* pat, ...)
4584
4585 getcwd_sv
4586 Fill "sv" with current working directory
4587
4588 int getcwd_sv(SV* sv)
4589
4590 get_c_backtrace_dump
4591 Returns a SV containing a dump of "depth" frames of the call
4592 stack, skipping the "skip" innermost ones. "depth" of 20 is
4593 usually enough.
4594
4595 The appended output looks like:
4596
4597 ... 1 10e004812:0082 Perl_croak util.c:1716
4598 /usr/bin/perl 2 10df8d6d2:1d72 perl_parse perl.c:3975
4599 /usr/bin/perl ...
4600
4601 The fields are tab-separated. The first column is the depth
4602 (zero being the innermost non-skipped frame). In the
4603 hex:offset, the hex is where the program counter was in
4604 "S_parse_body", and the :offset (might be missing) tells how
4605 much inside the "S_parse_body" the program counter was.
4606
4607 The "util.c:1716" is the source code file and line number.
4608
4609 The /usr/bin/perl is obvious (hopefully).
4610
4611 Unknowns are "-". Unknowns can happen unfortunately quite
4612 easily: if the platform doesn't support retrieving the
4613 information; if the binary is missing the debug information; if
4614 the optimizer has transformed the code by for example inlining.
4615
4616 SV* get_c_backtrace_dump(int max_depth, int skip)
4617
4618 ibcmp This is a synonym for "(! foldEQ())"
4619
4620 I32 ibcmp(const char* a, const char* b, I32 len)
4621
4622 ibcmp_locale
4623 This is a synonym for "(! foldEQ_locale())"
4624
4625 I32 ibcmp_locale(const char* a, const char* b,
4626 I32 len)
4627
4628 is_safe_syscall
4629 Test that the given "pv" doesn't contain any internal "NUL"
4630 characters. If it does, set "errno" to "ENOENT", optionally
4631 warn, and return FALSE.
4632
4633 Return TRUE if the name is safe.
4634
4635 Used by the "IS_SAFE_SYSCALL()" macro.
4636
4637 bool is_safe_syscall(const char *pv, STRLEN len,
4638 const char *what,
4639 const char *op_name)
4640
4641 memEQ Test two buffers (which may contain embedded "NUL" characters,
4642 to see if they are equal. The "len" parameter indicates the
4643 number of bytes to compare. Returns zero if equal, or non-zero
4644 if non-equal.
4645
4646 bool memEQ(char* s1, char* s2, STRLEN len)
4647
4648 memNE Test two buffers (which may contain embedded "NUL" characters,
4649 to see if they are not equal. The "len" parameter indicates
4650 the number of bytes to compare. Returns zero if non-equal, or
4651 non-zero if equal.
4652
4653 bool memNE(char* s1, char* s2, STRLEN len)
4654
4655 mess Take a sprintf-style format pattern and argument list. These
4656 are used to generate a string message. If the message does not
4657 end with a newline, then it will be extended with some
4658 indication of the current location in the code, as described
4659 for "mess_sv".
4660
4661 Normally, the resulting message is returned in a new mortal SV.
4662 During global destruction a single SV may be shared between
4663 uses of this function.
4664
4665 SV * mess(const char *pat, ...)
4666
4667 mess_sv Expands a message, intended for the user, to include an
4668 indication of the current location in the code, if the message
4669 does not already appear to be complete.
4670
4671 "basemsg" is the initial message or object. If it is a
4672 reference, it will be used as-is and will be the result of this
4673 function. Otherwise it is used as a string, and if it already
4674 ends with a newline, it is taken to be complete, and the result
4675 of this function will be the same string. If the message does
4676 not end with a newline, then a segment such as "at foo.pl line
4677 37" will be appended, and possibly other clauses indicating the
4678 current state of execution. The resulting message will end
4679 with a dot and a newline.
4680
4681 Normally, the resulting message is returned in a new mortal SV.
4682 During global destruction a single SV may be shared between
4683 uses of this function. If "consume" is true, then the function
4684 is permitted (but not required) to modify and return "basemsg"
4685 instead of allocating a new SV.
4686
4687 SV * mess_sv(SV *basemsg, bool consume)
4688
4689 my_snprintf
4690 The C library "snprintf" functionality, if available and
4691 standards-compliant (uses "vsnprintf", actually). However, if
4692 the "vsnprintf" is not available, will unfortunately use the
4693 unsafe "vsprintf" which can overrun the buffer (there is an
4694 overrun check, but that may be too late). Consider using
4695 "sv_vcatpvf" instead, or getting "vsnprintf".
4696
4697 int my_snprintf(char *buffer, const Size_t len,
4698 const char *format, ...)
4699
4700 my_strlcat
4701 The C library "strlcat" if available, or a Perl implementation
4702 of it. This operates on C "NUL"-terminated strings.
4703
4704 "my_strlcat()" appends string "src" to the end of "dst". It
4705 will append at most "size - strlen(dst) - 1" characters. It
4706 will then "NUL"-terminate, unless "size" is 0 or the original
4707 "dst" string was longer than "size" (in practice this should
4708 not happen as it means that either "size" is incorrect or that
4709 "dst" is not a proper "NUL"-terminated string).
4710
4711 Note that "size" is the full size of the destination buffer and
4712 the result is guaranteed to be "NUL"-terminated if there is
4713 room. Note that room for the "NUL" should be included in
4714 "size".
4715
4716 The return value is the total length that "dst" would have if
4717 "size" is sufficiently large. Thus it is the initial length of
4718 "dst" plus the length of "src". If "size" is smaller than the
4719 return, the excess was not appended.
4720
4721 Size_t my_strlcat(char *dst, const char *src,
4722 Size_t size)
4723
4724 my_strlcpy
4725 The C library "strlcpy" if available, or a Perl implementation
4726 of it. This operates on C "NUL"-terminated strings.
4727
4728 "my_strlcpy()" copies up to "size - 1" characters from the
4729 string "src" to "dst", "NUL"-terminating the result if "size"
4730 is not 0.
4731
4732 The return value is the total length "src" would be if the copy
4733 completely succeeded. If it is larger than "size", the excess
4734 was not copied.
4735
4736 Size_t my_strlcpy(char *dst, const char *src,
4737 Size_t size)
4738
4739 my_strnlen
4740 The C library "strnlen" if available, or a Perl implementation
4741 of it.
4742
4743 "my_strnlen()" computes the length of the string, up to
4744 "maxlen" characters. It will will never attempt to address
4745 more than "maxlen" characters, making it suitable for use with
4746 strings that are not guaranteed to be NUL-terminated.
4747
4748 Size_t my_strnlen(const char *str, Size_t maxlen)
4749
4750 my_vsnprintf
4751 The C library "vsnprintf" if available and standards-compliant.
4752 However, if if the "vsnprintf" is not available, will
4753 unfortunately use the unsafe "vsprintf" which can overrun the
4754 buffer (there is an overrun check, but that may be too late).
4755 Consider using "sv_vcatpvf" instead, or getting "vsnprintf".
4756
4757 int my_vsnprintf(char *buffer, const Size_t len,
4758 const char *format, va_list ap)
4759
4760 ninstr Find the first (leftmost) occurrence of a sequence of bytes
4761 within another sequence. This is the Perl version of
4762 "strstr()", extended to handle arbitrary sequences, potentially
4763 containing embedded "NUL" characters ("NUL" is what the initial
4764 "n" in the function name stands for; some systems have an
4765 equivalent, "memmem()", but with a somewhat different API).
4766
4767 Another way of thinking about this function is finding a needle
4768 in a haystack. "big" points to the first byte in the haystack.
4769 "big_end" points to one byte beyond the final byte in the
4770 haystack. "little" points to the first byte in the needle.
4771 "little_end" points to one byte beyond the final byte in the
4772 needle. All the parameters must be non-"NULL".
4773
4774 The function returns "NULL" if there is no occurrence of
4775 "little" within "big". If "little" is the empty string, "big"
4776 is returned.
4777
4778 Because this function operates at the byte level, and because
4779 of the inherent characteristics of UTF-8 (or UTF-EBCDIC), it
4780 will work properly if both the needle and the haystack are
4781 strings with the same UTF-8ness, but not if the UTF-8ness
4782 differs.
4783
4784 char * ninstr(char * big, char * bigend, char * little,
4785 char * little_end)
4786
4787 PERL_SYS_INIT
4788 Provides system-specific tune up of the C runtime environment
4789 necessary to run Perl interpreters. This should be called only
4790 once, before creating any Perl interpreters.
4791
4792 void PERL_SYS_INIT(int *argc, char*** argv)
4793
4794 PERL_SYS_INIT3
4795 Provides system-specific tune up of the C runtime environment
4796 necessary to run Perl interpreters. This should be called only
4797 once, before creating any Perl interpreters.
4798
4799 void PERL_SYS_INIT3(int *argc, char*** argv,
4800 char*** env)
4801
4802 PERL_SYS_TERM
4803 Provides system-specific clean up of the C runtime environment
4804 after running Perl interpreters. This should be called only
4805 once, after freeing any remaining Perl interpreters.
4806
4807 void PERL_SYS_TERM()
4808
4809 quadmath_format_needed
4810 "quadmath_format_needed()" returns true if the "format" string
4811 seems to contain at least one non-Q-prefixed "%[efgaEFGA]"
4812 format specifier, or returns false otherwise.
4813
4814 The format specifier detection is not complete printf-syntax
4815 detection, but it should catch most common cases.
4816
4817 If true is returned, those arguments should in theory be
4818 processed with "quadmath_snprintf()", but in case there is more
4819 than one such format specifier (see "quadmath_format_single"),
4820 and if there is anything else beyond that one (even just a
4821 single byte), they cannot be processed because
4822 "quadmath_snprintf()" is very strict, accepting only one format
4823 spec, and nothing else. In this case, the code should probably
4824 fail.
4825
4826 bool quadmath_format_needed(const char* format)
4827
4828 quadmath_format_single
4829 "quadmath_snprintf()" is very strict about its "format" string
4830 and will fail, returning -1, if the format is invalid. It
4831 accepts exactly one format spec.
4832
4833 "quadmath_format_single()" checks that the intended single spec
4834 looks sane: begins with "%", has only one "%", ends with
4835 "[efgaEFGA]", and has "Q" before it. This is not a full
4836 "printf syntax check", just the basics.
4837
4838 Returns the format if it is valid, NULL if not.
4839
4840 "quadmath_format_single()" can and will actually patch in the
4841 missing "Q", if necessary. In this case it will return the
4842 modified copy of the format, which the caller will need to
4843 free.
4844
4845 See also "quadmath_format_needed".
4846
4847 const char* quadmath_format_single(const char* format)
4848
4849 READ_XDIGIT
4850 Returns the value of an ASCII-range hex digit and advances the
4851 string pointer. Behaviour is only well defined when
4852 isXDIGIT(*str) is true.
4853
4854 U8 READ_XDIGIT(char str*)
4855
4856 rninstr Like "ninstr", but instead finds the final (rightmost)
4857 occurrence of a sequence of bytes within another sequence,
4858 returning "NULL" if there is no such occurrence.
4859
4860 char * rninstr(char * big, char * bigend,
4861 char * little, char * little_end)
4862
4863 strEQ Test two "NUL"-terminated strings to see if they are equal.
4864 Returns true or false.
4865
4866 bool strEQ(char* s1, char* s2)
4867
4868 strGE Test two "NUL"-terminated strings to see if the first, "s1", is
4869 greater than or equal to the second, "s2". Returns true or
4870 false.
4871
4872 bool strGE(char* s1, char* s2)
4873
4874 strGT Test two "NUL"-terminated strings to see if the first, "s1", is
4875 greater than the second, "s2". Returns true or false.
4876
4877 bool strGT(char* s1, char* s2)
4878
4879 strLE Test two "NUL"-terminated strings to see if the first, "s1", is
4880 less than or equal to the second, "s2". Returns true or false.
4881
4882 bool strLE(char* s1, char* s2)
4883
4884 strLT Test two "NUL"-terminated strings to see if the first, "s1", is
4885 less than the second, "s2". Returns true or false.
4886
4887 bool strLT(char* s1, char* s2)
4888
4889 strNE Test two "NUL"-terminated strings to see if they are different.
4890 Returns true or false.
4891
4892 bool strNE(char* s1, char* s2)
4893
4894 strnEQ Test two "NUL"-terminated strings to see if they are equal.
4895 The "len" parameter indicates the number of bytes to compare.
4896 Returns true or false. (A wrapper for "strncmp").
4897
4898 bool strnEQ(char* s1, char* s2, STRLEN len)
4899
4900 strnNE Test two "NUL"-terminated strings to see if they are different.
4901 The "len" parameter indicates the number of bytes to compare.
4902 Returns true or false. (A wrapper for "strncmp").
4903
4904 bool strnNE(char* s1, char* s2, STRLEN len)
4905
4906 sv_destroyable
4907 Dummy routine which reports that object can be destroyed when
4908 there is no sharing module present. It ignores its single SV
4909 argument, and returns 'true'. Exists to avoid test for a
4910 "NULL" function pointer and because it could potentially warn
4911 under some level of strict-ness.
4912
4913 bool sv_destroyable(SV *sv)
4914
4915 sv_nosharing
4916 Dummy routine which "shares" an SV when there is no sharing
4917 module present. Or "locks" it. Or "unlocks" it. In other
4918 words, ignores its single SV argument. Exists to avoid test
4919 for a "NULL" function pointer and because it could potentially
4920 warn under some level of strict-ness.
4921
4922 void sv_nosharing(SV *sv)
4923
4924 vmess "pat" and "args" are a sprintf-style format pattern and
4925 encapsulated argument list, respectively. These are used to
4926 generate a string message. If the message does not end with a
4927 newline, then it will be extended with some indication of the
4928 current location in the code, as described for "mess_sv".
4929
4930 Normally, the resulting message is returned in a new mortal SV.
4931 During global destruction a single SV may be shared between
4932 uses of this function.
4933
4934 SV * vmess(const char *pat, va_list *args)
4935
4937 These functions are related to the method resolution order of perl
4938 classes
4939
4940 mro_get_linear_isa
4941 Returns the mro linearisation for the given stash. By default,
4942 this will be whatever "mro_get_linear_isa_dfs" returns unless
4943 some other MRO is in effect for the stash. The return value is
4944 a read-only AV*.
4945
4946 You are responsible for "SvREFCNT_inc()" on the return value if
4947 you plan to store it anywhere semi-permanently (otherwise it
4948 might be deleted out from under you the next time the cache is
4949 invalidated).
4950
4951 AV* mro_get_linear_isa(HV* stash)
4952
4953 mro_method_changed_in
4954 Invalidates method caching on any child classes of the given
4955 stash, so that they might notice the changes in this one.
4956
4957 Ideally, all instances of "PL_sub_generation++" in perl source
4958 outside of mro.c should be replaced by calls to this.
4959
4960 Perl automatically handles most of the common ways a method
4961 might be redefined. However, there are a few ways you could
4962 change a method in a stash without the cache code noticing, in
4963 which case you need to call this method afterwards:
4964
4965 1) Directly manipulating the stash HV entries from XS code.
4966
4967 2) Assigning a reference to a readonly scalar constant into a
4968 stash entry in order to create a constant subroutine (like
4969 constant.pm does).
4970
4971 This same method is available from pure perl via,
4972 "mro::method_changed_in(classname)".
4973
4974 void mro_method_changed_in(HV* stash)
4975
4976 mro_register
4977 Registers a custom mro plugin. See perlmroapi for details.
4978
4979 void mro_register(const struct mro_alg *mro)
4980
4982 dMULTICALL
4983 Declare local variables for a multicall. See "LIGHTWEIGHT
4984 CALLBACKS" in perlcall.
4985
4986 dMULTICALL;
4987
4988 MULTICALL
4989 Make a lightweight callback. See "LIGHTWEIGHT CALLBACKS" in
4990 perlcall.
4991
4992 MULTICALL;
4993
4994 POP_MULTICALL
4995 Closing bracket for a lightweight callback. See "LIGHTWEIGHT
4996 CALLBACKS" in perlcall.
4997
4998 POP_MULTICALL;
4999
5000 PUSH_MULTICALL
5001 Opening bracket for a lightweight callback. See "LIGHTWEIGHT
5002 CALLBACKS" in perlcall.
5003
5004 PUSH_MULTICALL;
5005
5007 grok_bin
5008 converts a string representing a binary number to numeric form.
5009
5010 On entry "start" and *len give the string to scan, *flags gives
5011 conversion flags, and "result" should be "NULL" or a pointer to
5012 an NV. The scan stops at the end of the string, or the first
5013 invalid character. Unless "PERL_SCAN_SILENT_ILLDIGIT" is set
5014 in *flags, encountering an invalid character will also trigger
5015 a warning. On return *len is set to the length of the scanned
5016 string, and *flags gives output flags.
5017
5018 If the value is <= "UV_MAX" it is returned as a UV, the output
5019 flags are clear, and nothing is written to *result. If the
5020 value is > "UV_MAX", "grok_bin" returns "UV_MAX", sets
5021 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5022 the value to *result (or the value is discarded if "result" is
5023 NULL).
5024
5025 The binary number may optionally be prefixed with "0b" or "b"
5026 unless "PERL_SCAN_DISALLOW_PREFIX" is set in *flags on entry.
5027 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the
5028 binary number may use "_" characters to separate digits.
5029
5030 UV grok_bin(const char* start, STRLEN* len_p,
5031 I32* flags, NV *result)
5032
5033 grok_hex
5034 converts a string representing a hex number to numeric form.
5035
5036 On entry "start" and *len_p give the string to scan, *flags
5037 gives conversion flags, and "result" should be "NULL" or a
5038 pointer to an NV. The scan stops at the end of the string, or
5039 the first invalid character. Unless
5040 "PERL_SCAN_SILENT_ILLDIGIT" is set in *flags, encountering an
5041 invalid character will also trigger a warning. On return *len
5042 is set to the length of the scanned string, and *flags gives
5043 output flags.
5044
5045 If the value is <= "UV_MAX" it is returned as a UV, the output
5046 flags are clear, and nothing is written to *result. If the
5047 value is > "UV_MAX", "grok_hex" returns "UV_MAX", sets
5048 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5049 the value to *result (or the value is discarded if "result" is
5050 "NULL").
5051
5052 The hex number may optionally be prefixed with "0x" or "x"
5053 unless "PERL_SCAN_DISALLOW_PREFIX" is set in *flags on entry.
5054 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the hex
5055 number may use "_" characters to separate digits.
5056
5057 UV grok_hex(const char* start, STRLEN* len_p,
5058 I32* flags, NV *result)
5059
5060 grok_infnan
5061 Helper for "grok_number()", accepts various ways of spelling
5062 "infinity" or "not a number", and returns one of the following
5063 flag combinations:
5064
5065 IS_NUMBER_INFINITY
5066 IS_NUMBER_NAN
5067 IS_NUMBER_INFINITY | IS_NUMBER_NEG
5068 IS_NUMBER_NAN | IS_NUMBER_NEG
5069 0
5070
5071 possibly |-ed with "IS_NUMBER_TRAILING".
5072
5073 If an infinity or a not-a-number is recognized, *sp will point
5074 to one byte past the end of the recognized string. If the
5075 recognition fails, zero is returned, and *sp will not move.
5076
5077 int grok_infnan(const char** sp, const char *send)
5078
5079 grok_number
5080 Identical to "grok_number_flags()" with "flags" set to zero.
5081
5082 int grok_number(const char *pv, STRLEN len,
5083 UV *valuep)
5084
5085 grok_number_flags
5086 Recognise (or not) a number. The type of the number is
5087 returned (0 if unrecognised), otherwise it is a bit-ORed
5088 combination of "IS_NUMBER_IN_UV",
5089 "IS_NUMBER_GREATER_THAN_UV_MAX", "IS_NUMBER_NOT_INT",
5090 "IS_NUMBER_NEG", "IS_NUMBER_INFINITY", "IS_NUMBER_NAN" (defined
5091 in perl.h).
5092
5093 If the value of the number can fit in a UV, it is returned in
5094 *valuep. "IS_NUMBER_IN_UV" will be set to indicate that
5095 *valuep is valid, "IS_NUMBER_IN_UV" will never be set unless
5096 *valuep is valid, but *valuep may have been assigned to during
5097 processing even though "IS_NUMBER_IN_UV" is not set on return.
5098 If "valuep" is "NULL", "IS_NUMBER_IN_UV" will be set for the
5099 same cases as when "valuep" is non-"NULL", but no actual
5100 assignment (or SEGV) will occur.
5101
5102 "IS_NUMBER_NOT_INT" will be set with "IS_NUMBER_IN_UV" if
5103 trailing decimals were seen (in which case *valuep gives the
5104 true value truncated to an integer), and "IS_NUMBER_NEG" if the
5105 number is negative (in which case *valuep holds the absolute
5106 value). "IS_NUMBER_IN_UV" is not set if e notation was used or
5107 the number is larger than a UV.
5108
5109 "flags" allows only "PERL_SCAN_TRAILING", which allows for
5110 trailing non-numeric text on an otherwise successful grok,
5111 setting "IS_NUMBER_TRAILING" on the result.
5112
5113 int grok_number_flags(const char *pv, STRLEN len,
5114 UV *valuep, U32 flags)
5115
5116 grok_numeric_radix
5117 Scan and skip for a numeric decimal separator (radix).
5118
5119 bool grok_numeric_radix(const char **sp,
5120 const char *send)
5121
5122 grok_oct
5123 converts a string representing an octal number to numeric form.
5124
5125 On entry "start" and *len give the string to scan, *flags gives
5126 conversion flags, and "result" should be "NULL" or a pointer to
5127 an NV. The scan stops at the end of the string, or the first
5128 invalid character. Unless "PERL_SCAN_SILENT_ILLDIGIT" is set
5129 in *flags, encountering an 8 or 9 will also trigger a warning.
5130 On return *len is set to the length of the scanned string, and
5131 *flags gives output flags.
5132
5133 If the value is <= "UV_MAX" it is returned as a UV, the output
5134 flags are clear, and nothing is written to *result. If the
5135 value is > "UV_MAX", "grok_oct" returns "UV_MAX", sets
5136 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5137 the value to *result (or the value is discarded if "result" is
5138 "NULL").
5139
5140 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the
5141 octal number may use "_" characters to separate digits.
5142
5143 UV grok_oct(const char* start, STRLEN* len_p,
5144 I32* flags, NV *result)
5145
5146 isinfnan
5147 "Perl_isinfnan()" is utility function that returns true if the
5148 NV argument is either an infinity or a "NaN", false otherwise.
5149 To test in more detail, use "Perl_isinf()" and "Perl_isnan()".
5150
5151 This is also the logical inverse of Perl_isfinite().
5152
5153 bool isinfnan(NV nv)
5154
5155 my_strtod
5156 This function is equivalent to the libc strtod() function, and
5157 is available even on platforms that lack plain strtod(). Its
5158 return value is the best available precision depending on
5159 platform capabilities and Configure options.
5160
5161 It properly handles the locale radix character, meaning it
5162 expects a dot except when called from within the scope of
5163 "use locale", in which case the radix character should be that
5164 specified by the current locale.
5165
5166 The synonym Strod() may be used instead.
5167
5168 NV my_strtod(const char * const s, char ** e)
5169
5170 Perl_signbit
5171 NOTE: this function is experimental and may change or be
5172 removed without notice.
5173
5174 Return a non-zero integer if the sign bit on an NV is set, and
5175 0 if it is not.
5176
5177 If Configure detects this system has a "signbit()" that will
5178 work with our NVs, then we just use it via the "#define" in
5179 perl.h. Otherwise, fall back on this implementation. The main
5180 use of this function is catching "-0.0".
5181
5182 "Configure" notes: This function is called 'Perl_signbit'
5183 instead of a plain 'signbit' because it is easy to imagine a
5184 system having a "signbit()" function or macro that doesn't
5185 happen to work with our particular choice of NVs. We shouldn't
5186 just re-"#define" "signbit" as "Perl_signbit" and expect the
5187 standard system headers to be happy. Also, this is a no-
5188 context function (no "pTHX_") because "Perl_signbit()" is
5189 usually re-"#defined" in perl.h as a simple macro call to the
5190 system's "signbit()". Users should just always call
5191 "Perl_signbit()".
5192
5193 int Perl_signbit(NV f)
5194
5195 scan_bin
5196 For backwards compatibility. Use "grok_bin" instead.
5197
5198 NV scan_bin(const char* start, STRLEN len,
5199 STRLEN* retlen)
5200
5201 scan_hex
5202 For backwards compatibility. Use "grok_hex" instead.
5203
5204 NV scan_hex(const char* start, STRLEN len,
5205 STRLEN* retlen)
5206
5207 scan_oct
5208 For backwards compatibility. Use "grok_oct" instead.
5209
5210 NV scan_oct(const char* start, STRLEN len,
5211 STRLEN* retlen)
5212
5214 Some of these are also deprecated. You can exclude these from your
5215 compiled Perl by adding this option to Configure:
5216 "-Accflags='-DNO_MATHOMS'"
5217
5218 custom_op_desc
5219 Return the description of a given custom op. This was once
5220 used by the "OP_DESC" macro, but is no longer: it has only been
5221 kept for compatibility, and should not be used.
5222
5223 const char * custom_op_desc(const OP *o)
5224
5225 custom_op_name
5226 Return the name for a given custom op. This was once used by
5227 the "OP_NAME" macro, but is no longer: it has only been kept
5228 for compatibility, and should not be used.
5229
5230 const char * custom_op_name(const OP *o)
5231
5232 gv_fetchmethod
5233 See "gv_fetchmethod_autoload".
5234
5235 GV* gv_fetchmethod(HV* stash, const char* name)
5236
5237 is_utf8_char
5238 DEPRECATED! It is planned to remove this function from a
5239 future release of Perl. Do not use it for new code; remove it
5240 from existing code.
5241
5242 Tests if some arbitrary number of bytes begins in a valid UTF-8
5243 character. Note that an INVARIANT (i.e. ASCII on non-EBCDIC
5244 machines) character is a valid UTF-8 character. The actual
5245 number of bytes in the UTF-8 character will be returned if it
5246 is valid, otherwise 0.
5247
5248 This function is deprecated due to the possibility that
5249 malformed input could cause reading beyond the end of the input
5250 buffer. Use "isUTF8_CHAR" instead.
5251
5252 STRLEN is_utf8_char(const U8 *s)
5253
5254 is_utf8_char_buf
5255 This is identical to the macro "isUTF8_CHAR".
5256
5257 STRLEN is_utf8_char_buf(const U8 *buf,
5258 const U8 *buf_end)
5259
5260 pack_cat
5261 The engine implementing "pack()" Perl function. Note:
5262 parameters "next_in_list" and "flags" are not used. This call
5263 should not be used; use "packlist" instead.
5264
5265 void pack_cat(SV *cat, const char *pat,
5266 const char *patend, SV **beglist,
5267 SV **endlist, SV ***next_in_list,
5268 U32 flags)
5269
5270 pad_compname_type
5271 Looks up the type of the lexical variable at position "po" in
5272 the currently-compiling pad. If the variable is typed, the
5273 stash of the class to which it is typed is returned. If not,
5274 "NULL" is returned.
5275
5276 HV * pad_compname_type(PADOFFSET po)
5277
5278 sv_2pvbyte_nolen
5279 Return a pointer to the byte-encoded representation of the SV.
5280 May cause the SV to be downgraded from UTF-8 as a side-effect.
5281
5282 Usually accessed via the "SvPVbyte_nolen" macro.
5283
5284 char* sv_2pvbyte_nolen(SV* sv)
5285
5286 sv_2pvutf8_nolen
5287 Return a pointer to the UTF-8-encoded representation of the SV.
5288 May cause the SV to be upgraded to UTF-8 as a side-effect.
5289
5290 Usually accessed via the "SvPVutf8_nolen" macro.
5291
5292 char* sv_2pvutf8_nolen(SV* sv)
5293
5294 sv_2pv_nolen
5295 Like "sv_2pv()", but doesn't return the length too. You should
5296 usually use the macro wrapper "SvPV_nolen(sv)" instead.
5297
5298 char* sv_2pv_nolen(SV* sv)
5299
5300 sv_catpvn_mg
5301 Like "sv_catpvn", but also handles 'set' magic.
5302
5303 void sv_catpvn_mg(SV *sv, const char *ptr,
5304 STRLEN len)
5305
5306 sv_catsv_mg
5307 Like "sv_catsv", but also handles 'set' magic.
5308
5309 void sv_catsv_mg(SV *dsv, SV *ssv)
5310
5311 sv_force_normal
5312 Undo various types of fakery on an SV: if the PV is a shared
5313 string, make a private copy; if we're a ref, stop refing; if
5314 we're a glob, downgrade to an "xpvmg". See also
5315 "sv_force_normal_flags".
5316
5317 void sv_force_normal(SV *sv)
5318
5319 sv_iv A private implementation of the "SvIVx" macro for compilers
5320 which can't cope with complex macro expressions. Always use
5321 the macro instead.
5322
5323 IV sv_iv(SV* sv)
5324
5325 sv_nolocking
5326 Dummy routine which "locks" an SV when there is no locking
5327 module present. Exists to avoid test for a "NULL" function
5328 pointer and because it could potentially warn under some level
5329 of strict-ness.
5330
5331 "Superseded" by "sv_nosharing()".
5332
5333 void sv_nolocking(SV *sv)
5334
5335 sv_nounlocking
5336 Dummy routine which "unlocks" an SV when there is no locking
5337 module present. Exists to avoid test for a "NULL" function
5338 pointer and because it could potentially warn under some level
5339 of strict-ness.
5340
5341 "Superseded" by "sv_nosharing()".
5342
5343 void sv_nounlocking(SV *sv)
5344
5345 sv_nv A private implementation of the "SvNVx" macro for compilers
5346 which can't cope with complex macro expressions. Always use
5347 the macro instead.
5348
5349 NV sv_nv(SV* sv)
5350
5351 sv_pv Use the "SvPV_nolen" macro instead
5352
5353 char* sv_pv(SV *sv)
5354
5355 sv_pvbyte
5356 Use "SvPVbyte_nolen" instead.
5357
5358 char* sv_pvbyte(SV *sv)
5359
5360 sv_pvbyten
5361 A private implementation of the "SvPVbyte" macro for compilers
5362 which can't cope with complex macro expressions. Always use
5363 the macro instead.
5364
5365 char* sv_pvbyten(SV *sv, STRLEN *lp)
5366
5367 sv_pvn A private implementation of the "SvPV" macro for compilers
5368 which can't cope with complex macro expressions. Always use
5369 the macro instead.
5370
5371 char* sv_pvn(SV *sv, STRLEN *lp)
5372
5373 sv_pvutf8
5374 Use the "SvPVutf8_nolen" macro instead
5375
5376 char* sv_pvutf8(SV *sv)
5377
5378 sv_pvutf8n
5379 A private implementation of the "SvPVutf8" macro for compilers
5380 which can't cope with complex macro expressions. Always use
5381 the macro instead.
5382
5383 char* sv_pvutf8n(SV *sv, STRLEN *lp)
5384
5385 sv_taint
5386 Taint an SV. Use "SvTAINTED_on" instead.
5387
5388 void sv_taint(SV* sv)
5389
5390 sv_unref
5391 Unsets the RV status of the SV, and decrements the reference
5392 count of whatever was being referenced by the RV. This can
5393 almost be thought of as a reversal of "newSVrv". This is
5394 "sv_unref_flags" with the "flag" being zero. See "SvROK_off".
5395
5396 void sv_unref(SV* sv)
5397
5398 sv_usepvn
5399 Tells an SV to use "ptr" to find its string value. Implemented
5400 by calling "sv_usepvn_flags" with "flags" of 0, hence does not
5401 handle 'set' magic. See "sv_usepvn_flags".
5402
5403 void sv_usepvn(SV* sv, char* ptr, STRLEN len)
5404
5405 sv_usepvn_mg
5406 Like "sv_usepvn", but also handles 'set' magic.
5407
5408 void sv_usepvn_mg(SV *sv, char *ptr, STRLEN len)
5409
5410 sv_uv A private implementation of the "SvUVx" macro for compilers
5411 which can't cope with complex macro expressions. Always use
5412 the macro instead.
5413
5414 UV sv_uv(SV* sv)
5415
5416 unpack_str
5417 The engine implementing "unpack()" Perl function. Note:
5418 parameters "strbeg", "new_s" and "ocnt" are not used. This
5419 call should not be used, use "unpackstring" instead.
5420
5421 SSize_t unpack_str(const char *pat, const char *patend,
5422 const char *s, const char *strbeg,
5423 const char *strend, char **new_s,
5424 I32 ocnt, U32 flags)
5425
5426 utf8_to_uvuni
5427 DEPRECATED! It is planned to remove this function from a
5428 future release of Perl. Do not use it for new code; remove it
5429 from existing code.
5430
5431 Returns the Unicode code point of the first character in the
5432 string "s" which is assumed to be in UTF-8 encoding; "retlen"
5433 will be set to the length, in bytes, of that character.
5434
5435 Some, but not all, UTF-8 malformations are detected, and in
5436 fact, some malformed input could cause reading beyond the end
5437 of the input buffer, which is one reason why this function is
5438 deprecated. The other is that only in extremely limited
5439 circumstances should the Unicode versus native code point be of
5440 any interest to you. See "utf8_to_uvuni_buf" for alternatives.
5441
5442 If "s" points to one of the detected malformations, and UTF8
5443 warnings are enabled, zero is returned and *retlen is set (if
5444 "retlen" doesn't point to NULL) to -1. If those warnings are
5445 off, the computed value if well-defined (or the Unicode
5446 REPLACEMENT CHARACTER, if not) is silently returned, and
5447 *retlen is set (if "retlen" isn't NULL) so that ("s" + *retlen)
5448 is the next possible position in "s" that could begin a non-
5449 malformed character. See "utf8n_to_uvchr" for details on when
5450 the REPLACEMENT CHARACTER is returned.
5451
5452 UV utf8_to_uvuni(const U8 *s, STRLEN *retlen)
5453
5455 newASSIGNOP
5456 Constructs, checks, and returns an assignment op. "left" and
5457 "right" supply the parameters of the assignment; they are
5458 consumed by this function and become part of the constructed op
5459 tree.
5460
5461 If "optype" is "OP_ANDASSIGN", "OP_ORASSIGN", or
5462 "OP_DORASSIGN", then a suitable conditional optree is
5463 constructed. If "optype" is the opcode of a binary operator,
5464 such as "OP_BIT_OR", then an op is constructed that performs
5465 the binary operation and assigns the result to the left
5466 argument. Either way, if "optype" is non-zero then "flags" has
5467 no effect.
5468
5469 If "optype" is zero, then a plain scalar or list assignment is
5470 constructed. Which type of assignment it is is automatically
5471 determined. "flags" gives the eight bits of "op_flags", except
5472 that "OPf_KIDS" will be set automatically, and, shifted up
5473 eight bits, the eight bits of "op_private", except that the bit
5474 with value 1 or 2 is automatically set as required.
5475
5476 OP * newASSIGNOP(I32 flags, OP *left, I32 optype,
5477 OP *right)
5478
5479 newBINOP
5480 Constructs, checks, and returns an op of any binary type.
5481 "type" is the opcode. "flags" gives the eight bits of
5482 "op_flags", except that "OPf_KIDS" will be set automatically,
5483 and, shifted up eight bits, the eight bits of "op_private",
5484 except that the bit with value 1 or 2 is automatically set as
5485 required. "first" and "last" supply up to two ops to be the
5486 direct children of the binary op; they are consumed by this
5487 function and become part of the constructed op tree.
5488
5489 OP * newBINOP(I32 type, I32 flags, OP *first,
5490 OP *last)
5491
5492 newCONDOP
5493 Constructs, checks, and returns a conditional-expression
5494 ("cond_expr") op. "flags" gives the eight bits of "op_flags",
5495 except that "OPf_KIDS" will be set automatically, and, shifted
5496 up eight bits, the eight bits of "op_private", except that the
5497 bit with value 1 is automatically set. "first" supplies the
5498 expression selecting between the two branches, and "trueop" and
5499 "falseop" supply the branches; they are consumed by this
5500 function and become part of the constructed op tree.
5501
5502 OP * newCONDOP(I32 flags, OP *first, OP *trueop,
5503 OP *falseop)
5504
5505 newDEFSVOP
5506 Constructs and returns an op to access $_.
5507
5508 OP * newDEFSVOP()
5509
5510 newFOROP
5511 Constructs, checks, and returns an op tree expressing a
5512 "foreach" loop (iteration through a list of values). This is a
5513 heavyweight loop, with structure that allows exiting the loop
5514 by "last" and suchlike.
5515
5516 "sv" optionally supplies the variable that will be aliased to
5517 each item in turn; if null, it defaults to $_. "expr" supplies
5518 the list of values to iterate over. "block" supplies the main
5519 body of the loop, and "cont" optionally supplies a "continue"
5520 block that operates as a second half of the body. All of these
5521 optree inputs are consumed by this function and become part of
5522 the constructed op tree.
5523
5524 "flags" gives the eight bits of "op_flags" for the "leaveloop"
5525 op and, shifted up eight bits, the eight bits of "op_private"
5526 for the "leaveloop" op, except that (in both cases) some bits
5527 will be set automatically.
5528
5529 OP * newFOROP(I32 flags, OP *sv, OP *expr, OP *block,
5530 OP *cont)
5531
5532 newGIVENOP
5533 Constructs, checks, and returns an op tree expressing a "given"
5534 block. "cond" supplies the expression to whose value $_ will
5535 be locally aliased, and "block" supplies the body of the
5536 "given" construct; they are consumed by this function and
5537 become part of the constructed op tree. "defsv_off" must be
5538 zero (it used to identity the pad slot of lexical $_).
5539
5540 OP * newGIVENOP(OP *cond, OP *block,
5541 PADOFFSET defsv_off)
5542
5543 newGVOP Constructs, checks, and returns an op of any type that involves
5544 an embedded reference to a GV. "type" is the opcode. "flags"
5545 gives the eight bits of "op_flags". "gv" identifies the GV
5546 that the op should reference; calling this function does not
5547 transfer ownership of any reference to it.
5548
5549 OP * newGVOP(I32 type, I32 flags, GV *gv)
5550
5551 newLISTOP
5552 Constructs, checks, and returns an op of any list type. "type"
5553 is the opcode. "flags" gives the eight bits of "op_flags",
5554 except that "OPf_KIDS" will be set automatically if required.
5555 "first" and "last" supply up to two ops to be direct children
5556 of the list op; they are consumed by this function and become
5557 part of the constructed op tree.
5558
5559 For most list operators, the check function expects all the kid
5560 ops to be present already, so calling "newLISTOP(OP_JOIN, ...)"
5561 (e.g.) is not appropriate. What you want to do in that case is
5562 create an op of type "OP_LIST", append more children to it, and
5563 then call "op_convert_list". See "op_convert_list" for more
5564 information.
5565
5566 OP * newLISTOP(I32 type, I32 flags, OP *first,
5567 OP *last)
5568
5569 newLOGOP
5570 Constructs, checks, and returns a logical (flow control) op.
5571 "type" is the opcode. "flags" gives the eight bits of
5572 "op_flags", except that "OPf_KIDS" will be set automatically,
5573 and, shifted up eight bits, the eight bits of "op_private",
5574 except that the bit with value 1 is automatically set. "first"
5575 supplies the expression controlling the flow, and "other"
5576 supplies the side (alternate) chain of ops; they are consumed
5577 by this function and become part of the constructed op tree.
5578
5579 OP * newLOGOP(I32 type, I32 flags, OP *first,
5580 OP *other)
5581
5582 newLOOPEX
5583 Constructs, checks, and returns a loop-exiting op (such as
5584 "goto" or "last"). "type" is the opcode. "label" supplies the
5585 parameter determining the target of the op; it is consumed by
5586 this function and becomes part of the constructed op tree.
5587
5588 OP * newLOOPEX(I32 type, OP *label)
5589
5590 newLOOPOP
5591 Constructs, checks, and returns an op tree expressing a loop.
5592 This is only a loop in the control flow through the op tree; it
5593 does not have the heavyweight loop structure that allows
5594 exiting the loop by "last" and suchlike. "flags" gives the
5595 eight bits of "op_flags" for the top-level op, except that some
5596 bits will be set automatically as required. "expr" supplies
5597 the expression controlling loop iteration, and "block" supplies
5598 the body of the loop; they are consumed by this function and
5599 become part of the constructed op tree. "debuggable" is
5600 currently unused and should always be 1.
5601
5602 OP * newLOOPOP(I32 flags, I32 debuggable, OP *expr,
5603 OP *block)
5604
5605 newMETHOP
5606 Constructs, checks, and returns an op of method type with a
5607 method name evaluated at runtime. "type" is the opcode.
5608 "flags" gives the eight bits of "op_flags", except that
5609 "OPf_KIDS" will be set automatically, and, shifted up eight
5610 bits, the eight bits of "op_private", except that the bit with
5611 value 1 is automatically set. "dynamic_meth" supplies an op
5612 which evaluates method name; it is consumed by this function
5613 and become part of the constructed op tree. Supported optypes:
5614 "OP_METHOD".
5615
5616 OP * newMETHOP(I32 type, I32 flags, OP *first)
5617
5618 newMETHOP_named
5619 Constructs, checks, and returns an op of method type with a
5620 constant method name. "type" is the opcode. "flags" gives the
5621 eight bits of "op_flags", and, shifted up eight bits, the eight
5622 bits of "op_private". "const_meth" supplies a constant method
5623 name; it must be a shared COW string. Supported optypes:
5624 "OP_METHOD_NAMED".
5625
5626 OP * newMETHOP_named(I32 type, I32 flags,
5627 SV *const_meth)
5628
5629 newNULLLIST
5630 Constructs, checks, and returns a new "stub" op, which
5631 represents an empty list expression.
5632
5633 OP * newNULLLIST()
5634
5635 newOP Constructs, checks, and returns an op of any base type (any
5636 type that has no extra fields). "type" is the opcode. "flags"
5637 gives the eight bits of "op_flags", and, shifted up eight bits,
5638 the eight bits of "op_private".
5639
5640 OP * newOP(I32 type, I32 flags)
5641
5642 newPADOP
5643 Constructs, checks, and returns an op of any type that involves
5644 a reference to a pad element. "type" is the opcode. "flags"
5645 gives the eight bits of "op_flags". A pad slot is
5646 automatically allocated, and is populated with "sv"; this
5647 function takes ownership of one reference to it.
5648
5649 This function only exists if Perl has been compiled to use
5650 ithreads.
5651
5652 OP * newPADOP(I32 type, I32 flags, SV *sv)
5653
5654 newPMOP Constructs, checks, and returns an op of any pattern matching
5655 type. "type" is the opcode. "flags" gives the eight bits of
5656 "op_flags" and, shifted up eight bits, the eight bits of
5657 "op_private".
5658
5659 OP * newPMOP(I32 type, I32 flags)
5660
5661 newPVOP Constructs, checks, and returns an op of any type that involves
5662 an embedded C-level pointer (PV). "type" is the opcode.
5663 "flags" gives the eight bits of "op_flags". "pv" supplies the
5664 C-level pointer. Depending on the op type, the memory
5665 referenced by "pv" may be freed when the op is destroyed. If
5666 the op is of a freeing type, "pv" must have been allocated
5667 using "PerlMemShared_malloc".
5668
5669 OP * newPVOP(I32 type, I32 flags, char *pv)
5670
5671 newRANGE
5672 Constructs and returns a "range" op, with subordinate "flip"
5673 and "flop" ops. "flags" gives the eight bits of "op_flags" for
5674 the "flip" op and, shifted up eight bits, the eight bits of
5675 "op_private" for both the "flip" and "range" ops, except that
5676 the bit with value 1 is automatically set. "left" and "right"
5677 supply the expressions controlling the endpoints of the range;
5678 they are consumed by this function and become part of the
5679 constructed op tree.
5680
5681 OP * newRANGE(I32 flags, OP *left, OP *right)
5682
5683 newSLICEOP
5684 Constructs, checks, and returns an "lslice" (list slice) op.
5685 "flags" gives the eight bits of "op_flags", except that
5686 "OPf_KIDS" will be set automatically, and, shifted up eight
5687 bits, the eight bits of "op_private", except that the bit with
5688 value 1 or 2 is automatically set as required. "listval" and
5689 "subscript" supply the parameters of the slice; they are
5690 consumed by this function and become part of the constructed op
5691 tree.
5692
5693 OP * newSLICEOP(I32 flags, OP *subscript,
5694 OP *listval)
5695
5696 newSTATEOP
5697 Constructs a state op (COP). The state op is normally a
5698 "nextstate" op, but will be a "dbstate" op if debugging is
5699 enabled for currently-compiled code. The state op is populated
5700 from "PL_curcop" (or "PL_compiling"). If "label" is non-null,
5701 it supplies the name of a label to attach to the state op; this
5702 function takes ownership of the memory pointed at by "label",
5703 and will free it. "flags" gives the eight bits of "op_flags"
5704 for the state op.
5705
5706 If "o" is null, the state op is returned. Otherwise the state
5707 op is combined with "o" into a "lineseq" list op, which is
5708 returned. "o" is consumed by this function and becomes part of
5709 the returned op tree.
5710
5711 OP * newSTATEOP(I32 flags, char *label, OP *o)
5712
5713 newSVOP Constructs, checks, and returns an op of any type that involves
5714 an embedded SV. "type" is the opcode. "flags" gives the eight
5715 bits of "op_flags". "sv" gives the SV to embed in the op; this
5716 function takes ownership of one reference to it.
5717
5718 OP * newSVOP(I32 type, I32 flags, SV *sv)
5719
5720 newUNOP Constructs, checks, and returns an op of any unary type.
5721 "type" is the opcode. "flags" gives the eight bits of
5722 "op_flags", except that "OPf_KIDS" will be set automatically if
5723 required, and, shifted up eight bits, the eight bits of
5724 "op_private", except that the bit with value 1 is automatically
5725 set. "first" supplies an optional op to be the direct child of
5726 the unary op; it is consumed by this function and become part
5727 of the constructed op tree.
5728
5729 OP * newUNOP(I32 type, I32 flags, OP *first)
5730
5731 newUNOP_AUX
5732 Similar to "newUNOP", but creates an "UNOP_AUX" struct instead,
5733 with "op_aux" initialised to "aux"
5734
5735 OP* newUNOP_AUX(I32 type, I32 flags, OP* first,
5736 UNOP_AUX_item *aux)
5737
5738 newWHENOP
5739 Constructs, checks, and returns an op tree expressing a "when"
5740 block. "cond" supplies the test expression, and "block"
5741 supplies the block that will be executed if the test evaluates
5742 to true; they are consumed by this function and become part of
5743 the constructed op tree. "cond" will be interpreted
5744 DWIMically, often as a comparison against $_, and may be null
5745 to generate a "default" block.
5746
5747 OP * newWHENOP(OP *cond, OP *block)
5748
5749 newWHILEOP
5750 Constructs, checks, and returns an op tree expressing a "while"
5751 loop. This is a heavyweight loop, with structure that allows
5752 exiting the loop by "last" and suchlike.
5753
5754 "loop" is an optional preconstructed "enterloop" op to use in
5755 the loop; if it is null then a suitable op will be constructed
5756 automatically. "expr" supplies the loop's controlling
5757 expression. "block" supplies the main body of the loop, and
5758 "cont" optionally supplies a "continue" block that operates as
5759 a second half of the body. All of these optree inputs are
5760 consumed by this function and become part of the constructed op
5761 tree.
5762
5763 "flags" gives the eight bits of "op_flags" for the "leaveloop"
5764 op and, shifted up eight bits, the eight bits of "op_private"
5765 for the "leaveloop" op, except that (in both cases) some bits
5766 will be set automatically. "debuggable" is currently unused
5767 and should always be 1. "has_my" can be supplied as true to
5768 force the loop body to be enclosed in its own scope.
5769
5770 OP * newWHILEOP(I32 flags, I32 debuggable,
5771 LOOP *loop, OP *expr, OP *block,
5772 OP *cont, I32 has_my)
5773
5775 alloccopstash
5776 NOTE: this function is experimental and may change or be
5777 removed without notice.
5778
5779 Available only under threaded builds, this function allocates
5780 an entry in "PL_stashpad" for the stash passed to it.
5781
5782 PADOFFSET alloccopstash(HV *hv)
5783
5784 block_end
5785 Handles compile-time scope exit. "floor" is the savestack
5786 index returned by "block_start", and "seq" is the body of the
5787 block. Returns the block, possibly modified.
5788
5789 OP * block_end(I32 floor, OP *seq)
5790
5791 block_start
5792 Handles compile-time scope entry. Arranges for hints to be
5793 restored on block exit and also handles pad sequence numbers to
5794 make lexical variables scope right. Returns a savestack index
5795 for use with "block_end".
5796
5797 int block_start(int full)
5798
5799 ck_entersub_args_list
5800 Performs the default fixup of the arguments part of an
5801 "entersub" op tree. This consists of applying list context to
5802 each of the argument ops. This is the standard treatment used
5803 on a call marked with "&", or a method call, or a call through
5804 a subroutine reference, or any other call where the callee
5805 can't be identified at compile time, or a call where the callee
5806 has no prototype.
5807
5808 OP * ck_entersub_args_list(OP *entersubop)
5809
5810 ck_entersub_args_proto
5811 Performs the fixup of the arguments part of an "entersub" op
5812 tree based on a subroutine prototype. This makes various
5813 modifications to the argument ops, from applying context up to
5814 inserting "refgen" ops, and checking the number and syntactic
5815 types of arguments, as directed by the prototype. This is the
5816 standard treatment used on a subroutine call, not marked with
5817 "&", where the callee can be identified at compile time and has
5818 a prototype.
5819
5820 "protosv" supplies the subroutine prototype to be applied to
5821 the call. It may be a normal defined scalar, of which the
5822 string value will be used. Alternatively, for convenience, it
5823 may be a subroutine object (a "CV*" that has been cast to
5824 "SV*") which has a prototype. The prototype supplied, in
5825 whichever form, does not need to match the actual callee
5826 referenced by the op tree.
5827
5828 If the argument ops disagree with the prototype, for example by
5829 having an unacceptable number of arguments, a valid op tree is
5830 returned anyway. The error is reflected in the parser state,
5831 normally resulting in a single exception at the top level of
5832 parsing which covers all the compilation errors that occurred.
5833 In the error message, the callee is referred to by the name
5834 defined by the "namegv" parameter.
5835
5836 OP * ck_entersub_args_proto(OP *entersubop,
5837 GV *namegv, SV *protosv)
5838
5839 ck_entersub_args_proto_or_list
5840 Performs the fixup of the arguments part of an "entersub" op
5841 tree either based on a subroutine prototype or using default
5842 list-context processing. This is the standard treatment used
5843 on a subroutine call, not marked with "&", where the callee can
5844 be identified at compile time.
5845
5846 "protosv" supplies the subroutine prototype to be applied to
5847 the call, or indicates that there is no prototype. It may be a
5848 normal scalar, in which case if it is defined then the string
5849 value will be used as a prototype, and if it is undefined then
5850 there is no prototype. Alternatively, for convenience, it may
5851 be a subroutine object (a "CV*" that has been cast to "SV*"),
5852 of which the prototype will be used if it has one. The
5853 prototype (or lack thereof) supplied, in whichever form, does
5854 not need to match the actual callee referenced by the op tree.
5855
5856 If the argument ops disagree with the prototype, for example by
5857 having an unacceptable number of arguments, a valid op tree is
5858 returned anyway. The error is reflected in the parser state,
5859 normally resulting in a single exception at the top level of
5860 parsing which covers all the compilation errors that occurred.
5861 In the error message, the callee is referred to by the name
5862 defined by the "namegv" parameter.
5863
5864 OP * ck_entersub_args_proto_or_list(OP *entersubop,
5865 GV *namegv,
5866 SV *protosv)
5867
5868 cv_const_sv
5869 If "cv" is a constant sub eligible for inlining, returns the
5870 constant value returned by the sub. Otherwise, returns "NULL".
5871
5872 Constant subs can be created with "newCONSTSUB" or as described
5873 in "Constant Functions" in perlsub.
5874
5875 SV* cv_const_sv(const CV *const cv)
5876
5877 cv_get_call_checker
5878 The original form of "cv_get_call_checker_flags", which does
5879 not return checker flags. When using a checker function
5880 returned by this function, it is only safe to call it with a
5881 genuine GV as its "namegv" argument.
5882
5883 void cv_get_call_checker(CV *cv,
5884 Perl_call_checker *ckfun_p,
5885 SV **ckobj_p)
5886
5887 cv_get_call_checker_flags
5888 Retrieves the function that will be used to fix up a call to
5889 "cv". Specifically, the function is applied to an "entersub"
5890 op tree for a subroutine call, not marked with "&", where the
5891 callee can be identified at compile time as "cv".
5892
5893 The C-level function pointer is returned in *ckfun_p, an SV
5894 argument for it is returned in *ckobj_p, and control flags are
5895 returned in *ckflags_p. The function is intended to be called
5896 in this manner:
5897
5898 entersubop = (*ckfun_p)(aTHX_ entersubop, namegv, (*ckobj_p));
5899
5900 In this call, "entersubop" is a pointer to the "entersub" op,
5901 which may be replaced by the check function, and "namegv"
5902 supplies the name that should be used by the check function to
5903 refer to the callee of the "entersub" op if it needs to emit
5904 any diagnostics. It is permitted to apply the check function
5905 in non-standard situations, such as to a call to a different
5906 subroutine or to a method call.
5907
5908 "namegv" may not actually be a GV. If the
5909 "CALL_CHECKER_REQUIRE_GV" bit is clear in *ckflags_p, it is
5910 permitted to pass a CV or other SV instead, anything that can
5911 be used as the first argument to "cv_name". If the
5912 "CALL_CHECKER_REQUIRE_GV" bit is set in *ckflags_p then the
5913 check function requires "namegv" to be a genuine GV.
5914
5915 By default, the check function is
5916 Perl_ck_entersub_args_proto_or_list, the SV parameter is "cv"
5917 itself, and the "CALL_CHECKER_REQUIRE_GV" flag is clear. This
5918 implements standard prototype processing. It can be changed,
5919 for a particular subroutine, by "cv_set_call_checker_flags".
5920
5921 If the "CALL_CHECKER_REQUIRE_GV" bit is set in "gflags" then it
5922 indicates that the caller only knows about the genuine GV
5923 version of "namegv", and accordingly the corresponding bit will
5924 always be set in *ckflags_p, regardless of the check function's
5925 recorded requirements. If the "CALL_CHECKER_REQUIRE_GV" bit is
5926 clear in "gflags" then it indicates the caller knows about the
5927 possibility of passing something other than a GV as "namegv",
5928 and accordingly the corresponding bit may be either set or
5929 clear in *ckflags_p, indicating the check function's recorded
5930 requirements.
5931
5932 "gflags" is a bitset passed into "cv_get_call_checker_flags",
5933 in which only the "CALL_CHECKER_REQUIRE_GV" bit currently has a
5934 defined meaning (for which see above). All other bits should
5935 be clear.
5936
5937 void cv_get_call_checker_flags(
5938 CV *cv, U32 gflags,
5939 Perl_call_checker *ckfun_p, SV **ckobj_p,
5940 U32 *ckflags_p
5941 )
5942
5943 cv_set_call_checker
5944 The original form of "cv_set_call_checker_flags", which passes
5945 it the "CALL_CHECKER_REQUIRE_GV" flag for backward-
5946 compatibility. The effect of that flag setting is that the
5947 check function is guaranteed to get a genuine GV as its
5948 "namegv" argument.
5949
5950 void cv_set_call_checker(CV *cv,
5951 Perl_call_checker ckfun,
5952 SV *ckobj)
5953
5954 cv_set_call_checker_flags
5955 Sets the function that will be used to fix up a call to "cv".
5956 Specifically, the function is applied to an "entersub" op tree
5957 for a subroutine call, not marked with "&", where the callee
5958 can be identified at compile time as "cv".
5959
5960 The C-level function pointer is supplied in "ckfun", an SV
5961 argument for it is supplied in "ckobj", and control flags are
5962 supplied in "ckflags". The function should be defined like
5963 this:
5964
5965 STATIC OP * ckfun(pTHX_ OP *op, GV *namegv, SV *ckobj)
5966
5967 It is intended to be called in this manner:
5968
5969 entersubop = ckfun(aTHX_ entersubop, namegv, ckobj);
5970
5971 In this call, "entersubop" is a pointer to the "entersub" op,
5972 which may be replaced by the check function, and "namegv"
5973 supplies the name that should be used by the check function to
5974 refer to the callee of the "entersub" op if it needs to emit
5975 any diagnostics. It is permitted to apply the check function
5976 in non-standard situations, such as to a call to a different
5977 subroutine or to a method call.
5978
5979 "namegv" may not actually be a GV. For efficiency, perl may
5980 pass a CV or other SV instead. Whatever is passed can be used
5981 as the first argument to "cv_name". You can force perl to pass
5982 a GV by including "CALL_CHECKER_REQUIRE_GV" in the "ckflags".
5983
5984 "ckflags" is a bitset, in which only the
5985 "CALL_CHECKER_REQUIRE_GV" bit currently has a defined meaning
5986 (for which see above). All other bits should be clear.
5987
5988 The current setting for a particular CV can be retrieved by
5989 "cv_get_call_checker_flags".
5990
5991 void cv_set_call_checker_flags(
5992 CV *cv, Perl_call_checker ckfun, SV *ckobj,
5993 U32 ckflags
5994 )
5995
5996 LINKLIST
5997 Given the root of an optree, link the tree in execution order
5998 using the "op_next" pointers and return the first op executed.
5999 If this has already been done, it will not be redone, and
6000 "o->op_next" will be returned. If "o->op_next" is not already
6001 set, "o" should be at least an "UNOP".
6002
6003 OP* LINKLIST(OP *o)
6004
6005 newCONSTSUB
6006 Behaves like "newCONSTSUB_flags", except that "name" is nul-
6007 terminated rather than of counted length, and no flags are set.
6008 (This means that "name" is always interpreted as Latin-1.)
6009
6010 CV * newCONSTSUB(HV *stash, const char *name, SV *sv)
6011
6012 newCONSTSUB_flags
6013 Construct a constant subroutine, also performing some
6014 surrounding jobs. A scalar constant-valued subroutine is
6015 eligible for inlining at compile-time, and in Perl code can be
6016 created by "sub FOO () { 123 }". Other kinds of constant
6017 subroutine have other treatment.
6018
6019 The subroutine will have an empty prototype and will ignore any
6020 arguments when called. Its constant behaviour is determined by
6021 "sv". If "sv" is null, the subroutine will yield an empty
6022 list. If "sv" points to a scalar, the subroutine will always
6023 yield that scalar. If "sv" points to an array, the subroutine
6024 will always yield a list of the elements of that array in list
6025 context, or the number of elements in the array in scalar
6026 context. This function takes ownership of one counted
6027 reference to the scalar or array, and will arrange for the
6028 object to live as long as the subroutine does. If "sv" points
6029 to a scalar then the inlining assumes that the value of the
6030 scalar will never change, so the caller must ensure that the
6031 scalar is not subsequently written to. If "sv" points to an
6032 array then no such assumption is made, so it is ostensibly safe
6033 to mutate the array or its elements, but whether this is really
6034 supported has not been determined.
6035
6036 The subroutine will have "CvFILE" set according to "PL_curcop".
6037 Other aspects of the subroutine will be left in their default
6038 state. The caller is free to mutate the subroutine beyond its
6039 initial state after this function has returned.
6040
6041 If "name" is null then the subroutine will be anonymous, with
6042 its "CvGV" referring to an "__ANON__" glob. If "name" is non-
6043 null then the subroutine will be named accordingly, referenced
6044 by the appropriate glob. "name" is a string of length "len"
6045 bytes giving a sigilless symbol name, in UTF-8 if "flags" has
6046 the "SVf_UTF8" bit set and in Latin-1 otherwise. The name may
6047 be either qualified or unqualified. If the name is unqualified
6048 then it defaults to being in the stash specified by "stash" if
6049 that is non-null, or to "PL_curstash" if "stash" is null. The
6050 symbol is always added to the stash if necessary, with
6051 "GV_ADDMULTI" semantics.
6052
6053 "flags" should not have bits set other than "SVf_UTF8".
6054
6055 If there is already a subroutine of the specified name, then
6056 the new sub will replace the existing one in the glob. A
6057 warning may be generated about the redefinition.
6058
6059 If the subroutine has one of a few special names, such as
6060 "BEGIN" or "END", then it will be claimed by the appropriate
6061 queue for automatic running of phase-related subroutines. In
6062 this case the relevant glob will be left not containing any
6063 subroutine, even if it did contain one before. Execution of
6064 the subroutine will likely be a no-op, unless "sv" was a tied
6065 array or the caller modified the subroutine in some interesting
6066 way before it was executed. In the case of "BEGIN", the
6067 treatment is buggy: the sub will be executed when only half
6068 built, and may be deleted prematurely, possibly causing a
6069 crash.
6070
6071 The function returns a pointer to the constructed subroutine.
6072 If the sub is anonymous then ownership of one counted reference
6073 to the subroutine is transferred to the caller. If the sub is
6074 named then the caller does not get ownership of a reference.
6075 In most such cases, where the sub has a non-phase name, the sub
6076 will be alive at the point it is returned by virtue of being
6077 contained in the glob that names it. A phase-named subroutine
6078 will usually be alive by virtue of the reference owned by the
6079 phase's automatic run queue. A "BEGIN" subroutine may have
6080 been destroyed already by the time this function returns, but
6081 currently bugs occur in that case before the caller gets
6082 control. It is the caller's responsibility to ensure that it
6083 knows which of these situations applies.
6084
6085 CV * newCONSTSUB_flags(HV *stash, const char *name,
6086 STRLEN len, U32 flags, SV *sv)
6087
6088 newXS Used by "xsubpp" to hook up XSUBs as Perl subs. "filename"
6089 needs to be static storage, as it is used directly as CvFILE(),
6090 without a copy being made.
6091
6092 op_append_elem
6093 Append an item to the list of ops contained directly within a
6094 list-type op, returning the lengthened list. "first" is the
6095 list-type op, and "last" is the op to append to the list.
6096 "optype" specifies the intended opcode for the list. If
6097 "first" is not already a list of the right type, it will be
6098 upgraded into one. If either "first" or "last" is null, the
6099 other is returned unchanged.
6100
6101 OP * op_append_elem(I32 optype, OP *first, OP *last)
6102
6103 op_append_list
6104 Concatenate the lists of ops contained directly within two
6105 list-type ops, returning the combined list. "first" and "last"
6106 are the list-type ops to concatenate. "optype" specifies the
6107 intended opcode for the list. If either "first" or "last" is
6108 not already a list of the right type, it will be upgraded into
6109 one. If either "first" or "last" is null, the other is
6110 returned unchanged.
6111
6112 OP * op_append_list(I32 optype, OP *first, OP *last)
6113
6114 OP_CLASS
6115 Return the class of the provided OP: that is, which of the *OP
6116 structures it uses. For core ops this currently gets the
6117 information out of "PL_opargs", which does not always
6118 accurately reflect the type used; in v5.26 onwards, see also
6119 the function "op_class" which can do a better job of
6120 determining the used type.
6121
6122 For custom ops the type is returned from the registration, and
6123 it is up to the registree to ensure it is accurate. The value
6124 returned will be one of the "OA_"* constants from op.h.
6125
6126 U32 OP_CLASS(OP *o)
6127
6128 op_contextualize
6129 Applies a syntactic context to an op tree representing an
6130 expression. "o" is the op tree, and "context" must be
6131 "G_SCALAR", "G_ARRAY", or "G_VOID" to specify the context to
6132 apply. The modified op tree is returned.
6133
6134 OP * op_contextualize(OP *o, I32 context)
6135
6136 op_convert_list
6137 Converts "o" into a list op if it is not one already, and then
6138 converts it into the specified "type", calling its check
6139 function, allocating a target if it needs one, and folding
6140 constants.
6141
6142 A list-type op is usually constructed one kid at a time via
6143 "newLISTOP", "op_prepend_elem" and "op_append_elem". Then
6144 finally it is passed to "op_convert_list" to make it the right
6145 type.
6146
6147 OP * op_convert_list(I32 type, I32 flags, OP *o)
6148
6149 OP_DESC Return a short description of the provided OP.
6150
6151 const char * OP_DESC(OP *o)
6152
6153 op_free Free an op. Only use this when an op is no longer linked to
6154 from any optree.
6155
6156 void op_free(OP *o)
6157
6158 OpHAS_SIBLING
6159 Returns true if "o" has a sibling
6160
6161 bool OpHAS_SIBLING(OP *o)
6162
6163 OpLASTSIB_set
6164 Marks "o" as having no further siblings and marks o as having
6165 the specified parent. See also "OpMORESIB_set" and
6166 "OpMAYBESIB_set". For a higher-level interface, see
6167 "op_sibling_splice".
6168
6169 void OpLASTSIB_set(OP *o, OP *parent)
6170
6171 op_linklist
6172 This function is the implementation of the "LINKLIST" macro.
6173 It should not be called directly.
6174
6175 OP* op_linklist(OP *o)
6176
6177 op_lvalue
6178 NOTE: this function is experimental and may change or be
6179 removed without notice.
6180
6181 Propagate lvalue ("modifiable") context to an op and its
6182 children. "type" represents the context type, roughly based on
6183 the type of op that would do the modifying, although "local()"
6184 is represented by "OP_NULL", because it has no op type of its
6185 own (it is signalled by a flag on the lvalue op).
6186
6187 This function detects things that can't be modified, such as
6188 "$x+1", and generates errors for them. For example, "$x+1 = 2"
6189 would cause it to be called with an op of type "OP_ADD" and a
6190 "type" argument of "OP_SASSIGN".
6191
6192 It also flags things that need to behave specially in an lvalue
6193 context, such as "$$x = 5" which might have to vivify a
6194 reference in $x.
6195
6196 OP * op_lvalue(OP *o, I32 type)
6197
6198 OpMAYBESIB_set
6199 Conditionally does "OpMORESIB_set" or "OpLASTSIB_set" depending
6200 on whether "sib" is non-null. For a higher-level interface, see
6201 "op_sibling_splice".
6202
6203 void OpMAYBESIB_set(OP *o, OP *sib, OP *parent)
6204
6205 OpMORESIB_set
6206 Sets the sibling of "o" to the non-zero value "sib". See also
6207 "OpLASTSIB_set" and "OpMAYBESIB_set". For a higher-level
6208 interface, see "op_sibling_splice".
6209
6210 void OpMORESIB_set(OP *o, OP *sib)
6211
6212 OP_NAME Return the name of the provided OP. For core ops this looks up
6213 the name from the op_type; for custom ops from the op_ppaddr.
6214
6215 const char * OP_NAME(OP *o)
6216
6217 op_null Neutralizes an op when it is no longer needed, but is still
6218 linked to from other ops.
6219
6220 void op_null(OP *o)
6221
6222 op_parent
6223 Returns the parent OP of "o", if it has a parent. Returns
6224 "NULL" otherwise.
6225
6226 OP* op_parent(OP *o)
6227
6228 op_prepend_elem
6229 Prepend an item to the list of ops contained directly within a
6230 list-type op, returning the lengthened list. "first" is the op
6231 to prepend to the list, and "last" is the list-type op.
6232 "optype" specifies the intended opcode for the list. If "last"
6233 is not already a list of the right type, it will be upgraded
6234 into one. If either "first" or "last" is null, the other is
6235 returned unchanged.
6236
6237 OP * op_prepend_elem(I32 optype, OP *first, OP *last)
6238
6239 op_scope
6240 NOTE: this function is experimental and may change or be
6241 removed without notice.
6242
6243 Wraps up an op tree with some additional ops so that at runtime
6244 a dynamic scope will be created. The original ops run in the
6245 new dynamic scope, and then, provided that they exit normally,
6246 the scope will be unwound. The additional ops used to create
6247 and unwind the dynamic scope will normally be an
6248 "enter"/"leave" pair, but a "scope" op may be used instead if
6249 the ops are simple enough to not need the full dynamic scope
6250 structure.
6251
6252 OP * op_scope(OP *o)
6253
6254 OpSIBLING
6255 Returns the sibling of "o", or "NULL" if there is no sibling
6256
6257 OP* OpSIBLING(OP *o)
6258
6259 op_sibling_splice
6260 A general function for editing the structure of an existing
6261 chain of op_sibling nodes. By analogy with the perl-level
6262 "splice()" function, allows you to delete zero or more
6263 sequential nodes, replacing them with zero or more different
6264 nodes. Performs the necessary op_first/op_last housekeeping on
6265 the parent node and op_sibling manipulation on the children.
6266 The last deleted node will be marked as as the last node by
6267 updating the op_sibling/op_sibparent or op_moresib field as
6268 appropriate.
6269
6270 Note that op_next is not manipulated, and nodes are not freed;
6271 that is the responsibility of the caller. It also won't create
6272 a new list op for an empty list etc; use higher-level functions
6273 like op_append_elem() for that.
6274
6275 "parent" is the parent node of the sibling chain. It may passed
6276 as "NULL" if the splicing doesn't affect the first or last op
6277 in the chain.
6278
6279 "start" is the node preceding the first node to be spliced.
6280 Node(s) following it will be deleted, and ops will be inserted
6281 after it. If it is "NULL", the first node onwards is deleted,
6282 and nodes are inserted at the beginning.
6283
6284 "del_count" is the number of nodes to delete. If zero, no
6285 nodes are deleted. If -1 or greater than or equal to the
6286 number of remaining kids, all remaining kids are deleted.
6287
6288 "insert" is the first of a chain of nodes to be inserted in
6289 place of the nodes. If "NULL", no nodes are inserted.
6290
6291 The head of the chain of deleted ops is returned, or "NULL" if
6292 no ops were deleted.
6293
6294 For example:
6295
6296 action before after returns
6297 ------ ----- ----- -------
6298
6299 P P
6300 splice(P, A, 2, X-Y-Z) | | B-C
6301 A-B-C-D A-X-Y-Z-D
6302
6303 P P
6304 splice(P, NULL, 1, X-Y) | | A
6305 A-B-C-D X-Y-B-C-D
6306
6307 P P
6308 splice(P, NULL, 3, NULL) | | A-B-C
6309 A-B-C-D D
6310
6311 P P
6312 splice(P, B, 0, X-Y) | | NULL
6313 A-B-C-D A-B-X-Y-C-D
6314
6315 For lower-level direct manipulation of "op_sibparent" and
6316 "op_moresib", see "OpMORESIB_set", "OpLASTSIB_set",
6317 "OpMAYBESIB_set".
6318
6319 OP* op_sibling_splice(OP *parent, OP *start,
6320 int del_count, OP* insert)
6321
6322 OP_TYPE_IS
6323 Returns true if the given OP is not a "NULL" pointer and if it
6324 is of the given type.
6325
6326 The negation of this macro, "OP_TYPE_ISNT" is also available as
6327 well as "OP_TYPE_IS_NN" and "OP_TYPE_ISNT_NN" which elide the
6328 NULL pointer check.
6329
6330 bool OP_TYPE_IS(OP *o, Optype type)
6331
6332 OP_TYPE_IS_OR_WAS
6333 Returns true if the given OP is not a NULL pointer and if it is
6334 of the given type or used to be before being replaced by an OP
6335 of type OP_NULL.
6336
6337 The negation of this macro, "OP_TYPE_ISNT_AND_WASNT" is also
6338 available as well as "OP_TYPE_IS_OR_WAS_NN" and
6339 "OP_TYPE_ISNT_AND_WASNT_NN" which elide the "NULL" pointer
6340 check.
6341
6342 bool OP_TYPE_IS_OR_WAS(OP *o, Optype type)
6343
6344 rv2cv_op_cv
6345 Examines an op, which is expected to identify a subroutine at
6346 runtime, and attempts to determine at compile time which
6347 subroutine it identifies. This is normally used during Perl
6348 compilation to determine whether a prototype can be applied to
6349 a function call. "cvop" is the op being considered, normally
6350 an "rv2cv" op. A pointer to the identified subroutine is
6351 returned, if it could be determined statically, and a null
6352 pointer is returned if it was not possible to determine
6353 statically.
6354
6355 Currently, the subroutine can be identified statically if the
6356 RV that the "rv2cv" is to operate on is provided by a suitable
6357 "gv" or "const" op. A "gv" op is suitable if the GV's CV slot
6358 is populated. A "const" op is suitable if the constant value
6359 must be an RV pointing to a CV. Details of this process may
6360 change in future versions of Perl. If the "rv2cv" op has the
6361 "OPpENTERSUB_AMPER" flag set then no attempt is made to
6362 identify the subroutine statically: this flag is used to
6363 suppress compile-time magic on a subroutine call, forcing it to
6364 use default runtime behaviour.
6365
6366 If "flags" has the bit "RV2CVOPCV_MARK_EARLY" set, then the
6367 handling of a GV reference is modified. If a GV was examined
6368 and its CV slot was found to be empty, then the "gv" op has the
6369 "OPpEARLY_CV" flag set. If the op is not optimised away, and
6370 the CV slot is later populated with a subroutine having a
6371 prototype, that flag eventually triggers the warning "called
6372 too early to check prototype".
6373
6374 If "flags" has the bit "RV2CVOPCV_RETURN_NAME_GV" set, then
6375 instead of returning a pointer to the subroutine it returns a
6376 pointer to the GV giving the most appropriate name for the
6377 subroutine in this context. Normally this is just the "CvGV"
6378 of the subroutine, but for an anonymous ("CvANON") subroutine
6379 that is referenced through a GV it will be the referencing GV.
6380 The resulting "GV*" is cast to "CV*" to be returned. A null
6381 pointer is returned as usual if there is no statically-
6382 determinable subroutine.
6383
6384 CV * rv2cv_op_cv(OP *cvop, U32 flags)
6385
6387 packlist
6388 The engine implementing "pack()" Perl function.
6389
6390 void packlist(SV *cat, const char *pat,
6391 const char *patend, SV **beglist,
6392 SV **endlist)
6393
6394 unpackstring
6395 The engine implementing the "unpack()" Perl function.
6396
6397 Using the template "pat..patend", this function unpacks the
6398 string "s..strend" into a number of mortal SVs, which it pushes
6399 onto the perl argument (@_) stack (so you will need to issue a
6400 "PUTBACK" before and "SPAGAIN" after the call to this
6401 function). It returns the number of pushed elements.
6402
6403 The "strend" and "patend" pointers should point to the byte
6404 following the last character of each string.
6405
6406 Although this function returns its values on the perl argument
6407 stack, it doesn't take any parameters from that stack (and thus
6408 in particular there's no need to do a "PUSHMARK" before calling
6409 it, unlike "call_pv" for example).
6410
6411 SSize_t unpackstring(const char *pat,
6412 const char *patend, const char *s,
6413 const char *strend, U32 flags)
6414
6416 CvPADLIST
6417 NOTE: this function is experimental and may change or be
6418 removed without notice.
6419
6420 CV's can have CvPADLIST(cv) set to point to a PADLIST. This is
6421 the CV's scratchpad, which stores lexical variables and opcode
6422 temporary and per-thread values.
6423
6424 For these purposes "formats" are a kind-of CV; eval""s are too
6425 (except they're not callable at will and are always thrown away
6426 after the eval"" is done executing). Require'd files are
6427 simply evals without any outer lexical scope.
6428
6429 XSUBs do not have a "CvPADLIST". "dXSTARG" fetches values from
6430 "PL_curpad", but that is really the callers pad (a slot of
6431 which is allocated by every entersub). Do not get or set
6432 "CvPADLIST" if a CV is an XSUB (as determined by "CvISXSUB()"),
6433 "CvPADLIST" slot is reused for a different internal purpose in
6434 XSUBs.
6435
6436 The PADLIST has a C array where pads are stored.
6437
6438 The 0th entry of the PADLIST is a PADNAMELIST which represents
6439 the "names" or rather the "static type information" for
6440 lexicals. The individual elements of a PADNAMELIST are
6441 PADNAMEs. Future refactorings might stop the PADNAMELIST from
6442 being stored in the PADLIST's array, so don't rely on it. See
6443 "PadlistNAMES".
6444
6445 The CvDEPTH'th entry of a PADLIST is a PAD (an AV) which is the
6446 stack frame at that depth of recursion into the CV. The 0th
6447 slot of a frame AV is an AV which is @_. Other entries are
6448 storage for variables and op targets.
6449
6450 Iterating over the PADNAMELIST iterates over all possible pad
6451 items. Pad slots for targets ("SVs_PADTMP") and GVs end up
6452 having &PL_padname_undef "names", while slots for constants
6453 have &PL_padname_const "names" (see "pad_alloc"). That
6454 &PL_padname_undef and &PL_padname_const are used is an
6455 implementation detail subject to change. To test for them, use
6456 "!PadnamePV(name)" and "PadnamePV(name) && !PadnameLEN(name)",
6457 respectively.
6458
6459 Only "my"/"our" variable slots get valid names. The rest are
6460 op targets/GVs/constants which are statically allocated or
6461 resolved at compile time. These don't have names by which they
6462 can be looked up from Perl code at run time through eval"" the
6463 way "my"/"our" variables can be. Since they can't be looked up
6464 by "name" but only by their index allocated at compile time
6465 (which is usually in "PL_op->op_targ"), wasting a name SV for
6466 them doesn't make sense.
6467
6468 The pad names in the PADNAMELIST have their PV holding the name
6469 of the variable. The "COP_SEQ_RANGE_LOW" and "_HIGH" fields
6470 form a range (low+1..high inclusive) of cop_seq numbers for
6471 which the name is valid. During compilation, these fields may
6472 hold the special value PERL_PADSEQ_INTRO to indicate various
6473 stages:
6474
6475 COP_SEQ_RANGE_LOW _HIGH
6476 ----------------- -----
6477 PERL_PADSEQ_INTRO 0 variable not yet introduced:
6478 { my ($x
6479 valid-seq# PERL_PADSEQ_INTRO variable in scope:
6480 { my ($x);
6481 valid-seq# valid-seq# compilation of scope complete:
6482 { my ($x); .... }
6483
6484 When a lexical var hasn't yet been introduced, it already
6485 exists from the perspective of duplicate declarations, but not
6486 for variable lookups, e.g.
6487
6488 my ($x, $x); # '"my" variable $x masks earlier declaration'
6489 my $x = $x; # equal to my $x = $::x;
6490
6491 For typed lexicals "PadnameTYPE" points at the type stash. For
6492 "our" lexicals, "PadnameOURSTASH" points at the stash of the
6493 associated global (so that duplicate "our" declarations in the
6494 same package can be detected). "PadnameGEN" is sometimes used
6495 to store the generation number during compilation.
6496
6497 If "PadnameOUTER" is set on the pad name, then that slot in the
6498 frame AV is a REFCNT'ed reference to a lexical from "outside".
6499 Such entries are sometimes referred to as 'fake'. In this
6500 case, the name does not use 'low' and 'high' to store a cop_seq
6501 range, since it is in scope throughout. Instead 'high' stores
6502 some flags containing info about the real lexical (is it
6503 declared in an anon, and is it capable of being instantiated
6504 multiple times?), and for fake ANONs, 'low' contains the index
6505 within the parent's pad where the lexical's value is stored, to
6506 make cloning quicker.
6507
6508 If the 'name' is "&" the corresponding entry in the PAD is a CV
6509 representing a possible closure.
6510
6511 Note that formats are treated as anon subs, and are cloned each
6512 time write is called (if necessary).
6513
6514 The flag "SVs_PADSTALE" is cleared on lexicals each time the
6515 "my()" is executed, and set on scope exit. This allows the
6516 "Variable $x is not available" warning to be generated in
6517 evals, such as
6518
6519 { my $x = 1; sub f { eval '$x'} } f();
6520
6521 For state vars, "SVs_PADSTALE" is overloaded to mean 'not yet
6522 initialised', but this internal state is stored in a separate
6523 pad entry.
6524
6525 PADLIST * CvPADLIST(CV *cv)
6526
6527 pad_add_name_pvs
6528 Exactly like "pad_add_name_pvn", but takes a literal string
6529 instead of a string/length pair.
6530
6531 PADOFFSET pad_add_name_pvs("literal string" name,
6532 U32 flags, HV *typestash,
6533 HV *ourstash)
6534
6535 PadARRAY
6536 NOTE: this function is experimental and may change or be
6537 removed without notice.
6538
6539 The C array of pad entries.
6540
6541 SV ** PadARRAY(PAD pad)
6542
6543 pad_findmy_pvs
6544 Exactly like "pad_findmy_pvn", but takes a literal string
6545 instead of a string/length pair.
6546
6547 PADOFFSET pad_findmy_pvs("literal string" name,
6548 U32 flags)
6549
6550 PadlistARRAY
6551 NOTE: this function is experimental and may change or be
6552 removed without notice.
6553
6554 The C array of a padlist, containing the pads. Only subscript
6555 it with numbers >= 1, as the 0th entry is not guaranteed to
6556 remain usable.
6557
6558 PAD ** PadlistARRAY(PADLIST padlist)
6559
6560 PadlistMAX
6561 NOTE: this function is experimental and may change or be
6562 removed without notice.
6563
6564 The index of the last allocated space in the padlist. Note
6565 that the last pad may be in an earlier slot. Any entries
6566 following it will be "NULL" in that case.
6567
6568 SSize_t PadlistMAX(PADLIST padlist)
6569
6570 PadlistNAMES
6571 NOTE: this function is experimental and may change or be
6572 removed without notice.
6573
6574 The names associated with pad entries.
6575
6576 PADNAMELIST * PadlistNAMES(PADLIST padlist)
6577
6578 PadlistNAMESARRAY
6579 NOTE: this function is experimental and may change or be
6580 removed without notice.
6581
6582 The C array of pad names.
6583
6584 PADNAME ** PadlistNAMESARRAY(PADLIST padlist)
6585
6586 PadlistNAMESMAX
6587 NOTE: this function is experimental and may change or be
6588 removed without notice.
6589
6590 The index of the last pad name.
6591
6592 SSize_t PadlistNAMESMAX(PADLIST padlist)
6593
6594 PadlistREFCNT
6595 NOTE: this function is experimental and may change or be
6596 removed without notice.
6597
6598 The reference count of the padlist. Currently this is always
6599 1.
6600
6601 U32 PadlistREFCNT(PADLIST padlist)
6602
6603 PadMAX NOTE: this function is experimental and may change or be
6604 removed without notice.
6605
6606 The index of the last pad entry.
6607
6608 SSize_t PadMAX(PAD pad)
6609
6610 PadnameLEN
6611 NOTE: this function is experimental and may change or be
6612 removed without notice.
6613
6614 The length of the name.
6615
6616 STRLEN PadnameLEN(PADNAME pn)
6617
6618 PadnamelistARRAY
6619 NOTE: this function is experimental and may change or be
6620 removed without notice.
6621
6622 The C array of pad names.
6623
6624 PADNAME ** PadnamelistARRAY(PADNAMELIST pnl)
6625
6626 PadnamelistMAX
6627 NOTE: this function is experimental and may change or be
6628 removed without notice.
6629
6630 The index of the last pad name.
6631
6632 SSize_t PadnamelistMAX(PADNAMELIST pnl)
6633
6634 PadnamelistREFCNT
6635 NOTE: this function is experimental and may change or be
6636 removed without notice.
6637
6638 The reference count of the pad name list.
6639
6640 SSize_t PadnamelistREFCNT(PADNAMELIST pnl)
6641
6642 PadnamelistREFCNT_dec
6643 NOTE: this function is experimental and may change or be
6644 removed without notice.
6645
6646 Lowers the reference count of the pad name list.
6647
6648 void PadnamelistREFCNT_dec(PADNAMELIST pnl)
6649
6650 PadnamePV
6651 NOTE: this function is experimental and may change or be
6652 removed without notice.
6653
6654 The name stored in the pad name struct. This returns "NULL"
6655 for a target slot.
6656
6657 char * PadnamePV(PADNAME pn)
6658
6659 PadnameREFCNT
6660 NOTE: this function is experimental and may change or be
6661 removed without notice.
6662
6663 The reference count of the pad name.
6664
6665 SSize_t PadnameREFCNT(PADNAME pn)
6666
6667 PadnameREFCNT_dec
6668 NOTE: this function is experimental and may change or be
6669 removed without notice.
6670
6671 Lowers the reference count of the pad name.
6672
6673 void PadnameREFCNT_dec(PADNAME pn)
6674
6675 PadnameSV
6676 NOTE: this function is experimental and may change or be
6677 removed without notice.
6678
6679 Returns the pad name as a mortal SV.
6680
6681 SV * PadnameSV(PADNAME pn)
6682
6683 PadnameUTF8
6684 NOTE: this function is experimental and may change or be
6685 removed without notice.
6686
6687 Whether PadnamePV is in UTF-8. Currently, this is always true.
6688
6689 bool PadnameUTF8(PADNAME pn)
6690
6691 pad_new Create a new padlist, updating the global variables for the
6692 currently-compiling padlist to point to the new padlist. The
6693 following flags can be OR'ed together:
6694
6695 padnew_CLONE this pad is for a cloned CV
6696 padnew_SAVE save old globals on the save stack
6697 padnew_SAVESUB also save extra stuff for start of sub
6698
6699 PADLIST * pad_new(int flags)
6700
6701 PL_comppad
6702 NOTE: this function is experimental and may change or be
6703 removed without notice.
6704
6705 During compilation, this points to the array containing the
6706 values part of the pad for the currently-compiling code. (At
6707 runtime a CV may have many such value arrays; at compile time
6708 just one is constructed.) At runtime, this points to the array
6709 containing the currently-relevant values for the pad for the
6710 currently-executing code.
6711
6712 PL_comppad_name
6713 NOTE: this function is experimental and may change or be
6714 removed without notice.
6715
6716 During compilation, this points to the array containing the
6717 names part of the pad for the currently-compiling code.
6718
6719 PL_curpad
6720 NOTE: this function is experimental and may change or be
6721 removed without notice.
6722
6723 Points directly to the body of the "PL_comppad" array. (I.e.,
6724 this is "PadARRAY(PL_comppad)".)
6725
6727 PL_modglobal
6728 "PL_modglobal" is a general purpose, interpreter global HV for
6729 use by extensions that need to keep information on a per-
6730 interpreter basis. In a pinch, it can also be used as a symbol
6731 table for extensions to share data among each other. It is a
6732 good idea to use keys prefixed by the package name of the
6733 extension that owns the data.
6734
6735 HV* PL_modglobal
6736
6737 PL_na A convenience variable which is typically used with "SvPV" when
6738 one doesn't care about the length of the string. It is usually
6739 more efficient to either declare a local variable and use that
6740 instead or to use the "SvPV_nolen" macro.
6741
6742 STRLEN PL_na
6743
6744 PL_opfreehook
6745 When non-"NULL", the function pointed by this variable will be
6746 called each time an OP is freed with the corresponding OP as
6747 the argument. This allows extensions to free any extra
6748 attribute they have locally attached to an OP. It is also
6749 assured to first fire for the parent OP and then for its kids.
6750
6751 When you replace this variable, it is considered a good
6752 practice to store the possibly previously installed hook and
6753 that you recall it inside your own.
6754
6755 Perl_ophook_t PL_opfreehook
6756
6757 PL_peepp
6758 Pointer to the per-subroutine peephole optimiser. This is a
6759 function that gets called at the end of compilation of a Perl
6760 subroutine (or equivalently independent piece of Perl code) to
6761 perform fixups of some ops and to perform small-scale
6762 optimisations. The function is called once for each subroutine
6763 that is compiled, and is passed, as sole parameter, a pointer
6764 to the op that is the entry point to the subroutine. It
6765 modifies the op tree in place.
6766
6767 The peephole optimiser should never be completely replaced.
6768 Rather, add code to it by wrapping the existing optimiser. The
6769 basic way to do this can be seen in "Compile pass 3: peephole
6770 optimization" in perlguts. If the new code wishes to operate
6771 on ops throughout the subroutine's structure, rather than just
6772 at the top level, it is likely to be more convenient to wrap
6773 the "PL_rpeepp" hook.
6774
6775 peep_t PL_peepp
6776
6777 PL_rpeepp
6778 Pointer to the recursive peephole optimiser. This is a
6779 function that gets called at the end of compilation of a Perl
6780 subroutine (or equivalently independent piece of Perl code) to
6781 perform fixups of some ops and to perform small-scale
6782 optimisations. The function is called once for each chain of
6783 ops linked through their "op_next" fields; it is recursively
6784 called to handle each side chain. It is passed, as sole
6785 parameter, a pointer to the op that is at the head of the
6786 chain. It modifies the op tree in place.
6787
6788 The peephole optimiser should never be completely replaced.
6789 Rather, add code to it by wrapping the existing optimiser. The
6790 basic way to do this can be seen in "Compile pass 3: peephole
6791 optimization" in perlguts. If the new code wishes to operate
6792 only on ops at a subroutine's top level, rather than throughout
6793 the structure, it is likely to be more convenient to wrap the
6794 "PL_peepp" hook.
6795
6796 peep_t PL_rpeepp
6797
6798 PL_sv_no
6799 This is the "false" SV. See "PL_sv_yes". Always refer to this
6800 as &PL_sv_no.
6801
6802 SV PL_sv_no
6803
6804 PL_sv_undef
6805 This is the "undef" SV. Always refer to this as &PL_sv_undef.
6806
6807 SV PL_sv_undef
6808
6809 PL_sv_yes
6810 This is the "true" SV. See "PL_sv_no". Always refer to this
6811 as &PL_sv_yes.
6812
6813 SV PL_sv_yes
6814
6815 PL_sv_zero
6816 This readonly SV has a zero numeric value and a "0" string
6817 value. It's similar to "PL_sv_no" except for its string value.
6818 Can be used as a cheap alternative to mXPUSHi(0) for example.
6819 Always refer to this as &PL_sv_zero. Introduced in 5.28.
6820
6821 SV PL_sv_zero
6822
6824 SvRX Convenience macro to get the REGEXP from a SV. This is
6825 approximately equivalent to the following snippet:
6826
6827 if (SvMAGICAL(sv))
6828 mg_get(sv);
6829 if (SvROK(sv))
6830 sv = MUTABLE_SV(SvRV(sv));
6831 if (SvTYPE(sv) == SVt_REGEXP)
6832 return (REGEXP*) sv;
6833
6834 "NULL" will be returned if a REGEXP* is not found.
6835
6836 REGEXP * SvRX(SV *sv)
6837
6838 SvRXOK Returns a boolean indicating whether the SV (or the one it
6839 references) is a REGEXP.
6840
6841 If you want to do something with the REGEXP* later use SvRX
6842 instead and check for NULL.
6843
6844 bool SvRXOK(SV* sv)
6845
6847 dMARK Declare a stack marker variable, "mark", for the XSUB. See
6848 "MARK" and "dORIGMARK".
6849
6850 dMARK;
6851
6852 dORIGMARK
6853 Saves the original stack mark for the XSUB. See "ORIGMARK".
6854
6855 dORIGMARK;
6856
6857 dSP Declares a local copy of perl's stack pointer for the XSUB,
6858 available via the "SP" macro. See "SP".
6859
6860 dSP;
6861
6862 EXTEND Used to extend the argument stack for an XSUB's return values.
6863 Once used, guarantees that there is room for at least "nitems"
6864 to be pushed onto the stack.
6865
6866 void EXTEND(SP, SSize_t nitems)
6867
6868 MARK Stack marker variable for the XSUB. See "dMARK".
6869
6870 mPUSHi Push an integer onto the stack. The stack must have room for
6871 this element. Does not use "TARG". See also "PUSHi",
6872 "mXPUSHi" and "XPUSHi".
6873
6874 void mPUSHi(IV iv)
6875
6876 mPUSHn Push a double onto the stack. The stack must have room for
6877 this element. Does not use "TARG". See also "PUSHn",
6878 "mXPUSHn" and "XPUSHn".
6879
6880 void mPUSHn(NV nv)
6881
6882 mPUSHp Push a string onto the stack. The stack must have room for
6883 this element. The "len" indicates the length of the string.
6884 Does not use "TARG". See also "PUSHp", "mXPUSHp" and "XPUSHp".
6885
6886 void mPUSHp(char* str, STRLEN len)
6887
6888 mPUSHs Push an SV onto the stack and mortalizes the SV. The stack
6889 must have room for this element. Does not use "TARG". See
6890 also "PUSHs" and "mXPUSHs".
6891
6892 void mPUSHs(SV* sv)
6893
6894 mPUSHu Push an unsigned integer onto the stack. The stack must have
6895 room for this element. Does not use "TARG". See also "PUSHu",
6896 "mXPUSHu" and "XPUSHu".
6897
6898 void mPUSHu(UV uv)
6899
6900 mXPUSHi Push an integer onto the stack, extending the stack if
6901 necessary. Does not use "TARG". See also "XPUSHi", "mPUSHi"
6902 and "PUSHi".
6903
6904 void mXPUSHi(IV iv)
6905
6906 mXPUSHn Push a double onto the stack, extending the stack if necessary.
6907 Does not use "TARG". See also "XPUSHn", "mPUSHn" and "PUSHn".
6908
6909 void mXPUSHn(NV nv)
6910
6911 mXPUSHp Push a string onto the stack, extending the stack if necessary.
6912 The "len" indicates the length of the string. Does not use
6913 "TARG". See also "XPUSHp", "mPUSHp" and "PUSHp".
6914
6915 void mXPUSHp(char* str, STRLEN len)
6916
6917 mXPUSHs Push an SV onto the stack, extending the stack if necessary and
6918 mortalizes the SV. Does not use "TARG". See also "XPUSHs" and
6919 "mPUSHs".
6920
6921 void mXPUSHs(SV* sv)
6922
6923 mXPUSHu Push an unsigned integer onto the stack, extending the stack if
6924 necessary. Does not use "TARG". See also "XPUSHu", "mPUSHu"
6925 and "PUSHu".
6926
6927 void mXPUSHu(UV uv)
6928
6929 ORIGMARK
6930 The original stack mark for the XSUB. See "dORIGMARK".
6931
6932 POPi Pops an integer off the stack.
6933
6934 IV POPi
6935
6936 POPl Pops a long off the stack.
6937
6938 long POPl
6939
6940 POPn Pops a double off the stack.
6941
6942 NV POPn
6943
6944 POPp Pops a string off the stack.
6945
6946 char* POPp
6947
6948 POPpbytex
6949 Pops a string off the stack which must consist of bytes i.e.
6950 characters < 256.
6951
6952 char* POPpbytex
6953
6954 POPpx Pops a string off the stack. Identical to POPp. There are two
6955 names for historical reasons.
6956
6957 char* POPpx
6958
6959 POPs Pops an SV off the stack.
6960
6961 SV* POPs
6962
6963 POPu Pops an unsigned integer off the stack.
6964
6965 UV POPu
6966
6967 POPul Pops an unsigned long off the stack.
6968
6969 long POPul
6970
6971 PUSHi Push an integer onto the stack. The stack must have room for
6972 this element. Handles 'set' magic. Uses "TARG", so "dTARGET"
6973 or "dXSTARG" should be called to declare it. Do not call
6974 multiple "TARG"-oriented macros to return lists from XSUB's -
6975 see "mPUSHi" instead. See also "XPUSHi" and "mXPUSHi".
6976
6977 void PUSHi(IV iv)
6978
6979 PUSHMARK
6980 Opening bracket for arguments on a callback. See "PUTBACK" and
6981 perlcall.
6982
6983 void PUSHMARK(SP)
6984
6985 PUSHmortal
6986 Push a new mortal SV onto the stack. The stack must have room
6987 for this element. Does not use "TARG". See also "PUSHs",
6988 "XPUSHmortal" and "XPUSHs".
6989
6990 void PUSHmortal()
6991
6992 PUSHn Push a double onto the stack. The stack must have room for
6993 this element. Handles 'set' magic. Uses "TARG", so "dTARGET"
6994 or "dXSTARG" should be called to declare it. Do not call
6995 multiple "TARG"-oriented macros to return lists from XSUB's -
6996 see "mPUSHn" instead. See also "XPUSHn" and "mXPUSHn".
6997
6998 void PUSHn(NV nv)
6999
7000 PUSHp Push a string onto the stack. The stack must have room for
7001 this element. The "len" indicates the length of the string.
7002 Handles 'set' magic. Uses "TARG", so "dTARGET" or "dXSTARG"
7003 should be called to declare it. Do not call multiple
7004 "TARG"-oriented macros to return lists from XSUB's - see
7005 "mPUSHp" instead. See also "XPUSHp" and "mXPUSHp".
7006
7007 void PUSHp(char* str, STRLEN len)
7008
7009 PUSHs Push an SV onto the stack. The stack must have room for this
7010 element. Does not handle 'set' magic. Does not use "TARG".
7011 See also "PUSHmortal", "XPUSHs", and "XPUSHmortal".
7012
7013 void PUSHs(SV* sv)
7014
7015 PUSHu Push an unsigned integer onto the stack. The stack must have
7016 room for this element. Handles 'set' magic. Uses "TARG", so
7017 "dTARGET" or "dXSTARG" should be called to declare it. Do not
7018 call multiple "TARG"-oriented macros to return lists from
7019 XSUB's - see "mPUSHu" instead. See also "XPUSHu" and
7020 "mXPUSHu".
7021
7022 void PUSHu(UV uv)
7023
7024 PUTBACK Closing bracket for XSUB arguments. This is usually handled by
7025 "xsubpp". See "PUSHMARK" and perlcall for other uses.
7026
7027 PUTBACK;
7028
7029 SP Stack pointer. This is usually handled by "xsubpp". See "dSP"
7030 and "SPAGAIN".
7031
7032 SPAGAIN Refetch the stack pointer. Used after a callback. See
7033 perlcall.
7034
7035 SPAGAIN;
7036
7037 XPUSHi Push an integer onto the stack, extending the stack if
7038 necessary. Handles 'set' magic. Uses "TARG", so "dTARGET" or
7039 "dXSTARG" should be called to declare it. Do not call multiple
7040 "TARG"-oriented macros to return lists from XSUB's - see
7041 "mXPUSHi" instead. See also "PUSHi" and "mPUSHi".
7042
7043 void XPUSHi(IV iv)
7044
7045 XPUSHmortal
7046 Push a new mortal SV onto the stack, extending the stack if
7047 necessary. Does not use "TARG". See also "XPUSHs",
7048 "PUSHmortal" and "PUSHs".
7049
7050 void XPUSHmortal()
7051
7052 XPUSHn Push a double onto the stack, extending the stack if necessary.
7053 Handles 'set' magic. Uses "TARG", so "dTARGET" or "dXSTARG"
7054 should be called to declare it. Do not call multiple
7055 "TARG"-oriented macros to return lists from XSUB's - see
7056 "mXPUSHn" instead. See also "PUSHn" and "mPUSHn".
7057
7058 void XPUSHn(NV nv)
7059
7060 XPUSHp Push a string onto the stack, extending the stack if necessary.
7061 The "len" indicates the length of the string. Handles 'set'
7062 magic. Uses "TARG", so "dTARGET" or "dXSTARG" should be called
7063 to declare it. Do not call multiple "TARG"-oriented macros to
7064 return lists from XSUB's - see "mXPUSHp" instead. See also
7065 "PUSHp" and "mPUSHp".
7066
7067 void XPUSHp(char* str, STRLEN len)
7068
7069 XPUSHs Push an SV onto the stack, extending the stack if necessary.
7070 Does not handle 'set' magic. Does not use "TARG". See also
7071 "XPUSHmortal", "PUSHs" and "PUSHmortal".
7072
7073 void XPUSHs(SV* sv)
7074
7075 XPUSHu Push an unsigned integer onto the stack, extending the stack if
7076 necessary. Handles 'set' magic. Uses "TARG", so "dTARGET" or
7077 "dXSTARG" should be called to declare it. Do not call multiple
7078 "TARG"-oriented macros to return lists from XSUB's - see
7079 "mXPUSHu" instead. See also "PUSHu" and "mPUSHu".
7080
7081 void XPUSHu(UV uv)
7082
7083 XSRETURN
7084 Return from XSUB, indicating number of items on the stack.
7085 This is usually handled by "xsubpp".
7086
7087 void XSRETURN(int nitems)
7088
7089 XSRETURN_EMPTY
7090 Return an empty list from an XSUB immediately.
7091
7092 XSRETURN_EMPTY;
7093
7094 XSRETURN_IV
7095 Return an integer from an XSUB immediately. Uses "XST_mIV".
7096
7097 void XSRETURN_IV(IV iv)
7098
7099 XSRETURN_NO
7100 Return &PL_sv_no from an XSUB immediately. Uses "XST_mNO".
7101
7102 XSRETURN_NO;
7103
7104 XSRETURN_NV
7105 Return a double from an XSUB immediately. Uses "XST_mNV".
7106
7107 void XSRETURN_NV(NV nv)
7108
7109 XSRETURN_PV
7110 Return a copy of a string from an XSUB immediately. Uses
7111 "XST_mPV".
7112
7113 void XSRETURN_PV(char* str)
7114
7115 XSRETURN_UNDEF
7116 Return &PL_sv_undef from an XSUB immediately. Uses
7117 "XST_mUNDEF".
7118
7119 XSRETURN_UNDEF;
7120
7121 XSRETURN_UV
7122 Return an integer from an XSUB immediately. Uses "XST_mUV".
7123
7124 void XSRETURN_UV(IV uv)
7125
7126 XSRETURN_YES
7127 Return &PL_sv_yes from an XSUB immediately. Uses "XST_mYES".
7128
7129 XSRETURN_YES;
7130
7131 XST_mIV Place an integer into the specified position "pos" on the
7132 stack. The value is stored in a new mortal SV.
7133
7134 void XST_mIV(int pos, IV iv)
7135
7136 XST_mNO Place &PL_sv_no into the specified position "pos" on the stack.
7137
7138 void XST_mNO(int pos)
7139
7140 XST_mNV Place a double into the specified position "pos" on the stack.
7141 The value is stored in a new mortal SV.
7142
7143 void XST_mNV(int pos, NV nv)
7144
7145 XST_mPV Place a copy of a string into the specified position "pos" on
7146 the stack. The value is stored in a new mortal SV.
7147
7148 void XST_mPV(int pos, char* str)
7149
7150 XST_mUNDEF
7151 Place &PL_sv_undef into the specified position "pos" on the
7152 stack.
7153
7154 void XST_mUNDEF(int pos)
7155
7156 XST_mYES
7157 Place &PL_sv_yes into the specified position "pos" on the
7158 stack.
7159
7160 void XST_mYES(int pos)
7161
7163 SVt_INVLIST
7164 Type flag for scalars. See "svtype".
7165
7166 SVt_IV Type flag for scalars. See "svtype".
7167
7168 SVt_NULL
7169 Type flag for scalars. See "svtype".
7170
7171 SVt_NV Type flag for scalars. See "svtype".
7172
7173 SVt_PV Type flag for scalars. See "svtype".
7174
7175 SVt_PVAV
7176 Type flag for arrays. See "svtype".
7177
7178 SVt_PVCV
7179 Type flag for subroutines. See "svtype".
7180
7181 SVt_PVFM
7182 Type flag for formats. See "svtype".
7183
7184 SVt_PVGV
7185 Type flag for typeglobs. See "svtype".
7186
7187 SVt_PVHV
7188 Type flag for hashes. See "svtype".
7189
7190 SVt_PVIO
7191 Type flag for I/O objects. See "svtype".
7192
7193 SVt_PVIV
7194 Type flag for scalars. See "svtype".
7195
7196 SVt_PVLV
7197 Type flag for scalars. See "svtype".
7198
7199 SVt_PVMG
7200 Type flag for scalars. See "svtype".
7201
7202 SVt_PVNV
7203 Type flag for scalars. See "svtype".
7204
7205 SVt_REGEXP
7206 Type flag for regular expressions. See "svtype".
7207
7208 svtype An enum of flags for Perl types. These are found in the file
7209 sv.h in the "svtype" enum. Test these flags with the "SvTYPE"
7210 macro.
7211
7212 The types are:
7213
7214 SVt_NULL
7215 SVt_IV
7216 SVt_NV
7217 SVt_RV
7218 SVt_PV
7219 SVt_PVIV
7220 SVt_PVNV
7221 SVt_PVMG
7222 SVt_INVLIST
7223 SVt_REGEXP
7224 SVt_PVGV
7225 SVt_PVLV
7226 SVt_PVAV
7227 SVt_PVHV
7228 SVt_PVCV
7229 SVt_PVFM
7230 SVt_PVIO
7231
7232 These are most easily explained from the bottom up.
7233
7234 "SVt_PVIO" is for I/O objects, "SVt_PVFM" for formats,
7235 "SVt_PVCV" for subroutines, "SVt_PVHV" for hashes and
7236 "SVt_PVAV" for arrays.
7237
7238 All the others are scalar types, that is, things that can be
7239 bound to a "$" variable. For these, the internal types are
7240 mostly orthogonal to types in the Perl language.
7241
7242 Hence, checking "SvTYPE(sv) < SVt_PVAV" is the best way to see
7243 whether something is a scalar.
7244
7245 "SVt_PVGV" represents a typeglob. If "!SvFAKE(sv)", then it is
7246 a real, incoercible typeglob. If "SvFAKE(sv)", then it is a
7247 scalar to which a typeglob has been assigned. Assigning to it
7248 again will stop it from being a typeglob. "SVt_PVLV"
7249 represents a scalar that delegates to another scalar behind the
7250 scenes. It is used, e.g., for the return value of "substr" and
7251 for tied hash and array elements. It can hold any scalar
7252 value, including a typeglob. "SVt_REGEXP" is for regular
7253 expressions. "SVt_INVLIST" is for Perl core internal use only.
7254
7255 "SVt_PVMG" represents a "normal" scalar (not a typeglob,
7256 regular expression, or delegate). Since most scalars do not
7257 need all the internal fields of a PVMG, we save memory by
7258 allocating smaller structs when possible. All the other types
7259 are just simpler forms of "SVt_PVMG", with fewer internal
7260 fields. "SVt_NULL" can only hold undef. "SVt_IV" can hold
7261 undef, an integer, or a reference. ("SVt_RV" is an alias for
7262 "SVt_IV", which exists for backward compatibility.) "SVt_NV"
7263 can hold any of those or a double. "SVt_PV" can only hold
7264 "undef" or a string. "SVt_PVIV" is a superset of "SVt_PV" and
7265 "SVt_IV". "SVt_PVNV" is similar. "SVt_PVMG" can hold anything
7266 "SVt_PVNV" can hold, but it can, but does not have to, be
7267 blessed or magical.
7268
7270 boolSV Returns a true SV if "b" is a true value, or a false SV if "b"
7271 is 0.
7272
7273 See also "PL_sv_yes" and "PL_sv_no".
7274
7275 SV * boolSV(bool b)
7276
7277 croak_xs_usage
7278 A specialised variant of "croak()" for emitting the usage
7279 message for xsubs
7280
7281 croak_xs_usage(cv, "eee_yow");
7282
7283 works out the package name and subroutine name from "cv", and
7284 then calls "croak()". Hence if "cv" is &ouch::awk, it would
7285 call "croak" as:
7286
7287 Perl_croak(aTHX_ "Usage: %" SVf "::%" SVf "(%s)", "ouch" "awk",
7288 "eee_yow");
7289
7290 void croak_xs_usage(const CV *const cv,
7291 const char *const params)
7292
7293 get_sv Returns the SV of the specified Perl scalar. "flags" are
7294 passed to "gv_fetchpv". If "GV_ADD" is set and the Perl
7295 variable does not exist then it will be created. If "flags" is
7296 zero and the variable does not exist then NULL is returned.
7297
7298 NOTE: the perl_ form of this function is deprecated.
7299
7300 SV* get_sv(const char *name, I32 flags)
7301
7302 looks_like_number
7303 Test if the content of an SV looks like a number (or is a
7304 number). "Inf" and "Infinity" are treated as numbers (so will
7305 not issue a non-numeric warning), even if your "atof()" doesn't
7306 grok them. Get-magic is ignored.
7307
7308 I32 looks_like_number(SV *const sv)
7309
7310 newRV_inc
7311 Creates an RV wrapper for an SV. The reference count for the
7312 original SV is incremented.
7313
7314 SV* newRV_inc(SV* sv)
7315
7316 newRV_noinc
7317 Creates an RV wrapper for an SV. The reference count for the
7318 original SV is not incremented.
7319
7320 SV* newRV_noinc(SV *const tmpRef)
7321
7322 newSV Creates a new SV. A non-zero "len" parameter indicates the
7323 number of bytes of preallocated string space the SV should
7324 have. An extra byte for a trailing "NUL" is also reserved.
7325 ("SvPOK" is not set for the SV even if string space is
7326 allocated.) The reference count for the new SV is set to 1.
7327
7328 In 5.9.3, "newSV()" replaces the older "NEWSV()" API, and drops
7329 the first parameter, x, a debug aid which allowed callers to
7330 identify themselves. This aid has been superseded by a new
7331 build option, "PERL_MEM_LOG" (see "PERL_MEM_LOG" in
7332 perlhacktips). The older API is still there for use in XS
7333 modules supporting older perls.
7334
7335 SV* newSV(const STRLEN len)
7336
7337 newSVhek
7338 Creates a new SV from the hash key structure. It will generate
7339 scalars that point to the shared string table where possible.
7340 Returns a new (undefined) SV if "hek" is NULL.
7341
7342 SV* newSVhek(const HEK *const hek)
7343
7344 newSViv Creates a new SV and copies an integer into it. The reference
7345 count for the SV is set to 1.
7346
7347 SV* newSViv(const IV i)
7348
7349 newSVnv Creates a new SV and copies a floating point value into it.
7350 The reference count for the SV is set to 1.
7351
7352 SV* newSVnv(const NV n)
7353
7354 newSVpadname
7355 NOTE: this function is experimental and may change or be
7356 removed without notice.
7357
7358 Creates a new SV containing the pad name.
7359
7360 SV* newSVpadname(PADNAME *pn)
7361
7362 newSVpv Creates a new SV and copies a string (which may contain "NUL"
7363 ("\0") characters) into it. The reference count for the SV is
7364 set to 1. If "len" is zero, Perl will compute the length using
7365 "strlen()", (which means if you use this option, that "s" can't
7366 have embedded "NUL" characters and has to have a terminating
7367 "NUL" byte).
7368
7369 This function can cause reliability issues if you are likely to
7370 pass in empty strings that are not null terminated, because it
7371 will run strlen on the string and potentially run past valid
7372 memory.
7373
7374 Using "newSVpvn" is a safer alternative for non "NUL"
7375 terminated strings. For string literals use "newSVpvs"
7376 instead. This function will work fine for "NUL" terminated
7377 strings, but if you want to avoid the if statement on whether
7378 to call "strlen" use "newSVpvn" instead (calling "strlen"
7379 yourself).
7380
7381 SV* newSVpv(const char *const s, const STRLEN len)
7382
7383 newSVpvf
7384 Creates a new SV and initializes it with the string formatted
7385 like "sv_catpvf".
7386
7387 SV* newSVpvf(const char *const pat, ...)
7388
7389 newSVpvn
7390 Creates a new SV and copies a string into it, which may contain
7391 "NUL" characters ("\0") and other binary data. The reference
7392 count for the SV is set to 1. Note that if "len" is zero, Perl
7393 will create a zero length (Perl) string. You are responsible
7394 for ensuring that the source buffer is at least "len" bytes
7395 long. If the "buffer" argument is NULL the new SV will be
7396 undefined.
7397
7398 SV* newSVpvn(const char *const buffer,
7399 const STRLEN len)
7400
7401 newSVpvn_flags
7402 Creates a new SV and copies a string (which may contain "NUL"
7403 ("\0") characters) into it. The reference count for the SV is
7404 set to 1. Note that if "len" is zero, Perl will create a zero
7405 length string. You are responsible for ensuring that the
7406 source string is at least "len" bytes long. If the "s"
7407 argument is NULL the new SV will be undefined. Currently the
7408 only flag bits accepted are "SVf_UTF8" and "SVs_TEMP". If
7409 "SVs_TEMP" is set, then "sv_2mortal()" is called on the result
7410 before returning. If "SVf_UTF8" is set, "s" is considered to
7411 be in UTF-8 and the "SVf_UTF8" flag will be set on the new SV.
7412 "newSVpvn_utf8()" is a convenience wrapper for this function,
7413 defined as
7414
7415 #define newSVpvn_utf8(s, len, u) \
7416 newSVpvn_flags((s), (len), (u) ? SVf_UTF8 : 0)
7417
7418 SV* newSVpvn_flags(const char *const s,
7419 const STRLEN len,
7420 const U32 flags)
7421
7422 newSVpvn_share
7423 Creates a new SV with its "SvPVX_const" pointing to a shared
7424 string in the string table. If the string does not already
7425 exist in the table, it is created first. Turns on the
7426 "SvIsCOW" flag (or "READONLY" and "FAKE" in 5.16 and earlier).
7427 If the "hash" parameter is non-zero, that value is used;
7428 otherwise the hash is computed. The string's hash can later be
7429 retrieved from the SV with the "SvSHARED_HASH()" macro. The
7430 idea here is that as the string table is used for shared hash
7431 keys these strings will have "SvPVX_const == HeKEY" and hash
7432 lookup will avoid string compare.
7433
7434 SV* newSVpvn_share(const char* s, I32 len, U32 hash)
7435
7436 newSVpvn_utf8
7437 Creates a new SV and copies a string (which may contain "NUL"
7438 ("\0") characters) into it. If "utf8" is true, calls
7439 "SvUTF8_on" on the new SV. Implemented as a wrapper around
7440 "newSVpvn_flags".
7441
7442 SV* newSVpvn_utf8(const char* s, STRLEN len,
7443 U32 utf8)
7444
7445 newSVpvs
7446 Like "newSVpvn", but takes a literal string instead of a
7447 string/length pair.
7448
7449 SV* newSVpvs("literal string" s)
7450
7451 newSVpvs_flags
7452 Like "newSVpvn_flags", but takes a literal string instead of a
7453 string/length pair.
7454
7455 SV* newSVpvs_flags("literal string" s, U32 flags)
7456
7457 newSVpv_share
7458 Like "newSVpvn_share", but takes a "NUL"-terminated string
7459 instead of a string/length pair.
7460
7461 SV* newSVpv_share(const char* s, U32 hash)
7462
7463 newSVpvs_share
7464 Like "newSVpvn_share", but takes a literal string instead of a
7465 string/length pair and omits the hash parameter.
7466
7467 SV* newSVpvs_share("literal string" s)
7468
7469 newSVrv Creates a new SV for the existing RV, "rv", to point to. If
7470 "rv" is not an RV then it will be upgraded to one. If
7471 "classname" is non-null then the new SV will be blessed in the
7472 specified package. The new SV is returned and its reference
7473 count is 1. The reference count 1 is owned by "rv". See also
7474 newRV_inc() and newRV_noinc() for creating a new RV properly.
7475
7476 SV* newSVrv(SV *const rv,
7477 const char *const classname)
7478
7479 newSVsv Creates a new SV which is an exact duplicate of the original
7480 SV. (Uses "sv_setsv".)
7481
7482 SV* newSVsv(SV *const old)
7483
7484 newSVsv_nomg
7485 Like "newSVsv" but does not process get magic.
7486
7487 SV* newSVsv_nomg(SV *const old)
7488
7489 newSV_type
7490 Creates a new SV, of the type specified. The reference count
7491 for the new SV is set to 1.
7492
7493 SV* newSV_type(const svtype type)
7494
7495 newSVuv Creates a new SV and copies an unsigned integer into it. The
7496 reference count for the SV is set to 1.
7497
7498 SV* newSVuv(const UV u)
7499
7500 sv_2bool
7501 This macro is only used by "sv_true()" or its macro equivalent,
7502 and only if the latter's argument is neither "SvPOK", "SvIOK"
7503 nor "SvNOK". It calls "sv_2bool_flags" with the "SV_GMAGIC"
7504 flag.
7505
7506 bool sv_2bool(SV *const sv)
7507
7508 sv_2bool_flags
7509 This function is only used by "sv_true()" and friends, and
7510 only if the latter's argument is neither "SvPOK", "SvIOK" nor
7511 "SvNOK". If the flags contain "SV_GMAGIC", then it does an
7512 "mg_get()" first.
7513
7514 bool sv_2bool_flags(SV *sv, I32 flags)
7515
7516 sv_2cv Using various gambits, try to get a CV from an SV; in addition,
7517 try if possible to set *st and *gvp to the stash and GV
7518 associated with it. The flags in "lref" are passed to
7519 "gv_fetchsv".
7520
7521 CV* sv_2cv(SV* sv, HV **const st, GV **const gvp,
7522 const I32 lref)
7523
7524 sv_2io Using various gambits, try to get an IO from an SV: the IO slot
7525 if its a GV; or the recursive result if we're an RV; or the IO
7526 slot of the symbol named after the PV if we're a string.
7527
7528 'Get' magic is ignored on the "sv" passed in, but will be
7529 called on "SvRV(sv)" if "sv" is an RV.
7530
7531 IO* sv_2io(SV *const sv)
7532
7533 sv_2iv_flags
7534 Return the integer value of an SV, doing any necessary string
7535 conversion. If "flags" has the "SV_GMAGIC" bit set, does an
7536 "mg_get()" first. Normally used via the "SvIV(sv)" and
7537 "SvIVx(sv)" macros.
7538
7539 IV sv_2iv_flags(SV *const sv, const I32 flags)
7540
7541 sv_2mortal
7542 Marks an existing SV as mortal. The SV will be destroyed
7543 "soon", either by an explicit call to "FREETMPS", or by an
7544 implicit call at places such as statement boundaries.
7545 "SvTEMP()" is turned on which means that the SV's string buffer
7546 can be "stolen" if this SV is copied. See also "sv_newmortal"
7547 and "sv_mortalcopy".
7548
7549 SV* sv_2mortal(SV *const sv)
7550
7551 sv_2nv_flags
7552 Return the num value of an SV, doing any necessary string or
7553 integer conversion. If "flags" has the "SV_GMAGIC" bit set,
7554 does an "mg_get()" first. Normally used via the "SvNV(sv)" and
7555 "SvNVx(sv)" macros.
7556
7557 NV sv_2nv_flags(SV *const sv, const I32 flags)
7558
7559 sv_2pvbyte
7560 Return a pointer to the byte-encoded representation of the SV,
7561 and set *lp to its length. May cause the SV to be downgraded
7562 from UTF-8 as a side-effect.
7563
7564 Usually accessed via the "SvPVbyte" macro.
7565
7566 char* sv_2pvbyte(SV *sv, STRLEN *const lp)
7567
7568 sv_2pvutf8
7569 Return a pointer to the UTF-8-encoded representation of the SV,
7570 and set *lp to its length. May cause the SV to be upgraded to
7571 UTF-8 as a side-effect.
7572
7573 Usually accessed via the "SvPVutf8" macro.
7574
7575 char* sv_2pvutf8(SV *sv, STRLEN *const lp)
7576
7577 sv_2pv_flags
7578 Returns a pointer to the string value of an SV, and sets *lp to
7579 its length. If flags has the "SV_GMAGIC" bit set, does an
7580 "mg_get()" first. Coerces "sv" to a string if necessary.
7581 Normally invoked via the "SvPV_flags" macro. "sv_2pv()" and
7582 "sv_2pv_nomg" usually end up here too.
7583
7584 char* sv_2pv_flags(SV *const sv, STRLEN *const lp,
7585 const I32 flags)
7586
7587 sv_2uv_flags
7588 Return the unsigned integer value of an SV, doing any necessary
7589 string conversion. If "flags" has the "SV_GMAGIC" bit set,
7590 does an "mg_get()" first. Normally used via the "SvUV(sv)" and
7591 "SvUVx(sv)" macros.
7592
7593 UV sv_2uv_flags(SV *const sv, const I32 flags)
7594
7595 sv_backoff
7596 Remove any string offset. You should normally use the
7597 "SvOOK_off" macro wrapper instead.
7598
7599 void sv_backoff(SV *const sv)
7600
7601 sv_bless
7602 Blesses an SV into a specified package. The SV must be an RV.
7603 The package must be designated by its stash (see "gv_stashpv").
7604 The reference count of the SV is unaffected.
7605
7606 SV* sv_bless(SV *const sv, HV *const stash)
7607
7608 sv_catpv
7609 Concatenates the "NUL"-terminated string onto the end of the
7610 string which is in the SV. If the SV has the UTF-8 status set,
7611 then the bytes appended should be valid UTF-8. Handles 'get'
7612 magic, but not 'set' magic. See "sv_catpv_mg".
7613
7614 void sv_catpv(SV *const sv, const char* ptr)
7615
7616 sv_catpvf
7617 Processes its arguments like "sprintf", and appends the
7618 formatted output to an SV. As with "sv_vcatpvfn" called with a
7619 non-null C-style variable argument list, argument reordering is
7620 not supported. If the appended data contains "wide" characters
7621 (including, but not limited to, SVs with a UTF-8 PV formatted
7622 with %s, and characters >255 formatted with %c), the original
7623 SV might get upgraded to UTF-8. Handles 'get' magic, but not
7624 'set' magic. See "sv_catpvf_mg". If the original SV was
7625 UTF-8, the pattern should be valid UTF-8; if the original SV
7626 was bytes, the pattern should be too.
7627
7628 void sv_catpvf(SV *const sv, const char *const pat,
7629 ...)
7630
7631 sv_catpvf_mg
7632 Like "sv_catpvf", but also handles 'set' magic.
7633
7634 void sv_catpvf_mg(SV *const sv,
7635 const char *const pat, ...)
7636
7637 sv_catpvn
7638 Concatenates the string onto the end of the string which is in
7639 the SV. "len" indicates number of bytes to copy. If the SV
7640 has the UTF-8 status set, then the bytes appended should be
7641 valid UTF-8. Handles 'get' magic, but not 'set' magic. See
7642 "sv_catpvn_mg".
7643
7644 void sv_catpvn(SV *dsv, const char *sstr, STRLEN len)
7645
7646 sv_catpvn_flags
7647 Concatenates the string onto the end of the string which is in
7648 the SV. The "len" indicates number of bytes to copy.
7649
7650 By default, the string appended is assumed to be valid UTF-8 if
7651 the SV has the UTF-8 status set, and a string of bytes
7652 otherwise. One can force the appended string to be interpreted
7653 as UTF-8 by supplying the "SV_CATUTF8" flag, and as bytes by
7654 supplying the "SV_CATBYTES" flag; the SV or the string appended
7655 will be upgraded to UTF-8 if necessary.
7656
7657 If "flags" has the "SV_SMAGIC" bit set, will "mg_set" on "dsv"
7658 afterwards if appropriate. "sv_catpvn" and "sv_catpvn_nomg"
7659 are implemented in terms of this function.
7660
7661 void sv_catpvn_flags(SV *const dstr,
7662 const char *sstr,
7663 const STRLEN len,
7664 const I32 flags)
7665
7666 sv_catpvn_nomg
7667 Like "sv_catpvn" but doesn't process magic.
7668
7669 void sv_catpvn_nomg(SV* sv, const char* ptr,
7670 STRLEN len)
7671
7672 sv_catpvs
7673 Like "sv_catpvn", but takes a literal string instead of a
7674 string/length pair.
7675
7676 void sv_catpvs(SV* sv, "literal string" s)
7677
7678 sv_catpvs_flags
7679 Like "sv_catpvn_flags", but takes a literal string instead of a
7680 string/length pair.
7681
7682 void sv_catpvs_flags(SV* sv, "literal string" s,
7683 I32 flags)
7684
7685 sv_catpvs_mg
7686 Like "sv_catpvn_mg", but takes a literal string instead of a
7687 string/length pair.
7688
7689 void sv_catpvs_mg(SV* sv, "literal string" s)
7690
7691 sv_catpvs_nomg
7692 Like "sv_catpvn_nomg", but takes a literal string instead of a
7693 string/length pair.
7694
7695 void sv_catpvs_nomg(SV* sv, "literal string" s)
7696
7697 sv_catpv_flags
7698 Concatenates the "NUL"-terminated string onto the end of the
7699 string which is in the SV. If the SV has the UTF-8 status set,
7700 then the bytes appended should be valid UTF-8. If "flags" has
7701 the "SV_SMAGIC" bit set, will "mg_set" on the modified SV if
7702 appropriate.
7703
7704 void sv_catpv_flags(SV *dstr, const char *sstr,
7705 const I32 flags)
7706
7707 sv_catpv_mg
7708 Like "sv_catpv", but also handles 'set' magic.
7709
7710 void sv_catpv_mg(SV *const sv, const char *const ptr)
7711
7712 sv_catpv_nomg
7713 Like "sv_catpv" but doesn't process magic.
7714
7715 void sv_catpv_nomg(SV* sv, const char* ptr)
7716
7717 sv_catsv
7718 Concatenates the string from SV "ssv" onto the end of the
7719 string in SV "dsv". If "ssv" is null, does nothing; otherwise
7720 modifies only "dsv". Handles 'get' magic on both SVs, but no
7721 'set' magic. See "sv_catsv_mg" and "sv_catsv_nomg".
7722
7723 void sv_catsv(SV *dstr, SV *sstr)
7724
7725 sv_catsv_flags
7726 Concatenates the string from SV "ssv" onto the end of the
7727 string in SV "dsv". If "ssv" is null, does nothing; otherwise
7728 modifies only "dsv". If "flags" has the "SV_GMAGIC" bit set,
7729 will call "mg_get" on both SVs if appropriate. If "flags" has
7730 the "SV_SMAGIC" bit set, "mg_set" will be called on the
7731 modified SV afterward, if appropriate. "sv_catsv",
7732 "sv_catsv_nomg", and "sv_catsv_mg" are implemented in terms of
7733 this function.
7734
7735 void sv_catsv_flags(SV *const dsv, SV *const ssv,
7736 const I32 flags)
7737
7738 sv_catsv_nomg
7739 Like "sv_catsv" but doesn't process magic.
7740
7741 void sv_catsv_nomg(SV* dsv, SV* ssv)
7742
7743 sv_chop Efficient removal of characters from the beginning of the
7744 string buffer. "SvPOK(sv)", or at least "SvPOKp(sv)", must be
7745 true and "ptr" must be a pointer to somewhere inside the string
7746 buffer. "ptr" becomes the first character of the adjusted
7747 string. Uses the "OOK" hack. On return, only "SvPOK(sv)" and
7748 "SvPOKp(sv)" among the "OK" flags will be true.
7749
7750 Beware: after this function returns, "ptr" and SvPVX_const(sv)
7751 may no longer refer to the same chunk of data.
7752
7753 The unfortunate similarity of this function's name to that of
7754 Perl's "chop" operator is strictly coincidental. This function
7755 works from the left; "chop" works from the right.
7756
7757 void sv_chop(SV *const sv, const char *const ptr)
7758
7759 sv_clear
7760 Clear an SV: call any destructors, free up any memory used by
7761 the body, and free the body itself. The SV's head is not
7762 freed, although its type is set to all 1's so that it won't
7763 inadvertently be assumed to be live during global destruction
7764 etc. This function should only be called when "REFCNT" is
7765 zero. Most of the time you'll want to call "sv_free()" (or its
7766 macro wrapper "SvREFCNT_dec") instead.
7767
7768 void sv_clear(SV *const orig_sv)
7769
7770 sv_cmp Compares the strings in two SVs. Returns -1, 0, or 1
7771 indicating whether the string in "sv1" is less than, equal to,
7772 or greater than the string in "sv2". Is UTF-8 and 'use bytes'
7773 aware, handles get magic, and will coerce its args to strings
7774 if necessary. See also "sv_cmp_locale".
7775
7776 I32 sv_cmp(SV *const sv1, SV *const sv2)
7777
7778 sv_cmp_flags
7779 Compares the strings in two SVs. Returns -1, 0, or 1
7780 indicating whether the string in "sv1" is less than, equal to,
7781 or greater than the string in "sv2". Is UTF-8 and 'use bytes'
7782 aware and will coerce its args to strings if necessary. If the
7783 flags has the "SV_GMAGIC" bit set, it handles get magic. See
7784 also "sv_cmp_locale_flags".
7785
7786 I32 sv_cmp_flags(SV *const sv1, SV *const sv2,
7787 const U32 flags)
7788
7789 sv_cmp_locale
7790 Compares the strings in two SVs in a locale-aware manner. Is
7791 UTF-8 and 'use bytes' aware, handles get magic, and will coerce
7792 its args to strings if necessary. See also "sv_cmp".
7793
7794 I32 sv_cmp_locale(SV *const sv1, SV *const sv2)
7795
7796 sv_cmp_locale_flags
7797 Compares the strings in two SVs in a locale-aware manner. Is
7798 UTF-8 and 'use bytes' aware and will coerce its args to strings
7799 if necessary. If the flags contain "SV_GMAGIC", it handles get
7800 magic. See also "sv_cmp_flags".
7801
7802 I32 sv_cmp_locale_flags(SV *const sv1,
7803 SV *const sv2,
7804 const U32 flags)
7805
7806 sv_collxfrm
7807 This calls "sv_collxfrm_flags" with the SV_GMAGIC flag. See
7808 "sv_collxfrm_flags".
7809
7810 char* sv_collxfrm(SV *const sv, STRLEN *const nxp)
7811
7812 sv_collxfrm_flags
7813 Add Collate Transform magic to an SV if it doesn't already have
7814 it. If the flags contain "SV_GMAGIC", it handles get-magic.
7815
7816 Any scalar variable may carry "PERL_MAGIC_collxfrm" magic that
7817 contains the scalar data of the variable, but transformed to
7818 such a format that a normal memory comparison can be used to
7819 compare the data according to the locale settings.
7820
7821 char* sv_collxfrm_flags(SV *const sv,
7822 STRLEN *const nxp,
7823 I32 const flags)
7824
7825 sv_copypv
7826 Copies a stringified representation of the source SV into the
7827 destination SV. Automatically performs any necessary "mg_get"
7828 and coercion of numeric values into strings. Guaranteed to
7829 preserve "UTF8" flag even from overloaded objects. Similar in
7830 nature to "sv_2pv[_flags]" but operates directly on an SV
7831 instead of just the string. Mostly uses "sv_2pv_flags" to do
7832 its work, except when that would lose the UTF-8'ness of the PV.
7833
7834 void sv_copypv(SV *const dsv, SV *const ssv)
7835
7836 sv_copypv_flags
7837 Implementation of "sv_copypv" and "sv_copypv_nomg". Calls get
7838 magic iff flags has the "SV_GMAGIC" bit set.
7839
7840 void sv_copypv_flags(SV *const dsv, SV *const ssv,
7841 const I32 flags)
7842
7843 sv_copypv_nomg
7844 Like "sv_copypv", but doesn't invoke get magic first.
7845
7846 void sv_copypv_nomg(SV *const dsv, SV *const ssv)
7847
7848 SvCUR Returns the length of the string which is in the SV. See
7849 "SvLEN".
7850
7851 STRLEN SvCUR(SV* sv)
7852
7853 SvCUR_set
7854 Set the current length of the string which is in the SV. See
7855 "SvCUR" and "SvIV_set">.
7856
7857 void SvCUR_set(SV* sv, STRLEN len)
7858
7859 sv_dec Auto-decrement of the value in the SV, doing string to numeric
7860 conversion if necessary. Handles 'get' magic and operator
7861 overloading.
7862
7863 void sv_dec(SV *const sv)
7864
7865 sv_dec_nomg
7866 Auto-decrement of the value in the SV, doing string to numeric
7867 conversion if necessary. Handles operator overloading. Skips
7868 handling 'get' magic.
7869
7870 void sv_dec_nomg(SV *const sv)
7871
7872 sv_derived_from
7873 Exactly like "sv_derived_from_pv", but doesn't take a "flags"
7874 parameter.
7875
7876 bool sv_derived_from(SV* sv, const char *const name)
7877
7878 sv_derived_from_pv
7879 Exactly like "sv_derived_from_pvn", but takes a nul-terminated
7880 string instead of a string/length pair.
7881
7882 bool sv_derived_from_pv(SV* sv,
7883 const char *const name,
7884 U32 flags)
7885
7886 sv_derived_from_pvn
7887 Returns a boolean indicating whether the SV is derived from the
7888 specified class at the C level. To check derivation at the
7889 Perl level, call "isa()" as a normal Perl method.
7890
7891 Currently, the only significant value for "flags" is SVf_UTF8.
7892
7893 bool sv_derived_from_pvn(SV* sv,
7894 const char *const name,
7895 const STRLEN len, U32 flags)
7896
7897 sv_derived_from_sv
7898 Exactly like "sv_derived_from_pvn", but takes the name string
7899 in the form of an SV instead of a string/length pair.
7900
7901 bool sv_derived_from_sv(SV* sv, SV *namesv,
7902 U32 flags)
7903
7904 sv_does Like "sv_does_pv", but doesn't take a "flags" parameter.
7905
7906 bool sv_does(SV* sv, const char *const name)
7907
7908 sv_does_pv
7909 Like "sv_does_sv", but takes a nul-terminated string instead of
7910 an SV.
7911
7912 bool sv_does_pv(SV* sv, const char *const name,
7913 U32 flags)
7914
7915 sv_does_pvn
7916 Like "sv_does_sv", but takes a string/length pair instead of an
7917 SV.
7918
7919 bool sv_does_pvn(SV* sv, const char *const name,
7920 const STRLEN len, U32 flags)
7921
7922 sv_does_sv
7923 Returns a boolean indicating whether the SV performs a
7924 specific, named role. The SV can be a Perl object or the name
7925 of a Perl class.
7926
7927 bool sv_does_sv(SV* sv, SV* namesv, U32 flags)
7928
7929 SvEND Returns a pointer to the spot just after the last character in
7930 the string which is in the SV, where there is usually a
7931 trailing "NUL" character (even though Perl scalars do not
7932 strictly require it). See "SvCUR". Access the character as
7933 "*(SvEND(sv))".
7934
7935 Warning: If "SvCUR" is equal to "SvLEN", then "SvEND" points to
7936 unallocated memory.
7937
7938 char* SvEND(SV* sv)
7939
7940 sv_eq Returns a boolean indicating whether the strings in the two SVs
7941 are identical. Is UTF-8 and 'use bytes' aware, handles get
7942 magic, and will coerce its args to strings if necessary.
7943
7944 I32 sv_eq(SV* sv1, SV* sv2)
7945
7946 sv_eq_flags
7947 Returns a boolean indicating whether the strings in the two SVs
7948 are identical. Is UTF-8 and 'use bytes' aware and coerces its
7949 args to strings if necessary. If the flags has the "SV_GMAGIC"
7950 bit set, it handles get-magic, too.
7951
7952 I32 sv_eq_flags(SV* sv1, SV* sv2, const U32 flags)
7953
7954 sv_force_normal_flags
7955 Undo various types of fakery on an SV, where fakery means "more
7956 than" a string: if the PV is a shared string, make a private
7957 copy; if we're a ref, stop refing; if we're a glob, downgrade
7958 to an "xpvmg"; if we're a copy-on-write scalar, this is the on-
7959 write time when we do the copy, and is also used locally; if
7960 this is a vstring, drop the vstring magic. If "SV_COW_DROP_PV"
7961 is set then a copy-on-write scalar drops its PV buffer (if any)
7962 and becomes "SvPOK_off" rather than making a copy. (Used where
7963 this scalar is about to be set to some other value.) In
7964 addition, the "flags" parameter gets passed to
7965 "sv_unref_flags()" when unreffing. "sv_force_normal" calls
7966 this function with flags set to 0.
7967
7968 This function is expected to be used to signal to perl that
7969 this SV is about to be written to, and any extra book-keeping
7970 needs to be taken care of. Hence, it croaks on read-only
7971 values.
7972
7973 void sv_force_normal_flags(SV *const sv,
7974 const U32 flags)
7975
7976 sv_free Decrement an SV's reference count, and if it drops to zero,
7977 call "sv_clear" to invoke destructors and free up any memory
7978 used by the body; finally, deallocating the SV's head itself.
7979 Normally called via a wrapper macro "SvREFCNT_dec".
7980
7981 void sv_free(SV *const sv)
7982
7983 SvGAMAGIC
7984 Returns true if the SV has get magic or overloading. If either
7985 is true then the scalar is active data, and has the potential
7986 to return a new value every time it is accessed. Hence you
7987 must be careful to only read it once per user logical operation
7988 and work with that returned value. If neither is true then the
7989 scalar's value cannot change unless written to.
7990
7991 U32 SvGAMAGIC(SV* sv)
7992
7993 sv_gets Get a line from the filehandle and store it into the SV,
7994 optionally appending to the currently-stored string. If
7995 "append" is not 0, the line is appended to the SV instead of
7996 overwriting it. "append" should be set to the byte offset that
7997 the appended string should start at in the SV (typically,
7998 "SvCUR(sv)" is a suitable choice).
7999
8000 char* sv_gets(SV *const sv, PerlIO *const fp,
8001 I32 append)
8002
8003 sv_get_backrefs
8004 NOTE: this function is experimental and may change or be
8005 removed without notice.
8006
8007 If "sv" is the target of a weak reference then it returns the
8008 back references structure associated with the sv; otherwise
8009 return "NULL".
8010
8011 When returning a non-null result the type of the return is
8012 relevant. If it is an AV then the elements of the AV are the
8013 weak reference RVs which point at this item. If it is any other
8014 type then the item itself is the weak reference.
8015
8016 See also "Perl_sv_add_backref()", "Perl_sv_del_backref()",
8017 "Perl_sv_kill_backrefs()"
8018
8019 SV* sv_get_backrefs(SV *const sv)
8020
8021 SvGROW Expands the character buffer in the SV so that it has room for
8022 the indicated number of bytes (remember to reserve space for an
8023 extra trailing "NUL" character). Calls "sv_grow" to perform
8024 the expansion if necessary. Returns a pointer to the character
8025 buffer. SV must be of type >= "SVt_PV". One alternative is to
8026 call "sv_grow" if you are not sure of the type of SV.
8027
8028 You might mistakenly think that "len" is the number of bytes to
8029 add to the existing size, but instead it is the total size "sv"
8030 should be.
8031
8032 char * SvGROW(SV* sv, STRLEN len)
8033
8034 sv_grow Expands the character buffer in the SV. If necessary, uses
8035 "sv_unref" and upgrades the SV to "SVt_PV". Returns a pointer
8036 to the character buffer. Use the "SvGROW" wrapper instead.
8037
8038 char* sv_grow(SV *const sv, STRLEN newlen)
8039
8040 sv_inc Auto-increment of the value in the SV, doing string to numeric
8041 conversion if necessary. Handles 'get' magic and operator
8042 overloading.
8043
8044 void sv_inc(SV *const sv)
8045
8046 sv_inc_nomg
8047 Auto-increment of the value in the SV, doing string to numeric
8048 conversion if necessary. Handles operator overloading. Skips
8049 handling 'get' magic.
8050
8051 void sv_inc_nomg(SV *const sv)
8052
8053 sv_insert
8054 Inserts and/or replaces a string at the specified offset/length
8055 within the SV. Similar to the Perl "substr()" function, with
8056 "littlelen" bytes starting at "little" replacing "len" bytes of
8057 the string in "bigstr" starting at "offset". Handles get
8058 magic.
8059
8060 void sv_insert(SV *const bigstr, const STRLEN offset,
8061 const STRLEN len,
8062 const char *const little,
8063 const STRLEN littlelen)
8064
8065 sv_insert_flags
8066 Same as "sv_insert", but the extra "flags" are passed to the
8067 "SvPV_force_flags" that applies to "bigstr".
8068
8069 void sv_insert_flags(SV *const bigstr,
8070 const STRLEN offset,
8071 const STRLEN len,
8072 const char *little,
8073 const STRLEN littlelen,
8074 const U32 flags)
8075
8076 SvIOK Returns a U32 value indicating whether the SV contains an
8077 integer.
8078
8079 U32 SvIOK(SV* sv)
8080
8081 SvIOK_notUV
8082 Returns a boolean indicating whether the SV contains a signed
8083 integer.
8084
8085 bool SvIOK_notUV(SV* sv)
8086
8087 SvIOK_off
8088 Unsets the IV status of an SV.
8089
8090 void SvIOK_off(SV* sv)
8091
8092 SvIOK_on
8093 Tells an SV that it is an integer.
8094
8095 void SvIOK_on(SV* sv)
8096
8097 SvIOK_only
8098 Tells an SV that it is an integer and disables all other "OK"
8099 bits.
8100
8101 void SvIOK_only(SV* sv)
8102
8103 SvIOK_only_UV
8104 Tells an SV that it is an unsigned integer and disables all
8105 other "OK" bits.
8106
8107 void SvIOK_only_UV(SV* sv)
8108
8109 SvIOKp Returns a U32 value indicating whether the SV contains an
8110 integer. Checks the private setting. Use "SvIOK" instead.
8111
8112 U32 SvIOKp(SV* sv)
8113
8114 SvIOK_UV
8115 Returns a boolean indicating whether the SV contains an integer
8116 that must be interpreted as unsigned. A non-negative integer
8117 whose value is within the range of both an IV and a UV may be
8118 be flagged as either "SvUOK" or "SVIOK".
8119
8120 bool SvIOK_UV(SV* sv)
8121
8122 sv_isa Returns a boolean indicating whether the SV is blessed into the
8123 specified class. This does not check for subtypes; use
8124 "sv_derived_from" to verify an inheritance relationship.
8125
8126 int sv_isa(SV* sv, const char *const name)
8127
8128 SvIsCOW Returns a U32 value indicating whether the SV is Copy-On-Write
8129 (either shared hash key scalars, or full Copy On Write scalars
8130 if 5.9.0 is configured for COW).
8131
8132 U32 SvIsCOW(SV* sv)
8133
8134 SvIsCOW_shared_hash
8135 Returns a boolean indicating whether the SV is Copy-On-Write
8136 shared hash key scalar.
8137
8138 bool SvIsCOW_shared_hash(SV* sv)
8139
8140 sv_isobject
8141 Returns a boolean indicating whether the SV is an RV pointing
8142 to a blessed object. If the SV is not an RV, or if the object
8143 is not blessed, then this will return false.
8144
8145 int sv_isobject(SV* sv)
8146
8147 SvIV Coerces the given SV to IV and returns it. The returned value
8148 in many circumstances will get stored in "sv"'s IV slot, but
8149 not in all cases. (Use "sv_setiv" to make sure it does).
8150
8151 See "SvIVx" for a version which guarantees to evaluate "sv"
8152 only once.
8153
8154 IV SvIV(SV* sv)
8155
8156 SvIV_nomg
8157 Like "SvIV" but doesn't process magic.
8158
8159 IV SvIV_nomg(SV* sv)
8160
8161 SvIV_set
8162 Set the value of the IV pointer in sv to val. It is possible
8163 to perform the same function of this macro with an lvalue
8164 assignment to "SvIVX". With future Perls, however, it will be
8165 more efficient to use "SvIV_set" instead of the lvalue
8166 assignment to "SvIVX".
8167
8168 void SvIV_set(SV* sv, IV val)
8169
8170 SvIVX Returns the raw value in the SV's IV slot, without checks or
8171 conversions. Only use when you are sure "SvIOK" is true. See
8172 also "SvIV".
8173
8174 IV SvIVX(SV* sv)
8175
8176 SvIVx Coerces the given SV to IV and returns it. The returned value
8177 in many circumstances will get stored in "sv"'s IV slot, but
8178 not in all cases. (Use "sv_setiv" to make sure it does).
8179
8180 This form guarantees to evaluate "sv" only once. Only use this
8181 if "sv" is an expression with side effects, otherwise use the
8182 more efficient "SvIV".
8183
8184 IV SvIVx(SV* sv)
8185
8186 SvLEN Returns the size of the string buffer in the SV, not including
8187 any part attributable to "SvOOK". See "SvCUR".
8188
8189 STRLEN SvLEN(SV* sv)
8190
8191 sv_len Returns the length of the string in the SV. Handles magic and
8192 type coercion and sets the UTF8 flag appropriately. See also
8193 "SvCUR", which gives raw access to the "xpv_cur" slot.
8194
8195 STRLEN sv_len(SV *const sv)
8196
8197 SvLEN_set
8198 Set the size of the string buffer for the SV. See "SvLEN".
8199
8200 void SvLEN_set(SV* sv, STRLEN len)
8201
8202 sv_len_utf8
8203 Returns the number of characters in the string in an SV,
8204 counting wide UTF-8 bytes as a single character. Handles magic
8205 and type coercion.
8206
8207 STRLEN sv_len_utf8(SV *const sv)
8208
8209 sv_magic
8210 Adds magic to an SV. First upgrades "sv" to type "SVt_PVMG" if
8211 necessary, then adds a new magic item of type "how" to the head
8212 of the magic list.
8213
8214 See "sv_magicext" (which "sv_magic" now calls) for a
8215 description of the handling of the "name" and "namlen"
8216 arguments.
8217
8218 You need to use "sv_magicext" to add magic to "SvREADONLY" SVs
8219 and also to add more than one instance of the same "how".
8220
8221 void sv_magic(SV *const sv, SV *const obj,
8222 const int how, const char *const name,
8223 const I32 namlen)
8224
8225 sv_magicext
8226 Adds magic to an SV, upgrading it if necessary. Applies the
8227 supplied "vtable" and returns a pointer to the magic added.
8228
8229 Note that "sv_magicext" will allow things that "sv_magic" will
8230 not. In particular, you can add magic to "SvREADONLY" SVs, and
8231 add more than one instance of the same "how".
8232
8233 If "namlen" is greater than zero then a "savepvn" copy of
8234 "name" is stored, if "namlen" is zero then "name" is stored as-
8235 is and - as another special case - if "(name && namlen ==
8236 HEf_SVKEY)" then "name" is assumed to contain an SV* and is
8237 stored as-is with its "REFCNT" incremented.
8238
8239 (This is now used as a subroutine by "sv_magic".)
8240
8241 MAGIC * sv_magicext(SV *const sv, SV *const obj,
8242 const int how,
8243 const MGVTBL *const vtbl,
8244 const char *const name,
8245 const I32 namlen)
8246
8247 SvMAGIC_set
8248 Set the value of the MAGIC pointer in "sv" to val. See
8249 "SvIV_set".
8250
8251 void SvMAGIC_set(SV* sv, MAGIC* val)
8252
8253 sv_mortalcopy
8254 Creates a new SV which is a copy of the original SV (using
8255 "sv_setsv"). The new SV is marked as mortal. It will be
8256 destroyed "soon", either by an explicit call to "FREETMPS", or
8257 by an implicit call at places such as statement boundaries.
8258 See also "sv_newmortal" and "sv_2mortal".
8259
8260 SV* sv_mortalcopy(SV *const oldsv)
8261
8262 sv_newmortal
8263 Creates a new null SV which is mortal. The reference count of
8264 the SV is set to 1. It will be destroyed "soon", either by an
8265 explicit call to "FREETMPS", or by an implicit call at places
8266 such as statement boundaries. See also "sv_mortalcopy" and
8267 "sv_2mortal".
8268
8269 SV* sv_newmortal()
8270
8271 sv_newref
8272 Increment an SV's reference count. Use the "SvREFCNT_inc()"
8273 wrapper instead.
8274
8275 SV* sv_newref(SV *const sv)
8276
8277 SvNIOK Returns a U32 value indicating whether the SV contains a
8278 number, integer or double.
8279
8280 U32 SvNIOK(SV* sv)
8281
8282 SvNIOK_off
8283 Unsets the NV/IV status of an SV.
8284
8285 void SvNIOK_off(SV* sv)
8286
8287 SvNIOKp Returns a U32 value indicating whether the SV contains a
8288 number, integer or double. Checks the private setting. Use
8289 "SvNIOK" instead.
8290
8291 U32 SvNIOKp(SV* sv)
8292
8293 SvNOK Returns a U32 value indicating whether the SV contains a
8294 double.
8295
8296 U32 SvNOK(SV* sv)
8297
8298 SvNOK_off
8299 Unsets the NV status of an SV.
8300
8301 void SvNOK_off(SV* sv)
8302
8303 SvNOK_on
8304 Tells an SV that it is a double.
8305
8306 void SvNOK_on(SV* sv)
8307
8308 SvNOK_only
8309 Tells an SV that it is a double and disables all other OK bits.
8310
8311 void SvNOK_only(SV* sv)
8312
8313 SvNOKp Returns a U32 value indicating whether the SV contains a
8314 double. Checks the private setting. Use "SvNOK" instead.
8315
8316 U32 SvNOKp(SV* sv)
8317
8318 SvNV Coerces the given SV to NV and returns it. The returned value
8319 in many circumstances will get stored in "sv"'s NV slot, but
8320 not in all cases. (Use "sv_setnv" to make sure it does).
8321
8322 See "SvNVx" for a version which guarantees to evaluate "sv"
8323 only once.
8324
8325 NV SvNV(SV* sv)
8326
8327 SvNV_nomg
8328 Like "SvNV" but doesn't process magic.
8329
8330 NV SvNV_nomg(SV* sv)
8331
8332 SvNV_set
8333 Set the value of the NV pointer in "sv" to val. See
8334 "SvIV_set".
8335
8336 void SvNV_set(SV* sv, NV val)
8337
8338 SvNVX Returns the raw value in the SV's NV slot, without checks or
8339 conversions. Only use when you are sure "SvNOK" is true. See
8340 also "SvNV".
8341
8342 NV SvNVX(SV* sv)
8343
8344 SvNVx Coerces the given SV to NV and returns it. The returned value
8345 in many circumstances will get stored in "sv"'s NV slot, but
8346 not in all cases. (Use "sv_setnv" to make sure it does).
8347
8348 This form guarantees to evaluate "sv" only once. Only use this
8349 if "sv" is an expression with side effects, otherwise use the
8350 more efficient "SvNV".
8351
8352 NV SvNVx(SV* sv)
8353
8354 SvOK Returns a U32 value indicating whether the value is defined.
8355 This is only meaningful for scalars.
8356
8357 U32 SvOK(SV* sv)
8358
8359 SvOOK Returns a U32 indicating whether the pointer to the string
8360 buffer is offset. This hack is used internally to speed up
8361 removal of characters from the beginning of a "SvPV". When
8362 "SvOOK" is true, then the start of the allocated string buffer
8363 is actually "SvOOK_offset()" bytes before "SvPVX". This offset
8364 used to be stored in "SvIVX", but is now stored within the
8365 spare part of the buffer.
8366
8367 U32 SvOOK(SV* sv)
8368
8369 SvOOK_offset
8370 Reads into "len" the offset from "SvPVX" back to the true start
8371 of the allocated buffer, which will be non-zero if "sv_chop"
8372 has been used to efficiently remove characters from start of
8373 the buffer. Implemented as a macro, which takes the address of
8374 "len", which must be of type "STRLEN". Evaluates "sv" more
8375 than once. Sets "len" to 0 if "SvOOK(sv)" is false.
8376
8377 void SvOOK_offset(SV*sv, STRLEN len)
8378
8379 SvPOK Returns a U32 value indicating whether the SV contains a
8380 character string.
8381
8382 U32 SvPOK(SV* sv)
8383
8384 SvPOK_off
8385 Unsets the PV status of an SV.
8386
8387 void SvPOK_off(SV* sv)
8388
8389 SvPOK_on
8390 Tells an SV that it is a string.
8391
8392 void SvPOK_on(SV* sv)
8393
8394 SvPOK_only
8395 Tells an SV that it is a string and disables all other "OK"
8396 bits. Will also turn off the UTF-8 status.
8397
8398 void SvPOK_only(SV* sv)
8399
8400 SvPOK_only_UTF8
8401 Tells an SV that it is a string and disables all other "OK"
8402 bits, and leaves the UTF-8 status as it was.
8403
8404 void SvPOK_only_UTF8(SV* sv)
8405
8406 SvPOKp Returns a U32 value indicating whether the SV contains a
8407 character string. Checks the private setting. Use "SvPOK"
8408 instead.
8409
8410 U32 SvPOKp(SV* sv)
8411
8412 sv_pos_b2u
8413 Converts the value pointed to by "offsetp" from a count of
8414 bytes from the start of the string, to a count of the
8415 equivalent number of UTF-8 chars. Handles magic and type
8416 coercion.
8417
8418 Use "sv_pos_b2u_flags" in preference, which correctly handles
8419 strings longer than 2Gb.
8420
8421 void sv_pos_b2u(SV *const sv, I32 *const offsetp)
8422
8423 sv_pos_b2u_flags
8424 Converts "offset" from a count of bytes from the start of the
8425 string, to a count of the equivalent number of UTF-8 chars.
8426 Handles type coercion. "flags" is passed to "SvPV_flags", and
8427 usually should be "SV_GMAGIC|SV_CONST_RETURN" to handle magic.
8428
8429 STRLEN sv_pos_b2u_flags(SV *const sv,
8430 STRLEN const offset, U32 flags)
8431
8432 sv_pos_u2b
8433 Converts the value pointed to by "offsetp" from a count of
8434 UTF-8 chars from the start of the string, to a count of the
8435 equivalent number of bytes; if "lenp" is non-zero, it does the
8436 same to "lenp", but this time starting from the offset, rather
8437 than from the start of the string. Handles magic and type
8438 coercion.
8439
8440 Use "sv_pos_u2b_flags" in preference, which correctly handles
8441 strings longer than 2Gb.
8442
8443 void sv_pos_u2b(SV *const sv, I32 *const offsetp,
8444 I32 *const lenp)
8445
8446 sv_pos_u2b_flags
8447 Converts the offset from a count of UTF-8 chars from the start
8448 of the string, to a count of the equivalent number of bytes; if
8449 "lenp" is non-zero, it does the same to "lenp", but this time
8450 starting from "offset", rather than from the start of the
8451 string. Handles type coercion. "flags" is passed to
8452 "SvPV_flags", and usually should be "SV_GMAGIC|SV_CONST_RETURN"
8453 to handle magic.
8454
8455 STRLEN sv_pos_u2b_flags(SV *const sv, STRLEN uoffset,
8456 STRLEN *const lenp, U32 flags)
8457
8458 SvPV Returns a pointer to the string in the SV, or a stringified
8459 form of the SV if the SV does not contain a string. The SV may
8460 cache the stringified version becoming "SvPOK". Handles 'get'
8461 magic. The "len" variable will be set to the length of the
8462 string (this is a macro, so don't use &len). See also "SvPVx"
8463 for a version which guarantees to evaluate "sv" only once.
8464
8465 Note that there is no guarantee that the return value of
8466 "SvPV()" is equal to "SvPVX(sv)", or that "SvPVX(sv)" contains
8467 valid data, or that successive calls to "SvPV(sv)" will return
8468 the same pointer value each time. This is due to the way that
8469 things like overloading and Copy-On-Write are handled. In
8470 these cases, the return value may point to a temporary buffer
8471 or similar. If you absolutely need the "SvPVX" field to be
8472 valid (for example, if you intend to write to it), then see
8473 "SvPV_force".
8474
8475 char* SvPV(SV* sv, STRLEN len)
8476
8477 SvPVbyte
8478 Like "SvPV", but converts "sv" to byte representation first if
8479 necessary.
8480
8481 char* SvPVbyte(SV* sv, STRLEN len)
8482
8483 SvPVbyte_force
8484 Like "SvPV_force", but converts "sv" to byte representation
8485 first if necessary.
8486
8487 char* SvPVbyte_force(SV* sv, STRLEN len)
8488
8489 SvPVbyte_nolen
8490 Like "SvPV_nolen", but converts "sv" to byte representation
8491 first if necessary.
8492
8493 char* SvPVbyte_nolen(SV* sv)
8494
8495 sv_pvbyten_force
8496 The backend for the "SvPVbytex_force" macro. Always use the
8497 macro instead.
8498
8499 char* sv_pvbyten_force(SV *const sv, STRLEN *const lp)
8500
8501 SvPVbytex
8502 Like "SvPV", but converts "sv" to byte representation first if
8503 necessary. Guarantees to evaluate "sv" only once; use the more
8504 efficient "SvPVbyte" otherwise.
8505
8506 char* SvPVbytex(SV* sv, STRLEN len)
8507
8508 SvPVbytex_force
8509 Like "SvPV_force", but converts "sv" to byte representation
8510 first if necessary. Guarantees to evaluate "sv" only once; use
8511 the more efficient "SvPVbyte_force" otherwise.
8512
8513 char* SvPVbytex_force(SV* sv, STRLEN len)
8514
8515 SvPVCLEAR
8516 Ensures that sv is a SVt_PV and that its SvCUR is 0, and that
8517 it is properly null terminated. Equivalent to sv_setpvs(""),
8518 but more efficient.
8519
8520 char * SvPVCLEAR(SV* sv)
8521
8522 SvPV_force
8523 Like "SvPV" but will force the SV into containing a string
8524 ("SvPOK"), and only a string ("SvPOK_only"), by hook or by
8525 crook. You need force if you are going to update the "SvPVX"
8526 directly. Processes get magic.
8527
8528 Note that coercing an arbitrary scalar into a plain PV will
8529 potentially strip useful data from it. For example if the SV
8530 was "SvROK", then the referent will have its reference count
8531 decremented, and the SV itself may be converted to an "SvPOK"
8532 scalar with a string buffer containing a value such as
8533 "ARRAY(0x1234)".
8534
8535 char* SvPV_force(SV* sv, STRLEN len)
8536
8537 SvPV_force_nomg
8538 Like "SvPV_force", but doesn't process get magic.
8539
8540 char* SvPV_force_nomg(SV* sv, STRLEN len)
8541
8542 SvPV_nolen
8543 Like "SvPV" but doesn't set a length variable.
8544
8545 char* SvPV_nolen(SV* sv)
8546
8547 SvPV_nomg
8548 Like "SvPV" but doesn't process magic.
8549
8550 char* SvPV_nomg(SV* sv, STRLEN len)
8551
8552 SvPV_nomg_nolen
8553 Like "SvPV_nolen" but doesn't process magic.
8554
8555 char* SvPV_nomg_nolen(SV* sv)
8556
8557 sv_pvn_force
8558 Get a sensible string out of the SV somehow. A private
8559 implementation of the "SvPV_force" macro for compilers which
8560 can't cope with complex macro expressions. Always use the
8561 macro instead.
8562
8563 char* sv_pvn_force(SV* sv, STRLEN* lp)
8564
8565 sv_pvn_force_flags
8566 Get a sensible string out of the SV somehow. If "flags" has
8567 the "SV_GMAGIC" bit set, will "mg_get" on "sv" if appropriate,
8568 else not. "sv_pvn_force" and "sv_pvn_force_nomg" are
8569 implemented in terms of this function. You normally want to
8570 use the various wrapper macros instead: see "SvPV_force" and
8571 "SvPV_force_nomg".
8572
8573 char* sv_pvn_force_flags(SV *const sv,
8574 STRLEN *const lp,
8575 const I32 flags)
8576
8577 SvPV_set
8578 This is probably not what you want to use, you probably wanted
8579 "sv_usepvn_flags" or "sv_setpvn" or "sv_setpvs".
8580
8581 Set the value of the PV pointer in "sv" to the Perl allocated
8582 "NUL"-terminated string "val". See also "SvIV_set".
8583
8584 Remember to free the previous PV buffer. There are many things
8585 to check. Beware that the existing pointer may be involved in
8586 copy-on-write or other mischief, so do "SvOOK_off(sv)" and use
8587 "sv_force_normal" or "SvPV_force" (or check the "SvIsCOW" flag)
8588 first to make sure this modification is safe. Then finally, if
8589 it is not a COW, call "SvPV_free" to free the previous PV
8590 buffer.
8591
8592 void SvPV_set(SV* sv, char* val)
8593
8594 SvPVutf8
8595 Like "SvPV", but converts "sv" to UTF-8 first if necessary.
8596
8597 char* SvPVutf8(SV* sv, STRLEN len)
8598
8599 sv_pvutf8n_force
8600 The backend for the "SvPVutf8x_force" macro. Always use the
8601 macro instead.
8602
8603 char* sv_pvutf8n_force(SV *const sv, STRLEN *const lp)
8604
8605 SvPVutf8x
8606 Like "SvPV", but converts "sv" to UTF-8 first if necessary.
8607 Guarantees to evaluate "sv" only once; use the more efficient
8608 "SvPVutf8" otherwise.
8609
8610 char* SvPVutf8x(SV* sv, STRLEN len)
8611
8612 SvPVutf8x_force
8613 Like "SvPV_force", but converts "sv" to UTF-8 first if
8614 necessary. Guarantees to evaluate "sv" only once; use the more
8615 efficient "SvPVutf8_force" otherwise.
8616
8617 char* SvPVutf8x_force(SV* sv, STRLEN len)
8618
8619 SvPVutf8_force
8620 Like "SvPV_force", but converts "sv" to UTF-8 first if
8621 necessary.
8622
8623 char* SvPVutf8_force(SV* sv, STRLEN len)
8624
8625 SvPVutf8_nolen
8626 Like "SvPV_nolen", but converts "sv" to UTF-8 first if
8627 necessary.
8628
8629 char* SvPVutf8_nolen(SV* sv)
8630
8631 SvPVX Returns a pointer to the physical string in the SV. The SV
8632 must contain a string. Prior to 5.9.3 it is not safe to
8633 execute this macro unless the SV's type >= "SVt_PV".
8634
8635 This is also used to store the name of an autoloaded subroutine
8636 in an XS AUTOLOAD routine. See "Autoloading with XSUBs" in
8637 perlguts.
8638
8639 char* SvPVX(SV* sv)
8640
8641 SvPVx A version of "SvPV" which guarantees to evaluate "sv" only
8642 once. Only use this if "sv" is an expression with side
8643 effects, otherwise use the more efficient "SvPV".
8644
8645 char* SvPVx(SV* sv, STRLEN len)
8646
8647 SvREADONLY
8648 Returns true if the argument is readonly, otherwise returns
8649 false. Exposed to to perl code via Internals::SvREADONLY().
8650
8651 U32 SvREADONLY(SV* sv)
8652
8653 SvREADONLY_off
8654 Mark an object as not-readonly. Exactly what this mean depends
8655 on the object type. Exposed to perl code via
8656 Internals::SvREADONLY().
8657
8658 U32 SvREADONLY_off(SV* sv)
8659
8660 SvREADONLY_on
8661 Mark an object as readonly. Exactly what this means depends on
8662 the object type. Exposed to perl code via
8663 Internals::SvREADONLY().
8664
8665 U32 SvREADONLY_on(SV* sv)
8666
8667 sv_ref Returns a SV describing what the SV passed in is a reference
8668 to.
8669
8670 dst can be a SV to be set to the description or NULL, in which
8671 case a mortal SV is returned.
8672
8673 If ob is true and the SV is blessed, the description is the
8674 class name, otherwise it is the type of the SV, "SCALAR",
8675 "ARRAY" etc.
8676
8677 SV* sv_ref(SV *dst, const SV *const sv,
8678 const int ob)
8679
8680 SvREFCNT
8681 Returns the value of the object's reference count. Exposed to
8682 perl code via Internals::SvREFCNT().
8683
8684 U32 SvREFCNT(SV* sv)
8685
8686 SvREFCNT_dec
8687 Decrements the reference count of the given SV. "sv" may be
8688 "NULL".
8689
8690 void SvREFCNT_dec(SV* sv)
8691
8692 SvREFCNT_dec_NN
8693 Same as "SvREFCNT_dec", but can only be used if you know "sv"
8694 is not "NULL". Since we don't have to check the NULLness, it's
8695 faster and smaller.
8696
8697 void SvREFCNT_dec_NN(SV* sv)
8698
8699 SvREFCNT_inc
8700 Increments the reference count of the given SV, returning the
8701 SV.
8702
8703 All of the following "SvREFCNT_inc"* macros are optimized
8704 versions of "SvREFCNT_inc", and can be replaced with
8705 "SvREFCNT_inc".
8706
8707 SV* SvREFCNT_inc(SV* sv)
8708
8709 SvREFCNT_inc_NN
8710 Same as "SvREFCNT_inc", but can only be used if you know "sv"
8711 is not "NULL". Since we don't have to check the NULLness, it's
8712 faster and smaller.
8713
8714 SV* SvREFCNT_inc_NN(SV* sv)
8715
8716 SvREFCNT_inc_simple
8717 Same as "SvREFCNT_inc", but can only be used with expressions
8718 without side effects. Since we don't have to store a temporary
8719 value, it's faster.
8720
8721 SV* SvREFCNT_inc_simple(SV* sv)
8722
8723 SvREFCNT_inc_simple_NN
8724 Same as "SvREFCNT_inc_simple", but can only be used if you know
8725 "sv" is not "NULL". Since we don't have to check the NULLness,
8726 it's faster and smaller.
8727
8728 SV* SvREFCNT_inc_simple_NN(SV* sv)
8729
8730 SvREFCNT_inc_simple_void
8731 Same as "SvREFCNT_inc_simple", but can only be used if you
8732 don't need the return value. The macro doesn't need to return
8733 a meaningful value.
8734
8735 void SvREFCNT_inc_simple_void(SV* sv)
8736
8737 SvREFCNT_inc_simple_void_NN
8738 Same as "SvREFCNT_inc", but can only be used if you don't need
8739 the return value, and you know that "sv" is not "NULL". The
8740 macro doesn't need to return a meaningful value, or check for
8741 NULLness, so it's smaller and faster.
8742
8743 void SvREFCNT_inc_simple_void_NN(SV* sv)
8744
8745 SvREFCNT_inc_void
8746 Same as "SvREFCNT_inc", but can only be used if you don't need
8747 the return value. The macro doesn't need to return a
8748 meaningful value.
8749
8750 void SvREFCNT_inc_void(SV* sv)
8751
8752 SvREFCNT_inc_void_NN
8753 Same as "SvREFCNT_inc", but can only be used if you don't need
8754 the return value, and you know that "sv" is not "NULL". The
8755 macro doesn't need to return a meaningful value, or check for
8756 NULLness, so it's smaller and faster.
8757
8758 void SvREFCNT_inc_void_NN(SV* sv)
8759
8760 sv_reftype
8761 Returns a string describing what the SV is a reference to.
8762
8763 If ob is true and the SV is blessed, the string is the class
8764 name, otherwise it is the type of the SV, "SCALAR", "ARRAY"
8765 etc.
8766
8767 const char* sv_reftype(const SV *const sv, const int ob)
8768
8769 sv_replace
8770 Make the first argument a copy of the second, then delete the
8771 original. The target SV physically takes over ownership of the
8772 body of the source SV and inherits its flags; however, the
8773 target keeps any magic it owns, and any magic in the source is
8774 discarded. Note that this is a rather specialist SV copying
8775 operation; most of the time you'll want to use "sv_setsv" or
8776 one of its many macro front-ends.
8777
8778 void sv_replace(SV *const sv, SV *const nsv)
8779
8780 sv_report_used
8781 Dump the contents of all SVs not yet freed (debugging aid).
8782
8783 void sv_report_used()
8784
8785 sv_reset
8786 Underlying implementation for the "reset" Perl function. Note
8787 that the perl-level function is vaguely deprecated.
8788
8789 void sv_reset(const char* s, HV *const stash)
8790
8791 SvROK Tests if the SV is an RV.
8792
8793 U32 SvROK(SV* sv)
8794
8795 SvROK_off
8796 Unsets the RV status of an SV.
8797
8798 void SvROK_off(SV* sv)
8799
8800 SvROK_on
8801 Tells an SV that it is an RV.
8802
8803 void SvROK_on(SV* sv)
8804
8805 SvRV Dereferences an RV to return the SV.
8806
8807 SV* SvRV(SV* sv)
8808
8809 SvRV_set
8810 Set the value of the RV pointer in "sv" to val. See
8811 "SvIV_set".
8812
8813 void SvRV_set(SV* sv, SV* val)
8814
8815 sv_rvunweaken
8816 Unweaken a reference: Clear the "SvWEAKREF" flag on this RV;
8817 remove the backreference to this RV from the array of
8818 backreferences associated with the target SV, increment the
8819 refcount of the target. Silently ignores "undef" and warns on
8820 non-weak references.
8821
8822 SV* sv_rvunweaken(SV *const sv)
8823
8824 sv_rvweaken
8825 Weaken a reference: set the "SvWEAKREF" flag on this RV; give
8826 the referred-to SV "PERL_MAGIC_backref" magic if it hasn't
8827 already; and push a back-reference to this RV onto the array of
8828 backreferences associated with that magic. If the RV is
8829 magical, set magic will be called after the RV is cleared.
8830 Silently ignores "undef" and warns on already-weak references.
8831
8832 SV* sv_rvweaken(SV *const sv)
8833
8834 sv_setiv
8835 Copies an integer into the given SV, upgrading first if
8836 necessary. Does not handle 'set' magic. See also
8837 "sv_setiv_mg".
8838
8839 void sv_setiv(SV *const sv, const IV num)
8840
8841 sv_setiv_mg
8842 Like "sv_setiv", but also handles 'set' magic.
8843
8844 void sv_setiv_mg(SV *const sv, const IV i)
8845
8846 sv_setnv
8847 Copies a double into the given SV, upgrading first if
8848 necessary. Does not handle 'set' magic. See also
8849 "sv_setnv_mg".
8850
8851 void sv_setnv(SV *const sv, const NV num)
8852
8853 sv_setnv_mg
8854 Like "sv_setnv", but also handles 'set' magic.
8855
8856 void sv_setnv_mg(SV *const sv, const NV num)
8857
8858 sv_setpv
8859 Copies a string into an SV. The string must be terminated with
8860 a "NUL" character, and not contain embeded "NUL"'s. Does not
8861 handle 'set' magic. See "sv_setpv_mg".
8862
8863 void sv_setpv(SV *const sv, const char *const ptr)
8864
8865 sv_setpvf
8866 Works like "sv_catpvf" but copies the text into the SV instead
8867 of appending it. Does not handle 'set' magic. See
8868 "sv_setpvf_mg".
8869
8870 void sv_setpvf(SV *const sv, const char *const pat,
8871 ...)
8872
8873 sv_setpvf_mg
8874 Like "sv_setpvf", but also handles 'set' magic.
8875
8876 void sv_setpvf_mg(SV *const sv,
8877 const char *const pat, ...)
8878
8879 sv_setpviv
8880 Copies an integer into the given SV, also updating its string
8881 value. Does not handle 'set' magic. See "sv_setpviv_mg".
8882
8883 void sv_setpviv(SV *const sv, const IV num)
8884
8885 sv_setpviv_mg
8886 Like "sv_setpviv", but also handles 'set' magic.
8887
8888 void sv_setpviv_mg(SV *const sv, const IV iv)
8889
8890 sv_setpvn
8891 Copies a string (possibly containing embedded "NUL" characters)
8892 into an SV. The "len" parameter indicates the number of bytes
8893 to be copied. If the "ptr" argument is NULL the SV will become
8894 undefined. Does not handle 'set' magic. See "sv_setpvn_mg".
8895
8896 void sv_setpvn(SV *const sv, const char *const ptr,
8897 const STRLEN len)
8898
8899 sv_setpvn_mg
8900 Like "sv_setpvn", but also handles 'set' magic.
8901
8902 void sv_setpvn_mg(SV *const sv,
8903 const char *const ptr,
8904 const STRLEN len)
8905
8906 sv_setpvs
8907 Like "sv_setpvn", but takes a literal string instead of a
8908 string/length pair.
8909
8910 void sv_setpvs(SV* sv, "literal string" s)
8911
8912 sv_setpvs_mg
8913 Like "sv_setpvn_mg", but takes a literal string instead of a
8914 string/length pair.
8915
8916 void sv_setpvs_mg(SV* sv, "literal string" s)
8917
8918 sv_setpv_bufsize
8919 Sets the SV to be a string of cur bytes length, with at least
8920 len bytes available. Ensures that there is a null byte at
8921 SvEND. Returns a char * pointer to the SvPV buffer.
8922
8923 char * sv_setpv_bufsize(SV *const sv, const STRLEN cur,
8924 const STRLEN len)
8925
8926 sv_setpv_mg
8927 Like "sv_setpv", but also handles 'set' magic.
8928
8929 void sv_setpv_mg(SV *const sv, const char *const ptr)
8930
8931 sv_setref_iv
8932 Copies an integer into a new SV, optionally blessing the SV.
8933 The "rv" argument will be upgraded to an RV. That RV will be
8934 modified to point to the new SV. The "classname" argument
8935 indicates the package for the blessing. Set "classname" to
8936 "NULL" to avoid the blessing. The new SV will have a reference
8937 count of 1, and the RV will be returned.
8938
8939 SV* sv_setref_iv(SV *const rv,
8940 const char *const classname,
8941 const IV iv)
8942
8943 sv_setref_nv
8944 Copies a double into a new SV, optionally blessing the SV. The
8945 "rv" argument will be upgraded to an RV. That RV will be
8946 modified to point to the new SV. The "classname" argument
8947 indicates the package for the blessing. Set "classname" to
8948 "NULL" to avoid the blessing. The new SV will have a reference
8949 count of 1, and the RV will be returned.
8950
8951 SV* sv_setref_nv(SV *const rv,
8952 const char *const classname,
8953 const NV nv)
8954
8955 sv_setref_pv
8956 Copies a pointer into a new SV, optionally blessing the SV.
8957 The "rv" argument will be upgraded to an RV. That RV will be
8958 modified to point to the new SV. If the "pv" argument is
8959 "NULL", then "PL_sv_undef" will be placed into the SV. The
8960 "classname" argument indicates the package for the blessing.
8961 Set "classname" to "NULL" to avoid the blessing. The new SV
8962 will have a reference count of 1, and the RV will be returned.
8963
8964 Do not use with other Perl types such as HV, AV, SV, CV,
8965 because those objects will become corrupted by the pointer copy
8966 process.
8967
8968 Note that "sv_setref_pvn" copies the string while this copies
8969 the pointer.
8970
8971 SV* sv_setref_pv(SV *const rv,
8972 const char *const classname,
8973 void *const pv)
8974
8975 sv_setref_pvn
8976 Copies a string into a new SV, optionally blessing the SV. The
8977 length of the string must be specified with "n". The "rv"
8978 argument will be upgraded to an RV. That RV will be modified
8979 to point to the new SV. The "classname" argument indicates the
8980 package for the blessing. Set "classname" to "NULL" to avoid
8981 the blessing. The new SV will have a reference count of 1, and
8982 the RV will be returned.
8983
8984 Note that "sv_setref_pv" copies the pointer while this copies
8985 the string.
8986
8987 SV* sv_setref_pvn(SV *const rv,
8988 const char *const classname,
8989 const char *const pv,
8990 const STRLEN n)
8991
8992 sv_setref_pvs
8993 Like "sv_setref_pvn", but takes a literal string instead of a
8994 string/length pair.
8995
8996 SV * sv_setref_pvs("literal string" s)
8997
8998 sv_setref_uv
8999 Copies an unsigned integer into a new SV, optionally blessing
9000 the SV. The "rv" argument will be upgraded to an RV. That RV
9001 will be modified to point to the new SV. The "classname"
9002 argument indicates the package for the blessing. Set
9003 "classname" to "NULL" to avoid the blessing. The new SV will
9004 have a reference count of 1, and the RV will be returned.
9005
9006 SV* sv_setref_uv(SV *const rv,
9007 const char *const classname,
9008 const UV uv)
9009
9010 sv_setsv
9011 Copies the contents of the source SV "ssv" into the destination
9012 SV "dsv". The source SV may be destroyed if it is mortal, so
9013 don't use this function if the source SV needs to be reused.
9014 Does not handle 'set' magic on destination SV. Calls 'get'
9015 magic on source SV. Loosely speaking, it performs a copy-by-
9016 value, obliterating any previous content of the destination.
9017
9018 You probably want to use one of the assortment of wrappers,
9019 such as "SvSetSV", "SvSetSV_nosteal", "SvSetMagicSV" and
9020 "SvSetMagicSV_nosteal".
9021
9022 void sv_setsv(SV *dstr, SV *sstr)
9023
9024 sv_setsv_flags
9025 Copies the contents of the source SV "ssv" into the destination
9026 SV "dsv". The source SV may be destroyed if it is mortal, so
9027 don't use this function if the source SV needs to be reused.
9028 Does not handle 'set' magic. Loosely speaking, it performs a
9029 copy-by-value, obliterating any previous content of the
9030 destination. If the "flags" parameter has the "SV_GMAGIC" bit
9031 set, will "mg_get" on "ssv" if appropriate, else not. If the
9032 "flags" parameter has the "SV_NOSTEAL" bit set then the buffers
9033 of temps will not be stolen. "sv_setsv" and "sv_setsv_nomg"
9034 are implemented in terms of this function.
9035
9036 You probably want to use one of the assortment of wrappers,
9037 such as "SvSetSV", "SvSetSV_nosteal", "SvSetMagicSV" and
9038 "SvSetMagicSV_nosteal".
9039
9040 This is the primary function for copying scalars, and most
9041 other copy-ish functions and macros use this underneath.
9042
9043 void sv_setsv_flags(SV *dstr, SV *sstr,
9044 const I32 flags)
9045
9046 sv_setsv_mg
9047 Like "sv_setsv", but also handles 'set' magic.
9048
9049 void sv_setsv_mg(SV *const dstr, SV *const sstr)
9050
9051 sv_setsv_nomg
9052 Like "sv_setsv" but doesn't process magic.
9053
9054 void sv_setsv_nomg(SV* dsv, SV* ssv)
9055
9056 sv_setuv
9057 Copies an unsigned integer into the given SV, upgrading first
9058 if necessary. Does not handle 'set' magic. See also
9059 "sv_setuv_mg".
9060
9061 void sv_setuv(SV *const sv, const UV num)
9062
9063 sv_setuv_mg
9064 Like "sv_setuv", but also handles 'set' magic.
9065
9066 void sv_setuv_mg(SV *const sv, const UV u)
9067
9068 sv_set_undef
9069 Equivalent to "sv_setsv(sv, &PL_sv_undef)", but more efficient.
9070 Doesn't handle set magic.
9071
9072 The perl equivalent is "$sv = undef;". Note that it doesn't
9073 free any string buffer, unlike "undef $sv".
9074
9075 Introduced in perl 5.25.12.
9076
9077 void sv_set_undef(SV *sv)
9078
9079 SvSTASH Returns the stash of the SV.
9080
9081 HV* SvSTASH(SV* sv)
9082
9083 SvSTASH_set
9084 Set the value of the STASH pointer in "sv" to val. See
9085 "SvIV_set".
9086
9087 void SvSTASH_set(SV* sv, HV* val)
9088
9089 SvTAINT Taints an SV if tainting is enabled, and if some input to the
9090 current expression is tainted--usually a variable, but possibly
9091 also implicit inputs such as locale settings. "SvTAINT"
9092 propagates that taintedness to the outputs of an expression in
9093 a pessimistic fashion; i.e., without paying attention to
9094 precisely which outputs are influenced by which inputs.
9095
9096 void SvTAINT(SV* sv)
9097
9098 SvTAINTED
9099 Checks to see if an SV is tainted. Returns TRUE if it is,
9100 FALSE if not.
9101
9102 bool SvTAINTED(SV* sv)
9103
9104 sv_tainted
9105 Test an SV for taintedness. Use "SvTAINTED" instead.
9106
9107 bool sv_tainted(SV *const sv)
9108
9109 SvTAINTED_off
9110 Untaints an SV. Be very careful with this routine, as it
9111 short-circuits some of Perl's fundamental security features.
9112 XS module authors should not use this function unless they
9113 fully understand all the implications of unconditionally
9114 untainting the value. Untainting should be done in the
9115 standard perl fashion, via a carefully crafted regexp, rather
9116 than directly untainting variables.
9117
9118 void SvTAINTED_off(SV* sv)
9119
9120 SvTAINTED_on
9121 Marks an SV as tainted if tainting is enabled.
9122
9123 void SvTAINTED_on(SV* sv)
9124
9125 SvTRUE Returns a boolean indicating whether Perl would evaluate the SV
9126 as true or false. See "SvOK" for a defined/undefined test.
9127 Handles 'get' magic unless the scalar is already "SvPOK",
9128 "SvIOK" or "SvNOK" (the public, not the private flags).
9129
9130 bool SvTRUE(SV* sv)
9131
9132 sv_true Returns true if the SV has a true value by Perl's rules. Use
9133 the "SvTRUE" macro instead, which may call "sv_true()" or may
9134 instead use an in-line version.
9135
9136 I32 sv_true(SV *const sv)
9137
9138 SvTRUE_nomg
9139 Returns a boolean indicating whether Perl would evaluate the SV
9140 as true or false. See "SvOK" for a defined/undefined test.
9141 Does not handle 'get' magic.
9142
9143 bool SvTRUE_nomg(SV* sv)
9144
9145 SvTYPE Returns the type of the SV. See "svtype".
9146
9147 svtype SvTYPE(SV* sv)
9148
9149 sv_unmagic
9150 Removes all magic of type "type" from an SV.
9151
9152 int sv_unmagic(SV *const sv, const int type)
9153
9154 sv_unmagicext
9155 Removes all magic of type "type" with the specified "vtbl" from
9156 an SV.
9157
9158 int sv_unmagicext(SV *const sv, const int type,
9159 MGVTBL *vtbl)
9160
9161 sv_unref_flags
9162 Unsets the RV status of the SV, and decrements the reference
9163 count of whatever was being referenced by the RV. This can
9164 almost be thought of as a reversal of "newSVrv". The "cflags"
9165 argument can contain "SV_IMMEDIATE_UNREF" to force the
9166 reference count to be decremented (otherwise the decrementing
9167 is conditional on the reference count being different from one
9168 or the reference being a readonly SV). See "SvROK_off".
9169
9170 void sv_unref_flags(SV *const ref, const U32 flags)
9171
9172 sv_untaint
9173 Untaint an SV. Use "SvTAINTED_off" instead.
9174
9175 void sv_untaint(SV *const sv)
9176
9177 SvUOK Returns a boolean indicating whether the SV contains an integer
9178 that must be interpreted as unsigned. A non-negative integer
9179 whose value is within the range of both an IV and a UV may be
9180 be flagged as either "SvUOK" or "SVIOK".
9181
9182 bool SvUOK(SV* sv)
9183
9184 SvUPGRADE
9185 Used to upgrade an SV to a more complex form. Uses
9186 "sv_upgrade" to perform the upgrade if necessary. See
9187 "svtype".
9188
9189 void SvUPGRADE(SV* sv, svtype type)
9190
9191 sv_upgrade
9192 Upgrade an SV to a more complex form. Generally adds a new
9193 body type to the SV, then copies across as much information as
9194 possible from the old body. It croaks if the SV is already in
9195 a more complex form than requested. You generally want to use
9196 the "SvUPGRADE" macro wrapper, which checks the type before
9197 calling "sv_upgrade", and hence does not croak. See also
9198 "svtype".
9199
9200 void sv_upgrade(SV *const sv, svtype new_type)
9201
9202 sv_usepvn_flags
9203 Tells an SV to use "ptr" to find its string value. Normally
9204 the string is stored inside the SV, but sv_usepvn allows the SV
9205 to use an outside string. "ptr" should point to memory that
9206 was allocated by "Newx". It must be the start of a "Newx"-ed
9207 block of memory, and not a pointer to the middle of it (beware
9208 of "OOK" and copy-on-write), and not be from a non-"Newx"
9209 memory allocator like "malloc". The string length, "len", must
9210 be supplied. By default this function will "Renew" (i.e.
9211 realloc, move) the memory pointed to by "ptr", so that pointer
9212 should not be freed or used by the programmer after giving it
9213 to "sv_usepvn", and neither should any pointers from "behind"
9214 that pointer (e.g. ptr + 1) be used.
9215
9216 If "flags & SV_SMAGIC" is true, will call "SvSETMAGIC". If
9217 "flags & SV_HAS_TRAILING_NUL" is true, then "ptr[len]" must be
9218 "NUL", and the realloc will be skipped (i.e. the buffer is
9219 actually at least 1 byte longer than "len", and already meets
9220 the requirements for storing in "SvPVX").
9221
9222 void sv_usepvn_flags(SV *const sv, char* ptr,
9223 const STRLEN len,
9224 const U32 flags)
9225
9226 SvUTF8 Returns a U32 value indicating the UTF-8 status of an SV. If
9227 things are set-up properly, this indicates whether or not the
9228 SV contains UTF-8 encoded data. You should use this after a
9229 call to "SvPV()" or one of its variants, in case any call to
9230 string overloading updates the internal flag.
9231
9232 If you want to take into account the bytes pragma, use
9233 "DO_UTF8" instead.
9234
9235 U32 SvUTF8(SV* sv)
9236
9237 sv_utf8_decode
9238 If the PV of the SV is an octet sequence in Perl's extended
9239 UTF-8 and contains a multiple-byte character, the "SvUTF8" flag
9240 is turned on so that it looks like a character. If the PV
9241 contains only single-byte characters, the "SvUTF8" flag stays
9242 off. Scans PV for validity and returns FALSE if the PV is
9243 invalid UTF-8.
9244
9245 bool sv_utf8_decode(SV *const sv)
9246
9247 sv_utf8_downgrade
9248 Attempts to convert the PV of an SV from characters to bytes.
9249 If the PV contains a character that cannot fit in a byte, this
9250 conversion will fail; in this case, either returns false or, if
9251 "fail_ok" is not true, croaks.
9252
9253 This is not a general purpose Unicode to byte encoding
9254 interface: use the "Encode" extension for that.
9255
9256 bool sv_utf8_downgrade(SV *const sv,
9257 const bool fail_ok)
9258
9259 sv_utf8_encode
9260 Converts the PV of an SV to UTF-8, but then turns the "SvUTF8"
9261 flag off so that it looks like octets again.
9262
9263 void sv_utf8_encode(SV *const sv)
9264
9265 sv_utf8_upgrade
9266 Converts the PV of an SV to its UTF-8-encoded form. Forces the
9267 SV to string form if it is not already. Will "mg_get" on "sv"
9268 if appropriate. Always sets the "SvUTF8" flag to avoid future
9269 validity checks even if the whole string is the same in UTF-8
9270 as not. Returns the number of bytes in the converted string
9271
9272 This is not a general purpose byte encoding to Unicode
9273 interface: use the Encode extension for that.
9274
9275 STRLEN sv_utf8_upgrade(SV *sv)
9276
9277 sv_utf8_upgrade_flags
9278 Converts the PV of an SV to its UTF-8-encoded form. Forces the
9279 SV to string form if it is not already. Always sets the SvUTF8
9280 flag to avoid future validity checks even if all the bytes are
9281 invariant in UTF-8. If "flags" has "SV_GMAGIC" bit set, will
9282 "mg_get" on "sv" if appropriate, else not.
9283
9284 The "SV_FORCE_UTF8_UPGRADE" flag is now ignored.
9285
9286 Returns the number of bytes in the converted string.
9287
9288 This is not a general purpose byte encoding to Unicode
9289 interface: use the Encode extension for that.
9290
9291 STRLEN sv_utf8_upgrade_flags(SV *const sv,
9292 const I32 flags)
9293
9294 sv_utf8_upgrade_flags_grow
9295 Like "sv_utf8_upgrade_flags", but has an additional parameter
9296 "extra", which is the number of unused bytes the string of "sv"
9297 is guaranteed to have free after it upon return. This allows
9298 the caller to reserve extra space that it intends to fill, to
9299 avoid extra grows.
9300
9301 "sv_utf8_upgrade", "sv_utf8_upgrade_nomg", and
9302 "sv_utf8_upgrade_flags" are implemented in terms of this
9303 function.
9304
9305 Returns the number of bytes in the converted string (not
9306 including the spares).
9307
9308 STRLEN sv_utf8_upgrade_flags_grow(SV *const sv,
9309 const I32 flags,
9310 STRLEN extra)
9311
9312 sv_utf8_upgrade_nomg
9313 Like "sv_utf8_upgrade", but doesn't do magic on "sv".
9314
9315 STRLEN sv_utf8_upgrade_nomg(SV *sv)
9316
9317 SvUTF8_off
9318 Unsets the UTF-8 status of an SV (the data is not changed, just
9319 the flag). Do not use frivolously.
9320
9321 void SvUTF8_off(SV *sv)
9322
9323 SvUTF8_on
9324 Turn on the UTF-8 status of an SV (the data is not changed,
9325 just the flag). Do not use frivolously.
9326
9327 void SvUTF8_on(SV *sv)
9328
9329 SvUV Coerces the given SV to UV and returns it. The returned value
9330 in many circumstances will get stored in "sv"'s UV slot, but
9331 not in all cases. (Use "sv_setuv" to make sure it does).
9332
9333 See "SvUVx" for a version which guarantees to evaluate "sv"
9334 only once.
9335
9336 UV SvUV(SV* sv)
9337
9338 SvUV_nomg
9339 Like "SvUV" but doesn't process magic.
9340
9341 UV SvUV_nomg(SV* sv)
9342
9343 SvUV_set
9344 Set the value of the UV pointer in "sv" to val. See
9345 "SvIV_set".
9346
9347 void SvUV_set(SV* sv, UV val)
9348
9349 SvUVX Returns the raw value in the SV's UV slot, without checks or
9350 conversions. Only use when you are sure "SvIOK" is true. See
9351 also "SvUV".
9352
9353 UV SvUVX(SV* sv)
9354
9355 SvUVx Coerces the given SV to UV and returns it. The returned value
9356 in many circumstances will get stored in "sv"'s UV slot, but
9357 not in all cases. (Use "sv_setuv" to make sure it does).
9358
9359 This form guarantees to evaluate "sv" only once. Only use this
9360 if "sv" is an expression with side effects, otherwise use the
9361 more efficient "SvUV".
9362
9363 UV SvUVx(SV* sv)
9364
9365 sv_vcatpvf
9366 Processes its arguments like "sv_vcatpvfn" called with a non-
9367 null C-style variable argument list, and appends the formatted
9368 output to an SV. Does not handle 'set' magic. See
9369 "sv_vcatpvf_mg".
9370
9371 Usually used via its frontend "sv_catpvf".
9372
9373 void sv_vcatpvf(SV *const sv, const char *const pat,
9374 va_list *const args)
9375
9376 sv_vcatpvfn
9377 void sv_vcatpvfn(SV *const sv, const char *const pat,
9378 const STRLEN patlen,
9379 va_list *const args,
9380 SV **const svargs,
9381 const Size_t sv_count,
9382 bool *const maybe_tainted)
9383
9384 sv_vcatpvfn_flags
9385 Processes its arguments like "vsprintf" and appends the
9386 formatted output to an SV. Uses an array of SVs if the C-style
9387 variable argument list is missing ("NULL"). Argument reordering
9388 (using format specifiers like "%2$d" or "%*2$d") is supported
9389 only when using an array of SVs; using a C-style "va_list"
9390 argument list with a format string that uses argument
9391 reordering will yield an exception.
9392
9393 When running with taint checks enabled, indicates via
9394 "maybe_tainted" if results are untrustworthy (often due to the
9395 use of locales).
9396
9397 If called as "sv_vcatpvfn" or flags has the "SV_GMAGIC" bit
9398 set, calls get magic.
9399
9400 It assumes that pat has the same utf8-ness as sv. It's the
9401 caller's responsibility to ensure that this is so.
9402
9403 Usually used via one of its frontends "sv_vcatpvf" and
9404 "sv_vcatpvf_mg".
9405
9406 void sv_vcatpvfn_flags(SV *const sv,
9407 const char *const pat,
9408 const STRLEN patlen,
9409 va_list *const args,
9410 SV **const svargs,
9411 const Size_t sv_count,
9412 bool *const maybe_tainted,
9413 const U32 flags)
9414
9415 sv_vcatpvf_mg
9416 Like "sv_vcatpvf", but also handles 'set' magic.
9417
9418 Usually used via its frontend "sv_catpvf_mg".
9419
9420 void sv_vcatpvf_mg(SV *const sv,
9421 const char *const pat,
9422 va_list *const args)
9423
9424 SvVOK Returns a boolean indicating whether the SV contains a
9425 v-string.
9426
9427 bool SvVOK(SV* sv)
9428
9429 sv_vsetpvf
9430 Works like "sv_vcatpvf" but copies the text into the SV instead
9431 of appending it. Does not handle 'set' magic. See
9432 "sv_vsetpvf_mg".
9433
9434 Usually used via its frontend "sv_setpvf".
9435
9436 void sv_vsetpvf(SV *const sv, const char *const pat,
9437 va_list *const args)
9438
9439 sv_vsetpvfn
9440 Works like "sv_vcatpvfn" but copies the text into the SV
9441 instead of appending it.
9442
9443 Usually used via one of its frontends "sv_vsetpvf" and
9444 "sv_vsetpvf_mg".
9445
9446 void sv_vsetpvfn(SV *const sv, const char *const pat,
9447 const STRLEN patlen,
9448 va_list *const args,
9449 SV **const svargs,
9450 const Size_t sv_count,
9451 bool *const maybe_tainted)
9452
9453 sv_vsetpvf_mg
9454 Like "sv_vsetpvf", but also handles 'set' magic.
9455
9456 Usually used via its frontend "sv_setpvf_mg".
9457
9458 void sv_vsetpvf_mg(SV *const sv,
9459 const char *const pat,
9460 va_list *const args)
9461
9463 "Unicode Support" in perlguts has an introduction to this API.
9464
9465 See also "Character classification", and "Character case changing".
9466 Various functions outside this section also work specially with
9467 Unicode. Search for the string "utf8" in this document.
9468
9469 BOM_UTF8
9470 This is a macro that evaluates to a string constant of the
9471 UTF-8 bytes that define the Unicode BYTE ORDER MARK (U+FEFF)
9472 for the platform that perl is compiled on. This allows code to
9473 use a mnemonic for this character that works on both ASCII and
9474 EBCDIC platforms. "sizeof(BOM_UTF8) - 1" can be used to get
9475 its length in bytes.
9476
9477 bytes_cmp_utf8
9478 Compares the sequence of characters (stored as octets) in "b",
9479 "blen" with the sequence of characters (stored as UTF-8) in
9480 "u", "ulen". Returns 0 if they are equal, -1 or -2 if the
9481 first string is less than the second string, +1 or +2 if the
9482 first string is greater than the second string.
9483
9484 -1 or +1 is returned if the shorter string was identical to the
9485 start of the longer string. -2 or +2 is returned if there was
9486 a difference between characters within the strings.
9487
9488 int bytes_cmp_utf8(const U8 *b, STRLEN blen,
9489 const U8 *u, STRLEN ulen)
9490
9491 bytes_from_utf8
9492 NOTE: this function is experimental and may change or be
9493 removed without notice.
9494
9495 Converts a potentially UTF-8 encoded string "s" of length *lenp
9496 into native byte encoding. On input, the boolean *is_utf8p
9497 gives whether or not "s" is actually encoded in UTF-8.
9498
9499 Unlike "utf8_to_bytes" but like "bytes_to_utf8", this is non-
9500 destructive of the input string.
9501
9502 Do nothing if *is_utf8p is 0, or if there are code points in
9503 the string not expressible in native byte encoding. In these
9504 cases, *is_utf8p and *lenp are unchanged, and the return value
9505 is the original "s".
9506
9507 Otherwise, *is_utf8p is set to 0, and the return value is a
9508 pointer to a newly created string containing a downgraded copy
9509 of "s", and whose length is returned in *lenp, updated. The
9510 new string is "NUL"-terminated. The caller is responsible for
9511 arranging for the memory used by this string to get freed.
9512
9513 Upon successful return, the number of variants in the string
9514 can be computed by having saved the value of *lenp before the
9515 call, and subtracting the after-call value of *lenp from it.
9516
9517 U8* bytes_from_utf8(const U8 *s, STRLEN *lenp,
9518 bool *is_utf8p)
9519
9520 bytes_to_utf8
9521 NOTE: this function is experimental and may change or be
9522 removed without notice.
9523
9524 Converts a string "s" of length *lenp bytes from the native
9525 encoding into UTF-8. Returns a pointer to the newly-created
9526 string, and sets *lenp to reflect the new length in bytes. The
9527 caller is responsible for arranging for the memory used by this
9528 string to get freed.
9529
9530 Upon successful return, the number of variants in the string
9531 can be computed by having saved the value of *lenp before the
9532 call, and subtracting it from the after-call value of *lenp.
9533
9534 A "NUL" character will be written after the end of the string.
9535
9536 If you want to convert to UTF-8 from encodings other than the
9537 native (Latin1 or EBCDIC), see "sv_recode_to_utf8"().
9538
9539 U8* bytes_to_utf8(const U8 *s, STRLEN *lenp)
9540
9541 DO_UTF8 Returns a bool giving whether or not the PV in "sv" is to be
9542 treated as being encoded in UTF-8.
9543
9544 You should use this after a call to "SvPV()" or one of its
9545 variants, in case any call to string overloading updates the
9546 internal UTF-8 encoding flag.
9547
9548 bool DO_UTF8(SV* sv)
9549
9550 foldEQ_utf8
9551 Returns true if the leading portions of the strings "s1" and
9552 "s2" (either or both of which may be in UTF-8) are the same
9553 case-insensitively; false otherwise. How far into the strings
9554 to compare is determined by other input parameters.
9555
9556 If "u1" is true, the string "s1" is assumed to be in
9557 UTF-8-encoded Unicode; otherwise it is assumed to be in native
9558 8-bit encoding. Correspondingly for "u2" with respect to "s2".
9559
9560 If the byte length "l1" is non-zero, it says how far into "s1"
9561 to check for fold equality. In other words, "s1"+"l1" will be
9562 used as a goal to reach. The scan will not be considered to be
9563 a match unless the goal is reached, and scanning won't continue
9564 past that goal. Correspondingly for "l2" with respect to "s2".
9565
9566 If "pe1" is non-"NULL" and the pointer it points to is not
9567 "NULL", that pointer is considered an end pointer to the
9568 position 1 byte past the maximum point in "s1" beyond which
9569 scanning will not continue under any circumstances. (This
9570 routine assumes that UTF-8 encoded input strings are not
9571 malformed; malformed input can cause it to read past "pe1").
9572 This means that if both "l1" and "pe1" are specified, and "pe1"
9573 is less than "s1"+"l1", the match will never be successful
9574 because it can never get as far as its goal (and in fact is
9575 asserted against). Correspondingly for "pe2" with respect to
9576 "s2".
9577
9578 At least one of "s1" and "s2" must have a goal (at least one of
9579 "l1" and "l2" must be non-zero), and if both do, both have to
9580 be reached for a successful match. Also, if the fold of a
9581 character is multiple characters, all of them must be matched
9582 (see tr21 reference below for 'folding').
9583
9584 Upon a successful match, if "pe1" is non-"NULL", it will be set
9585 to point to the beginning of the next character of "s1" beyond
9586 what was matched. Correspondingly for "pe2" and "s2".
9587
9588 For case-insensitiveness, the "casefolding" of Unicode is used
9589 instead of upper/lowercasing both the characters, see
9590 <http://www.unicode.org/unicode/reports/tr21/> (Case Mappings).
9591
9592 I32 foldEQ_utf8(const char *s1, char **pe1, UV l1,
9593 bool u1, const char *s2, char **pe2,
9594 UV l2, bool u2)
9595
9596 is_ascii_string
9597 This is a misleadingly-named synonym for
9598 "is_utf8_invariant_string". On ASCII-ish platforms, the name
9599 isn't misleading: the ASCII-range characters are exactly the
9600 UTF-8 invariants. But EBCDIC machines have more invariants
9601 than just the ASCII characters, so "is_utf8_invariant_string"
9602 is preferred.
9603
9604 bool is_ascii_string(const U8* const s, STRLEN len)
9605
9606 is_c9strict_utf8_string
9607 Returns TRUE if the first "len" bytes of string "s" form a
9608 valid UTF-8-encoded string that conforms to Unicode Corrigendum
9609 #9 <http://www.unicode.org/versions/corrigendum9.html>;
9610 otherwise it returns FALSE. If "len" is 0, it will be
9611 calculated using strlen(s) (which means if you use this option,
9612 that "s" can't have embedded "NUL" characters and has to have a
9613 terminating "NUL" byte). Note that all characters being ASCII
9614 constitute 'a valid UTF-8 string'.
9615
9616 This function returns FALSE for strings containing any code
9617 points above the Unicode max of 0x10FFFF or surrogate code
9618 points, but accepts non-character code points per Corrigendum
9619 #9 <http://www.unicode.org/versions/corrigendum9.html>.
9620
9621 See also "is_utf8_invariant_string",
9622 "is_utf8_invariant_string_loc", "is_utf8_string",
9623 "is_utf8_string_flags", "is_utf8_string_loc",
9624 "is_utf8_string_loc_flags", "is_utf8_string_loclen",
9625 "is_utf8_string_loclen_flags", "is_utf8_fixed_width_buf_flags",
9626 "is_utf8_fixed_width_buf_loc_flags",
9627 "is_utf8_fixed_width_buf_loclen_flags",
9628 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9629 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string_loc",
9630 and "is_c9strict_utf8_string_loclen".
9631
9632 bool is_c9strict_utf8_string(const U8 *s, STRLEN len)
9633
9634 is_c9strict_utf8_string_loc
9635 Like "is_c9strict_utf8_string" but stores the location of the
9636 failure (in the case of "utf8ness failure") or the location
9637 "s"+"len" (in the case of "utf8ness success") in the "ep"
9638 pointer.
9639
9640 See also "is_c9strict_utf8_string_loclen".
9641
9642 bool is_c9strict_utf8_string_loc(const U8 *s,
9643 STRLEN len,
9644 const U8 **ep)
9645
9646 is_c9strict_utf8_string_loclen
9647 Like "is_c9strict_utf8_string" but stores the location of the
9648 failure (in the case of "utf8ness failure") or the location
9649 "s"+"len" (in the case of "utf8ness success") in the "ep"
9650 pointer, and the number of UTF-8 encoded characters in the "el"
9651 pointer.
9652
9653 See also "is_c9strict_utf8_string_loc".
9654
9655 bool is_c9strict_utf8_string_loclen(const U8 *s,
9656 STRLEN len,
9657 const U8 **ep,
9658 STRLEN *el)
9659
9660 isC9_STRICT_UTF8_CHAR
9661 Evaluates to non-zero if the first few bytes of the string
9662 starting at "s" and looking no further than "e - 1" are well-
9663 formed UTF-8 that represents some Unicode non-surrogate code
9664 point; otherwise it evaluates to 0. If non-zero, the value
9665 gives how many bytes starting at "s" comprise the code point's
9666 representation. Any bytes remaining before "e", but beyond the
9667 ones needed to form the first code point in "s", are not
9668 examined.
9669
9670 The largest acceptable code point is the Unicode maximum
9671 0x10FFFF. This differs from "isSTRICT_UTF8_CHAR" only in that
9672 it accepts non-character code points. This corresponds to
9673 Unicode Corrigendum #9
9674 <http://www.unicode.org/versions/corrigendum9.html>. which
9675 said that non-character code points are merely discouraged
9676 rather than completely forbidden in open interchange. See
9677 "Noncharacter code points" in perlunicode.
9678
9679 Use "isUTF8_CHAR" to check for Perl's extended UTF-8; and
9680 "isUTF8_CHAR_flags" for a more customized definition.
9681
9682 Use "is_c9strict_utf8_string", "is_c9strict_utf8_string_loc",
9683 and "is_c9strict_utf8_string_loclen" to check entire strings.
9684
9685 STRLEN isC9_STRICT_UTF8_CHAR(const U8 *s, const U8 *e)
9686
9687 is_invariant_string
9688 This is a somewhat misleadingly-named synonym for
9689 "is_utf8_invariant_string". "is_utf8_invariant_string" is
9690 preferred, as it indicates under what conditions the string is
9691 invariant.
9692
9693 bool is_invariant_string(const U8* const s,
9694 STRLEN len)
9695
9696 isSTRICT_UTF8_CHAR
9697 Evaluates to non-zero if the first few bytes of the string
9698 starting at "s" and looking no further than "e - 1" are well-
9699 formed UTF-8 that represents some Unicode code point completely
9700 acceptable for open interchange between all applications;
9701 otherwise it evaluates to 0. If non-zero, the value gives how
9702 many bytes starting at "s" comprise the code point's
9703 representation. Any bytes remaining before "e", but beyond the
9704 ones needed to form the first code point in "s", are not
9705 examined.
9706
9707 The largest acceptable code point is the Unicode maximum
9708 0x10FFFF, and must not be a surrogate nor a non-character code
9709 point. Thus this excludes any code point from Perl's extended
9710 UTF-8.
9711
9712 This is used to efficiently decide if the next few bytes in "s"
9713 is legal Unicode-acceptable UTF-8 for a single character.
9714
9715 Use "isC9_STRICT_UTF8_CHAR" to use the Unicode Corrigendum #9
9716 <http://www.unicode.org/versions/corrigendum9.html> definition
9717 of allowable code points; "isUTF8_CHAR" to check for Perl's
9718 extended UTF-8; and "isUTF8_CHAR_flags" for a more customized
9719 definition.
9720
9721 Use "is_strict_utf8_string", "is_strict_utf8_string_loc", and
9722 "is_strict_utf8_string_loclen" to check entire strings.
9723
9724 Size_t isSTRICT_UTF8_CHAR(const U8 * const s0,
9725 const U8 * const e)
9726
9727 is_strict_utf8_string
9728 Returns TRUE if the first "len" bytes of string "s" form a
9729 valid UTF-8-encoded string that is fully interchangeable by any
9730 application using Unicode rules; otherwise it returns FALSE.
9731 If "len" is 0, it will be calculated using strlen(s) (which
9732 means if you use this option, that "s" can't have embedded
9733 "NUL" characters and has to have a terminating "NUL" byte).
9734 Note that all characters being ASCII constitute 'a valid UTF-8
9735 string'.
9736
9737 This function returns FALSE for strings containing any code
9738 points above the Unicode max of 0x10FFFF, surrogate code
9739 points, or non-character code points.
9740
9741 See also "is_utf8_invariant_string",
9742 "is_utf8_invariant_string_loc", "is_utf8_string",
9743 "is_utf8_string_flags", "is_utf8_string_loc",
9744 "is_utf8_string_loc_flags", "is_utf8_string_loclen",
9745 "is_utf8_string_loclen_flags", "is_utf8_fixed_width_buf_flags",
9746 "is_utf8_fixed_width_buf_loc_flags",
9747 "is_utf8_fixed_width_buf_loclen_flags",
9748 "is_strict_utf8_string_loc", "is_strict_utf8_string_loclen",
9749 "is_c9strict_utf8_string", "is_c9strict_utf8_string_loc", and
9750 "is_c9strict_utf8_string_loclen".
9751
9752 bool is_strict_utf8_string(const U8 *s, STRLEN len)
9753
9754 is_strict_utf8_string_loc
9755 Like "is_strict_utf8_string" but stores the location of the
9756 failure (in the case of "utf8ness failure") or the location
9757 "s"+"len" (in the case of "utf8ness success") in the "ep"
9758 pointer.
9759
9760 See also "is_strict_utf8_string_loclen".
9761
9762 bool is_strict_utf8_string_loc(const U8 *s,
9763 STRLEN len,
9764 const U8 **ep)
9765
9766 is_strict_utf8_string_loclen
9767 Like "is_strict_utf8_string" but stores the location of the
9768 failure (in the case of "utf8ness failure") or the location
9769 "s"+"len" (in the case of "utf8ness success") in the "ep"
9770 pointer, and the number of UTF-8 encoded characters in the "el"
9771 pointer.
9772
9773 See also "is_strict_utf8_string_loc".
9774
9775 bool is_strict_utf8_string_loclen(const U8 *s,
9776 STRLEN len,
9777 const U8 **ep,
9778 STRLEN *el)
9779
9780 is_utf8_fixed_width_buf_flags
9781 Returns TRUE if the fixed-width buffer starting at "s" with
9782 length "len" is entirely valid UTF-8, subject to the
9783 restrictions given by "flags"; otherwise it returns FALSE.
9784
9785 If "flags" is 0, any well-formed UTF-8, as extended by Perl, is
9786 accepted without restriction. If the final few bytes of the
9787 buffer do not form a complete code point, this will return TRUE
9788 anyway, provided that "is_utf8_valid_partial_char_flags"
9789 returns TRUE for them.
9790
9791 If "flags" in non-zero, it can be any combination of the
9792 "UTF8_DISALLOW_foo" flags accepted by "utf8n_to_uvchr", and
9793 with the same meanings.
9794
9795 This function differs from "is_utf8_string_flags" only in that
9796 the latter returns FALSE if the final few bytes of the string
9797 don't form a complete code point.
9798
9799 bool is_utf8_fixed_width_buf_flags(
9800 const U8 * const s, STRLEN len,
9801 const U32 flags
9802 )
9803
9804 is_utf8_fixed_width_buf_loclen_flags
9805 Like "is_utf8_fixed_width_buf_loc_flags" but stores the number
9806 of complete, valid characters found in the "el" pointer.
9807
9808 bool is_utf8_fixed_width_buf_loclen_flags(
9809 const U8 * const s, STRLEN len,
9810 const U8 **ep, STRLEN *el, const U32 flags
9811 )
9812
9813 is_utf8_fixed_width_buf_loc_flags
9814 Like "is_utf8_fixed_width_buf_flags" but stores the location of
9815 the failure in the "ep" pointer. If the function returns TRUE,
9816 *ep will point to the beginning of any partial character at the
9817 end of the buffer; if there is no partial character *ep will
9818 contain "s"+"len".
9819
9820 See also "is_utf8_fixed_width_buf_loclen_flags".
9821
9822 bool is_utf8_fixed_width_buf_loc_flags(
9823 const U8 * const s, STRLEN len,
9824 const U8 **ep, const U32 flags
9825 )
9826
9827 is_utf8_invariant_string
9828 Returns TRUE if the first "len" bytes of the string "s" are the
9829 same regardless of the UTF-8 encoding of the string (or UTF-
9830 EBCDIC encoding on EBCDIC machines); otherwise it returns
9831 FALSE. That is, it returns TRUE if they are UTF-8 invariant.
9832 On ASCII-ish machines, all the ASCII characters and only the
9833 ASCII characters fit this definition. On EBCDIC machines, the
9834 ASCII-range characters are invariant, but so also are the C1
9835 controls.
9836
9837 If "len" is 0, it will be calculated using strlen(s), (which
9838 means if you use this option, that "s" can't have embedded
9839 "NUL" characters and has to have a terminating "NUL" byte).
9840
9841 See also "is_utf8_string", "is_utf8_string_flags",
9842 "is_utf8_string_loc", "is_utf8_string_loc_flags",
9843 "is_utf8_string_loclen", "is_utf8_string_loclen_flags",
9844 "is_utf8_fixed_width_buf_flags",
9845 "is_utf8_fixed_width_buf_loc_flags",
9846 "is_utf8_fixed_width_buf_loclen_flags",
9847 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9848 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string",
9849 "is_c9strict_utf8_string_loc", and
9850 "is_c9strict_utf8_string_loclen".
9851
9852 bool is_utf8_invariant_string(const U8* const s,
9853 STRLEN len)
9854
9855 is_utf8_invariant_string_loc
9856 Like "is_utf8_invariant_string" but upon failure, stores the
9857 location of the first UTF-8 variant character in the "ep"
9858 pointer; if all characters are UTF-8 invariant, this function
9859 does not change the contents of *ep.
9860
9861 bool is_utf8_invariant_string_loc(const U8* const s,
9862 STRLEN len,
9863 const U8 ** ep)
9864
9865 is_utf8_string
9866 Returns TRUE if the first "len" bytes of string "s" form a
9867 valid Perl-extended-UTF-8 string; returns FALSE otherwise. If
9868 "len" is 0, it will be calculated using strlen(s) (which means
9869 if you use this option, that "s" can't have embedded "NUL"
9870 characters and has to have a terminating "NUL" byte). Note
9871 that all characters being ASCII constitute 'a valid UTF-8
9872 string'.
9873
9874 This function considers Perl's extended UTF-8 to be valid.
9875 That means that code points above Unicode, surrogates, and non-
9876 character code points are considered valid by this function.
9877 Use "is_strict_utf8_string", "is_c9strict_utf8_string", or
9878 "is_utf8_string_flags" to restrict what code points are
9879 considered valid.
9880
9881 See also "is_utf8_invariant_string",
9882 "is_utf8_invariant_string_loc", "is_utf8_string_loc",
9883 "is_utf8_string_loclen", "is_utf8_fixed_width_buf_flags",
9884 "is_utf8_fixed_width_buf_loc_flags",
9885 "is_utf8_fixed_width_buf_loclen_flags",
9886
9887 bool is_utf8_string(const U8 *s, STRLEN len)
9888
9889 is_utf8_string_flags
9890 Returns TRUE if the first "len" bytes of string "s" form a
9891 valid UTF-8 string, subject to the restrictions imposed by
9892 "flags"; returns FALSE otherwise. If "len" is 0, it will be
9893 calculated using strlen(s) (which means if you use this option,
9894 that "s" can't have embedded "NUL" characters and has to have a
9895 terminating "NUL" byte). Note that all characters being ASCII
9896 constitute 'a valid UTF-8 string'.
9897
9898 If "flags" is 0, this gives the same results as
9899 "is_utf8_string"; if "flags" is
9900 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", this gives the same
9901 results as "is_strict_utf8_string"; and if "flags" is
9902 "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE", this gives the same
9903 results as "is_c9strict_utf8_string". Otherwise "flags" may be
9904 any combination of the "UTF8_DISALLOW_foo" flags understood by
9905 "utf8n_to_uvchr", with the same meanings.
9906
9907 See also "is_utf8_invariant_string",
9908 "is_utf8_invariant_string_loc", "is_utf8_string",
9909 "is_utf8_string_loc", "is_utf8_string_loc_flags",
9910 "is_utf8_string_loclen", "is_utf8_string_loclen_flags",
9911 "is_utf8_fixed_width_buf_flags",
9912 "is_utf8_fixed_width_buf_loc_flags",
9913 "is_utf8_fixed_width_buf_loclen_flags",
9914 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9915 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string",
9916 "is_c9strict_utf8_string_loc", and
9917 "is_c9strict_utf8_string_loclen".
9918
9919 bool is_utf8_string_flags(const U8 *s, STRLEN len,
9920 const U32 flags)
9921
9922 is_utf8_string_loc
9923 Like "is_utf8_string" but stores the location of the failure
9924 (in the case of "utf8ness failure") or the location "s"+"len"
9925 (in the case of "utf8ness success") in the "ep" pointer.
9926
9927 See also "is_utf8_string_loclen".
9928
9929 bool is_utf8_string_loc(const U8 *s,
9930 const STRLEN len,
9931 const U8 **ep)
9932
9933 is_utf8_string_loclen
9934 Like "is_utf8_string" but stores the location of the failure
9935 (in the case of "utf8ness failure") or the location "s"+"len"
9936 (in the case of "utf8ness success") in the "ep" pointer, and
9937 the number of UTF-8 encoded characters in the "el" pointer.
9938
9939 See also "is_utf8_string_loc".
9940
9941 bool is_utf8_string_loclen(const U8 *s, STRLEN len,
9942 const U8 **ep, STRLEN *el)
9943
9944 is_utf8_string_loclen_flags
9945 Like "is_utf8_string_flags" but stores the location of the
9946 failure (in the case of "utf8ness failure") or the location
9947 "s"+"len" (in the case of "utf8ness success") in the "ep"
9948 pointer, and the number of UTF-8 encoded characters in the "el"
9949 pointer.
9950
9951 See also "is_utf8_string_loc_flags".
9952
9953 bool is_utf8_string_loclen_flags(const U8 *s,
9954 STRLEN len,
9955 const U8 **ep,
9956 STRLEN *el,
9957 const U32 flags)
9958
9959 is_utf8_string_loc_flags
9960 Like "is_utf8_string_flags" but stores the location of the
9961 failure (in the case of "utf8ness failure") or the location
9962 "s"+"len" (in the case of "utf8ness success") in the "ep"
9963 pointer.
9964
9965 See also "is_utf8_string_loclen_flags".
9966
9967 bool is_utf8_string_loc_flags(const U8 *s,
9968 STRLEN len,
9969 const U8 **ep,
9970 const U32 flags)
9971
9972 is_utf8_valid_partial_char
9973 Returns 0 if the sequence of bytes starting at "s" and looking
9974 no further than "e - 1" is the UTF-8 encoding, as extended by
9975 Perl, for one or more code points. Otherwise, it returns 1 if
9976 there exists at least one non-empty sequence of bytes that when
9977 appended to sequence "s", starting at position "e" causes the
9978 entire sequence to be the well-formed UTF-8 of some code point;
9979 otherwise returns 0.
9980
9981 In other words this returns TRUE if "s" points to a partial
9982 UTF-8-encoded code point.
9983
9984 This is useful when a fixed-length buffer is being tested for
9985 being well-formed UTF-8, but the final few bytes in it don't
9986 comprise a full character; that is, it is split somewhere in
9987 the middle of the final code point's UTF-8 representation.
9988 (Presumably when the buffer is refreshed with the next chunk of
9989 data, the new first bytes will complete the partial code
9990 point.) This function is used to verify that the final bytes
9991 in the current buffer are in fact the legal beginning of some
9992 code point, so that if they aren't, the failure can be
9993 signalled without having to wait for the next read.
9994
9995 bool is_utf8_valid_partial_char(const U8 * const s,
9996 const U8 * const e)
9997
9998 is_utf8_valid_partial_char_flags
9999 Like "is_utf8_valid_partial_char", it returns a boolean giving
10000 whether or not the input is a valid UTF-8 encoded partial
10001 character, but it takes an extra parameter, "flags", which can
10002 further restrict which code points are considered valid.
10003
10004 If "flags" is 0, this behaves identically to
10005 "is_utf8_valid_partial_char". Otherwise "flags" can be any
10006 combination of the "UTF8_DISALLOW_foo" flags accepted by
10007 "utf8n_to_uvchr". If there is any sequence of bytes that can
10008 complete the input partial character in such a way that a non-
10009 prohibited character is formed, the function returns TRUE;
10010 otherwise FALSE. Non character code points cannot be
10011 determined based on partial character input. But many of the
10012 other possible excluded types can be determined from just the
10013 first one or two bytes.
10014
10015 bool is_utf8_valid_partial_char_flags(
10016 const U8 * const s, const U8 * const e,
10017 const U32 flags
10018 )
10019
10020 isUTF8_CHAR
10021 Evaluates to non-zero if the first few bytes of the string
10022 starting at "s" and looking no further than "e - 1" are well-
10023 formed UTF-8, as extended by Perl, that represents some code
10024 point; otherwise it evaluates to 0. If non-zero, the value
10025 gives how many bytes starting at "s" comprise the code point's
10026 representation. Any bytes remaining before "e", but beyond the
10027 ones needed to form the first code point in "s", are not
10028 examined.
10029
10030 The code point can be any that will fit in an IV on this
10031 machine, using Perl's extension to official UTF-8 to represent
10032 those higher than the Unicode maximum of 0x10FFFF. That means
10033 that this macro is used to efficiently decide if the next few
10034 bytes in "s" is legal UTF-8 for a single character.
10035
10036 Use "isSTRICT_UTF8_CHAR" to restrict the acceptable code points
10037 to those defined by Unicode to be fully interchangeable across
10038 applications; "isC9_STRICT_UTF8_CHAR" to use the Unicode
10039 Corrigendum #9
10040 <http://www.unicode.org/versions/corrigendum9.html> definition
10041 of allowable code points; and "isUTF8_CHAR_flags" for a more
10042 customized definition.
10043
10044 Use "is_utf8_string", "is_utf8_string_loc", and
10045 "is_utf8_string_loclen" to check entire strings.
10046
10047 Note also that a UTF-8 "invariant" character (i.e. ASCII on
10048 non-EBCDIC machines) is a valid UTF-8 character.
10049
10050 STRLEN isUTF8_CHAR(const U8 *s, const U8 *e)
10051
10052 isUTF8_CHAR_flags
10053 Evaluates to non-zero if the first few bytes of the string
10054 starting at "s" and looking no further than "e - 1" are well-
10055 formed UTF-8, as extended by Perl, that represents some code
10056 point, subject to the restrictions given by "flags"; otherwise
10057 it evaluates to 0. If non-zero, the value gives how many bytes
10058 starting at "s" comprise the code point's representation. Any
10059 bytes remaining before "e", but beyond the ones needed to form
10060 the first code point in "s", are not examined.
10061
10062 If "flags" is 0, this gives the same results as "isUTF8_CHAR";
10063 if "flags" is "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", this gives
10064 the same results as "isSTRICT_UTF8_CHAR"; and if "flags" is
10065 "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE", this gives the same
10066 results as "isC9_STRICT_UTF8_CHAR". Otherwise "flags" may be
10067 any combination of the "UTF8_DISALLOW_foo" flags understood by
10068 "utf8n_to_uvchr", with the same meanings.
10069
10070 The three alternative macros are for the most commonly needed
10071 validations; they are likely to run somewhat faster than this
10072 more general one, as they can be inlined into your code.
10073
10074 Use "is_utf8_string_flags", "is_utf8_string_loc_flags", and
10075 "is_utf8_string_loclen_flags" to check entire strings.
10076
10077 STRLEN isUTF8_CHAR_flags(const U8 *s, const U8 *e,
10078 const U32 flags)
10079
10080 pv_uni_display
10081 Build to the scalar "dsv" a displayable version of the string
10082 "spv", length "len", the displayable version being at most
10083 "pvlim" bytes long (if longer, the rest is truncated and "..."
10084 will be appended).
10085
10086 The "flags" argument can have "UNI_DISPLAY_ISPRINT" set to
10087 display "isPRINT()"able characters as themselves,
10088 "UNI_DISPLAY_BACKSLASH" to display the "\\[nrfta\\]" as the
10089 backslashed versions (like "\n") ("UNI_DISPLAY_BACKSLASH" is
10090 preferred over "UNI_DISPLAY_ISPRINT" for "\\").
10091 "UNI_DISPLAY_QQ" (and its alias "UNI_DISPLAY_REGEX") have both
10092 "UNI_DISPLAY_BACKSLASH" and "UNI_DISPLAY_ISPRINT" turned on.
10093
10094 The pointer to the PV of the "dsv" is returned.
10095
10096 See also "sv_uni_display".
10097
10098 char* pv_uni_display(SV *dsv, const U8 *spv,
10099 STRLEN len, STRLEN pvlim,
10100 UV flags)
10101
10102 REPLACEMENT_CHARACTER_UTF8
10103 This is a macro that evaluates to a string constant of the
10104 UTF-8 bytes that define the Unicode REPLACEMENT CHARACTER
10105 (U+FFFD) for the platform that perl is compiled on. This
10106 allows code to use a mnemonic for this character that works on
10107 both ASCII and EBCDIC platforms.
10108 "sizeof(REPLACEMENT_CHARACTER_UTF8) - 1" can be used to get its
10109 length in bytes.
10110
10111 sv_cat_decode
10112 "encoding" is assumed to be an "Encode" object, the PV of "ssv"
10113 is assumed to be octets in that encoding and decoding the input
10114 starts from the position which "(PV + *offset)" pointed to.
10115 "dsv" will be concatenated with the decoded UTF-8 string from
10116 "ssv". Decoding will terminate when the string "tstr" appears
10117 in decoding output or the input ends on the PV of "ssv". The
10118 value which "offset" points will be modified to the last input
10119 position on "ssv".
10120
10121 Returns TRUE if the terminator was found, else returns FALSE.
10122
10123 bool sv_cat_decode(SV* dsv, SV *encoding, SV *ssv,
10124 int *offset, char* tstr, int tlen)
10125
10126 sv_recode_to_utf8
10127 "encoding" is assumed to be an "Encode" object, on entry the PV
10128 of "sv" is assumed to be octets in that encoding, and "sv" will
10129 be converted into Unicode (and UTF-8).
10130
10131 If "sv" already is UTF-8 (or if it is not "POK"), or if
10132 "encoding" is not a reference, nothing is done to "sv". If
10133 "encoding" is not an "Encode::XS" Encoding object, bad things
10134 will happen. (See cpan/Encode/encoding.pm and Encode.)
10135
10136 The PV of "sv" is returned.
10137
10138 char* sv_recode_to_utf8(SV* sv, SV *encoding)
10139
10140 sv_uni_display
10141 Build to the scalar "dsv" a displayable version of the scalar
10142 "sv", the displayable version being at most "pvlim" bytes long
10143 (if longer, the rest is truncated and "..." will be appended).
10144
10145 The "flags" argument is as in "pv_uni_display"().
10146
10147 The pointer to the PV of the "dsv" is returned.
10148
10149 char* sv_uni_display(SV *dsv, SV *ssv, STRLEN pvlim,
10150 UV flags)
10151
10152 to_utf8_fold
10153 DEPRECATED! It is planned to remove this function from a
10154 future release of Perl. Do not use it for new code; remove it
10155 from existing code.
10156
10157 Instead use "toFOLD_utf8_safe".
10158
10159 UV to_utf8_fold(const U8 *p, U8* ustrp,
10160 STRLEN *lenp)
10161
10162 to_utf8_lower
10163 DEPRECATED! It is planned to remove this function from a
10164 future release of Perl. Do not use it for new code; remove it
10165 from existing code.
10166
10167 Instead use "toLOWER_utf8_safe".
10168
10169 UV to_utf8_lower(const U8 *p, U8* ustrp,
10170 STRLEN *lenp)
10171
10172 to_utf8_title
10173 DEPRECATED! It is planned to remove this function from a
10174 future release of Perl. Do not use it for new code; remove it
10175 from existing code.
10176
10177 Instead use "toTITLE_utf8_safe".
10178
10179 UV to_utf8_title(const U8 *p, U8* ustrp,
10180 STRLEN *lenp)
10181
10182 to_utf8_upper
10183 DEPRECATED! It is planned to remove this function from a
10184 future release of Perl. Do not use it for new code; remove it
10185 from existing code.
10186
10187 Instead use "toUPPER_utf8_safe".
10188
10189 UV to_utf8_upper(const U8 *p, U8* ustrp,
10190 STRLEN *lenp)
10191
10192 utf8n_to_uvchr
10193 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10194 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10195 rather than call this directly.
10196
10197 Bottom level UTF-8 decode routine. Returns the native code
10198 point value of the first character in the string "s", which is
10199 assumed to be in UTF-8 (or UTF-EBCDIC) encoding, and no longer
10200 than "curlen" bytes; *retlen (if "retlen" isn't NULL) will be
10201 set to the length, in bytes, of that character.
10202
10203 The value of "flags" determines the behavior when "s" does not
10204 point to a well-formed UTF-8 character. If "flags" is 0,
10205 encountering a malformation causes zero to be returned and
10206 *retlen is set so that ("s" + *retlen) is the next possible
10207 position in "s" that could begin a non-malformed character.
10208 Also, if UTF-8 warnings haven't been lexically disabled, a
10209 warning is raised. Some UTF-8 input sequences may contain
10210 multiple malformations. This function tries to find every
10211 possible one in each call, so multiple warnings can be raised
10212 for the same sequence.
10213
10214 Various ALLOW flags can be set in "flags" to allow (and not
10215 warn on) individual types of malformations, such as the
10216 sequence being overlong (that is, when there is a shorter
10217 sequence that can express the same code point; overlong
10218 sequences are expressly forbidden in the UTF-8 standard due to
10219 potential security issues). Another malformation example is
10220 the first byte of a character not being a legal first byte.
10221 See utf8.h for the list of such flags. Even if allowed, this
10222 function generally returns the Unicode REPLACEMENT CHARACTER
10223 when it encounters a malformation. There are flags in utf8.h
10224 to override this behavior for the overlong malformations, but
10225 don't do that except for very specialized purposes.
10226
10227 The "UTF8_CHECK_ONLY" flag overrides the behavior when a non-
10228 allowed (by other flags) malformation is found. If this flag
10229 is set, the routine assumes that the caller will raise a
10230 warning, and this function will silently just set "retlen" to
10231 "-1" (cast to "STRLEN") and return zero.
10232
10233 Note that this API requires disambiguation between successful
10234 decoding a "NUL" character, and an error return (unless the
10235 "UTF8_CHECK_ONLY" flag is set), as in both cases, 0 is
10236 returned, and, depending on the malformation, "retlen" may be
10237 set to 1. To disambiguate, upon a zero return, see if the
10238 first byte of "s" is 0 as well. If so, the input was a "NUL";
10239 if not, the input had an error. Or you can use
10240 "utf8n_to_uvchr_error".
10241
10242 Certain code points are considered problematic. These are
10243 Unicode surrogates, Unicode non-characters, and code points
10244 above the Unicode maximum of 0x10FFFF. By default these are
10245 considered regular code points, but certain situations warrant
10246 special handling for them, which can be specified using the
10247 "flags" parameter. If "flags" contains
10248 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", all three classes are
10249 treated as malformations and handled as such. The flags
10250 "UTF8_DISALLOW_SURROGATE", "UTF8_DISALLOW_NONCHAR", and
10251 "UTF8_DISALLOW_SUPER" (meaning above the legal Unicode maximum)
10252 can be set to disallow these categories individually.
10253 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE" restricts the allowed
10254 inputs to the strict UTF-8 traditionally defined by Unicode.
10255 Use "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE" to use the
10256 strictness definition given by Unicode Corrigendum #9
10257 <http://www.unicode.org/versions/corrigendum9.html>. The
10258 difference between traditional strictness and C9 strictness is
10259 that the latter does not forbid non-character code points.
10260 (They are still discouraged, however.) For more discussion see
10261 "Noncharacter code points" in perlunicode.
10262
10263 The flags "UTF8_WARN_ILLEGAL_INTERCHANGE",
10264 "UTF8_WARN_ILLEGAL_C9_INTERCHANGE", "UTF8_WARN_SURROGATE",
10265 "UTF8_WARN_NONCHAR", and "UTF8_WARN_SUPER" will cause warning
10266 messages to be raised for their respective categories, but
10267 otherwise the code points are considered valid (not
10268 malformations). To get a category to both be treated as a
10269 malformation and raise a warning, specify both the WARN and
10270 DISALLOW flags. (But note that warnings are not raised if
10271 lexically disabled nor if "UTF8_CHECK_ONLY" is also specified.)
10272
10273 Extremely high code points were never specified in any
10274 standard, and require an extension to UTF-8 to express, which
10275 Perl does. It is likely that programs written in something
10276 other than Perl would not be able to read files that contain
10277 these; nor would Perl understand files written by something
10278 that uses a different extension. For these reasons, there is a
10279 separate set of flags that can warn and/or disallow these
10280 extremely high code points, even if other above-Unicode ones
10281 are accepted. They are the "UTF8_WARN_PERL_EXTENDED" and
10282 "UTF8_DISALLOW_PERL_EXTENDED" flags. For more information see
10283 ""UTF8_GOT_PERL_EXTENDED"". Of course "UTF8_DISALLOW_SUPER"
10284 will treat all above-Unicode code points, including these, as
10285 malformations. (Note that the Unicode standard considers
10286 anything above 0x10FFFF to be illegal, but there are standards
10287 predating it that allow up to 0x7FFF_FFFF (2**31 -1))
10288
10289 A somewhat misleadingly named synonym for
10290 "UTF8_WARN_PERL_EXTENDED" is retained for backward
10291 compatibility: "UTF8_WARN_ABOVE_31_BIT". Similarly,
10292 "UTF8_DISALLOW_ABOVE_31_BIT" is usable instead of the more
10293 accurately named "UTF8_DISALLOW_PERL_EXTENDED". The names are
10294 misleading because these flags can apply to code points that
10295 actually do fit in 31 bits. This happens on EBCDIC platforms,
10296 and sometimes when the overlong malformation is also present.
10297 The new names accurately describe the situation in all cases.
10298
10299 All other code points corresponding to Unicode characters,
10300 including private use and those yet to be assigned, are never
10301 considered malformed and never warn.
10302
10303 UV utf8n_to_uvchr(const U8 *s, STRLEN curlen,
10304 STRLEN *retlen, const U32 flags)
10305
10306 utf8n_to_uvchr_error
10307 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10308 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10309 rather than call this directly.
10310
10311 This function is for code that needs to know what the precise
10312 malformation(s) are when an error is found. If you also need
10313 to know the generated warning messages, use
10314 "utf8n_to_uvchr_msgs"() instead.
10315
10316 It is like "utf8n_to_uvchr" but it takes an extra parameter
10317 placed after all the others, "errors". If this parameter is 0,
10318 this function behaves identically to "utf8n_to_uvchr".
10319 Otherwise, "errors" should be a pointer to a "U32" variable,
10320 which this function sets to indicate any errors found. Upon
10321 return, if *errors is 0, there were no errors found.
10322 Otherwise, *errors is the bit-wise "OR" of the bits described
10323 in the list below. Some of these bits will be set if a
10324 malformation is found, even if the input "flags" parameter
10325 indicates that the given malformation is allowed; those
10326 exceptions are noted:
10327
10328 "UTF8_GOT_PERL_EXTENDED"
10329 The input sequence is not standard UTF-8, but a Perl
10330 extension. This bit is set only if the input "flags"
10331 parameter contains either the "UTF8_DISALLOW_PERL_EXTENDED"
10332 or the "UTF8_WARN_PERL_EXTENDED" flags.
10333
10334 Code points above 0x7FFF_FFFF (2**31 - 1) were never
10335 specified in any standard, and so some extension must be
10336 used to express them. Perl uses a natural extension to
10337 UTF-8 to represent the ones up to 2**36-1, and invented a
10338 further extension to represent even higher ones, so that
10339 any code point that fits in a 64-bit word can be
10340 represented. Text using these extensions is not likely to
10341 be portable to non-Perl code. We lump both of these
10342 extensions together and refer to them as Perl extended
10343 UTF-8. There exist other extensions that people have
10344 invented, incompatible with Perl's.
10345
10346 On EBCDIC platforms starting in Perl v5.24, the Perl
10347 extension for representing extremely high code points kicks
10348 in at 0x3FFF_FFFF (2**30 -1), which is lower than on ASCII.
10349 Prior to that, code points 2**31 and higher were simply
10350 unrepresentable, and a different, incompatible method was
10351 used to represent code points between 2**30 and 2**31 - 1.
10352
10353 On both platforms, ASCII and EBCDIC,
10354 "UTF8_GOT_PERL_EXTENDED" is set if Perl extended UTF-8 is
10355 used.
10356
10357 In earlier Perls, this bit was named
10358 "UTF8_GOT_ABOVE_31_BIT", which you still may use for
10359 backward compatibility. That name is misleading, as this
10360 flag may be set when the code point actually does fit in 31
10361 bits. This happens on EBCDIC platforms, and sometimes when
10362 the overlong malformation is also present. The new name
10363 accurately describes the situation in all cases.
10364
10365 "UTF8_GOT_CONTINUATION"
10366 The input sequence was malformed in that the first byte was
10367 a a UTF-8 continuation byte.
10368
10369 "UTF8_GOT_EMPTY"
10370 The input "curlen" parameter was 0.
10371
10372 "UTF8_GOT_LONG"
10373 The input sequence was malformed in that there is some
10374 other sequence that evaluates to the same code point, but
10375 that sequence is shorter than this one.
10376
10377 Until Unicode 3.1, it was legal for programs to accept this
10378 malformation, but it was discovered that this created
10379 security issues.
10380
10381 "UTF8_GOT_NONCHAR"
10382 The code point represented by the input UTF-8 sequence is
10383 for a Unicode non-character code point. This bit is set
10384 only if the input "flags" parameter contains either the
10385 "UTF8_DISALLOW_NONCHAR" or the "UTF8_WARN_NONCHAR" flags.
10386
10387 "UTF8_GOT_NON_CONTINUATION"
10388 The input sequence was malformed in that a non-continuation
10389 type byte was found in a position where only a continuation
10390 type one should be. See also ""UTF8_GOT_SHORT"".
10391
10392 "UTF8_GOT_OVERFLOW"
10393 The input sequence was malformed in that it is for a code
10394 point that is not representable in the number of bits
10395 available in an IV on the current platform.
10396
10397 "UTF8_GOT_SHORT"
10398 The input sequence was malformed in that "curlen" is
10399 smaller than required for a complete sequence. In other
10400 words, the input is for a partial character sequence.
10401
10402 "UTF8_GOT_SHORT" and "UTF8_GOT_NON_CONTINUATION" both
10403 indicate a too short sequence. The difference is that
10404 "UTF8_GOT_NON_CONTINUATION" indicates always that there is
10405 an error, while "UTF8_GOT_SHORT" means that an incomplete
10406 sequence was looked at. If no other flags are present, it
10407 means that the sequence was valid as far as it went.
10408 Depending on the application, this could mean one of three
10409 things:
10410
10411 · The "curlen" length parameter passed in was too small,
10412 and the function was prevented from examining all the
10413 necessary bytes.
10414
10415 · The buffer being looked at is based on reading data,
10416 and the data received so far stopped in the middle of a
10417 character, so that the next read will read the
10418 remainder of this character. (It is up to the caller
10419 to deal with the split bytes somehow.)
10420
10421 · This is a real error, and the partial sequence is all
10422 we're going to get.
10423
10424 "UTF8_GOT_SUPER"
10425 The input sequence was malformed in that it is for a non-
10426 Unicode code point; that is, one above the legal Unicode
10427 maximum. This bit is set only if the input "flags"
10428 parameter contains either the "UTF8_DISALLOW_SUPER" or the
10429 "UTF8_WARN_SUPER" flags.
10430
10431 "UTF8_GOT_SURROGATE"
10432 The input sequence was malformed in that it is for a
10433 -Unicode UTF-16 surrogate code point. This bit is set only
10434 if the input "flags" parameter contains either the
10435 "UTF8_DISALLOW_SURROGATE" or the "UTF8_WARN_SURROGATE"
10436 flags.
10437
10438 To do your own error handling, call this function with the
10439 "UTF8_CHECK_ONLY" flag to suppress any warnings, and then
10440 examine the *errors return.
10441
10442 UV utf8n_to_uvchr_error(const U8 *s, STRLEN curlen,
10443 STRLEN *retlen,
10444 const U32 flags,
10445 U32 * errors)
10446
10447 utf8n_to_uvchr_msgs
10448 NOTE: this function is experimental and may change or be
10449 removed without notice.
10450
10451 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10452 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10453 rather than call this directly.
10454
10455 This function is for code that needs to know what the precise
10456 malformation(s) are when an error is found, and wants the
10457 corresponding warning and/or error messages to be returned to
10458 the caller rather than be displayed. All messages that would
10459 have been displayed if all lexcial warnings are enabled will be
10460 returned.
10461
10462 It is just like "utf8n_to_uvchr_error" but it takes an extra
10463 parameter placed after all the others, "msgs". If this
10464 parameter is 0, this function behaves identically to
10465 "utf8n_to_uvchr_error". Otherwise, "msgs" should be a pointer
10466 to an "AV *" variable, in which this function creates a new AV
10467 to contain any appropriate messages. The elements of the array
10468 are ordered so that the first message that would have been
10469 displayed is in the 0th element, and so on. Each element is a
10470 hash with three key-value pairs, as follows:
10471
10472 "text"
10473 The text of the message as a "SVpv".
10474
10475 "warn_categories"
10476 The warning category (or categories) packed into a "SVuv".
10477
10478 "flag"
10479 A single flag bit associated with this message, in a
10480 "SVuv". The bit corresponds to some bit in the *errors
10481 return value, such as "UTF8_GOT_LONG".
10482
10483 It's important to note that specifying this parameter as non-
10484 null will cause any warnings this function would otherwise
10485 generate to be suppressed, and instead be placed in *msgs. The
10486 caller can check the lexical warnings state (or not) when
10487 choosing what to do with the returned messages.
10488
10489 If the flag "UTF8_CHECK_ONLY" is passed, no warnings are
10490 generated, and hence no AV is created.
10491
10492 The caller, of course, is responsible for freeing any returned
10493 AV.
10494
10495 UV utf8n_to_uvchr_msgs(const U8 *s, STRLEN curlen,
10496 STRLEN *retlen,
10497 const U32 flags,
10498 U32 * errors, AV ** msgs)
10499
10500 utf8n_to_uvuni
10501 Instead use "utf8_to_uvchr_buf", or rarely, "utf8n_to_uvchr".
10502
10503 This function was useful for code that wanted to handle both
10504 EBCDIC and ASCII platforms with Unicode properties, but
10505 starting in Perl v5.20, the distinctions between the platforms
10506 have mostly been made invisible to most code, so this function
10507 is quite unlikely to be what you want. If you do need this
10508 precise functionality, use instead
10509 "NATIVE_TO_UNI(utf8_to_uvchr_buf(...))" or
10510 "NATIVE_TO_UNI(utf8n_to_uvchr(...))".
10511
10512 UV utf8n_to_uvuni(const U8 *s, STRLEN curlen,
10513 STRLEN *retlen, U32 flags)
10514
10515 UTF8SKIP
10516 returns the number of bytes in the UTF-8 encoded character
10517 whose first (perhaps only) byte is pointed to by "s".
10518
10519 STRLEN UTF8SKIP(char* s)
10520
10521 utf8_distance
10522 Returns the number of UTF-8 characters between the UTF-8
10523 pointers "a" and "b".
10524
10525 WARNING: use only if you *know* that the pointers point inside
10526 the same UTF-8 buffer.
10527
10528 IV utf8_distance(const U8 *a, const U8 *b)
10529
10530 utf8_hop
10531 Return the UTF-8 pointer "s" displaced by "off" characters,
10532 either forward or backward.
10533
10534 WARNING: do not use the following unless you *know* "off" is
10535 within the UTF-8 data pointed to by "s" *and* that on entry "s"
10536 is aligned on the first byte of character or just after the
10537 last byte of a character.
10538
10539 U8* utf8_hop(const U8 *s, SSize_t off)
10540
10541 utf8_hop_back
10542 Return the UTF-8 pointer "s" displaced by up to "off"
10543 characters, backward.
10544
10545 "off" must be non-positive.
10546
10547 "s" must be after or equal to "start".
10548
10549 When moving backward it will not move before "start".
10550
10551 Will not exceed this limit even if the string is not valid
10552 "UTF-8".
10553
10554 U8* utf8_hop_back(const U8 *s, SSize_t off,
10555 const U8 *start)
10556
10557 utf8_hop_forward
10558 Return the UTF-8 pointer "s" displaced by up to "off"
10559 characters, forward.
10560
10561 "off" must be non-negative.
10562
10563 "s" must be before or equal to "end".
10564
10565 When moving forward it will not move beyond "end".
10566
10567 Will not exceed this limit even if the string is not valid
10568 "UTF-8".
10569
10570 U8* utf8_hop_forward(const U8 *s, SSize_t off,
10571 const U8 *end)
10572
10573 utf8_hop_safe
10574 Return the UTF-8 pointer "s" displaced by up to "off"
10575 characters, either forward or backward.
10576
10577 When moving backward it will not move before "start".
10578
10579 When moving forward it will not move beyond "end".
10580
10581 Will not exceed those limits even if the string is not valid
10582 "UTF-8".
10583
10584 U8* utf8_hop_safe(const U8 *s, SSize_t off,
10585 const U8 *start, const U8 *end)
10586
10587 UTF8_IS_INVARIANT
10588 Evaluates to 1 if the byte "c" represents the same character
10589 when encoded in UTF-8 as when not; otherwise evaluates to 0.
10590 UTF-8 invariant characters can be copied as-is when converting
10591 to/from UTF-8, saving time.
10592
10593 In spite of the name, this macro gives the correct result if
10594 the input string from which "c" comes is not encoded in UTF-8.
10595
10596 See "UVCHR_IS_INVARIANT" for checking if a UV is invariant.
10597
10598 bool UTF8_IS_INVARIANT(char c)
10599
10600 UTF8_IS_NONCHAR
10601 Evaluates to non-zero if the first few bytes of the string
10602 starting at "s" and looking no further than "e - 1" are well-
10603 formed UTF-8 that represents one of the Unicode non-character
10604 code points; otherwise it evaluates to 0. If non-zero, the
10605 value gives how many bytes starting at "s" comprise the code
10606 point's representation.
10607
10608 bool UTF8_IS_NONCHAR(const U8 *s, const U8 *e)
10609
10610 UTF8_IS_SUPER
10611 Recall that Perl recognizes an extension to UTF-8 that can
10612 encode code points larger than the ones defined by Unicode,
10613 which are 0..0x10FFFF.
10614
10615 This macro evaluates to non-zero if the first few bytes of the
10616 string starting at "s" and looking no further than "e - 1" are
10617 from this UTF-8 extension; otherwise it evaluates to 0. If
10618 non-zero, the value gives how many bytes starting at "s"
10619 comprise the code point's representation.
10620
10621 0 is returned if the bytes are not well-formed extended UTF-8,
10622 or if they represent a code point that cannot fit in a UV on
10623 the current platform. Hence this macro can give different
10624 results when run on a 64-bit word machine than on one with a
10625 32-bit word size.
10626
10627 Note that it is illegal to have code points that are larger
10628 than what can fit in an IV on the current machine.
10629
10630 bool UTF8_IS_SUPER(const U8 *s, const U8 *e)
10631
10632 UTF8_IS_SURROGATE
10633 Evaluates to non-zero if the first few bytes of the string
10634 starting at "s" and looking no further than "e - 1" are well-
10635 formed UTF-8 that represents one of the Unicode surrogate code
10636 points; otherwise it evaluates to 0. If non-zero, the value
10637 gives how many bytes starting at "s" comprise the code point's
10638 representation.
10639
10640 bool UTF8_IS_SURROGATE(const U8 *s, const U8 *e)
10641
10642 utf8_length
10643 Returns the number of characters in the sequence of
10644 UTF-8-encoded bytes starting at "s" and ending at the byte just
10645 before "e". If <s> and <e> point to the same place, it returns
10646 0 with no warning raised.
10647
10648 If "e < s" or if the scan would end up past "e", it raises a
10649 UTF8 warning and returns the number of valid characters.
10650
10651 STRLEN utf8_length(const U8* s, const U8 *e)
10652
10653 UTF8_SAFE_SKIP
10654 returns 0 if "s >= e"; otherwise returns the number of bytes in
10655 the UTF-8 encoded character whose first byte is pointed to by
10656 "s". But it never returns beyond "e". On DEBUGGING builds, it
10657 asserts that "s <= e".
10658
10659 STRLEN UTF8_SAFE_SKIP(char* s, char* e)
10660
10661 utf8_to_bytes
10662 NOTE: this function is experimental and may change or be
10663 removed without notice.
10664
10665 Converts a string "s" of length *lenp from UTF-8 into native
10666 byte encoding. Unlike "bytes_to_utf8", this over-writes the
10667 original string, and updates *lenp to contain the new length.
10668 Returns zero on failure (leaving "s" unchanged) setting *lenp
10669 to -1.
10670
10671 Upon successful return, the number of variants in the string
10672 can be computed by having saved the value of *lenp before the
10673 call, and subtracting the after-call value of *lenp from it.
10674
10675 If you need a copy of the string, see "bytes_from_utf8".
10676
10677 U8* utf8_to_bytes(U8 *s, STRLEN *lenp)
10678
10679 utf8_to_uvchr
10680 DEPRECATED! It is planned to remove this function from a
10681 future release of Perl. Do not use it for new code; remove it
10682 from existing code.
10683
10684 Returns the native code point of the first character in the
10685 string "s" which is assumed to be in UTF-8 encoding; "retlen"
10686 will be set to the length, in bytes, of that character.
10687
10688 Some, but not all, UTF-8 malformations are detected, and in
10689 fact, some malformed input could cause reading beyond the end
10690 of the input buffer, which is why this function is deprecated.
10691 Use "utf8_to_uvchr_buf" instead.
10692
10693 If "s" points to one of the detected malformations, and UTF8
10694 warnings are enabled, zero is returned and *retlen is set (if
10695 "retlen" isn't "NULL") to -1. If those warnings are off, the
10696 computed value if well-defined (or the Unicode REPLACEMENT
10697 CHARACTER, if not) is silently returned, and *retlen is set (if
10698 "retlen" isn't NULL) so that ("s" + *retlen) is the next
10699 possible position in "s" that could begin a non-malformed
10700 character. See "utf8n_to_uvchr" for details on when the
10701 REPLACEMENT CHARACTER is returned.
10702
10703 UV utf8_to_uvchr(const U8 *s, STRLEN *retlen)
10704
10705 utf8_to_uvchr_buf
10706 Returns the native code point of the first character in the
10707 string "s" which is assumed to be in UTF-8 encoding; "send"
10708 points to 1 beyond the end of "s". *retlen will be set to the
10709 length, in bytes, of that character.
10710
10711 If "s" does not point to a well-formed UTF-8 character and UTF8
10712 warnings are enabled, zero is returned and *retlen is set (if
10713 "retlen" isn't "NULL") to -1. If those warnings are off, the
10714 computed value, if well-defined (or the Unicode REPLACEMENT
10715 CHARACTER if not), is silently returned, and *retlen is set (if
10716 "retlen" isn't "NULL") so that ("s" + *retlen) is the next
10717 possible position in "s" that could begin a non-malformed
10718 character. See "utf8n_to_uvchr" for details on when the
10719 REPLACEMENT CHARACTER is returned.
10720
10721 UV utf8_to_uvchr_buf(const U8 *s, const U8 *send,
10722 STRLEN *retlen)
10723
10724 utf8_to_uvuni_buf
10725 DEPRECATED! It is planned to remove this function from a
10726 future release of Perl. Do not use it for new code; remove it
10727 from existing code.
10728
10729 Only in very rare circumstances should code need to be dealing
10730 in Unicode (as opposed to native) code points. In those few
10731 cases, use "NATIVE_TO_UNI(utf8_to_uvchr_buf(...))" instead. If
10732 you are not absolutely sure this is one of those cases, then
10733 assume it isn't and use plain "utf8_to_uvchr_buf" instead.
10734
10735 Returns the Unicode (not-native) code point of the first
10736 character in the string "s" which is assumed to be in UTF-8
10737 encoding; "send" points to 1 beyond the end of "s". "retlen"
10738 will be set to the length, in bytes, of that character.
10739
10740 If "s" does not point to a well-formed UTF-8 character and UTF8
10741 warnings are enabled, zero is returned and *retlen is set (if
10742 "retlen" isn't NULL) to -1. If those warnings are off, the
10743 computed value if well-defined (or the Unicode REPLACEMENT
10744 CHARACTER, if not) is silently returned, and *retlen is set (if
10745 "retlen" isn't NULL) so that ("s" + *retlen) is the next
10746 possible position in "s" that could begin a non-malformed
10747 character. See "utf8n_to_uvchr" for details on when the
10748 REPLACEMENT CHARACTER is returned.
10749
10750 UV utf8_to_uvuni_buf(const U8 *s, const U8 *send,
10751 STRLEN *retlen)
10752
10753 UVCHR_IS_INVARIANT
10754 Evaluates to 1 if the representation of code point "cp" is the
10755 same whether or not it is encoded in UTF-8; otherwise evaluates
10756 to 0. UTF-8 invariant characters can be copied as-is when
10757 converting to/from UTF-8, saving time. "cp" is Unicode if
10758 above 255; otherwise is platform-native.
10759
10760 bool UVCHR_IS_INVARIANT(UV cp)
10761
10762 UVCHR_SKIP
10763 returns the number of bytes required to represent the code
10764 point "cp" when encoded as UTF-8. "cp" is a native (ASCII or
10765 EBCDIC) code point if less than 255; a Unicode code point
10766 otherwise.
10767
10768 STRLEN UVCHR_SKIP(UV cp)
10769
10770 uvchr_to_utf8
10771 Adds the UTF-8 representation of the native code point "uv" to
10772 the end of the string "d"; "d" should have at least
10773 "UVCHR_SKIP(uv)+1" (up to "UTF8_MAXBYTES+1") free bytes
10774 available. The return value is the pointer to the byte after
10775 the end of the new character. In other words,
10776
10777 d = uvchr_to_utf8(d, uv);
10778
10779 is the recommended wide native character-aware way of saying
10780
10781 *(d++) = uv;
10782
10783 This function accepts any code point from 0.."IV_MAX" as input.
10784 "IV_MAX" is typically 0x7FFF_FFFF in a 32-bit word.
10785
10786 It is possible to forbid or warn on non-Unicode code points, or
10787 those that may be problematic by using "uvchr_to_utf8_flags".
10788
10789 U8* uvchr_to_utf8(U8 *d, UV uv)
10790
10791 uvchr_to_utf8_flags
10792 Adds the UTF-8 representation of the native code point "uv" to
10793 the end of the string "d"; "d" should have at least
10794 "UVCHR_SKIP(uv)+1" (up to "UTF8_MAXBYTES+1") free bytes
10795 available. The return value is the pointer to the byte after
10796 the end of the new character. In other words,
10797
10798 d = uvchr_to_utf8_flags(d, uv, flags);
10799
10800 or, in most cases,
10801
10802 d = uvchr_to_utf8_flags(d, uv, 0);
10803
10804 This is the Unicode-aware way of saying
10805
10806 *(d++) = uv;
10807
10808 If "flags" is 0, this function accepts any code point from
10809 0.."IV_MAX" as input. "IV_MAX" is typically 0x7FFF_FFFF in a
10810 32-bit word.
10811
10812 Specifying "flags" can further restrict what is allowed and not
10813 warned on, as follows:
10814
10815 If "uv" is a Unicode surrogate code point and
10816 "UNICODE_WARN_SURROGATE" is set, the function will raise a
10817 warning, provided UTF8 warnings are enabled. If instead
10818 "UNICODE_DISALLOW_SURROGATE" is set, the function will fail and
10819 return NULL. If both flags are set, the function will both
10820 warn and return NULL.
10821
10822 Similarly, the "UNICODE_WARN_NONCHAR" and
10823 "UNICODE_DISALLOW_NONCHAR" flags affect how the function
10824 handles a Unicode non-character.
10825
10826 And likewise, the "UNICODE_WARN_SUPER" and
10827 "UNICODE_DISALLOW_SUPER" flags affect the handling of code
10828 points that are above the Unicode maximum of 0x10FFFF.
10829 Languages other than Perl may not be able to accept files that
10830 contain these.
10831
10832 The flag "UNICODE_WARN_ILLEGAL_INTERCHANGE" selects all three
10833 of the above WARN flags; and
10834 "UNICODE_DISALLOW_ILLEGAL_INTERCHANGE" selects all three
10835 DISALLOW flags. "UNICODE_DISALLOW_ILLEGAL_INTERCHANGE"
10836 restricts the allowed inputs to the strict UTF-8 traditionally
10837 defined by Unicode. Similarly,
10838 "UNICODE_WARN_ILLEGAL_C9_INTERCHANGE" and
10839 "UNICODE_DISALLOW_ILLEGAL_C9_INTERCHANGE" are shortcuts to
10840 select the above-Unicode and surrogate flags, but not the non-
10841 character ones, as defined in Unicode Corrigendum #9
10842 <http://www.unicode.org/versions/corrigendum9.html>. See
10843 "Noncharacter code points" in perlunicode.
10844
10845 Extremely high code points were never specified in any
10846 standard, and require an extension to UTF-8 to express, which
10847 Perl does. It is likely that programs written in something
10848 other than Perl would not be able to read files that contain
10849 these; nor would Perl understand files written by something
10850 that uses a different extension. For these reasons, there is a
10851 separate set of flags that can warn and/or disallow these
10852 extremely high code points, even if other above-Unicode ones
10853 are accepted. They are the "UNICODE_WARN_PERL_EXTENDED" and
10854 "UNICODE_DISALLOW_PERL_EXTENDED" flags. For more information
10855 see ""UTF8_GOT_PERL_EXTENDED"". Of course
10856 "UNICODE_DISALLOW_SUPER" will treat all above-Unicode code
10857 points, including these, as malformations. (Note that the
10858 Unicode standard considers anything above 0x10FFFF to be
10859 illegal, but there are standards predating it that allow up to
10860 0x7FFF_FFFF (2**31 -1))
10861
10862 A somewhat misleadingly named synonym for
10863 "UNICODE_WARN_PERL_EXTENDED" is retained for backward
10864 compatibility: "UNICODE_WARN_ABOVE_31_BIT". Similarly,
10865 "UNICODE_DISALLOW_ABOVE_31_BIT" is usable instead of the more
10866 accurately named "UNICODE_DISALLOW_PERL_EXTENDED". The names
10867 are misleading because on EBCDIC platforms,these flags can
10868 apply to code points that actually do fit in 31 bits. The new
10869 names accurately describe the situation in all cases.
10870
10871 U8* uvchr_to_utf8_flags(U8 *d, UV uv, UV flags)
10872
10873 uvchr_to_utf8_flags_msgs
10874 NOTE: this function is experimental and may change or be
10875 removed without notice.
10876
10877 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10878 CIRCUMSTANCES.
10879
10880 Most code should use ""uvchr_to_utf8_flags"()" rather than call
10881 this directly.
10882
10883 This function is for code that wants any warning and/or error
10884 messages to be returned to the caller rather than be displayed.
10885 All messages that would have been displayed if all lexical
10886 warnings are enabled will be returned.
10887
10888 It is just like "uvchr_to_utf8_flags" but it takes an extra
10889 parameter placed after all the others, "msgs". If this
10890 parameter is 0, this function behaves identically to
10891 "uvchr_to_utf8_flags". Otherwise, "msgs" should be a pointer
10892 to an "HV *" variable, in which this function creates a new HV
10893 to contain any appropriate messages. The hash has three key-
10894 value pairs, as follows:
10895
10896 "text"
10897 The text of the message as a "SVpv".
10898
10899 "warn_categories"
10900 The warning category (or categories) packed into a "SVuv".
10901
10902 "flag"
10903 A single flag bit associated with this message, in a
10904 "SVuv". The bit corresponds to some bit in the *errors
10905 return value, such as "UNICODE_GOT_SURROGATE".
10906
10907 It's important to note that specifying this parameter as non-
10908 null will cause any warnings this function would otherwise
10909 generate to be suppressed, and instead be placed in *msgs. The
10910 caller can check the lexical warnings state (or not) when
10911 choosing what to do with the returned messages.
10912
10913 The caller, of course, is responsible for freeing any returned
10914 HV.
10915
10916 U8* uvchr_to_utf8_flags_msgs(U8 *d, UV uv, UV flags,
10917 HV ** msgs)
10918
10919 uvoffuni_to_utf8_flags
10920 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10921 CIRCUMSTANCES. Instead, Almost all code should use
10922 "uvchr_to_utf8" or "uvchr_to_utf8_flags".
10923
10924 This function is like them, but the input is a strict Unicode
10925 (as opposed to native) code point. Only in very rare
10926 circumstances should code not be using the native code point.
10927
10928 For details, see the description for "uvchr_to_utf8_flags".
10929
10930 U8* uvoffuni_to_utf8_flags(U8 *d, UV uv,
10931 const UV flags)
10932
10933 uvuni_to_utf8_flags
10934 Instead you almost certainly want to use "uvchr_to_utf8" or
10935 "uvchr_to_utf8_flags".
10936
10937 This function is a deprecated synonym for
10938 "uvoffuni_to_utf8_flags", which itself, while not deprecated,
10939 should be used only in isolated circumstances. These functions
10940 were useful for code that wanted to handle both EBCDIC and
10941 ASCII platforms with Unicode properties, but starting in Perl
10942 v5.20, the distinctions between the platforms have mostly been
10943 made invisible to most code, so this function is quite unlikely
10944 to be what you want.
10945
10946 U8* uvuni_to_utf8_flags(U8 *d, UV uv, UV flags)
10947
10948 valid_utf8_to_uvchr
10949 Like "utf8_to_uvchr_buf", but should only be called when it is
10950 known that the next character in the input UTF-8 string "s" is
10951 well-formed (e.g., it passes "isUTF8_CHAR". Surrogates, non-
10952 character code points, and non-Unicode code points are allowed.
10953
10954 UV valid_utf8_to_uvchr(const U8 *s, STRLEN *retlen)
10955
10957 newXSproto
10958 Used by "xsubpp" to hook up XSUBs as Perl subs. Adds Perl
10959 prototypes to the subs.
10960
10961 XS_APIVERSION_BOOTCHECK
10962 Macro to verify that the perl api version an XS module has been
10963 compiled against matches the api version of the perl
10964 interpreter it's being loaded into.
10965
10966 XS_APIVERSION_BOOTCHECK;
10967
10968 XS_VERSION
10969 The version identifier for an XS module. This is usually
10970 handled automatically by "ExtUtils::MakeMaker". See
10971 "XS_VERSION_BOOTCHECK".
10972
10973 XS_VERSION_BOOTCHECK
10974 Macro to verify that a PM module's $VERSION variable matches
10975 the XS module's "XS_VERSION" variable. This is usually handled
10976 automatically by "xsubpp". See "The VERSIONCHECK: Keyword" in
10977 perlxs.
10978
10979 XS_VERSION_BOOTCHECK;
10980
10982 ckWARN Returns a boolean as to whether or not warnings are enabled for
10983 the warning category "w". If the category is by default
10984 enabled even if not within the scope of "use warnings", instead
10985 use the "ckWARN_d" macro.
10986
10987 bool ckWARN(U32 w)
10988
10989 ckWARN2 Like "ckWARN", but takes two warnings categories as input, and
10990 returns TRUE if either is enabled. If either category is by
10991 default enabled even if not within the scope of "use warnings",
10992 instead use the "ckWARN2_d" macro. The categories must be
10993 completely independent, one may not be subclassed from the
10994 other.
10995
10996 bool ckWARN2(U32 w1, U32 w2)
10997
10998 ckWARN3 Like "ckWARN2", but takes three warnings categories as input,
10999 and returns TRUE if any is enabled. If any of the categories
11000 is by default enabled even if not within the scope of
11001 "use warnings", instead use the "ckWARN3_d" macro. The
11002 categories must be completely independent, one may not be
11003 subclassed from any other.
11004
11005 bool ckWARN3(U32 w1, U32 w2, U32 w3)
11006
11007 ckWARN4 Like "ckWARN3", but takes four warnings categories as input,
11008 and returns TRUE if any is enabled. If any of the categories
11009 is by default enabled even if not within the scope of
11010 "use warnings", instead use the "ckWARN4_d" macro. The
11011 categories must be completely independent, one may not be
11012 subclassed from any other.
11013
11014 bool ckWARN4(U32 w1, U32 w2, U32 w3, U32 w4)
11015
11016 ckWARN_d
11017 Like "ckWARN", but for use if and only if the warning category
11018 is by default enabled even if not within the scope of
11019 "use warnings".
11020
11021 bool ckWARN_d(U32 w)
11022
11023 ckWARN2_d
11024 Like "ckWARN2", but for use if and only if either warning
11025 category is by default enabled even if not within the scope of
11026 "use warnings".
11027
11028 bool ckWARN2_d(U32 w1, U32 w2)
11029
11030 ckWARN3_d
11031 Like "ckWARN3", but for use if and only if any of the warning
11032 categories is by default enabled even if not within the scope
11033 of "use warnings".
11034
11035 bool ckWARN3_d(U32 w1, U32 w2, U32 w3)
11036
11037 ckWARN4_d
11038 Like "ckWARN4", but for use if and only if any of the warning
11039 categories is by default enabled even if not within the scope
11040 of "use warnings".
11041
11042 bool ckWARN4_d(U32 w1, U32 w2, U32 w3, U32 w4)
11043
11044 croak This is an XS interface to Perl's "die" function.
11045
11046 Take a sprintf-style format pattern and argument list. These
11047 are used to generate a string message. If the message does not
11048 end with a newline, then it will be extended with some
11049 indication of the current location in the code, as described
11050 for "mess_sv".
11051
11052 The error message will be used as an exception, by default
11053 returning control to the nearest enclosing "eval", but subject
11054 to modification by a $SIG{__DIE__} handler. In any case, the
11055 "croak" function never returns normally.
11056
11057 For historical reasons, if "pat" is null then the contents of
11058 "ERRSV" ($@) will be used as an error message or object instead
11059 of building an error message from arguments. If you want to
11060 throw a non-string object, or build an error message in an SV
11061 yourself, it is preferable to use the "croak_sv" function,
11062 which does not involve clobbering "ERRSV".
11063
11064 void croak(const char *pat, ...)
11065
11066 croak_no_modify
11067 Exactly equivalent to "Perl_croak(aTHX_ "%s", PL_no_modify)",
11068 but generates terser object code than using "Perl_croak". Less
11069 code used on exception code paths reduces CPU cache pressure.
11070
11071 void croak_no_modify()
11072
11073 croak_sv
11074 This is an XS interface to Perl's "die" function.
11075
11076 "baseex" is the error message or object. If it is a reference,
11077 it will be used as-is. Otherwise it is used as a string, and
11078 if it does not end with a newline then it will be extended with
11079 some indication of the current location in the code, as
11080 described for "mess_sv".
11081
11082 The error message or object will be used as an exception, by
11083 default returning control to the nearest enclosing "eval", but
11084 subject to modification by a $SIG{__DIE__} handler. In any
11085 case, the "croak_sv" function never returns normally.
11086
11087 To die with a simple string message, the "croak" function may
11088 be more convenient.
11089
11090 void croak_sv(SV *baseex)
11091
11092 die Behaves the same as "croak", except for the return type. It
11093 should be used only where the "OP *" return type is required.
11094 The function never actually returns.
11095
11096 OP * die(const char *pat, ...)
11097
11098 die_sv Behaves the same as "croak_sv", except for the return type. It
11099 should be used only where the "OP *" return type is required.
11100 The function never actually returns.
11101
11102 OP * die_sv(SV *baseex)
11103
11104 vcroak This is an XS interface to Perl's "die" function.
11105
11106 "pat" and "args" are a sprintf-style format pattern and
11107 encapsulated argument list. These are used to generate a
11108 string message. If the message does not end with a newline,
11109 then it will be extended with some indication of the current
11110 location in the code, as described for "mess_sv".
11111
11112 The error message will be used as an exception, by default
11113 returning control to the nearest enclosing "eval", but subject
11114 to modification by a $SIG{__DIE__} handler. In any case, the
11115 "croak" function never returns normally.
11116
11117 For historical reasons, if "pat" is null then the contents of
11118 "ERRSV" ($@) will be used as an error message or object instead
11119 of building an error message from arguments. If you want to
11120 throw a non-string object, or build an error message in an SV
11121 yourself, it is preferable to use the "croak_sv" function,
11122 which does not involve clobbering "ERRSV".
11123
11124 void vcroak(const char *pat, va_list *args)
11125
11126 vwarn This is an XS interface to Perl's "warn" function.
11127
11128 "pat" and "args" are a sprintf-style format pattern and
11129 encapsulated argument list. These are used to generate a
11130 string message. If the message does not end with a newline,
11131 then it will be extended with some indication of the current
11132 location in the code, as described for "mess_sv".
11133
11134 The error message or object will by default be written to
11135 standard error, but this is subject to modification by a
11136 $SIG{__WARN__} handler.
11137
11138 Unlike with "vcroak", "pat" is not permitted to be null.
11139
11140 void vwarn(const char *pat, va_list *args)
11141
11142 warn This is an XS interface to Perl's "warn" function.
11143
11144 Take a sprintf-style format pattern and argument list. These
11145 are used to generate a string message. If the message does not
11146 end with a newline, then it will be extended with some
11147 indication of the current location in the code, as described
11148 for "mess_sv".
11149
11150 The error message or object will by default be written to
11151 standard error, but this is subject to modification by a
11152 $SIG{__WARN__} handler.
11153
11154 Unlike with "croak", "pat" is not permitted to be null.
11155
11156 void warn(const char *pat, ...)
11157
11158 warn_sv This is an XS interface to Perl's "warn" function.
11159
11160 "baseex" is the error message or object. If it is a reference,
11161 it will be used as-is. Otherwise it is used as a string, and
11162 if it does not end with a newline then it will be extended with
11163 some indication of the current location in the code, as
11164 described for "mess_sv".
11165
11166 The error message or object will by default be written to
11167 standard error, but this is subject to modification by a
11168 $SIG{__WARN__} handler.
11169
11170 To warn with a simple string message, the "warn" function may
11171 be more convenient.
11172
11173 void warn_sv(SV *baseex)
11174
11176 The following functions have been flagged as part of the public API,
11177 but are currently undocumented. Use them at your own risk, as the
11178 interfaces are subject to change. Functions that are not listed in
11179 this document are not intended for public use, and should NOT be used
11180 under any circumstances.
11181
11182 If you feel you need to use one of these functions, first send email to
11183 perl5-porters@perl.org <mailto:perl5-porters@perl.org>. It may be that
11184 there is a good reason for the function not being documented, and it
11185 should be removed from this list; or it may just be that no one has
11186 gotten around to documenting it. In the latter case, you will be asked
11187 to submit a patch to document the function. Once your patch is
11188 accepted, it will indicate that the interface is stable (unless it is
11189 explicitly marked otherwise) and usable by you.
11190
11191 GetVars
11192 Gv_AMupdate
11193 PerlIO_clearerr
11194 PerlIO_close
11195 PerlIO_context_layers
11196 PerlIO_eof
11197 PerlIO_error
11198 PerlIO_fileno
11199 PerlIO_fill
11200 PerlIO_flush
11201 PerlIO_get_base
11202 PerlIO_get_bufsiz
11203 PerlIO_get_cnt
11204 PerlIO_get_ptr
11205 PerlIO_read
11206 PerlIO_seek
11207 PerlIO_set_cnt
11208 PerlIO_set_ptrcnt
11209 PerlIO_setlinebuf
11210 PerlIO_stderr
11211 PerlIO_stdin
11212 PerlIO_stdout
11213 PerlIO_tell
11214 PerlIO_unread
11215 PerlIO_write
11216 _variant_byte_number
11217 amagic_call
11218 amagic_deref_call
11219 any_dup
11220 atfork_lock
11221 atfork_unlock
11222 av_arylen_p
11223 av_iter_p
11224 block_gimme
11225 call_atexit
11226 call_list
11227 calloc
11228 cast_i32
11229 cast_iv
11230 cast_ulong
11231 cast_uv
11232 ck_warner
11233 ck_warner_d
11234 ckwarn
11235 ckwarn_d
11236 clear_defarray
11237 clone_params_del
11238 clone_params_new
11239 croak_memory_wrap
11240 croak_nocontext
11241 csighandler
11242 cx_dump
11243 cx_dup
11244 cxinc
11245 deb
11246 deb_nocontext
11247 debop
11248 debprofdump
11249 debstack
11250 debstackptrs
11251 delimcpy
11252 despatch_signals
11253 die_nocontext
11254 dirp_dup
11255 do_aspawn
11256 do_binmode
11257 do_close
11258 do_gv_dump
11259 do_gvgv_dump
11260 do_hv_dump
11261 do_join
11262 do_magic_dump
11263 do_op_dump
11264 do_open
11265 do_open9
11266 do_openn
11267 do_pmop_dump
11268 do_spawn
11269 do_spawn_nowait
11270 do_sprintf
11271 do_sv_dump
11272 doing_taint
11273 doref
11274 dounwind
11275 dowantarray
11276 dump_eval
11277 dump_form
11278 dump_indent
11279 dump_mstats
11280 dump_sub
11281 dump_vindent
11282 filter_add
11283 filter_del
11284 filter_read
11285 foldEQ_latin1
11286 form_nocontext
11287 fp_dup
11288 fprintf_nocontext
11289 free_global_struct
11290 free_tmps
11291 get_context
11292 get_mstats
11293 get_op_descs
11294 get_op_names
11295 get_ppaddr
11296 get_vtbl
11297 gp_dup
11298 gp_free
11299 gp_ref
11300 gv_AVadd
11301 gv_HVadd
11302 gv_IOadd
11303 gv_SVadd
11304 gv_add_by_type
11305 gv_autoload4
11306 gv_autoload_pv
11307 gv_autoload_pvn
11308 gv_autoload_sv
11309 gv_check
11310 gv_dump
11311 gv_efullname
11312 gv_efullname3
11313 gv_efullname4
11314 gv_fetchfile
11315 gv_fetchfile_flags
11316 gv_fetchpv
11317 gv_fetchpvn_flags
11318 gv_fetchsv
11319 gv_fullname
11320 gv_fullname3
11321 gv_fullname4
11322 gv_handler
11323 gv_name_set
11324 he_dup
11325 hek_dup
11326 hv_common
11327 hv_common_key_len
11328 hv_delayfree_ent
11329 hv_eiter_p
11330 hv_eiter_set
11331 hv_free_ent
11332 hv_ksplit
11333 hv_name_set
11334 hv_placeholders_get
11335 hv_placeholders_set
11336 hv_rand_set
11337 hv_riter_p
11338 hv_riter_set
11339 ibcmp_utf8
11340 init_global_struct
11341 init_stacks
11342 init_tm
11343 instr
11344 is_lvalue_sub
11345 leave_scope
11346 load_module_nocontext
11347 magic_dump
11348 malloc
11349 markstack_grow
11350 mess_nocontext
11351 mfree
11352 mg_dup
11353 mg_size
11354 mini_mktime
11355 moreswitches
11356 mro_get_from_name
11357 mro_get_private_data
11358 mro_set_mro
11359 mro_set_private_data
11360 my_atof
11361 my_atof2
11362 my_atof3
11363 my_chsize
11364 my_cxt_index
11365 my_cxt_init
11366 my_dirfd
11367 my_exit
11368 my_failure_exit
11369 my_fflush_all
11370 my_fork
11371 my_lstat
11372 my_pclose
11373 my_popen
11374 my_popen_list
11375 my_setenv
11376 my_socketpair
11377 my_stat
11378 my_strftime
11379 newANONATTRSUB
11380 newANONHASH
11381 newANONLIST
11382 newANONSUB
11383 newATTRSUB
11384 newAVREF
11385 newCVREF
11386 newFORM
11387 newGVREF
11388 newGVgen
11389 newGVgen_flags
11390 newHVREF
11391 newHVhv
11392 newIO
11393 newMYSUB
11394 newPROG
11395 newRV
11396 newSUB
11397 newSVREF
11398 newSVpvf_nocontext
11399 newSVsv_flags
11400 new_stackinfo
11401 op_refcnt_lock
11402 op_refcnt_unlock
11403 parser_dup
11404 perl_alloc_using
11405 perl_clone_using
11406 pmop_dump
11407 pop_scope
11408 pregcomp
11409 pregexec
11410 pregfree
11411 pregfree2
11412 printf_nocontext
11413 ptr_table_fetch
11414 ptr_table_free
11415 ptr_table_new
11416 ptr_table_split
11417 ptr_table_store
11418 push_scope
11419 re_compile
11420 re_dup_guts
11421 re_intuit_start
11422 re_intuit_string
11423 realloc
11424 reentrant_free
11425 reentrant_init
11426 reentrant_retry
11427 reentrant_size
11428 ref
11429 reg_named_buff_all
11430 reg_named_buff_exists
11431 reg_named_buff_fetch
11432 reg_named_buff_firstkey
11433 reg_named_buff_nextkey
11434 reg_named_buff_scalar
11435 regdump
11436 regdupe_internal
11437 regexec_flags
11438 regfree_internal
11439 reginitcolors
11440 regnext
11441 repeatcpy
11442 rsignal
11443 rsignal_state
11444 runops_debug
11445 runops_standard
11446 rvpv_dup
11447 safesyscalloc
11448 safesysfree
11449 safesysmalloc
11450 safesysrealloc
11451 save_I16
11452 save_I32
11453 save_I8
11454 save_adelete
11455 save_aelem
11456 save_aelem_flags
11457 save_alloc
11458 save_aptr
11459 save_ary
11460 save_bool
11461 save_clearsv
11462 save_delete
11463 save_destructor
11464 save_destructor_x
11465 save_freeop
11466 save_freepv
11467 save_freesv
11468 save_generic_pvref
11469 save_generic_svref
11470 save_hash
11471 save_hdelete
11472 save_helem
11473 save_helem_flags
11474 save_hints
11475 save_hptr
11476 save_int
11477 save_item
11478 save_iv
11479 save_list
11480 save_long
11481 save_mortalizesv
11482 save_nogv
11483 save_op
11484 save_padsv_and_mortalize
11485 save_pptr
11486 save_pushi32ptr
11487 save_pushptr
11488 save_pushptrptr
11489 save_re_context
11490 save_scalar
11491 save_set_svflags
11492 save_shared_pvref
11493 save_sptr
11494 save_svref
11495 save_vptr
11496 savestack_grow
11497 savestack_grow_cnt
11498 scan_num
11499 scan_vstring
11500 seed
11501 set_context
11502 share_hek
11503 si_dup
11504 ss_dup
11505 stack_grow
11506 start_subparse
11507 str_to_version
11508 sv_2iv
11509 sv_2pv
11510 sv_2uv
11511 sv_catpvf_mg_nocontext
11512 sv_catpvf_nocontext
11513 sv_dup
11514 sv_dup_inc
11515 sv_peek
11516 sv_pvn_nomg
11517 sv_setpvf_mg_nocontext
11518 sv_setpvf_nocontext
11519 sys_init
11520 sys_init3
11521 sys_intern_clear
11522 sys_intern_dup
11523 sys_intern_init
11524 sys_term
11525 taint_env
11526 taint_proper
11527 unlnk
11528 unsharepvn
11529 uvuni_to_utf8
11530 vdeb
11531 vform
11532 vload_module
11533 vnewSVpvf
11534 vwarner
11535 warn_nocontext
11536 warner
11537 warner_nocontext
11538 whichsig
11539 whichsig_pv
11540 whichsig_pvn
11541 whichsig_sv
11542
11544 Until May 1997, this document was maintained by Jeff Okamoto
11545 <okamoto@corp.hp.com>. It is now maintained as part of Perl itself.
11546
11547 With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
11548 Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
11549 Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
11550 Stephen McCamant, and Gurusamy Sarathy.
11551
11552 API Listing originally by Dean Roehrich <roehrich@cray.com>.
11553
11554 Updated to be autogenerated from comments in the source by Benjamin
11555 Stuhl.
11556
11558 perlguts, perlxs, perlxstut, perlintern
11559
11560
11561
11562perl v5.30.2 2020-03-27 PERLAPI(1)