1PERLAPI(1) Perl Programmers Reference Guide PERLAPI(1)
2
3
4
6 perlapi - autogenerated documentation for the perl public API
7
9 This file contains the documentation of the perl public API generated
10 by embed.pl, specifically a listing of functions, macros, flags, and
11 variables that may be used by extension writers. At the end is a list
12 of functions which have yet to be documented. The interfaces of those
13 are subject to change without notice. Anything not listed here is not
14 part of the public API, and should not be used by extension writers at
15 all. For these reasons, blindly using functions listed in proto.h is
16 to be avoided when writing extensions.
17
18 In Perl, unlike C, a string of characters may generally contain
19 embedded "NUL" characters. Sometimes in the documentation a Perl
20 string is referred to as a "buffer" to distinguish it from a C string,
21 but sometimes they are both just referred to as strings.
22
23 Note that all Perl API global variables must be referenced with the
24 "PL_" prefix. Again, those not listed here are not to be used by
25 extension writers, and can be changed or removed without notice; same
26 with macros. Some macros are provided for compatibility with the
27 older, unadorned names, but this support may be disabled in a future
28 release.
29
30 Perl was originally written to handle US-ASCII only (that is characters
31 whose ordinal numbers are in the range 0 - 127). And documentation and
32 comments may still use the term ASCII, when sometimes in fact the
33 entire range from 0 - 255 is meant.
34
35 The non-ASCII characters below 256 can have various meanings, depending
36 on various things. (See, most notably, perllocale.) But usually the
37 whole range can be referred to as ISO-8859-1. Often, the term
38 "Latin-1" (or "Latin1") is used as an equivalent for ISO-8859-1. But
39 some people treat "Latin1" as referring just to the characters in the
40 range 128 through 255, or somethimes from 160 through 255. This
41 documentation uses "Latin1" and "Latin-1" to refer to all 256
42 characters.
43
44 Note that Perl can be compiled and run under either ASCII or EBCDIC
45 (See perlebcdic). Most of the documentation (and even comments in the
46 code) ignore the EBCDIC possibility. For almost all purposes the
47 differences are transparent. As an example, under EBCDIC, instead of
48 UTF-8, UTF-EBCDIC is used to encode Unicode strings, and so whenever
49 this documentation refers to "utf8" (and variants of that name,
50 including in function names), it also (essentially transparently) means
51 "UTF-EBCDIC". But the ordinals of characters differ between ASCII,
52 EBCDIC, and the UTF- encodings, and a string encoded in UTF-EBCDIC may
53 occupy a different number of bytes than in UTF-8.
54
55 The listing below is alphabetical, case insensitive.
56
58 av_clear
59 Frees the all the elements of an array, leaving it empty. The
60 XS equivalent of "@array = ()". See also "av_undef".
61
62 Note that it is possible that the actions of a destructor
63 called directly or indirectly by freeing an element of the
64 array could cause the reference count of the array itself to be
65 reduced (e.g. by deleting an entry in the symbol table). So it
66 is a possibility that the AV could have been freed (or even
67 reallocated) on return from the call unless you hold a
68 reference to it.
69
70 void av_clear(AV *av)
71
72 av_create_and_push
73 NOTE: this function is experimental and may change or be
74 removed without notice.
75
76 Push an SV onto the end of the array, creating the array if
77 necessary. A small internal helper function to remove a
78 commonly duplicated idiom.
79
80 void av_create_and_push(AV **const avp,
81 SV *const val)
82
83 av_create_and_unshift_one
84 NOTE: this function is experimental and may change or be
85 removed without notice.
86
87 Unshifts an SV onto the beginning of the array, creating the
88 array if necessary. A small internal helper function to remove
89 a commonly duplicated idiom.
90
91 SV** av_create_and_unshift_one(AV **const avp,
92 SV *const val)
93
94 av_delete
95 Deletes the element indexed by "key" from the array, makes the
96 element mortal, and returns it. If "flags" equals "G_DISCARD",
97 the element is freed and NULL is returned. NULL is also
98 returned if "key" is out of range.
99
100 Perl equivalent: "splice(@myarray, $key, 1, undef)" (with the
101 "splice" in void context if "G_DISCARD" is present).
102
103 SV* av_delete(AV *av, SSize_t key, I32 flags)
104
105 av_exists
106 Returns true if the element indexed by "key" has been
107 initialized.
108
109 This relies on the fact that uninitialized array elements are
110 set to "NULL".
111
112 Perl equivalent: "exists($myarray[$key])".
113
114 bool av_exists(AV *av, SSize_t key)
115
116 av_extend
117 Pre-extend an array. The "key" is the index to which the array
118 should be extended.
119
120 void av_extend(AV *av, SSize_t key)
121
122 av_fetch
123 Returns the SV at the specified index in the array. The "key"
124 is the index. If lval is true, you are guaranteed to get a
125 real SV back (in case it wasn't real before), which you can
126 then modify. Check that the return value is non-null before
127 dereferencing it to a "SV*".
128
129 See "Understanding the Magic of Tied Hashes and Arrays" in
130 perlguts for more information on how to use this function on
131 tied arrays.
132
133 The rough perl equivalent is $myarray[$key].
134
135 SV** av_fetch(AV *av, SSize_t key, I32 lval)
136
137 AvFILL Same as "av_top_index()" or "av_tindex()".
138
139 int AvFILL(AV* av)
140
141 av_fill Set the highest index in the array to the given number,
142 equivalent to Perl's "$#array = $fill;".
143
144 The number of elements in the array will be "fill + 1" after
145 "av_fill()" returns. If the array was previously shorter, then
146 the additional elements appended are set to NULL. If the array
147 was longer, then the excess elements are freed.
148 "av_fill(av, -1)" is the same as "av_clear(av)".
149
150 void av_fill(AV *av, SSize_t fill)
151
152 av_len Same as "av_top_index". Note that, unlike what the name
153 implies, it returns the highest index in the array, so to get
154 the size of the array you need to use "av_len(av) + 1". This
155 is unlike "sv_len", which returns what you would expect.
156
157 SSize_t av_len(AV *av)
158
159 av_make Creates a new AV and populates it with a list of SVs. The SVs
160 are copied into the array, so they may be freed after the call
161 to "av_make". The new AV will have a reference count of 1.
162
163 Perl equivalent: "my @new_array = ($scalar1, $scalar2,
164 $scalar3...);"
165
166 AV* av_make(SSize_t size, SV **strp)
167
168 av_pop Removes one SV from the end of the array, reducing its size by
169 one and returning the SV (transferring control of one reference
170 count) to the caller. Returns &PL_sv_undef if the array is
171 empty.
172
173 Perl equivalent: "pop(@myarray);"
174
175 SV* av_pop(AV *av)
176
177 av_push Pushes an SV (transferring control of one reference count) onto
178 the end of the array. The array will grow automatically to
179 accommodate the addition.
180
181 Perl equivalent: "push @myarray, $val;".
182
183 void av_push(AV *av, SV *val)
184
185 av_shift
186 Removes one SV from the start of the array, reducing its size
187 by one and returning the SV (transferring control of one
188 reference count) to the caller. Returns &PL_sv_undef if the
189 array is empty.
190
191 Perl equivalent: "shift(@myarray);"
192
193 SV* av_shift(AV *av)
194
195 av_store
196 Stores an SV in an array. The array index is specified as
197 "key". The return value will be "NULL" if the operation failed
198 or if the value did not need to be actually stored within the
199 array (as in the case of tied arrays). Otherwise, it can be
200 dereferenced to get the "SV*" that was stored there (= "val")).
201
202 Note that the caller is responsible for suitably incrementing
203 the reference count of "val" before the call, and decrementing
204 it if the function returned "NULL".
205
206 Approximate Perl equivalent: "splice(@myarray, $key, 1, $val)".
207
208 See "Understanding the Magic of Tied Hashes and Arrays" in
209 perlguts for more information on how to use this function on
210 tied arrays.
211
212 SV** av_store(AV *av, SSize_t key, SV *val)
213
214 av_tindex
215 Same as "av_top_index()".
216
217 int av_tindex(AV* av)
218
219 av_top_index
220 Returns the highest index in the array. The number of elements
221 in the array is "av_top_index(av) + 1". Returns -1 if the
222 array is empty.
223
224 The Perl equivalent for this is $#myarray.
225
226 (A slightly shorter form is "av_tindex".)
227
228 SSize_t av_top_index(AV *av)
229
230 av_undef
231 Undefines the array. The XS equivalent of "undef(@array)".
232
233 As well as freeing all the elements of the array (like
234 "av_clear()"), this also frees the memory used by the av to
235 store its list of scalars.
236
237 See "av_clear" for a note about the array possibly being
238 invalid on return.
239
240 void av_undef(AV *av)
241
242 av_unshift
243 Unshift the given number of "undef" values onto the beginning
244 of the array. The array will grow automatically to accommodate
245 the addition.
246
247 Perl equivalent: "unshift @myarray, ((undef) x $num);"
248
249 void av_unshift(AV *av, SSize_t num)
250
251 get_av Returns the AV of the specified Perl global or package array
252 with the given name (so it won't work on lexical variables).
253 "flags" are passed to "gv_fetchpv". If "GV_ADD" is set and the
254 Perl variable does not exist then it will be created. If
255 "flags" is zero and the variable does not exist then NULL is
256 returned.
257
258 Perl equivalent: "@{"$name"}".
259
260 NOTE: the perl_ form of this function is deprecated.
261
262 AV* get_av(const char *name, I32 flags)
263
264 newAV Creates a new AV. The reference count is set to 1.
265
266 Perl equivalent: "my @array;".
267
268 AV* newAV()
269
270 sortsv In-place sort an array of SV pointers with the given comparison
271 routine.
272
273 Currently this always uses mergesort. See "sortsv_flags" for a
274 more flexible routine.
275
276 void sortsv(SV** array, size_t num_elts,
277 SVCOMPARE_t cmp)
278
280 call_argv
281 Performs a callback to the specified named and package-scoped
282 Perl subroutine with "argv" (a "NULL"-terminated array of
283 strings) as arguments. See perlcall.
284
285 Approximate Perl equivalent: "&{"$sub_name"}(@$argv)".
286
287 NOTE: the perl_ form of this function is deprecated.
288
289 I32 call_argv(const char* sub_name, I32 flags,
290 char** argv)
291
292 call_method
293 Performs a callback to the specified Perl method. The blessed
294 object must be on the stack. See perlcall.
295
296 NOTE: the perl_ form of this function is deprecated.
297
298 I32 call_method(const char* methname, I32 flags)
299
300 call_pv Performs a callback to the specified Perl sub. See perlcall.
301
302 NOTE: the perl_ form of this function is deprecated.
303
304 I32 call_pv(const char* sub_name, I32 flags)
305
306 call_sv Performs a callback to the Perl sub specified by the SV.
307
308 If neither the "G_METHOD" nor "G_METHOD_NAMED" flag is
309 supplied, the SV may be any of a CV, a GV, a reference to a CV,
310 a reference to a GV or "SvPV(sv)" will be used as the name of
311 the sub to call.
312
313 If the "G_METHOD" flag is supplied, the SV may be a reference
314 to a CV or "SvPV(sv)" will be used as the name of the method to
315 call.
316
317 If the "G_METHOD_NAMED" flag is supplied, "SvPV(sv)" will be
318 used as the name of the method to call.
319
320 Some other values are treated specially for internal use and
321 should not be depended on.
322
323 See perlcall.
324
325 NOTE: the perl_ form of this function is deprecated.
326
327 I32 call_sv(SV* sv, volatile I32 flags)
328
329 ENTER Opening bracket on a callback. See "LEAVE" and perlcall.
330
331 ENTER;
332
333 ENTER_with_name(name)
334 Same as "ENTER", but when debugging is enabled it also
335 associates the given literal string with the new scope.
336
337 ENTER_with_name(name);
338
339 eval_pv Tells Perl to "eval" the given string in scalar context and
340 return an SV* result.
341
342 NOTE: the perl_ form of this function is deprecated.
343
344 SV* eval_pv(const char* p, I32 croak_on_error)
345
346 eval_sv Tells Perl to "eval" the string in the SV. It supports the
347 same flags as "call_sv", with the obvious exception of
348 "G_EVAL". See perlcall.
349
350 NOTE: the perl_ form of this function is deprecated.
351
352 I32 eval_sv(SV* sv, I32 flags)
353
354 FREETMPS
355 Closing bracket for temporaries on a callback. See "SAVETMPS"
356 and perlcall.
357
358 FREETMPS;
359
360 LEAVE Closing bracket on a callback. See "ENTER" and perlcall.
361
362 LEAVE;
363
364 LEAVE_with_name(name)
365 Same as "LEAVE", but when debugging is enabled it first checks
366 that the scope has the given name. "name" must be a literal
367 string.
368
369 LEAVE_with_name(name);
370
371 SAVETMPS
372 Opening bracket for temporaries on a callback. See "FREETMPS"
373 and perlcall.
374
375 SAVETMPS;
376
378 Perl uses "full" Unicode case mappings. This means that converting a
379 single character to another case may result in a sequence of more than
380 one character. For example, the uppercase of "ss" (LATIN SMALL LETTER
381 SHARP S) is the two character sequence "SS". This presents some
382 complications The lowercase of all characters in the range 0..255 is
383 a single character, and thus "toLOWER_L1" is furnished. But,
384 "toUPPER_L1" can't exist, as it couldn't return a valid result for all
385 legal inputs. Instead "toUPPER_uvchr" has an API that does allow every
386 possible legal result to be returned.) Likewise no other function that
387 is crippled by not being able to give the correct results for the full
388 range of possible inputs has been implemented here.
389
390 toFOLD Converts the specified character to foldcase. If the input is
391 anything but an ASCII uppercase character, that input character
392 itself is returned. Variant "toFOLD_A" is equivalent. (There
393 is no equivalent "to_FOLD_L1" for the full Latin1 range, as the
394 full generality of "toFOLD_uvchr" is needed there.)
395
396 U8 toFOLD(U8 ch)
397
398 toFOLD_utf8
399 This is like "toFOLD_utf8_safe", but doesn't have the "e"
400 parameter The function therefore can't check if it is reading
401 beyond the end of the string. Starting in Perl v5.32, it will
402 take the "e" parameter, becoming a synonym for
403 "toFOLD_utf8_safe". At that time every program that uses it
404 will have to be changed to successfully compile. In the
405 meantime, the first runtime call to "toFOLD_utf8" from each
406 call point in the program will raise a deprecation warning,
407 enabled by default. You can convert your program now to use
408 "toFOLD_utf8_safe", and avoid the warnings, and get an extra
409 measure of protection, or you can wait until v5.32, when you'll
410 be forced to add the "e" parameter.
411
412 UV toFOLD_utf8(U8* p, U8* s, STRLEN* lenp)
413
414 toFOLD_utf8_safe
415 Converts the first UTF-8 encoded character in the sequence
416 starting at "p" and extending no further than "e - 1" to its
417 foldcase version, and stores that in UTF-8 in "s", and its
418 length in bytes in "lenp". Note that the buffer pointed to by
419 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
420 foldcase version may be longer than the original character.
421
422 The first code point of the foldcased version is returned (but
423 note, as explained at the top of this section, that there may
424 be more).
425
426 The suffix "_safe" in the function's name indicates that it
427 will not attempt to read beyond "e - 1", provided that the
428 constraint "s < e" is true (this is asserted for in
429 "-DDEBUGGING" builds). If the UTF-8 for the input character is
430 malformed in some way, the program may croak, or the function
431 may return the REPLACEMENT CHARACTER, at the discretion of the
432 implementation, and subject to change in future releases.
433
434 UV toFOLD_utf8_safe(U8* p, U8* e, U8* s,
435 STRLEN* lenp)
436
437 toFOLD_uvchr
438 Converts the code point "cp" to its foldcase version, and
439 stores that in UTF-8 in "s", and its length in bytes in "lenp".
440 The code point is interpreted as native if less than 256;
441 otherwise as Unicode. Note that the buffer pointed to by "s"
442 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
443 foldcase version may be longer than the original character.
444
445 The first code point of the foldcased version is returned (but
446 note, as explained at the top of this section, that there may
447 be more).
448
449 UV toFOLD_uvchr(UV cp, U8* s, STRLEN* lenp)
450
451 toLOWER Converts the specified character to lowercase. If the input is
452 anything but an ASCII uppercase character, that input character
453 itself is returned. Variant "toLOWER_A" is equivalent.
454
455 U8 toLOWER(U8 ch)
456
457 toLOWER_L1
458 Converts the specified Latin1 character to lowercase. The
459 results are undefined if the input doesn't fit in a byte.
460
461 U8 toLOWER_L1(U8 ch)
462
463 toLOWER_LC
464 Converts the specified character to lowercase using the current
465 locale's rules, if possible; otherwise returns the input
466 character itself.
467
468 U8 toLOWER_LC(U8 ch)
469
470 toLOWER_utf8
471 This is like "toLOWER_utf8_safe", but doesn't have the "e"
472 parameter The function therefore can't check if it is reading
473 beyond the end of the string. Starting in Perl v5.32, it will
474 take the "e" parameter, becoming a synonym for
475 "toLOWER_utf8_safe". At that time every program that uses it
476 will have to be changed to successfully compile. In the
477 meantime, the first runtime call to "toLOWER_utf8" from each
478 call point in the program will raise a deprecation warning,
479 enabled by default. You can convert your program now to use
480 "toLOWER_utf8_safe", and avoid the warnings, and get an extra
481 measure of protection, or you can wait until v5.32, when you'll
482 be forced to add the "e" parameter.
483
484 UV toLOWER_utf8(U8* p, U8* s, STRLEN* lenp)
485
486 toLOWER_utf8_safe
487 Converts the first UTF-8 encoded character in the sequence
488 starting at "p" and extending no further than "e - 1" to its
489 lowercase version, and stores that in UTF-8 in "s", and its
490 length in bytes in "lenp". Note that the buffer pointed to by
491 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
492 lowercase version may be longer than the original character.
493
494 The first code point of the lowercased version is returned (but
495 note, as explained at the top of this section, that there may
496 be more).
497
498 The suffix "_safe" in the function's name indicates that it
499 will not attempt to read beyond "e - 1", provided that the
500 constraint "s < e" is true (this is asserted for in
501 "-DDEBUGGING" builds). If the UTF-8 for the input character is
502 malformed in some way, the program may croak, or the function
503 may return the REPLACEMENT CHARACTER, at the discretion of the
504 implementation, and subject to change in future releases.
505
506 UV toLOWER_utf8_safe(U8* p, U8* e, U8* s,
507 STRLEN* lenp)
508
509 toLOWER_uvchr
510 Converts the code point "cp" to its lowercase version, and
511 stores that in UTF-8 in "s", and its length in bytes in "lenp".
512 The code point is interpreted as native if less than 256;
513 otherwise as Unicode. Note that the buffer pointed to by "s"
514 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
515 lowercase version may be longer than the original character.
516
517 The first code point of the lowercased version is returned (but
518 note, as explained at the top of this section, that there may
519 be more).
520
521 UV toLOWER_uvchr(UV cp, U8* s, STRLEN* lenp)
522
523 toTITLE Converts the specified character to titlecase. If the input is
524 anything but an ASCII lowercase character, that input character
525 itself is returned. Variant "toTITLE_A" is equivalent. (There
526 is no "toTITLE_L1" for the full Latin1 range, as the full
527 generality of "toTITLE_uvchr" is needed there. Titlecase is
528 not a concept used in locale handling, so there is no
529 functionality for that.)
530
531 U8 toTITLE(U8 ch)
532
533 toTITLE_utf8
534 This is like "toLOWER_utf8_safe", but doesn't have the "e"
535 parameter The function therefore can't check if it is reading
536 beyond the end of the string. Starting in Perl v5.32, it will
537 take the "e" parameter, becoming a synonym for
538 "toTITLE_utf8_safe". At that time every program that uses it
539 will have to be changed to successfully compile. In the
540 meantime, the first runtime call to "toTITLE_utf8" from each
541 call point in the program will raise a deprecation warning,
542 enabled by default. You can convert your program now to use
543 "toTITLE_utf8_safe", and avoid the warnings, and get an extra
544 measure of protection, or you can wait until v5.32, when you'll
545 be forced to add the "e" parameter.
546
547 UV toTITLE_utf8(U8* p, U8* s, STRLEN* lenp)
548
549 toTITLE_utf8_safe
550 Converts the first UTF-8 encoded character in the sequence
551 starting at "p" and extending no further than "e - 1" to its
552 titlecase version, and stores that in UTF-8 in "s", and its
553 length in bytes in "lenp". Note that the buffer pointed to by
554 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
555 titlecase version may be longer than the original character.
556
557 The first code point of the titlecased version is returned (but
558 note, as explained at the top of this section, that there may
559 be more).
560
561 The suffix "_safe" in the function's name indicates that it
562 will not attempt to read beyond "e - 1", provided that the
563 constraint "s < e" is true (this is asserted for in
564 "-DDEBUGGING" builds). If the UTF-8 for the input character is
565 malformed in some way, the program may croak, or the function
566 may return the REPLACEMENT CHARACTER, at the discretion of the
567 implementation, and subject to change in future releases.
568
569 UV toTITLE_utf8_safe(U8* p, U8* e, U8* s,
570 STRLEN* lenp)
571
572 toTITLE_uvchr
573 Converts the code point "cp" to its titlecase version, and
574 stores that in UTF-8 in "s", and its length in bytes in "lenp".
575 The code point is interpreted as native if less than 256;
576 otherwise as Unicode. Note that the buffer pointed to by "s"
577 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
578 titlecase version may be longer than the original character.
579
580 The first code point of the titlecased version is returned (but
581 note, as explained at the top of this section, that there may
582 be more).
583
584 UV toTITLE_uvchr(UV cp, U8* s, STRLEN* lenp)
585
586 toUPPER Converts the specified character to uppercase. If the input is
587 anything but an ASCII lowercase character, that input character
588 itself is returned. Variant "toUPPER_A" is equivalent.
589
590 U8 toUPPER(U8 ch)
591
592 toUPPER_utf8
593 This is like "toUPPER_utf8_safe", but doesn't have the "e"
594 parameter The function therefore can't check if it is reading
595 beyond the end of the string. Starting in Perl v5.32, it will
596 take the "e" parameter, becoming a synonym for
597 "toUPPER_utf8_safe". At that time every program that uses it
598 will have to be changed to successfully compile. In the
599 meantime, the first runtime call to "toUPPER_utf8" from each
600 call point in the program will raise a deprecation warning,
601 enabled by default. You can convert your program now to use
602 "toUPPER_utf8_safe", and avoid the warnings, and get an extra
603 measure of protection, or you can wait until v5.32, when you'll
604 be forced to add the "e" parameter.
605
606 UV toUPPER_utf8(U8* p, U8* s, STRLEN* lenp)
607
608 toUPPER_utf8_safe
609 Converts the first UTF-8 encoded character in the sequence
610 starting at "p" and extending no further than "e - 1" to its
611 uppercase version, and stores that in UTF-8 in "s", and its
612 length in bytes in "lenp". Note that the buffer pointed to by
613 "s" needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
614 uppercase version may be longer than the original character.
615
616 The first code point of the uppercased version is returned (but
617 note, as explained at the top of this section, that there may
618 be more).
619
620 The suffix "_safe" in the function's name indicates that it
621 will not attempt to read beyond "e - 1", provided that the
622 constraint "s < e" is true (this is asserted for in
623 "-DDEBUGGING" builds). If the UTF-8 for the input character is
624 malformed in some way, the program may croak, or the function
625 may return the REPLACEMENT CHARACTER, at the discretion of the
626 implementation, and subject to change in future releases.
627
628 UV toUPPER_utf8_safe(U8* p, U8* e, U8* s,
629 STRLEN* lenp)
630
631 toUPPER_uvchr
632 Converts the code point "cp" to its uppercase version, and
633 stores that in UTF-8 in "s", and its length in bytes in "lenp".
634 The code point is interpreted as native if less than 256;
635 otherwise as Unicode. Note that the buffer pointed to by "s"
636 needs to be at least "UTF8_MAXBYTES_CASE+1" bytes since the
637 uppercase version may be longer than the original character.
638
639 The first code point of the uppercased version is returned (but
640 note, as explained at the top of this section, that there may
641 be more.)
642
643 UV toUPPER_uvchr(UV cp, U8* s, STRLEN* lenp)
644
646 This section is about functions (really macros) that classify
647 characters into types, such as punctuation versus alphabetic, etc.
648 Most of these are analogous to regular expression character classes.
649 (See "POSIX Character Classes" in perlrecharclass.) There are several
650 variants for each class. (Not all macros have all variants; each item
651 below lists the ones valid for it.) None are affected by "use bytes",
652 and only the ones with "LC" in the name are affected by the current
653 locale.
654
655 The base function, e.g., "isALPHA()", takes an octet (either a "char"
656 or a "U8") as input and returns a boolean as to whether or not the
657 character represented by that octet is (or on non-ASCII platforms,
658 corresponds to) an ASCII character in the named class based on
659 platform, Unicode, and Perl rules. If the input is a number that
660 doesn't fit in an octet, FALSE is returned.
661
662 Variant "isFOO_A" (e.g., "isALPHA_A()") is identical to the base
663 function with no suffix "_A". This variant is used to emphasize by its
664 name that only ASCII-range characters can return TRUE.
665
666 Variant "isFOO_L1" imposes the Latin-1 (or EBCDIC equivalent) character
667 set onto the platform. That is, the code points that are ASCII are
668 unaffected, since ASCII is a subset of Latin-1. But the non-ASCII code
669 points are treated as if they are Latin-1 characters. For example,
670 "isWORDCHAR_L1()" will return true when called with the code point
671 0xDF, which is a word character in both ASCII and EBCDIC (though it
672 represents different characters in each).
673
674 Variant "isFOO_uvchr" is like the "isFOO_L1" variant, but accepts any
675 UV code point as input. If the code point is larger than 255, Unicode
676 rules are used to determine if it is in the character class. For
677 example, "isWORDCHAR_uvchr(0x100)" returns TRUE, since 0x100 is LATIN
678 CAPITAL LETTER A WITH MACRON in Unicode, and is a word character.
679
680 Variant "isFOO_utf8_safe" is like "isFOO_uvchr", but is used for UTF-8
681 encoded strings. Each call classifies one character, even if the
682 string contains many. This variant takes two parameters. The first,
683 "p", is a pointer to the first byte of the character to be classified.
684 (Recall that it may take more than one byte to represent a character in
685 UTF-8 strings.) The second parameter, "e", points to anywhere in the
686 string beyond the first character, up to one byte past the end of the
687 entire string. The suffix "_safe" in the function's name indicates
688 that it will not attempt to read beyond "e - 1", provided that the
689 constraint "s < e" is true (this is asserted for in "-DDEBUGGING"
690 builds). If the UTF-8 for the input character is malformed in some
691 way, the program may croak, or the function may return FALSE, at the
692 discretion of the implementation, and subject to change in future
693 releases.
694
695 Variant "isFOO_utf8" is like "isFOO_utf8_safe", but takes just a single
696 parameter, "p", which has the same meaning as the corresponding
697 parameter does in "isFOO_utf8_safe". The function therefore can't
698 check if it is reading beyond the end of the string. Starting in Perl
699 v5.32, it will take a second parameter, becoming a synonym for
700 "isFOO_utf8_safe". At that time every program that uses it will have
701 to be changed to successfully compile. In the meantime, the first
702 runtime call to "isFOO_utf8" from each call point in the program will
703 raise a deprecation warning, enabled by default. You can convert your
704 program now to use "isFOO_utf8_safe", and avoid the warnings, and get
705 an extra measure of protection, or you can wait until v5.32, when
706 you'll be forced to add the "e" parameter.
707
708 Variant "isFOO_LC" is like the "isFOO_A" and "isFOO_L1" variants, but
709 the result is based on the current locale, which is what "LC" in the
710 name stands for. If Perl can determine that the current locale is a
711 UTF-8 locale, it uses the published Unicode rules; otherwise, it uses
712 the C library function that gives the named classification. For
713 example, "isDIGIT_LC()" when not in a UTF-8 locale returns the result
714 of calling "isdigit()". FALSE is always returned if the input won't
715 fit into an octet. On some platforms where the C library function is
716 known to be defective, Perl changes its result to follow the POSIX
717 standard's rules.
718
719 Variant "isFOO_LC_uvchr" is like "isFOO_LC", but is defined on any UV.
720 It returns the same as "isFOO_LC" for input code points less than 256,
721 and returns the hard-coded, not-affected-by-locale, Unicode results for
722 larger ones.
723
724 Variant "isFOO_LC_utf8_safe" is like "isFOO_LC_uvchr", but is used for
725 UTF-8 encoded strings. Each call classifies one character, even if the
726 string contains many. This variant takes two parameters. The first,
727 "p", is a pointer to the first byte of the character to be classified.
728 (Recall that it may take more than one byte to represent a character in
729 UTF-8 strings.) The second parameter, "e", points to anywhere in the
730 string beyond the first character, up to one byte past the end of the
731 entire string. The suffix "_safe" in the function's name indicates
732 that it will not attempt to read beyond "e - 1", provided that the
733 constraint "s < e" is true (this is asserted for in "-DDEBUGGING"
734 builds). If the UTF-8 for the input character is malformed in some
735 way, the program may croak, or the function may return FALSE, at the
736 discretion of the implementation, and subject to change in future
737 releases.
738
739 Variant "isFOO_LC_utf8" is like "isFOO_LC_utf8_safe", but takes just a
740 single parameter, "p", which has the same meaning as the corresponding
741 parameter does in "isFOO_LC_utf8_safe". The function therefore can't
742 check if it is reading beyond the end of the string. Starting in Perl
743 v5.32, it will take a second parameter, becoming a synonym for
744 "isFOO_LC_utf8_safe". At that time every program that uses it will
745 have to be changed to successfully compile. In the meantime, the first
746 runtime call to "isFOO_LC_utf8" from each call point in the program
747 will raise a deprecation warning, enabled by default. You can convert
748 your program now to use "isFOO_LC_utf8_safe", and avoid the warnings,
749 and get an extra measure of protection, or you can wait until v5.32,
750 when you'll be forced to add the "e" parameter.
751
752 isALPHA Returns a boolean indicating whether the specified character is
753 an alphabetic character, analogous to "m/[[:alpha:]]/". See
754 the top of this section for an explanation of variants
755 "isALPHA_A", "isALPHA_L1", "isALPHA_uvchr",
756 "isALPHA_utf8_safe", "isALPHA_LC", "isALPHA_LC_uvchr", and
757 "isALPHA_LC_utf8_safe".
758
759 bool isALPHA(char ch)
760
761 isALPHANUMERIC
762 Returns a boolean indicating whether the specified character is
763 a either an alphabetic character or decimal digit, analogous to
764 "m/[[:alnum:]]/". See the top of this section for an
765 explanation of variants "isALPHANUMERIC_A",
766 "isALPHANUMERIC_L1", "isALPHANUMERIC_uvchr",
767 "isALPHANUMERIC_utf8_safe", "isALPHANUMERIC_LC",
768 "isALPHANUMERIC_LC_uvchr", and "isALPHANUMERIC_LC_utf8_safe".
769
770 bool isALPHANUMERIC(char ch)
771
772 isASCII Returns a boolean indicating whether the specified character is
773 one of the 128 characters in the ASCII character set, analogous
774 to "m/[[:ascii:]]/". On non-ASCII platforms, it returns TRUE
775 iff this character corresponds to an ASCII character. Variants
776 "isASCII_A()" and "isASCII_L1()" are identical to "isASCII()".
777 See the top of this section for an explanation of variants
778 "isASCII_uvchr", "isASCII_utf8_safe", "isASCII_LC",
779 "isASCII_LC_uvchr", and "isASCII_LC_utf8_safe". Note, however,
780 that some platforms do not have the C library routine
781 "isascii()". In these cases, the variants whose names contain
782 "LC" are the same as the corresponding ones without.
783
784 Also note, that because all ASCII characters are UTF-8
785 invariant (meaning they have the exact same representation
786 (always a single byte) whether encoded in UTF-8 or not),
787 "isASCII" will give the correct results when called with any
788 byte in any string encoded or not in UTF-8. And similarly
789 "isASCII_utf8_safe" will work properly on any string encoded or
790 not in UTF-8.
791
792 bool isASCII(char ch)
793
794 isBLANK Returns a boolean indicating whether the specified character is
795 a character considered to be a blank, analogous to
796 "m/[[:blank:]]/". See the top of this section for an
797 explanation of variants "isBLANK_A", "isBLANK_L1",
798 "isBLANK_uvchr", "isBLANK_utf8_safe", "isBLANK_LC",
799 "isBLANK_LC_uvchr", and "isBLANK_LC_utf8_safe". Note, however,
800 that some platforms do not have the C library routine
801 "isblank()". In these cases, the variants whose names contain
802 "LC" are the same as the corresponding ones without.
803
804 bool isBLANK(char ch)
805
806 isCNTRL Returns a boolean indicating whether the specified character is
807 a control character, analogous to "m/[[:cntrl:]]/". See the
808 top of this section for an explanation of variants "isCNTRL_A",
809 "isCNTRL_L1", "isCNTRL_uvchr", "isCNTRL_utf8_safe",
810 "isCNTRL_LC", "isCNTRL_LC_uvchr", and "isCNTRL_LC_utf8_safe" On
811 EBCDIC platforms, you almost always want to use the
812 "isCNTRL_L1" variant.
813
814 bool isCNTRL(char ch)
815
816 isDIGIT Returns a boolean indicating whether the specified character is
817 a digit, analogous to "m/[[:digit:]]/". Variants "isDIGIT_A"
818 and "isDIGIT_L1" are identical to "isDIGIT". See the top of
819 this section for an explanation of variants "isDIGIT_uvchr",
820 "isDIGIT_utf8_safe", "isDIGIT_LC", "isDIGIT_LC_uvchr", and
821 "isDIGIT_LC_utf8_safe".
822
823 bool isDIGIT(char ch)
824
825 isGRAPH Returns a boolean indicating whether the specified character is
826 a graphic character, analogous to "m/[[:graph:]]/". See the
827 top of this section for an explanation of variants "isGRAPH_A",
828 "isGRAPH_L1", "isGRAPH_uvchr", "isGRAPH_utf8_safe",
829 "isGRAPH_LC", "isGRAPH_LC_uvchr", and "isGRAPH_LC_utf8_safe".
830
831 bool isGRAPH(char ch)
832
833 isIDCONT
834 Returns a boolean indicating whether the specified character
835 can be the second or succeeding character of an identifier.
836 This is very close to, but not quite the same as the official
837 Unicode property "XID_Continue". The difference is that this
838 returns true only if the input character also matches
839 "isWORDCHAR". See the top of this section for an explanation
840 of variants "isIDCONT_A", "isIDCONT_L1", "isIDCONT_uvchr",
841 "isIDCONT_utf8_safe", "isIDCONT_LC", "isIDCONT_LC_uvchr", and
842 "isIDCONT_LC_utf8_safe".
843
844 bool isIDCONT(char ch)
845
846 isIDFIRST
847 Returns a boolean indicating whether the specified character
848 can be the first character of an identifier. This is very
849 close to, but not quite the same as the official Unicode
850 property "XID_Start". The difference is that this returns true
851 only if the input character also matches "isWORDCHAR". See the
852 top of this section for an explanation of variants
853 "isIDFIRST_A", "isIDFIRST_L1", "isIDFIRST_uvchr",
854 "isIDFIRST_utf8_safe", "isIDFIRST_LC", "isIDFIRST_LC_uvchr",
855 and "isIDFIRST_LC_utf8_safe".
856
857 bool isIDFIRST(char ch)
858
859 isLOWER Returns a boolean indicating whether the specified character is
860 a lowercase character, analogous to "m/[[:lower:]]/". See the
861 top of this section for an explanation of variants "isLOWER_A",
862 "isLOWER_L1", "isLOWER_uvchr", "isLOWER_utf8_safe",
863 "isLOWER_LC", "isLOWER_LC_uvchr", and "isLOWER_LC_utf8_safe".
864
865 bool isLOWER(char ch)
866
867 isOCTAL Returns a boolean indicating whether the specified character is
868 an octal digit, [0-7]. The only two variants are "isOCTAL_A"
869 and "isOCTAL_L1"; each is identical to "isOCTAL".
870
871 bool isOCTAL(char ch)
872
873 isPRINT Returns a boolean indicating whether the specified character is
874 a printable character, analogous to "m/[[:print:]]/". See the
875 top of this section for an explanation of variants "isPRINT_A",
876 "isPRINT_L1", "isPRINT_uvchr", "isPRINT_utf8_safe",
877 "isPRINT_LC", "isPRINT_LC_uvchr", and "isPRINT_LC_utf8_safe".
878
879 bool isPRINT(char ch)
880
881 isPSXSPC
882 (short for Posix Space) Starting in 5.18, this is identical in
883 all its forms to the corresponding "isSPACE()" macros. The
884 locale forms of this macro are identical to their corresponding
885 "isSPACE()" forms in all Perl releases. In releases prior to
886 5.18, the non-locale forms differ from their "isSPACE()" forms
887 only in that the "isSPACE()" forms don't match a Vertical Tab,
888 and the "isPSXSPC()" forms do. Otherwise they are identical.
889 Thus this macro is analogous to what "m/[[:space:]]/" matches
890 in a regular expression. See the top of this section for an
891 explanation of variants "isPSXSPC_A", "isPSXSPC_L1",
892 "isPSXSPC_uvchr", "isPSXSPC_utf8_safe", "isPSXSPC_LC",
893 "isPSXSPC_LC_uvchr", and "isPSXSPC_LC_utf8_safe".
894
895 bool isPSXSPC(char ch)
896
897 isPUNCT Returns a boolean indicating whether the specified character is
898 a punctuation character, analogous to "m/[[:punct:]]/". Note
899 that the definition of what is punctuation isn't as
900 straightforward as one might desire. See "POSIX Character
901 Classes" in perlrecharclass for details. See the top of this
902 section for an explanation of variants "isPUNCT_A",
903 "isPUNCT_L1", "isPUNCT_uvchr", "isPUNCT_utf8_safe",
904 "isPUNCT_LC", "isPUNCT_LC_uvchr", and "isPUNCT_LC_utf8_safe".
905
906 bool isPUNCT(char ch)
907
908 isSPACE Returns a boolean indicating whether the specified character is
909 a whitespace character. This is analogous to what "m/\s/"
910 matches in a regular expression. Starting in Perl 5.18 this
911 also matches what "m/[[:space:]]/" does. Prior to 5.18, only
912 the locale forms of this macro (the ones with "LC" in their
913 names) matched precisely what "m/[[:space:]]/" does. In those
914 releases, the only difference, in the non-locale variants, was
915 that "isSPACE()" did not match a vertical tab. (See "isPSXSPC"
916 for a macro that matches a vertical tab in all releases.) See
917 the top of this section for an explanation of variants
918 "isSPACE_A", "isSPACE_L1", "isSPACE_uvchr",
919 "isSPACE_utf8_safe", "isSPACE_LC", "isSPACE_LC_uvchr", and
920 "isSPACE_LC_utf8_safe".
921
922 bool isSPACE(char ch)
923
924 isUPPER Returns a boolean indicating whether the specified character is
925 an uppercase character, analogous to "m/[[:upper:]]/". See the
926 top of this section for an explanation of variants "isUPPER_A",
927 "isUPPER_L1", "isUPPER_uvchr", "isUPPER_utf8_safe",
928 "isUPPER_LC", "isUPPER_LC_uvchr", and "isUPPER_LC_utf8_safe".
929
930 bool isUPPER(char ch)
931
932 isWORDCHAR
933 Returns a boolean indicating whether the specified character is
934 a character that is a word character, analogous to what "m/\w/"
935 and "m/[[:word:]]/" match in a regular expression. A word
936 character is an alphabetic character, a decimal digit, a
937 connecting punctuation character (such as an underscore), or a
938 "mark" character that attaches to one of those (like some sort
939 of accent). "isALNUM()" is a synonym provided for backward
940 compatibility, even though a word character includes more than
941 the standard C language meaning of alphanumeric. See the top
942 of this section for an explanation of variants "isWORDCHAR_A",
943 "isWORDCHAR_L1", "isWORDCHAR_uvchr", and
944 "isWORDCHAR_utf8_safe". "isWORDCHAR_LC",
945 "isWORDCHAR_LC_uvchr", and "isWORDCHAR_LC_utf8_safe" are also
946 as described there, but additionally include the platform's
947 native underscore.
948
949 bool isWORDCHAR(char ch)
950
951 isXDIGIT
952 Returns a boolean indicating whether the specified character is
953 a hexadecimal digit. In the ASCII range these are
954 "[0-9A-Fa-f]". Variants "isXDIGIT_A()" and "isXDIGIT_L1()" are
955 identical to "isXDIGIT()". See the top of this section for an
956 explanation of variants "isXDIGIT_uvchr", "isXDIGIT_utf8_safe",
957 "isXDIGIT_LC", "isXDIGIT_LC_uvchr", and
958 "isXDIGIT_LC_utf8_safe".
959
960 bool isXDIGIT(char ch)
961
963 perl_clone
964 Create and return a new interpreter by cloning the current one.
965
966 "perl_clone" takes these flags as parameters:
967
968 "CLONEf_COPY_STACKS" - is used to, well, copy the stacks also,
969 without it we only clone the data and zero the stacks, with it
970 we copy the stacks and the new perl interpreter is ready to run
971 at the exact same point as the previous one. The pseudo-fork
972 code uses "COPY_STACKS" while the threads->create doesn't.
973
974 "CLONEf_KEEP_PTR_TABLE" - "perl_clone" keeps a ptr_table with
975 the pointer of the old variable as a key and the new variable
976 as a value, this allows it to check if something has been
977 cloned and not clone it again but rather just use the value and
978 increase the refcount. If "KEEP_PTR_TABLE" is not set then
979 "perl_clone" will kill the ptr_table using the function
980 "ptr_table_free(PL_ptr_table); PL_ptr_table = NULL;", reason to
981 keep it around is if you want to dup some of your own variable
982 who are outside the graph perl scans, an example of this code
983 is in threads.xs create.
984
985 "CLONEf_CLONE_HOST" - This is a win32 thing, it is ignored on
986 unix, it tells perls win32host code (which is c++) to clone
987 itself, this is needed on win32 if you want to run two threads
988 at the same time, if you just want to do some stuff in a
989 separate perl interpreter and then throw it away and return to
990 the original one, you don't need to do anything.
991
992 PerlInterpreter* perl_clone(
993 PerlInterpreter *proto_perl,
994 UV flags
995 )
996
998 BhkDISABLE
999 NOTE: this function is experimental and may change or be
1000 removed without notice.
1001
1002 Temporarily disable an entry in this BHK structure, by clearing
1003 the appropriate flag. "which" is a preprocessor token
1004 indicating which entry to disable.
1005
1006 void BhkDISABLE(BHK *hk, which)
1007
1008 BhkENABLE
1009 NOTE: this function is experimental and may change or be
1010 removed without notice.
1011
1012 Re-enable an entry in this BHK structure, by setting the
1013 appropriate flag. "which" is a preprocessor token indicating
1014 which entry to enable. This will assert (under -DDEBUGGING) if
1015 the entry doesn't contain a valid pointer.
1016
1017 void BhkENABLE(BHK *hk, which)
1018
1019 BhkENTRY_set
1020 NOTE: this function is experimental and may change or be
1021 removed without notice.
1022
1023 Set an entry in the BHK structure, and set the flags to
1024 indicate it is valid. "which" is a preprocessing token
1025 indicating which entry to set. The type of "ptr" depends on
1026 the entry.
1027
1028 void BhkENTRY_set(BHK *hk, which, void *ptr)
1029
1030 blockhook_register
1031 NOTE: this function is experimental and may change or be
1032 removed without notice.
1033
1034 Register a set of hooks to be called when the Perl lexical
1035 scope changes at compile time. See "Compile-time scope hooks"
1036 in perlguts.
1037
1038 NOTE: this function must be explicitly called as
1039 Perl_blockhook_register with an aTHX_ parameter.
1040
1041 void Perl_blockhook_register(pTHX_ BHK *hk)
1042
1044 cophh_2hv
1045 NOTE: this function is experimental and may change or be
1046 removed without notice.
1047
1048 Generates and returns a standard Perl hash representing the
1049 full set of key/value pairs in the cop hints hash "cophh".
1050 "flags" is currently unused and must be zero.
1051
1052 HV * cophh_2hv(const COPHH *cophh, U32 flags)
1053
1054 cophh_copy
1055 NOTE: this function is experimental and may change or be
1056 removed without notice.
1057
1058 Make and return a complete copy of the cop hints hash "cophh".
1059
1060 COPHH * cophh_copy(COPHH *cophh)
1061
1062 cophh_delete_pv
1063 NOTE: this function is experimental and may change or be
1064 removed without notice.
1065
1066 Like "cophh_delete_pvn", but takes a nul-terminated string
1067 instead of a string/length pair.
1068
1069 COPHH * cophh_delete_pv(const COPHH *cophh,
1070 const char *key, U32 hash,
1071 U32 flags)
1072
1073 cophh_delete_pvn
1074 NOTE: this function is experimental and may change or be
1075 removed without notice.
1076
1077 Delete a key and its associated value from the cop hints hash
1078 "cophh", and returns the modified hash. The returned hash
1079 pointer is in general not the same as the hash pointer that was
1080 passed in. The input hash is consumed by the function, and the
1081 pointer to it must not be subsequently used. Use "cophh_copy"
1082 if you need both hashes.
1083
1084 The key is specified by "keypv" and "keylen". If "flags" has
1085 the "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1086 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1087 precomputed hash of the key string, or zero if it has not been
1088 precomputed.
1089
1090 COPHH * cophh_delete_pvn(COPHH *cophh,
1091 const char *keypv,
1092 STRLEN keylen, U32 hash,
1093 U32 flags)
1094
1095 cophh_delete_pvs
1096 NOTE: this function is experimental and may change or be
1097 removed without notice.
1098
1099 Like "cophh_delete_pvn", but takes a literal string instead of
1100 a string/length pair, and no precomputed hash.
1101
1102 COPHH * cophh_delete_pvs(const COPHH *cophh,
1103 "literal string" key,
1104 U32 flags)
1105
1106 cophh_delete_sv
1107 NOTE: this function is experimental and may change or be
1108 removed without notice.
1109
1110 Like "cophh_delete_pvn", but takes a Perl scalar instead of a
1111 string/length pair.
1112
1113 COPHH * cophh_delete_sv(const COPHH *cophh, SV *key,
1114 U32 hash, U32 flags)
1115
1116 cophh_fetch_pv
1117 NOTE: this function is experimental and may change or be
1118 removed without notice.
1119
1120 Like "cophh_fetch_pvn", but takes a nul-terminated string
1121 instead of a string/length pair.
1122
1123 SV * cophh_fetch_pv(const COPHH *cophh,
1124 const char *key, U32 hash,
1125 U32 flags)
1126
1127 cophh_fetch_pvn
1128 NOTE: this function is experimental and may change or be
1129 removed without notice.
1130
1131 Look up the entry in the cop hints hash "cophh" with the key
1132 specified by "keypv" and "keylen". If "flags" has the
1133 "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1134 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1135 precomputed hash of the key string, or zero if it has not been
1136 precomputed. Returns a mortal scalar copy of the value
1137 associated with the key, or &PL_sv_placeholder if there is no
1138 value associated with the key.
1139
1140 SV * cophh_fetch_pvn(const COPHH *cophh,
1141 const char *keypv,
1142 STRLEN keylen, U32 hash,
1143 U32 flags)
1144
1145 cophh_fetch_pvs
1146 NOTE: this function is experimental and may change or be
1147 removed without notice.
1148
1149 Like "cophh_fetch_pvn", but takes a literal string instead of a
1150 string/length pair, and no precomputed hash.
1151
1152 SV * cophh_fetch_pvs(const COPHH *cophh,
1153 "literal string" key, U32 flags)
1154
1155 cophh_fetch_sv
1156 NOTE: this function is experimental and may change or be
1157 removed without notice.
1158
1159 Like "cophh_fetch_pvn", but takes a Perl scalar instead of a
1160 string/length pair.
1161
1162 SV * cophh_fetch_sv(const COPHH *cophh, SV *key,
1163 U32 hash, U32 flags)
1164
1165 cophh_free
1166 NOTE: this function is experimental and may change or be
1167 removed without notice.
1168
1169 Discard the cop hints hash "cophh", freeing all resources
1170 associated with it.
1171
1172 void cophh_free(COPHH *cophh)
1173
1174 cophh_new_empty
1175 NOTE: this function is experimental and may change or be
1176 removed without notice.
1177
1178 Generate and return a fresh cop hints hash containing no
1179 entries.
1180
1181 COPHH * cophh_new_empty()
1182
1183 cophh_store_pv
1184 NOTE: this function is experimental and may change or be
1185 removed without notice.
1186
1187 Like "cophh_store_pvn", but takes a nul-terminated string
1188 instead of a string/length pair.
1189
1190 COPHH * cophh_store_pv(const COPHH *cophh,
1191 const char *key, U32 hash,
1192 SV *value, U32 flags)
1193
1194 cophh_store_pvn
1195 NOTE: this function is experimental and may change or be
1196 removed without notice.
1197
1198 Stores a value, associated with a key, in the cop hints hash
1199 "cophh", and returns the modified hash. The returned hash
1200 pointer is in general not the same as the hash pointer that was
1201 passed in. The input hash is consumed by the function, and the
1202 pointer to it must not be subsequently used. Use "cophh_copy"
1203 if you need both hashes.
1204
1205 The key is specified by "keypv" and "keylen". If "flags" has
1206 the "COPHH_KEY_UTF8" bit set, the key octets are interpreted as
1207 UTF-8, otherwise they are interpreted as Latin-1. "hash" is a
1208 precomputed hash of the key string, or zero if it has not been
1209 precomputed.
1210
1211 "value" is the scalar value to store for this key. "value" is
1212 copied by this function, which thus does not take ownership of
1213 any reference to it, and later changes to the scalar will not
1214 be reflected in the value visible in the cop hints hash.
1215 Complex types of scalar will not be stored with referential
1216 integrity, but will be coerced to strings.
1217
1218 COPHH * cophh_store_pvn(COPHH *cophh, const char *keypv,
1219 STRLEN keylen, U32 hash,
1220 SV *value, U32 flags)
1221
1222 cophh_store_pvs
1223 NOTE: this function is experimental and may change or be
1224 removed without notice.
1225
1226 Like "cophh_store_pvn", but takes a literal string instead of a
1227 string/length pair, and no precomputed hash.
1228
1229 COPHH * cophh_store_pvs(const COPHH *cophh,
1230 "literal string" key, SV *value,
1231 U32 flags)
1232
1233 cophh_store_sv
1234 NOTE: this function is experimental and may change or be
1235 removed without notice.
1236
1237 Like "cophh_store_pvn", but takes a Perl scalar instead of a
1238 string/length pair.
1239
1240 COPHH * cophh_store_sv(const COPHH *cophh, SV *key,
1241 U32 hash, SV *value, U32 flags)
1242
1244 cop_hints_2hv
1245 Generates and returns a standard Perl hash representing the
1246 full set of hint entries in the cop "cop". "flags" is
1247 currently unused and must be zero.
1248
1249 HV * cop_hints_2hv(const COP *cop, U32 flags)
1250
1251 cop_hints_fetch_pv
1252 Like "cop_hints_fetch_pvn", but takes a nul-terminated string
1253 instead of a string/length pair.
1254
1255 SV * cop_hints_fetch_pv(const COP *cop,
1256 const char *key, U32 hash,
1257 U32 flags)
1258
1259 cop_hints_fetch_pvn
1260 Look up the hint entry in the cop "cop" with the key specified
1261 by "keypv" and "keylen". If "flags" has the "COPHH_KEY_UTF8"
1262 bit set, the key octets are interpreted as UTF-8, otherwise
1263 they are interpreted as Latin-1. "hash" is a precomputed hash
1264 of the key string, or zero if it has not been precomputed.
1265 Returns a mortal scalar copy of the value associated with the
1266 key, or &PL_sv_placeholder if there is no value associated with
1267 the key.
1268
1269 SV * cop_hints_fetch_pvn(const COP *cop,
1270 const char *keypv,
1271 STRLEN keylen, U32 hash,
1272 U32 flags)
1273
1274 cop_hints_fetch_pvs
1275 Like "cop_hints_fetch_pvn", but takes a literal string instead
1276 of a string/length pair, and no precomputed hash.
1277
1278 SV * cop_hints_fetch_pvs(const COP *cop,
1279 "literal string" key,
1280 U32 flags)
1281
1282 cop_hints_fetch_sv
1283 Like "cop_hints_fetch_pvn", but takes a Perl scalar instead of
1284 a string/length pair.
1285
1286 SV * cop_hints_fetch_sv(const COP *cop, SV *key,
1287 U32 hash, U32 flags)
1288
1290 custom_op_register
1291 Register a custom op. See "Custom Operators" in perlguts.
1292
1293 NOTE: this function must be explicitly called as
1294 Perl_custom_op_register with an aTHX_ parameter.
1295
1296 void Perl_custom_op_register(pTHX_
1297 Perl_ppaddr_t ppaddr,
1298 const XOP *xop)
1299
1300 custom_op_xop
1301 Return the XOP structure for a given custom op. This macro
1302 should be considered internal to "OP_NAME" and the other access
1303 macros: use them instead. This macro does call a function.
1304 Prior to 5.19.6, this was implemented as a function.
1305
1306 NOTE: this function must be explicitly called as
1307 Perl_custom_op_xop with an aTHX_ parameter.
1308
1309 const XOP * Perl_custom_op_xop(pTHX_ const OP *o)
1310
1311 XopDISABLE
1312 Temporarily disable a member of the XOP, by clearing the
1313 appropriate flag.
1314
1315 void XopDISABLE(XOP *xop, which)
1316
1317 XopENABLE
1318 Reenable a member of the XOP which has been disabled.
1319
1320 void XopENABLE(XOP *xop, which)
1321
1322 XopENTRY
1323 Return a member of the XOP structure. "which" is a cpp token
1324 indicating which entry to return. If the member is not set
1325 this will return a default value. The return type depends on
1326 "which". This macro evaluates its arguments more than once.
1327 If you are using "Perl_custom_op_xop" to retreive a "XOP *"
1328 from a "OP *", use the more efficient "XopENTRYCUSTOM" instead.
1329
1330 XopENTRY(XOP *xop, which)
1331
1332 XopENTRYCUSTOM
1333 Exactly like "XopENTRY(XopENTRY(Perl_custom_op_xop(aTHX_ o),
1334 which)" but more efficient. The "which" parameter is identical
1335 to "XopENTRY".
1336
1337 XopENTRYCUSTOM(const OP *o, which)
1338
1339 XopENTRY_set
1340 Set a member of the XOP structure. "which" is a cpp token
1341 indicating which entry to set. See "Custom Operators" in
1342 perlguts for details about the available members and how they
1343 are used. This macro evaluates its argument more than once.
1344
1345 void XopENTRY_set(XOP *xop, which, value)
1346
1347 XopFLAGS
1348 Return the XOP's flags.
1349
1350 U32 XopFLAGS(XOP *xop)
1351
1353 This section documents functions to manipulate CVs which are code-
1354 values, or subroutines. For more information, see perlguts.
1355
1356 caller_cx
1357 The XSUB-writer's equivalent of caller(). The returned
1358 "PERL_CONTEXT" structure can be interrogated to find all the
1359 information returned to Perl by "caller". Note that XSUBs
1360 don't get a stack frame, so "caller_cx(0, NULL)" will return
1361 information for the immediately-surrounding Perl code.
1362
1363 This function skips over the automatic calls to &DB::sub made
1364 on the behalf of the debugger. If the stack frame requested
1365 was a sub called by "DB::sub", the return value will be the
1366 frame for the call to "DB::sub", since that has the correct
1367 line number/etc. for the call site. If dbcxp is non-"NULL", it
1368 will be set to a pointer to the frame for the sub call itself.
1369
1370 const PERL_CONTEXT * caller_cx(
1371 I32 level,
1372 const PERL_CONTEXT **dbcxp
1373 )
1374
1375 CvSTASH Returns the stash of the CV. A stash is the symbol table hash,
1376 containing the package-scoped variables in the package where
1377 the subroutine was defined. For more information, see
1378 perlguts.
1379
1380 This also has a special use with XS AUTOLOAD subs. See
1381 "Autoloading with XSUBs" in perlguts.
1382
1383 HV* CvSTASH(CV* cv)
1384
1385 find_runcv
1386 Locate the CV corresponding to the currently executing sub or
1387 eval. If "db_seqp" is non_null, skip CVs that are in the DB
1388 package and populate *db_seqp with the cop sequence number at
1389 the point that the DB:: code was entered. (This allows
1390 debuggers to eval in the scope of the breakpoint rather than in
1391 the scope of the debugger itself.)
1392
1393 CV* find_runcv(U32 *db_seqp)
1394
1395 get_cv Uses "strlen" to get the length of "name", then calls
1396 "get_cvn_flags".
1397
1398 NOTE: the perl_ form of this function is deprecated.
1399
1400 CV* get_cv(const char* name, I32 flags)
1401
1402 get_cvn_flags
1403 Returns the CV of the specified Perl subroutine. "flags" are
1404 passed to "gv_fetchpvn_flags". If "GV_ADD" is set and the Perl
1405 subroutine does not exist then it will be declared (which has
1406 the same effect as saying "sub name;"). If "GV_ADD" is not set
1407 and the subroutine does not exist then NULL is returned.
1408
1409 NOTE: the perl_ form of this function is deprecated.
1410
1411 CV* get_cvn_flags(const char* name, STRLEN len,
1412 I32 flags)
1413
1415 ax Variable which is setup by "xsubpp" to indicate the stack base
1416 offset, used by the "ST", "XSprePUSH" and "XSRETURN" macros.
1417 The "dMARK" macro must be called prior to setup the "MARK"
1418 variable.
1419
1420 I32 ax
1421
1422 CLASS Variable which is setup by "xsubpp" to indicate the class name
1423 for a C++ XS constructor. This is always a "char*". See
1424 "THIS".
1425
1426 char* CLASS
1427
1428 dAX Sets up the "ax" variable. This is usually handled
1429 automatically by "xsubpp" by calling "dXSARGS".
1430
1431 dAX;
1432
1433 dAXMARK Sets up the "ax" variable and stack marker variable "mark".
1434 This is usually handled automatically by "xsubpp" by calling
1435 "dXSARGS".
1436
1437 dAXMARK;
1438
1439 dITEMS Sets up the "items" variable. This is usually handled
1440 automatically by "xsubpp" by calling "dXSARGS".
1441
1442 dITEMS;
1443
1444 dUNDERBAR
1445 Sets up any variable needed by the "UNDERBAR" macro. It used
1446 to define "padoff_du", but it is currently a noop. However, it
1447 is strongly advised to still use it for ensuring past and
1448 future compatibility.
1449
1450 dUNDERBAR;
1451
1452 dXSARGS Sets up stack and mark pointers for an XSUB, calling "dSP" and
1453 "dMARK". Sets up the "ax" and "items" variables by calling
1454 "dAX" and "dITEMS". This is usually handled automatically by
1455 "xsubpp".
1456
1457 dXSARGS;
1458
1459 dXSI32 Sets up the "ix" variable for an XSUB which has aliases. This
1460 is usually handled automatically by "xsubpp".
1461
1462 dXSI32;
1463
1464 items Variable which is setup by "xsubpp" to indicate the number of
1465 items on the stack. See "Variable-length Parameter Lists" in
1466 perlxs.
1467
1468 I32 items
1469
1470 ix Variable which is setup by "xsubpp" to indicate which of an
1471 XSUB's aliases was used to invoke it. See "The ALIAS: Keyword"
1472 in perlxs.
1473
1474 I32 ix
1475
1476 RETVAL Variable which is setup by "xsubpp" to hold the return value
1477 for an XSUB. This is always the proper type for the XSUB. See
1478 "The RETVAL Variable" in perlxs.
1479
1480 (whatever) RETVAL
1481
1482 ST Used to access elements on the XSUB's stack.
1483
1484 SV* ST(int ix)
1485
1486 THIS Variable which is setup by "xsubpp" to designate the object in
1487 a C++ XSUB. This is always the proper type for the C++ object.
1488 See "CLASS" and "Using XS With C++" in perlxs.
1489
1490 (whatever) THIS
1491
1492 UNDERBAR
1493 The SV* corresponding to the $_ variable. Works even if there
1494 is a lexical $_ in scope.
1495
1496 XS Macro to declare an XSUB and its C parameter list. This is
1497 handled by "xsubpp". It is the same as using the more explicit
1498 "XS_EXTERNAL" macro.
1499
1500 XS_EXTERNAL
1501 Macro to declare an XSUB and its C parameter list explicitly
1502 exporting the symbols.
1503
1504 XS_INTERNAL
1505 Macro to declare an XSUB and its C parameter list without
1506 exporting the symbols. This is handled by "xsubpp" and
1507 generally preferable over exporting the XSUB symbols
1508 unnecessarily.
1509
1511 dump_all
1512 Dumps the entire optree of the current program starting at
1513 "PL_main_root" to "STDERR". Also dumps the optrees for all
1514 visible subroutines in "PL_defstash".
1515
1516 void dump_all()
1517
1518 dump_packsubs
1519 Dumps the optrees for all visible subroutines in "stash".
1520
1521 void dump_packsubs(const HV* stash)
1522
1523 op_class
1524 Given an op, determine what type of struct it has been
1525 allocated as. Returns one of the OPclass enums, such as
1526 OPclass_LISTOP.
1527
1528 OPclass op_class(const OP *o)
1529
1530 op_dump Dumps the optree starting at OP "o" to "STDERR".
1531
1532 void op_dump(const OP *o)
1533
1534 sv_dump Dumps the contents of an SV to the "STDERR" filehandle.
1535
1536 For an example of its output, see Devel::Peek.
1537
1538 void sv_dump(SV* sv)
1539
1541 pv_display
1542 Similar to
1543
1544 pv_escape(dsv,pv,cur,pvlim,PERL_PV_ESCAPE_QUOTE);
1545
1546 except that an additional "\0" will be appended to the string
1547 when len > cur and pv[cur] is "\0".
1548
1549 Note that the final string may be up to 7 chars longer than
1550 pvlim.
1551
1552 char* pv_display(SV *dsv, const char *pv, STRLEN cur,
1553 STRLEN len, STRLEN pvlim)
1554
1555 pv_escape
1556 Escapes at most the first "count" chars of "pv" and puts the
1557 results into "dsv" such that the size of the escaped string
1558 will not exceed "max" chars and will not contain any incomplete
1559 escape sequences. The number of bytes escaped will be returned
1560 in the "STRLEN *escaped" parameter if it is not null. When the
1561 "dsv" parameter is null no escaping actually occurs, but the
1562 number of bytes that would be escaped were it not null will be
1563 calculated.
1564
1565 If flags contains "PERL_PV_ESCAPE_QUOTE" then any double quotes
1566 in the string will also be escaped.
1567
1568 Normally the SV will be cleared before the escaped string is
1569 prepared, but when "PERL_PV_ESCAPE_NOCLEAR" is set this will
1570 not occur.
1571
1572 If "PERL_PV_ESCAPE_UNI" is set then the input string is treated
1573 as UTF-8 if "PERL_PV_ESCAPE_UNI_DETECT" is set then the input
1574 string is scanned using "is_utf8_string()" to determine if it
1575 is UTF-8.
1576
1577 If "PERL_PV_ESCAPE_ALL" is set then all input chars will be
1578 output using "\x01F1" style escapes, otherwise if
1579 "PERL_PV_ESCAPE_NONASCII" is set, only non-ASCII chars will be
1580 escaped using this style; otherwise, only chars above 255 will
1581 be so escaped; other non printable chars will use octal or
1582 common escaped patterns like "\n". Otherwise, if
1583 "PERL_PV_ESCAPE_NOBACKSLASH" then all chars below 255 will be
1584 treated as printable and will be output as literals.
1585
1586 If "PERL_PV_ESCAPE_FIRSTCHAR" is set then only the first char
1587 of the string will be escaped, regardless of max. If the
1588 output is to be in hex, then it will be returned as a plain hex
1589 sequence. Thus the output will either be a single char, an
1590 octal escape sequence, a special escape like "\n" or a hex
1591 value.
1592
1593 If "PERL_PV_ESCAPE_RE" is set then the escape char used will be
1594 a "%" and not a "\\". This is because regexes very often
1595 contain backslashed sequences, whereas "%" is not a
1596 particularly common character in patterns.
1597
1598 Returns a pointer to the escaped text as held by "dsv".
1599
1600 char* pv_escape(SV *dsv, char const * const str,
1601 const STRLEN count, const STRLEN max,
1602 STRLEN * const escaped,
1603 const U32 flags)
1604
1605 pv_pretty
1606 Converts a string into something presentable, handling escaping
1607 via "pv_escape()" and supporting quoting and ellipses.
1608
1609 If the "PERL_PV_PRETTY_QUOTE" flag is set then the result will
1610 be double quoted with any double quotes in the string escaped.
1611 Otherwise if the "PERL_PV_PRETTY_LTGT" flag is set then the
1612 result be wrapped in angle brackets.
1613
1614 If the "PERL_PV_PRETTY_ELLIPSES" flag is set and not all
1615 characters in string were output then an ellipsis "..." will be
1616 appended to the string. Note that this happens AFTER it has
1617 been quoted.
1618
1619 If "start_color" is non-null then it will be inserted after the
1620 opening quote (if there is one) but before the escaped text.
1621 If "end_color" is non-null then it will be inserted after the
1622 escaped text but before any quotes or ellipses.
1623
1624 Returns a pointer to the prettified text as held by "dsv".
1625
1626 char* pv_pretty(SV *dsv, char const * const str,
1627 const STRLEN count, const STRLEN max,
1628 char const * const start_color,
1629 char const * const end_color,
1630 const U32 flags)
1631
1633 cv_clone
1634 Clone a CV, making a lexical closure. "proto" supplies the
1635 prototype of the function: its code, pad structure, and other
1636 attributes. The prototype is combined with a capture of outer
1637 lexicals to which the code refers, which are taken from the
1638 currently-executing instance of the immediately surrounding
1639 code.
1640
1641 CV * cv_clone(CV *proto)
1642
1643 cv_name Returns an SV containing the name of the CV, mainly for use in
1644 error reporting. The CV may actually be a GV instead, in which
1645 case the returned SV holds the GV's name. Anything other than
1646 a GV or CV is treated as a string already holding the sub name,
1647 but this could change in the future.
1648
1649 An SV may be passed as a second argument. If so, the name will
1650 be assigned to it and it will be returned. Otherwise the
1651 returned SV will be a new mortal.
1652
1653 If "flags" has the "CV_NAME_NOTQUAL" bit set, then the package
1654 name will not be included. If the first argument is neither a
1655 CV nor a GV, this flag is ignored (subject to change).
1656
1657 SV * cv_name(CV *cv, SV *sv, U32 flags)
1658
1659 cv_undef
1660 Clear out all the active components of a CV. This can happen
1661 either by an explicit "undef &foo", or by the reference count
1662 going to zero. In the former case, we keep the "CvOUTSIDE"
1663 pointer, so that any anonymous children can still follow the
1664 full lexical scope chain.
1665
1666 void cv_undef(CV* cv)
1667
1668 find_rundefsv
1669 Returns the global variable $_.
1670
1671 SV * find_rundefsv()
1672
1673 find_rundefsvoffset
1674 DEPRECATED! It is planned to remove this function from a
1675 future release of Perl. Do not use it for new code; remove it
1676 from existing code.
1677
1678 Until the lexical $_ feature was removed, this function would
1679 find the position of the lexical $_ in the pad of the
1680 currently-executing function and return the offset in the
1681 current pad, or "NOT_IN_PAD".
1682
1683 Now it always returns "NOT_IN_PAD".
1684
1685 NOTE: the perl_ form of this function is deprecated.
1686
1687 PADOFFSET find_rundefsvoffset()
1688
1689 intro_my
1690 "Introduce" "my" variables to visible status. This is called
1691 during parsing at the end of each statement to make lexical
1692 variables visible to subsequent statements.
1693
1694 U32 intro_my()
1695
1696 load_module
1697 Loads the module whose name is pointed to by the string part of
1698 "name". Note that the actual module name, not its filename,
1699 should be given. Eg, "Foo::Bar" instead of "Foo/Bar.pm". ver,
1700 if specified and not NULL, provides version semantics similar
1701 to "use Foo::Bar VERSION". The optional trailing arguments can
1702 be used to specify arguments to the module's "import()" method,
1703 similar to "use Foo::Bar VERSION LIST"; their precise handling
1704 depends on the flags. The flags argument is a bitwise-ORed
1705 collection of any of "PERL_LOADMOD_DENY",
1706 "PERL_LOADMOD_NOIMPORT", or "PERL_LOADMOD_IMPORT_OPS" (or 0 for
1707 no flags).
1708
1709 If "PERL_LOADMOD_NOIMPORT" is set, the module is loaded as if
1710 with an empty import list, as in "use Foo::Bar ()"; this is the
1711 only circumstance in which the trailing optional arguments may
1712 be omitted entirely. Otherwise, if "PERL_LOADMOD_IMPORT_OPS" is
1713 set, the trailing arguments must consist of exactly one "OP*",
1714 containing the op tree that produces the relevant import
1715 arguments. Otherwise, the trailing arguments must all be "SV*"
1716 values that will be used as import arguments; and the list must
1717 be terminated with "(SV*) NULL". If neither
1718 "PERL_LOADMOD_NOIMPORT" nor "PERL_LOADMOD_IMPORT_OPS" is set,
1719 the trailing "NULL" pointer is needed even if no import
1720 arguments are desired. The reference count for each specified
1721 "SV*" argument is decremented. In addition, the "name" argument
1722 is modified.
1723
1724 If "PERL_LOADMOD_DENY" is set, the module is loaded as if with
1725 "no" rather than "use".
1726
1727 void load_module(U32 flags, SV* name, SV* ver, ...)
1728
1729 newPADNAMELIST
1730 NOTE: this function is experimental and may change or be
1731 removed without notice.
1732
1733 Creates a new pad name list. "max" is the highest index for
1734 which space is allocated.
1735
1736 PADNAMELIST * newPADNAMELIST(size_t max)
1737
1738 newPADNAMEouter
1739 NOTE: this function is experimental and may change or be
1740 removed without notice.
1741
1742 Constructs and returns a new pad name. Only use this function
1743 for names that refer to outer lexicals. (See also
1744 "newPADNAMEpvn".) "outer" is the outer pad name that this one
1745 mirrors. The returned pad name has the "PADNAMEt_OUTER" flag
1746 already set.
1747
1748 PADNAME * newPADNAMEouter(PADNAME *outer)
1749
1750 newPADNAMEpvn
1751 NOTE: this function is experimental and may change or be
1752 removed without notice.
1753
1754 Constructs and returns a new pad name. "s" must be a UTF-8
1755 string. Do not use this for pad names that point to outer
1756 lexicals. See "newPADNAMEouter".
1757
1758 PADNAME * newPADNAMEpvn(const char *s, STRLEN len)
1759
1760 nothreadhook
1761 Stub that provides thread hook for perl_destruct when there are
1762 no threads.
1763
1764 int nothreadhook()
1765
1766 pad_add_anon
1767 Allocates a place in the currently-compiling pad (via
1768 "pad_alloc") for an anonymous function that is lexically scoped
1769 inside the currently-compiling function. The function "func"
1770 is linked into the pad, and its "CvOUTSIDE" link to the outer
1771 scope is weakened to avoid a reference loop.
1772
1773 One reference count is stolen, so you may need to do
1774 "SvREFCNT_inc(func)".
1775
1776 "optype" should be an opcode indicating the type of operation
1777 that the pad entry is to support. This doesn't affect
1778 operational semantics, but is used for debugging.
1779
1780 PADOFFSET pad_add_anon(CV *func, I32 optype)
1781
1782 pad_add_name_pv
1783 Exactly like "pad_add_name_pvn", but takes a nul-terminated
1784 string instead of a string/length pair.
1785
1786 PADOFFSET pad_add_name_pv(const char *name, U32 flags,
1787 HV *typestash, HV *ourstash)
1788
1789 pad_add_name_pvn
1790 Allocates a place in the currently-compiling pad for a named
1791 lexical variable. Stores the name and other metadata in the
1792 name part of the pad, and makes preparations to manage the
1793 variable's lexical scoping. Returns the offset of the
1794 allocated pad slot.
1795
1796 "namepv"/"namelen" specify the variable's name, including
1797 leading sigil. If "typestash" is non-null, the name is for a
1798 typed lexical, and this identifies the type. If "ourstash" is
1799 non-null, it's a lexical reference to a package variable, and
1800 this identifies the package. The following flags can be OR'ed
1801 together:
1802
1803 padadd_OUR redundantly specifies if it's a package var
1804 padadd_STATE variable will retain value persistently
1805 padadd_NO_DUP_CHECK skip check for lexical shadowing
1806
1807 PADOFFSET pad_add_name_pvn(const char *namepv,
1808 STRLEN namelen, U32 flags,
1809 HV *typestash, HV *ourstash)
1810
1811 pad_add_name_sv
1812 Exactly like "pad_add_name_pvn", but takes the name string in
1813 the form of an SV instead of a string/length pair.
1814
1815 PADOFFSET pad_add_name_sv(SV *name, U32 flags,
1816 HV *typestash, HV *ourstash)
1817
1818 pad_alloc
1819 NOTE: this function is experimental and may change or be
1820 removed without notice.
1821
1822 Allocates a place in the currently-compiling pad, returning the
1823 offset of the allocated pad slot. No name is initially
1824 attached to the pad slot. "tmptype" is a set of flags
1825 indicating the kind of pad entry required, which will be set in
1826 the value SV for the allocated pad entry:
1827
1828 SVs_PADMY named lexical variable ("my", "our", "state")
1829 SVs_PADTMP unnamed temporary store
1830 SVf_READONLY constant shared between recursion levels
1831
1832 "SVf_READONLY" has been supported here only since perl 5.20.
1833 To work with earlier versions as well, use
1834 "SVf_READONLY|SVs_PADTMP". "SVf_READONLY" does not cause the
1835 SV in the pad slot to be marked read-only, but simply tells
1836 "pad_alloc" that it will be made read-only (by the caller), or
1837 at least should be treated as such.
1838
1839 "optype" should be an opcode indicating the type of operation
1840 that the pad entry is to support. This doesn't affect
1841 operational semantics, but is used for debugging.
1842
1843 PADOFFSET pad_alloc(I32 optype, U32 tmptype)
1844
1845 pad_findmy_pv
1846 Exactly like "pad_findmy_pvn", but takes a nul-terminated
1847 string instead of a string/length pair.
1848
1849 PADOFFSET pad_findmy_pv(const char *name, U32 flags)
1850
1851 pad_findmy_pvn
1852 Given the name of a lexical variable, find its position in the
1853 currently-compiling pad. "namepv"/"namelen" specify the
1854 variable's name, including leading sigil. "flags" is reserved
1855 and must be zero. If it is not in the current pad but appears
1856 in the pad of any lexically enclosing scope, then a pseudo-
1857 entry for it is added in the current pad. Returns the offset
1858 in the current pad, or "NOT_IN_PAD" if no such lexical is in
1859 scope.
1860
1861 PADOFFSET pad_findmy_pvn(const char *namepv,
1862 STRLEN namelen, U32 flags)
1863
1864 pad_findmy_sv
1865 Exactly like "pad_findmy_pvn", but takes the name string in the
1866 form of an SV instead of a string/length pair.
1867
1868 PADOFFSET pad_findmy_sv(SV *name, U32 flags)
1869
1870 padnamelist_fetch
1871 NOTE: this function is experimental and may change or be
1872 removed without notice.
1873
1874 Fetches the pad name from the given index.
1875
1876 PADNAME * padnamelist_fetch(PADNAMELIST *pnl,
1877 SSize_t key)
1878
1879 padnamelist_store
1880 NOTE: this function is experimental and may change or be
1881 removed without notice.
1882
1883 Stores the pad name (which may be null) at the given index,
1884 freeing any existing pad name in that slot.
1885
1886 PADNAME ** padnamelist_store(PADNAMELIST *pnl,
1887 SSize_t key, PADNAME *val)
1888
1889 pad_setsv
1890 Set the value at offset "po" in the current (compiling or
1891 executing) pad. Use the macro "PAD_SETSV()" rather than
1892 calling this function directly.
1893
1894 void pad_setsv(PADOFFSET po, SV *sv)
1895
1896 pad_sv Get the value at offset "po" in the current (compiling or
1897 executing) pad. Use macro "PAD_SV" instead of calling this
1898 function directly.
1899
1900 SV * pad_sv(PADOFFSET po)
1901
1902 pad_tidy
1903 NOTE: this function is experimental and may change or be
1904 removed without notice.
1905
1906 Tidy up a pad at the end of compilation of the code to which it
1907 belongs. Jobs performed here are: remove most stuff from the
1908 pads of anonsub prototypes; give it a @_; mark temporaries as
1909 such. "type" indicates the kind of subroutine:
1910
1911 padtidy_SUB ordinary subroutine
1912 padtidy_SUBCLONE prototype for lexical closure
1913 padtidy_FORMAT format
1914
1915 void pad_tidy(padtidy_type type)
1916
1917 perl_alloc
1918 Allocates a new Perl interpreter. See perlembed.
1919
1920 PerlInterpreter* perl_alloc()
1921
1922 perl_construct
1923 Initializes a new Perl interpreter. See perlembed.
1924
1925 void perl_construct(PerlInterpreter *my_perl)
1926
1927 perl_destruct
1928 Shuts down a Perl interpreter. See perlembed for a tutorial.
1929
1930 "my_perl" points to the Perl interpreter. It must have been
1931 previously created through the use of "perl_alloc" and
1932 "perl_construct". It may have been initialised through
1933 "perl_parse", and may have been used through "perl_run" and
1934 other means. This function should be called for any Perl
1935 interpreter that has been constructed with "perl_construct",
1936 even if subsequent operations on it failed, for example if
1937 "perl_parse" returned a non-zero value.
1938
1939 If the interpreter's "PL_exit_flags" word has the
1940 "PERL_EXIT_DESTRUCT_END" flag set, then this function will
1941 execute code in "END" blocks before performing the rest of
1942 destruction. If it is desired to make any use of the
1943 interpreter between "perl_parse" and "perl_destruct" other than
1944 just calling "perl_run", then this flag should be set early on.
1945 This matters if "perl_run" will not be called, or if anything
1946 else will be done in addition to calling "perl_run".
1947
1948 Returns a value be a suitable value to pass to the C library
1949 function "exit" (or to return from "main"), to serve as an exit
1950 code indicating the nature of the way the interpreter
1951 terminated. This takes into account any failure of
1952 "perl_parse" and any early exit from "perl_run". The exit code
1953 is of the type required by the host operating system, so
1954 because of differing exit code conventions it is not portable
1955 to interpret specific numeric values as having specific
1956 meanings.
1957
1958 int perl_destruct(PerlInterpreter *my_perl)
1959
1960 perl_free
1961 Releases a Perl interpreter. See perlembed.
1962
1963 void perl_free(PerlInterpreter *my_perl)
1964
1965 perl_parse
1966 Tells a Perl interpreter to parse a Perl script. This performs
1967 most of the initialisation of a Perl interpreter. See
1968 perlembed for a tutorial.
1969
1970 "my_perl" points to the Perl interpreter that is to parse the
1971 script. It must have been previously created through the use
1972 of "perl_alloc" and "perl_construct". "xsinit" points to a
1973 callback function that will be called to set up the ability for
1974 this Perl interpreter to load XS extensions, or may be null to
1975 perform no such setup.
1976
1977 "argc" and "argv" supply a set of command-line arguments to the
1978 Perl interpreter, as would normally be passed to the "main"
1979 function of a C program. "argv[argc]" must be null. These
1980 arguments are where the script to parse is specified, either by
1981 naming a script file or by providing a script in a "-e" option.
1982 If $0 will be written to in the Perl interpreter, then the
1983 argument strings must be in writable memory, and so mustn't
1984 just be string constants.
1985
1986 "env" specifies a set of environment variables that will be
1987 used by this Perl interpreter. If non-null, it must point to a
1988 null-terminated array of environment strings. If null, the
1989 Perl interpreter will use the environment supplied by the
1990 "environ" global variable.
1991
1992 This function initialises the interpreter, and parses and
1993 compiles the script specified by the command-line arguments.
1994 This includes executing code in "BEGIN", "UNITCHECK", and
1995 "CHECK" blocks. It does not execute "INIT" blocks or the main
1996 program.
1997
1998 Returns an integer of slightly tricky interpretation. The
1999 correct use of the return value is as a truth value indicating
2000 whether there was a failure in initialisation. If zero is
2001 returned, this indicates that initialisation was successful,
2002 and it is safe to proceed to call "perl_run" and make other use
2003 of it. If a non-zero value is returned, this indicates some
2004 problem that means the interpreter wants to terminate. The
2005 interpreter should not be just abandoned upon such failure; the
2006 caller should proceed to shut the interpreter down cleanly with
2007 "perl_destruct" and free it with "perl_free".
2008
2009 For historical reasons, the non-zero return value also attempts
2010 to be a suitable value to pass to the C library function "exit"
2011 (or to return from "main"), to serve as an exit code indicating
2012 the nature of the way initialisation terminated. However, this
2013 isn't portable, due to differing exit code conventions. A
2014 historical bug is preserved for the time being: if the Perl
2015 built-in "exit" is called during this function's execution,
2016 with a type of exit entailing a zero exit code under the host
2017 operating system's conventions, then this function returns zero
2018 rather than a non-zero value. This bug, [perl #2754], leads to
2019 "perl_run" being called (and therefore "INIT" blocks and the
2020 main program running) despite a call to "exit". It has been
2021 preserved because a popular module-installing module has come
2022 to rely on it and needs time to be fixed. This issue is [perl
2023 #132577], and the original bug is due to be fixed in Perl 5.30.
2024
2025 int perl_parse(PerlInterpreter *my_perl,
2026 XSINIT_t xsinit, int argc,
2027 char **argv, char **env)
2028
2029 perl_run
2030 Tells a Perl interpreter to run its main program. See
2031 perlembed for a tutorial.
2032
2033 "my_perl" points to the Perl interpreter. It must have been
2034 previously created through the use of "perl_alloc" and
2035 "perl_construct", and initialised through "perl_parse". This
2036 function should not be called if "perl_parse" returned a non-
2037 zero value, indicating a failure in initialisation or
2038 compilation.
2039
2040 This function executes code in "INIT" blocks, and then executes
2041 the main program. The code to be executed is that established
2042 by the prior call to "perl_parse". If the interpreter's
2043 "PL_exit_flags" word does not have the "PERL_EXIT_DESTRUCT_END"
2044 flag set, then this function will also execute code in "END"
2045 blocks. If it is desired to make any further use of the
2046 interpreter after calling this function, then "END" blocks
2047 should be postponed to "perl_destruct" time by setting that
2048 flag.
2049
2050 Returns an integer of slightly tricky interpretation. The
2051 correct use of the return value is as a truth value indicating
2052 whether the program terminated non-locally. If zero is
2053 returned, this indicates that the program ran to completion,
2054 and it is safe to make other use of the interpreter (provided
2055 that the "PERL_EXIT_DESTRUCT_END" flag was set as described
2056 above). If a non-zero value is returned, this indicates that
2057 the interpreter wants to terminate early. The interpreter
2058 should not be just abandoned because of this desire to
2059 terminate; the caller should proceed to shut the interpreter
2060 down cleanly with "perl_destruct" and free it with "perl_free".
2061
2062 For historical reasons, the non-zero return value also attempts
2063 to be a suitable value to pass to the C library function "exit"
2064 (or to return from "main"), to serve as an exit code indicating
2065 the nature of the way the program terminated. However, this
2066 isn't portable, due to differing exit code conventions. An
2067 attempt is made to return an exit code of the type required by
2068 the host operating system, but because it is constrained to be
2069 non-zero, it is not necessarily possible to indicate every type
2070 of exit. It is only reliable on Unix, where a zero exit code
2071 can be augmented with a set bit that will be ignored. In any
2072 case, this function is not the correct place to acquire an exit
2073 code: one should get that from "perl_destruct".
2074
2075 int perl_run(PerlInterpreter *my_perl)
2076
2077 require_pv
2078 Tells Perl to "require" the file named by the string argument.
2079 It is analogous to the Perl code "eval "require '$file'"".
2080 It's even implemented that way; consider using load_module
2081 instead.
2082
2083 NOTE: the perl_ form of this function is deprecated.
2084
2085 void require_pv(const char* pv)
2086
2088 dXCPT Set up necessary local variables for exception handling. See
2089 "Exception Handling" in perlguts.
2090
2091 dXCPT;
2092
2093 XCPT_CATCH
2094 Introduces a catch block. See "Exception Handling" in
2095 perlguts.
2096
2097 XCPT_RETHROW
2098 Rethrows a previously caught exception. See "Exception
2099 Handling" in perlguts.
2100
2101 XCPT_RETHROW;
2102
2103 XCPT_TRY_END
2104 Ends a try block. See "Exception Handling" in perlguts.
2105
2106 XCPT_TRY_START
2107 Starts a try block. See "Exception Handling" in perlguts.
2108
2110 sortsv_flags
2111 In-place sort an array of SV pointers with the given comparison
2112 routine, with various SORTf_* flag options.
2113
2114 void sortsv_flags(SV** array, size_t num_elts,
2115 SVCOMPARE_t cmp, U32 flags)
2116
2118 save_gp Saves the current GP of gv on the save stack to be restored on
2119 scope exit.
2120
2121 If empty is true, replace the GP with a new GP.
2122
2123 If empty is false, mark gv with GVf_INTRO so the next reference
2124 assigned is localized, which is how " local *foo = $someref; "
2125 works.
2126
2127 void save_gp(GV* gv, I32 empty)
2128
2130 new_version
2131 Returns a new version object based on the passed in SV:
2132
2133 SV *sv = new_version(SV *ver);
2134
2135 Does not alter the passed in ver SV. See "upg_version" if you
2136 want to upgrade the SV.
2137
2138 SV* new_version(SV *ver)
2139
2140 prescan_version
2141 Validate that a given string can be parsed as a version object,
2142 but doesn't actually perform the parsing. Can use either
2143 strict or lax validation rules. Can optionally set a number of
2144 hint variables to save the parsing code some time when
2145 tokenizing.
2146
2147 const char* prescan_version(const char *s, bool strict,
2148 const char** errstr,
2149 bool *sqv,
2150 int *ssaw_decimal,
2151 int *swidth, bool *salpha)
2152
2153 scan_version
2154 Returns a pointer to the next character after the parsed
2155 version string, as well as upgrading the passed in SV to an RV.
2156
2157 Function must be called with an already existing SV like
2158
2159 sv = newSV(0);
2160 s = scan_version(s, SV *sv, bool qv);
2161
2162 Performs some preprocessing to the string to ensure that it has
2163 the correct characteristics of a version. Flags the object if
2164 it contains an underscore (which denotes this is an alpha
2165 version). The boolean qv denotes that the version should be
2166 interpreted as if it had multiple decimals, even if it doesn't.
2167
2168 const char* scan_version(const char *s, SV *rv, bool qv)
2169
2170 upg_version
2171 In-place upgrade of the supplied SV to a version object.
2172
2173 SV *sv = upg_version(SV *sv, bool qv);
2174
2175 Returns a pointer to the upgraded SV. Set the boolean qv if
2176 you want to force this SV to be interpreted as an "extended"
2177 version.
2178
2179 SV* upg_version(SV *ver, bool qv)
2180
2181 vcmp Version object aware cmp. Both operands must already have been
2182 converted into version objects.
2183
2184 int vcmp(SV *lhv, SV *rhv)
2185
2186 vnormal Accepts a version object and returns the normalized string
2187 representation. Call like:
2188
2189 sv = vnormal(rv);
2190
2191 NOTE: you can pass either the object directly or the SV
2192 contained within the RV.
2193
2194 The SV returned has a refcount of 1.
2195
2196 SV* vnormal(SV *vs)
2197
2198 vnumify Accepts a version object and returns the normalized floating
2199 point representation. Call like:
2200
2201 sv = vnumify(rv);
2202
2203 NOTE: you can pass either the object directly or the SV
2204 contained within the RV.
2205
2206 The SV returned has a refcount of 1.
2207
2208 SV* vnumify(SV *vs)
2209
2210 vstringify
2211 In order to maintain maximum compatibility with earlier
2212 versions of Perl, this function will return either the floating
2213 point notation or the multiple dotted notation, depending on
2214 whether the original version contained 1 or more dots,
2215 respectively.
2216
2217 The SV returned has a refcount of 1.
2218
2219 SV* vstringify(SV *vs)
2220
2221 vverify Validates that the SV contains valid internal structure for a
2222 version object. It may be passed either the version object
2223 (RV) or the hash itself (HV). If the structure is valid, it
2224 returns the HV. If the structure is invalid, it returns NULL.
2225
2226 SV *hv = vverify(sv);
2227
2228 Note that it only confirms the bare minimum structure (so as
2229 not to get confused by derived classes which may contain
2230 additional hash entries):
2231
2232 · The SV is an HV or a reference to an HV
2233
2234 · The hash contains a "version" key
2235
2236 · The "version" key has a reference to an AV as its value
2237
2238 SV* vverify(SV *vs)
2239
2241 G_ARRAY Used to indicate list context. See "GIMME_V", "GIMME" and
2242 perlcall.
2243
2244 G_DISCARD
2245 Indicates that arguments returned from a callback should be
2246 discarded. See perlcall.
2247
2248 G_EVAL Used to force a Perl "eval" wrapper around a callback. See
2249 perlcall.
2250
2251 GIMME A backward-compatible version of "GIMME_V" which can only
2252 return "G_SCALAR" or "G_ARRAY"; in a void context, it returns
2253 "G_SCALAR". Deprecated. Use "GIMME_V" instead.
2254
2255 U32 GIMME
2256
2257 GIMME_V The XSUB-writer's equivalent to Perl's "wantarray". Returns
2258 "G_VOID", "G_SCALAR" or "G_ARRAY" for void, scalar or list
2259 context, respectively. See perlcall for a usage example.
2260
2261 U32 GIMME_V
2262
2263 G_NOARGS
2264 Indicates that no arguments are being sent to a callback. See
2265 perlcall.
2266
2267 G_SCALAR
2268 Used to indicate scalar context. See "GIMME_V", "GIMME", and
2269 perlcall.
2270
2271 G_VOID Used to indicate void context. See "GIMME_V" and perlcall.
2272
2274 These variables are global to an entire process. They are shared
2275 between all interpreters and all threads in a process. Any variables
2276 not documented here may be changed or removed without notice, so don't
2277 use them! If you feel you really do need to use an unlisted variable,
2278 first send email to perl5-porters@perl.org
2279 <mailto:perl5-porters@perl.org>. It may be that someone there will
2280 point out a way to accomplish what you need without using an internal
2281 variable. But if not, you should get a go-ahead to document and then
2282 use the variable.
2283
2284 PL_check
2285 Array, indexed by opcode, of functions that will be called for
2286 the "check" phase of optree building during compilation of Perl
2287 code. For most (but not all) types of op, once the op has been
2288 initially built and populated with child ops it will be
2289 filtered through the check function referenced by the
2290 appropriate element of this array. The new op is passed in as
2291 the sole argument to the check function, and the check function
2292 returns the completed op. The check function may (as the name
2293 suggests) check the op for validity and signal errors. It may
2294 also initialise or modify parts of the ops, or perform more
2295 radical surgery such as adding or removing child ops, or even
2296 throw the op away and return a different op in its place.
2297
2298 This array of function pointers is a convenient place to hook
2299 into the compilation process. An XS module can put its own
2300 custom check function in place of any of the standard ones, to
2301 influence the compilation of a particular type of op. However,
2302 a custom check function must never fully replace a standard
2303 check function (or even a custom check function from another
2304 module). A module modifying checking must instead wrap the
2305 preexisting check function. A custom check function must be
2306 selective about when to apply its custom behaviour. In the
2307 usual case where it decides not to do anything special with an
2308 op, it must chain the preexisting op function. Check functions
2309 are thus linked in a chain, with the core's base checker at the
2310 end.
2311
2312 For thread safety, modules should not write directly to this
2313 array. Instead, use the function "wrap_op_checker".
2314
2315 PL_keyword_plugin
2316 NOTE: this function is experimental and may change or be
2317 removed without notice.
2318
2319 Function pointer, pointing at a function used to handle
2320 extended keywords. The function should be declared as
2321
2322 int keyword_plugin_function(pTHX_
2323 char *keyword_ptr, STRLEN keyword_len,
2324 OP **op_ptr)
2325
2326 The function is called from the tokeniser, whenever a possible
2327 keyword is seen. "keyword_ptr" points at the word in the
2328 parser's input buffer, and "keyword_len" gives its length; it
2329 is not null-terminated. The function is expected to examine
2330 the word, and possibly other state such as %^H, to decide
2331 whether it wants to handle it as an extended keyword. If it
2332 does not, the function should return "KEYWORD_PLUGIN_DECLINE",
2333 and the normal parser process will continue.
2334
2335 If the function wants to handle the keyword, it first must
2336 parse anything following the keyword that is part of the syntax
2337 introduced by the keyword. See "Lexer interface" for details.
2338
2339 When a keyword is being handled, the plugin function must build
2340 a tree of "OP" structures, representing the code that was
2341 parsed. The root of the tree must be stored in *op_ptr. The
2342 function then returns a constant indicating the syntactic role
2343 of the construct that it has parsed: "KEYWORD_PLUGIN_STMT" if
2344 it is a complete statement, or "KEYWORD_PLUGIN_EXPR" if it is
2345 an expression. Note that a statement construct cannot be used
2346 inside an expression (except via "do BLOCK" and similar), and
2347 an expression is not a complete statement (it requires at least
2348 a terminating semicolon).
2349
2350 When a keyword is handled, the plugin function may also have
2351 (compile-time) side effects. It may modify "%^H", define
2352 functions, and so on. Typically, if side effects are the main
2353 purpose of a handler, it does not wish to generate any ops to
2354 be included in the normal compilation. In this case it is
2355 still required to supply an op tree, but it suffices to
2356 generate a single null op.
2357
2358 That's how the *PL_keyword_plugin function needs to behave
2359 overall. Conventionally, however, one does not completely
2360 replace the existing handler function. Instead, take a copy of
2361 "PL_keyword_plugin" before assigning your own function pointer
2362 to it. Your handler function should look for keywords that it
2363 is interested in and handle those. Where it is not interested,
2364 it should call the saved plugin function, passing on the
2365 arguments it received. Thus "PL_keyword_plugin" actually
2366 points at a chain of handler functions, all of which have an
2367 opportunity to handle keywords, and only the last function in
2368 the chain (built into the Perl core) will normally return
2369 "KEYWORD_PLUGIN_DECLINE".
2370
2371 For thread safety, modules should not set this variable
2372 directly. Instead, use the function "wrap_keyword_plugin".
2373
2375 A GV is a structure which corresponds to to a Perl typeglob, ie *foo.
2376 It is a structure that holds a pointer to a scalar, an array, a hash
2377 etc, corresponding to $foo, @foo, %foo.
2378
2379 GVs are usually found as values in stashes (symbol table hashes) where
2380 Perl stores its global variables.
2381
2382 GvAV Return the AV from the GV.
2383
2384 AV* GvAV(GV* gv)
2385
2386 gv_const_sv
2387 If "gv" is a typeglob whose subroutine entry is a constant sub
2388 eligible for inlining, or "gv" is a placeholder reference that
2389 would be promoted to such a typeglob, then returns the value
2390 returned by the sub. Otherwise, returns "NULL".
2391
2392 SV* gv_const_sv(GV* gv)
2393
2394 GvCV Return the CV from the GV.
2395
2396 CV* GvCV(GV* gv)
2397
2398 gv_fetchmeth
2399 Like "gv_fetchmeth_pvn", but lacks a flags parameter.
2400
2401 GV* gv_fetchmeth(HV* stash, const char* name,
2402 STRLEN len, I32 level)
2403
2404 gv_fetchmethod_autoload
2405 Returns the glob which contains the subroutine to call to
2406 invoke the method on the "stash". In fact in the presence of
2407 autoloading this may be the glob for "AUTOLOAD". In this case
2408 the corresponding variable $AUTOLOAD is already setup.
2409
2410 The third parameter of "gv_fetchmethod_autoload" determines
2411 whether AUTOLOAD lookup is performed if the given method is not
2412 present: non-zero means yes, look for AUTOLOAD; zero means no,
2413 don't look for AUTOLOAD. Calling "gv_fetchmethod" is
2414 equivalent to calling "gv_fetchmethod_autoload" with a non-zero
2415 "autoload" parameter.
2416
2417 These functions grant "SUPER" token as a prefix of the method
2418 name. Note that if you want to keep the returned glob for a
2419 long time, you need to check for it being "AUTOLOAD", since at
2420 the later time the call may load a different subroutine due to
2421 $AUTOLOAD changing its value. Use the glob created as a side
2422 effect to do this.
2423
2424 These functions have the same side-effects as "gv_fetchmeth"
2425 with "level==0". The warning against passing the GV returned
2426 by "gv_fetchmeth" to "call_sv" applies equally to these
2427 functions.
2428
2429 GV* gv_fetchmethod_autoload(HV* stash,
2430 const char* name,
2431 I32 autoload)
2432
2433 gv_fetchmeth_autoload
2434 This is the old form of "gv_fetchmeth_pvn_autoload", which has
2435 no flags parameter.
2436
2437 GV* gv_fetchmeth_autoload(HV* stash,
2438 const char* name,
2439 STRLEN len, I32 level)
2440
2441 gv_fetchmeth_pv
2442 Exactly like "gv_fetchmeth_pvn", but takes a nul-terminated
2443 string instead of a string/length pair.
2444
2445 GV* gv_fetchmeth_pv(HV* stash, const char* name,
2446 I32 level, U32 flags)
2447
2448 gv_fetchmeth_pvn
2449 Returns the glob with the given "name" and a defined subroutine
2450 or "NULL". The glob lives in the given "stash", or in the
2451 stashes accessible via @ISA and "UNIVERSAL::".
2452
2453 The argument "level" should be either 0 or -1. If "level==0",
2454 as a side-effect creates a glob with the given "name" in the
2455 given "stash" which in the case of success contains an alias
2456 for the subroutine, and sets up caching info for this glob.
2457
2458 The only significant values for "flags" are "GV_SUPER" and
2459 "SVf_UTF8".
2460
2461 "GV_SUPER" indicates that we want to look up the method in the
2462 superclasses of the "stash".
2463
2464 The GV returned from "gv_fetchmeth" may be a method cache
2465 entry, which is not visible to Perl code. So when calling
2466 "call_sv", you should not use the GV directly; instead, you
2467 should use the method's CV, which can be obtained from the GV
2468 with the "GvCV" macro.
2469
2470 GV* gv_fetchmeth_pvn(HV* stash, const char* name,
2471 STRLEN len, I32 level,
2472 U32 flags)
2473
2474 gv_fetchmeth_pvn_autoload
2475 Same as "gv_fetchmeth_pvn()", but looks for autoloaded
2476 subroutines too. Returns a glob for the subroutine.
2477
2478 For an autoloaded subroutine without a GV, will create a GV
2479 even if "level < 0". For an autoloaded subroutine without a
2480 stub, "GvCV()" of the result may be zero.
2481
2482 Currently, the only significant value for "flags" is
2483 "SVf_UTF8".
2484
2485 GV* gv_fetchmeth_pvn_autoload(HV* stash,
2486 const char* name,
2487 STRLEN len, I32 level,
2488 U32 flags)
2489
2490 gv_fetchmeth_pv_autoload
2491 Exactly like "gv_fetchmeth_pvn_autoload", but takes a nul-
2492 terminated string instead of a string/length pair.
2493
2494 GV* gv_fetchmeth_pv_autoload(HV* stash,
2495 const char* name,
2496 I32 level, U32 flags)
2497
2498 gv_fetchmeth_sv
2499 Exactly like "gv_fetchmeth_pvn", but takes the name string in
2500 the form of an SV instead of a string/length pair.
2501
2502 GV* gv_fetchmeth_sv(HV* stash, SV* namesv,
2503 I32 level, U32 flags)
2504
2505 gv_fetchmeth_sv_autoload
2506 Exactly like "gv_fetchmeth_pvn_autoload", but takes the name
2507 string in the form of an SV instead of a string/length pair.
2508
2509 GV* gv_fetchmeth_sv_autoload(HV* stash, SV* namesv,
2510 I32 level, U32 flags)
2511
2512 GvHV Return the HV from the GV.
2513
2514 HV* GvHV(GV* gv)
2515
2516 gv_init The old form of "gv_init_pvn()". It does not work with UTF-8
2517 strings, as it has no flags parameter. If the "multi"
2518 parameter is set, the "GV_ADDMULTI" flag will be passed to
2519 "gv_init_pvn()".
2520
2521 void gv_init(GV* gv, HV* stash, const char* name,
2522 STRLEN len, int multi)
2523
2524 gv_init_pv
2525 Same as "gv_init_pvn()", but takes a nul-terminated string for
2526 the name instead of separate char * and length parameters.
2527
2528 void gv_init_pv(GV* gv, HV* stash, const char* name,
2529 U32 flags)
2530
2531 gv_init_pvn
2532 Converts a scalar into a typeglob. This is an incoercible
2533 typeglob; assigning a reference to it will assign to one of its
2534 slots, instead of overwriting it as happens with typeglobs
2535 created by "SvSetSV". Converting any scalar that is "SvOK()"
2536 may produce unpredictable results and is reserved for perl's
2537 internal use.
2538
2539 "gv" is the scalar to be converted.
2540
2541 "stash" is the parent stash/package, if any.
2542
2543 "name" and "len" give the name. The name must be unqualified;
2544 that is, it must not include the package name. If "gv" is a
2545 stash element, it is the caller's responsibility to ensure that
2546 the name passed to this function matches the name of the
2547 element. If it does not match, perl's internal bookkeeping
2548 will get out of sync.
2549
2550 "flags" can be set to "SVf_UTF8" if "name" is a UTF-8 string,
2551 or the return value of SvUTF8(sv). It can also take the
2552 "GV_ADDMULTI" flag, which means to pretend that the GV has been
2553 seen before (i.e., suppress "Used once" warnings).
2554
2555 void gv_init_pvn(GV* gv, HV* stash, const char* name,
2556 STRLEN len, U32 flags)
2557
2558 gv_init_sv
2559 Same as "gv_init_pvn()", but takes an SV * for the name instead
2560 of separate char * and length parameters. "flags" is currently
2561 unused.
2562
2563 void gv_init_sv(GV* gv, HV* stash, SV* namesv,
2564 U32 flags)
2565
2566 gv_stashpv
2567 Returns a pointer to the stash for a specified package. Uses
2568 "strlen" to determine the length of "name", then calls
2569 "gv_stashpvn()".
2570
2571 HV* gv_stashpv(const char* name, I32 flags)
2572
2573 gv_stashpvn
2574 Returns a pointer to the stash for a specified package. The
2575 "namelen" parameter indicates the length of the "name", in
2576 bytes. "flags" is passed to "gv_fetchpvn_flags()", so if set
2577 to "GV_ADD" then the package will be created if it does not
2578 already exist. If the package does not exist and "flags" is 0
2579 (or any other setting that does not create packages) then
2580 "NULL" is returned.
2581
2582 Flags may be one of:
2583
2584 GV_ADD
2585 SVf_UTF8
2586 GV_NOADD_NOINIT
2587 GV_NOINIT
2588 GV_NOEXPAND
2589 GV_ADDMG
2590
2591 The most important of which are probably "GV_ADD" and
2592 "SVf_UTF8".
2593
2594 Note, use of "gv_stashsv" instead of "gv_stashpvn" where
2595 possible is strongly recommended for performance reasons.
2596
2597 HV* gv_stashpvn(const char* name, U32 namelen,
2598 I32 flags)
2599
2600 gv_stashpvs
2601 Like "gv_stashpvn", but takes a literal string instead of a
2602 string/length pair.
2603
2604 HV* gv_stashpvs("literal string" name, I32 create)
2605
2606 gv_stashsv
2607 Returns a pointer to the stash for a specified package. See
2608 "gv_stashpvn".
2609
2610 Note this interface is strongly preferred over "gv_stashpvn"
2611 for performance reasons.
2612
2613 HV* gv_stashsv(SV* sv, I32 flags)
2614
2615 GvSV Return the SV from the GV.
2616
2617 SV* GvSV(GV* gv)
2618
2619 setdefout
2620 Sets "PL_defoutgv", the default file handle for output, to the
2621 passed in typeglob. As "PL_defoutgv" "owns" a reference on its
2622 typeglob, the reference count of the passed in typeglob is
2623 increased by one, and the reference count of the typeglob that
2624 "PL_defoutgv" points to is decreased by one.
2625
2626 void setdefout(GV* gv)
2627
2629 Nullav Null AV pointer.
2630
2631 (deprecated - use "(AV *)NULL" instead)
2632
2633 Nullch Null character pointer. (No longer available when "PERL_CORE"
2634 is defined.)
2635
2636 Nullcv Null CV pointer.
2637
2638 (deprecated - use "(CV *)NULL" instead)
2639
2640 Nullhv Null HV pointer.
2641
2642 (deprecated - use "(HV *)NULL" instead)
2643
2644 Nullsv Null SV pointer. (No longer available when "PERL_CORE" is
2645 defined.)
2646
2648 A HV structure represents a Perl hash. It consists mainly of an array
2649 of pointers, each of which points to a linked list of HE structures.
2650 The array is indexed by the hash function of the key, so each linked
2651 list represents all the hash entries with the same hash value. Each HE
2652 contains a pointer to the actual value, plus a pointer to a HEK
2653 structure which holds the key and hash value.
2654
2655 cop_fetch_label
2656 NOTE: this function is experimental and may change or be
2657 removed without notice.
2658
2659 Returns the label attached to a cop. The flags pointer may be
2660 set to "SVf_UTF8" or 0.
2661
2662 const char * cop_fetch_label(COP *const cop,
2663 STRLEN *len, U32 *flags)
2664
2665 cop_store_label
2666 NOTE: this function is experimental and may change or be
2667 removed without notice.
2668
2669 Save a label into a "cop_hints_hash". You need to set flags to
2670 "SVf_UTF8" for a UTF-8 label.
2671
2672 void cop_store_label(COP *const cop,
2673 const char *label, STRLEN len,
2674 U32 flags)
2675
2676 get_hv Returns the HV of the specified Perl hash. "flags" are passed
2677 to "gv_fetchpv". If "GV_ADD" is set and the Perl variable does
2678 not exist then it will be created. If "flags" is zero and the
2679 variable does not exist then "NULL" is returned.
2680
2681 NOTE: the perl_ form of this function is deprecated.
2682
2683 HV* get_hv(const char *name, I32 flags)
2684
2685 HEf_SVKEY
2686 This flag, used in the length slot of hash entries and magic
2687 structures, specifies the structure contains an "SV*" pointer
2688 where a "char*" pointer is to be expected. (For information
2689 only--not to be used).
2690
2691 HeHASH Returns the computed hash stored in the hash entry.
2692
2693 U32 HeHASH(HE* he)
2694
2695 HeKEY Returns the actual pointer stored in the key slot of the hash
2696 entry. The pointer may be either "char*" or "SV*", depending
2697 on the value of "HeKLEN()". Can be assigned to. The "HePV()"
2698 or "HeSVKEY()" macros are usually preferable for finding the
2699 value of a key.
2700
2701 void* HeKEY(HE* he)
2702
2703 HeKLEN If this is negative, and amounts to "HEf_SVKEY", it indicates
2704 the entry holds an "SV*" key. Otherwise, holds the actual
2705 length of the key. Can be assigned to. The "HePV()" macro is
2706 usually preferable for finding key lengths.
2707
2708 STRLEN HeKLEN(HE* he)
2709
2710 HePV Returns the key slot of the hash entry as a "char*" value,
2711 doing any necessary dereferencing of possibly "SV*" keys. The
2712 length of the string is placed in "len" (this is a macro, so do
2713 not use &len). If you do not care about what the length of the
2714 key is, you may use the global variable "PL_na", though this is
2715 rather less efficient than using a local variable. Remember
2716 though, that hash keys in perl are free to contain embedded
2717 nulls, so using "strlen()" or similar is not a good way to find
2718 the length of hash keys. This is very similar to the "SvPV()"
2719 macro described elsewhere in this document. See also "HeUTF8".
2720
2721 If you are using "HePV" to get values to pass to "newSVpvn()"
2722 to create a new SV, you should consider using
2723 "newSVhek(HeKEY_hek(he))" as it is more efficient.
2724
2725 char* HePV(HE* he, STRLEN len)
2726
2727 HeSVKEY Returns the key as an "SV*", or "NULL" if the hash entry does
2728 not contain an "SV*" key.
2729
2730 SV* HeSVKEY(HE* he)
2731
2732 HeSVKEY_force
2733 Returns the key as an "SV*". Will create and return a
2734 temporary mortal "SV*" if the hash entry contains only a
2735 "char*" key.
2736
2737 SV* HeSVKEY_force(HE* he)
2738
2739 HeSVKEY_set
2740 Sets the key to a given "SV*", taking care to set the
2741 appropriate flags to indicate the presence of an "SV*" key, and
2742 returns the same "SV*".
2743
2744 SV* HeSVKEY_set(HE* he, SV* sv)
2745
2746 HeUTF8 Returns whether the "char *" value returned by "HePV" is
2747 encoded in UTF-8, doing any necessary dereferencing of possibly
2748 "SV*" keys. The value returned will be 0 or non-0, not
2749 necessarily 1 (or even a value with any low bits set), so do
2750 not blindly assign this to a "bool" variable, as "bool" may be
2751 a typedef for "char".
2752
2753 U32 HeUTF8(HE* he)
2754
2755 HeVAL Returns the value slot (type "SV*") stored in the hash entry.
2756 Can be assigned to.
2757
2758 SV *foo= HeVAL(hv);
2759 HeVAL(hv)= sv;
2760
2761
2762 SV* HeVAL(HE* he)
2763
2764 hv_assert
2765 Check that a hash is in an internally consistent state.
2766
2767 void hv_assert(HV *hv)
2768
2769 hv_bucket_ratio
2770 NOTE: this function is experimental and may change or be
2771 removed without notice.
2772
2773 If the hash is tied dispatches through to the SCALAR tied
2774 method, otherwise if the hash contains no keys returns 0,
2775 otherwise returns a mortal sv containing a string specifying
2776 the number of used buckets, followed by a slash, followed by
2777 the number of available buckets.
2778
2779 This function is expensive, it must scan all of the buckets to
2780 determine which are used, and the count is NOT cached. In a
2781 large hash this could be a lot of buckets.
2782
2783 SV* hv_bucket_ratio(HV *hv)
2784
2785 hv_clear
2786 Frees the all the elements of a hash, leaving it empty. The XS
2787 equivalent of "%hash = ()". See also "hv_undef".
2788
2789 See "av_clear" for a note about the hash possibly being invalid
2790 on return.
2791
2792 void hv_clear(HV *hv)
2793
2794 hv_clear_placeholders
2795 Clears any placeholders from a hash. If a restricted hash has
2796 any of its keys marked as readonly and the key is subsequently
2797 deleted, the key is not actually deleted but is marked by
2798 assigning it a value of &PL_sv_placeholder. This tags it so it
2799 will be ignored by future operations such as iterating over the
2800 hash, but will still allow the hash to have a value reassigned
2801 to the key at some future point. This function clears any such
2802 placeholder keys from the hash. See "Hash::Util::lock_keys()"
2803 for an example of its use.
2804
2805 void hv_clear_placeholders(HV *hv)
2806
2807 hv_copy_hints_hv
2808 A specialised version of "newHVhv" for copying "%^H". "ohv"
2809 must be a pointer to a hash (which may have "%^H" magic, but
2810 should be generally non-magical), or "NULL" (interpreted as an
2811 empty hash). The content of "ohv" is copied to a new hash,
2812 which has the "%^H"-specific magic added to it. A pointer to
2813 the new hash is returned.
2814
2815 HV * hv_copy_hints_hv(HV *ohv)
2816
2817 hv_delete
2818 Deletes a key/value pair in the hash. The value's SV is
2819 removed from the hash, made mortal, and returned to the caller.
2820 The absolute value of "klen" is the length of the key. If
2821 "klen" is negative the key is assumed to be in UTF-8-encoded
2822 Unicode. The "flags" value will normally be zero; if set to
2823 "G_DISCARD" then "NULL" will be returned. "NULL" will also be
2824 returned if the key is not found.
2825
2826 SV* hv_delete(HV *hv, const char *key, I32 klen,
2827 I32 flags)
2828
2829 hv_delete_ent
2830 Deletes a key/value pair in the hash. The value SV is removed
2831 from the hash, made mortal, and returned to the caller. The
2832 "flags" value will normally be zero; if set to "G_DISCARD" then
2833 "NULL" will be returned. "NULL" will also be returned if the
2834 key is not found. "hash" can be a valid precomputed hash
2835 value, or 0 to ask for it to be computed.
2836
2837 SV* hv_delete_ent(HV *hv, SV *keysv, I32 flags,
2838 U32 hash)
2839
2840 HvENAME Returns the effective name of a stash, or NULL if there is
2841 none. The effective name represents a location in the symbol
2842 table where this stash resides. It is updated automatically
2843 when packages are aliased or deleted. A stash that is no
2844 longer in the symbol table has no effective name. This name is
2845 preferable to "HvNAME" for use in MRO linearisations and isa
2846 caches.
2847
2848 char* HvENAME(HV* stash)
2849
2850 HvENAMELEN
2851 Returns the length of the stash's effective name.
2852
2853 STRLEN HvENAMELEN(HV *stash)
2854
2855 HvENAMEUTF8
2856 Returns true if the effective name is in UTF-8 encoding.
2857
2858 unsigned char HvENAMEUTF8(HV *stash)
2859
2860 hv_exists
2861 Returns a boolean indicating whether the specified hash key
2862 exists. The absolute value of "klen" is the length of the key.
2863 If "klen" is negative the key is assumed to be in UTF-8-encoded
2864 Unicode.
2865
2866 bool hv_exists(HV *hv, const char *key, I32 klen)
2867
2868 hv_exists_ent
2869 Returns a boolean indicating whether the specified hash key
2870 exists. "hash" can be a valid precomputed hash value, or 0 to
2871 ask for it to be computed.
2872
2873 bool hv_exists_ent(HV *hv, SV *keysv, U32 hash)
2874
2875 hv_fetch
2876 Returns the SV which corresponds to the specified key in the
2877 hash. The absolute value of "klen" is the length of the key.
2878 If "klen" is negative the key is assumed to be in UTF-8-encoded
2879 Unicode. If "lval" is set then the fetch will be part of a
2880 store. This means that if there is no value in the hash
2881 associated with the given key, then one is created and a
2882 pointer to it is returned. The "SV*" it points to can be
2883 assigned to. But always check that the return value is non-
2884 null before dereferencing it to an "SV*".
2885
2886 See "Understanding the Magic of Tied Hashes and Arrays" in
2887 perlguts for more information on how to use this function on
2888 tied hashes.
2889
2890 SV** hv_fetch(HV *hv, const char *key, I32 klen,
2891 I32 lval)
2892
2893 hv_fetchs
2894 Like "hv_fetch", but takes a literal string instead of a
2895 string/length pair.
2896
2897 SV** hv_fetchs(HV* tb, "literal string" key,
2898 I32 lval)
2899
2900 hv_fetch_ent
2901 Returns the hash entry which corresponds to the specified key
2902 in the hash. "hash" must be a valid precomputed hash number
2903 for the given "key", or 0 if you want the function to compute
2904 it. IF "lval" is set then the fetch will be part of a store.
2905 Make sure the return value is non-null before accessing it.
2906 The return value when "hv" is a tied hash is a pointer to a
2907 static location, so be sure to make a copy of the structure if
2908 you need to store it somewhere.
2909
2910 See "Understanding the Magic of Tied Hashes and Arrays" in
2911 perlguts for more information on how to use this function on
2912 tied hashes.
2913
2914 HE* hv_fetch_ent(HV *hv, SV *keysv, I32 lval,
2915 U32 hash)
2916
2917 hv_fill Returns the number of hash buckets that happen to be in use.
2918
2919 This function is wrapped by the macro "HvFILL".
2920
2921 As of perl 5.25 this function is used only for debugging
2922 purposes, and the number of used hash buckets is not in any way
2923 cached, thus this function can be costly to execute as it must
2924 iterate over all the buckets in the hash.
2925
2926 STRLEN hv_fill(HV *const hv)
2927
2928 hv_iterinit
2929 Prepares a starting point to traverse a hash table. Returns
2930 the number of keys in the hash, including placeholders (i.e.
2931 the same as "HvTOTALKEYS(hv)"). The return value is currently
2932 only meaningful for hashes without tie magic.
2933
2934 NOTE: Before version 5.004_65, "hv_iterinit" used to return the
2935 number of hash buckets that happen to be in use. If you still
2936 need that esoteric value, you can get it through the macro
2937 "HvFILL(hv)".
2938
2939 I32 hv_iterinit(HV *hv)
2940
2941 hv_iterkey
2942 Returns the key from the current position of the hash iterator.
2943 See "hv_iterinit".
2944
2945 char* hv_iterkey(HE* entry, I32* retlen)
2946
2947 hv_iterkeysv
2948 Returns the key as an "SV*" from the current position of the
2949 hash iterator. The return value will always be a mortal copy
2950 of the key. Also see "hv_iterinit".
2951
2952 SV* hv_iterkeysv(HE* entry)
2953
2954 hv_iternext
2955 Returns entries from a hash iterator. See "hv_iterinit".
2956
2957 You may call "hv_delete" or "hv_delete_ent" on the hash entry
2958 that the iterator currently points to, without losing your
2959 place or invalidating your iterator. Note that in this case
2960 the current entry is deleted from the hash with your iterator
2961 holding the last reference to it. Your iterator is flagged to
2962 free the entry on the next call to "hv_iternext", so you must
2963 not discard your iterator immediately else the entry will leak
2964 - call "hv_iternext" to trigger the resource deallocation.
2965
2966 HE* hv_iternext(HV *hv)
2967
2968 hv_iternextsv
2969 Performs an "hv_iternext", "hv_iterkey", and "hv_iterval" in
2970 one operation.
2971
2972 SV* hv_iternextsv(HV *hv, char **key, I32 *retlen)
2973
2974 hv_iternext_flags
2975 NOTE: this function is experimental and may change or be
2976 removed without notice.
2977
2978 Returns entries from a hash iterator. See "hv_iterinit" and
2979 "hv_iternext". The "flags" value will normally be zero; if
2980 "HV_ITERNEXT_WANTPLACEHOLDERS" is set the placeholders keys
2981 (for restricted hashes) will be returned in addition to normal
2982 keys. By default placeholders are automatically skipped over.
2983 Currently a placeholder is implemented with a value that is
2984 &PL_sv_placeholder. Note that the implementation of
2985 placeholders and restricted hashes may change, and the
2986 implementation currently is insufficiently abstracted for any
2987 change to be tidy.
2988
2989 HE* hv_iternext_flags(HV *hv, I32 flags)
2990
2991 hv_iterval
2992 Returns the value from the current position of the hash
2993 iterator. See "hv_iterkey".
2994
2995 SV* hv_iterval(HV *hv, HE *entry)
2996
2997 hv_magic
2998 Adds magic to a hash. See "sv_magic".
2999
3000 void hv_magic(HV *hv, GV *gv, int how)
3001
3002 HvNAME Returns the package name of a stash, or "NULL" if "stash" isn't
3003 a stash. See "SvSTASH", "CvSTASH".
3004
3005 char* HvNAME(HV* stash)
3006
3007 HvNAMELEN
3008 Returns the length of the stash's name.
3009
3010 STRLEN HvNAMELEN(HV *stash)
3011
3012 HvNAMEUTF8
3013 Returns true if the name is in UTF-8 encoding.
3014
3015 unsigned char HvNAMEUTF8(HV *stash)
3016
3017 hv_scalar
3018 Evaluates the hash in scalar context and returns the result.
3019
3020 When the hash is tied dispatches through to the SCALAR method,
3021 otherwise returns a mortal SV containing the number of keys in
3022 the hash.
3023
3024 Note, prior to 5.25 this function returned what is now returned
3025 by the hv_bucket_ratio() function.
3026
3027 SV* hv_scalar(HV *hv)
3028
3029 hv_store
3030 Stores an SV in a hash. The hash key is specified as "key" and
3031 the absolute value of "klen" is the length of the key. If
3032 "klen" is negative the key is assumed to be in UTF-8-encoded
3033 Unicode. The "hash" parameter is the precomputed hash value;
3034 if it is zero then Perl will compute it.
3035
3036 The return value will be "NULL" if the operation failed or if
3037 the value did not need to be actually stored within the hash
3038 (as in the case of tied hashes). Otherwise it can be
3039 dereferenced to get the original "SV*". Note that the caller
3040 is responsible for suitably incrementing the reference count of
3041 "val" before the call, and decrementing it if the function
3042 returned "NULL". Effectively a successful "hv_store" takes
3043 ownership of one reference to "val". This is usually what you
3044 want; a newly created SV has a reference count of one, so if
3045 all your code does is create SVs then store them in a hash,
3046 "hv_store" will own the only reference to the new SV, and your
3047 code doesn't need to do anything further to tidy up.
3048 "hv_store" is not implemented as a call to "hv_store_ent", and
3049 does not create a temporary SV for the key, so if your key data
3050 is not already in SV form then use "hv_store" in preference to
3051 "hv_store_ent".
3052
3053 See "Understanding the Magic of Tied Hashes and Arrays" in
3054 perlguts for more information on how to use this function on
3055 tied hashes.
3056
3057 SV** hv_store(HV *hv, const char *key, I32 klen,
3058 SV *val, U32 hash)
3059
3060 hv_stores
3061 Like "hv_store", but takes a literal string instead of a
3062 string/length pair and omits the hash parameter.
3063
3064 SV** hv_stores(HV* tb, "literal string" key, SV* val)
3065
3066 hv_store_ent
3067 Stores "val" in a hash. The hash key is specified as "key".
3068 The "hash" parameter is the precomputed hash value; if it is
3069 zero then Perl will compute it. The return value is the new
3070 hash entry so created. It will be "NULL" if the operation
3071 failed or if the value did not need to be actually stored
3072 within the hash (as in the case of tied hashes). Otherwise the
3073 contents of the return value can be accessed using the "He?"
3074 macros described here. Note that the caller is responsible for
3075 suitably incrementing the reference count of "val" before the
3076 call, and decrementing it if the function returned NULL.
3077 Effectively a successful "hv_store_ent" takes ownership of one
3078 reference to "val". This is usually what you want; a newly
3079 created SV has a reference count of one, so if all your code
3080 does is create SVs then store them in a hash, "hv_store" will
3081 own the only reference to the new SV, and your code doesn't
3082 need to do anything further to tidy up. Note that
3083 "hv_store_ent" only reads the "key"; unlike "val" it does not
3084 take ownership of it, so maintaining the correct reference
3085 count on "key" is entirely the caller's responsibility. The
3086 reason it does not take ownership, is that "key" is not used
3087 after this function returns, and so can be freed immediately.
3088 "hv_store" is not implemented as a call to "hv_store_ent", and
3089 does not create a temporary SV for the key, so if your key data
3090 is not already in SV form then use "hv_store" in preference to
3091 "hv_store_ent".
3092
3093 See "Understanding the Magic of Tied Hashes and Arrays" in
3094 perlguts for more information on how to use this function on
3095 tied hashes.
3096
3097 HE* hv_store_ent(HV *hv, SV *key, SV *val, U32 hash)
3098
3099 hv_undef
3100 Undefines the hash. The XS equivalent of "undef(%hash)".
3101
3102 As well as freeing all the elements of the hash (like
3103 "hv_clear()"), this also frees any auxiliary data and storage
3104 associated with the hash.
3105
3106 See "av_clear" for a note about the hash possibly being invalid
3107 on return.
3108
3109 void hv_undef(HV *hv)
3110
3111 newHV Creates a new HV. The reference count is set to 1.
3112
3113 HV* newHV()
3114
3116 These functions provide convenient and thread-safe means of
3117 manipulating hook variables.
3118
3119 wrap_op_checker
3120 Puts a C function into the chain of check functions for a
3121 specified op type. This is the preferred way to manipulate the
3122 "PL_check" array. "opcode" specifies which type of op is to be
3123 affected. "new_checker" is a pointer to the C function that is
3124 to be added to that opcode's check chain, and "old_checker_p"
3125 points to the storage location where a pointer to the next
3126 function in the chain will be stored. The value of
3127 "new_checker" is written into the "PL_check" array, while the
3128 value previously stored there is written to *old_checker_p.
3129
3130 "PL_check" is global to an entire process, and a module wishing
3131 to hook op checking may find itself invoked more than once per
3132 process, typically in different threads. To handle that
3133 situation, this function is idempotent. The location
3134 *old_checker_p must initially (once per process) contain a null
3135 pointer. A C variable of static duration (declared at file
3136 scope, typically also marked "static" to give it internal
3137 linkage) will be implicitly initialised appropriately, if it
3138 does not have an explicit initialiser. This function will only
3139 actually modify the check chain if it finds *old_checker_p to
3140 be null. This function is also thread safe on the small scale.
3141 It uses appropriate locking to avoid race conditions in
3142 accessing "PL_check".
3143
3144 When this function is called, the function referenced by
3145 "new_checker" must be ready to be called, except for
3146 *old_checker_p being unfilled. In a threading situation,
3147 "new_checker" may be called immediately, even before this
3148 function has returned. *old_checker_p will always be
3149 appropriately set before "new_checker" is called. If
3150 "new_checker" decides not to do anything special with an op
3151 that it is given (which is the usual case for most uses of op
3152 check hooking), it must chain the check function referenced by
3153 *old_checker_p.
3154
3155 Taken all together, XS code to hook an op checker should
3156 typically look something like this:
3157
3158 static Perl_check_t nxck_frob;
3159 static OP *myck_frob(pTHX_ OP *op) {
3160 ...
3161 op = nxck_frob(aTHX_ op);
3162 ...
3163 return op;
3164 }
3165 BOOT:
3166 wrap_op_checker(OP_FROB, myck_frob, &nxck_frob);
3167
3168 If you want to influence compilation of calls to a specific
3169 subroutine, then use "cv_set_call_checker_flags" rather than
3170 hooking checking of all "entersub" ops.
3171
3172 void wrap_op_checker(Optype opcode,
3173 Perl_check_t new_checker,
3174 Perl_check_t *old_checker_p)
3175
3177 This is the lower layer of the Perl parser, managing characters and
3178 tokens.
3179
3180 lex_bufutf8
3181 NOTE: this function is experimental and may change or be
3182 removed without notice.
3183
3184 Indicates whether the octets in the lexer buffer
3185 ("PL_parser->linestr") should be interpreted as the UTF-8
3186 encoding of Unicode characters. If not, they should be
3187 interpreted as Latin-1 characters. This is analogous to the
3188 "SvUTF8" flag for scalars.
3189
3190 In UTF-8 mode, it is not guaranteed that the lexer buffer
3191 actually contains valid UTF-8. Lexing code must be robust in
3192 the face of invalid encoding.
3193
3194 The actual "SvUTF8" flag of the "PL_parser->linestr" scalar is
3195 significant, but not the whole story regarding the input
3196 character encoding. Normally, when a file is being read, the
3197 scalar contains octets and its "SvUTF8" flag is off, but the
3198 octets should be interpreted as UTF-8 if the "use utf8" pragma
3199 is in effect. During a string eval, however, the scalar may
3200 have the "SvUTF8" flag on, and in this case its octets should
3201 be interpreted as UTF-8 unless the "use bytes" pragma is in
3202 effect. This logic may change in the future; use this function
3203 instead of implementing the logic yourself.
3204
3205 bool lex_bufutf8()
3206
3207 lex_discard_to
3208 NOTE: this function is experimental and may change or be
3209 removed without notice.
3210
3211 Discards the first part of the "PL_parser->linestr" buffer, up
3212 to "ptr". The remaining content of the buffer will be moved,
3213 and all pointers into the buffer updated appropriately. "ptr"
3214 must not be later in the buffer than the position of
3215 "PL_parser->bufptr": it is not permitted to discard text that
3216 has yet to be lexed.
3217
3218 Normally it is not necessarily to do this directly, because it
3219 suffices to use the implicit discarding behaviour of
3220 "lex_next_chunk" and things based on it. However, if a token
3221 stretches across multiple lines, and the lexing code has kept
3222 multiple lines of text in the buffer for that purpose, then
3223 after completion of the token it would be wise to explicitly
3224 discard the now-unneeded earlier lines, to avoid future multi-
3225 line tokens growing the buffer without bound.
3226
3227 void lex_discard_to(char *ptr)
3228
3229 lex_grow_linestr
3230 NOTE: this function is experimental and may change or be
3231 removed without notice.
3232
3233 Reallocates the lexer buffer ("PL_parser->linestr") to
3234 accommodate at least "len" octets (including terminating
3235 "NUL"). Returns a pointer to the reallocated buffer. This is
3236 necessary before making any direct modification of the buffer
3237 that would increase its length. "lex_stuff_pvn" provides a
3238 more convenient way to insert text into the buffer.
3239
3240 Do not use "SvGROW" or "sv_grow" directly on
3241 "PL_parser->linestr"; this function updates all of the lexer's
3242 variables that point directly into the buffer.
3243
3244 char * lex_grow_linestr(STRLEN len)
3245
3246 lex_next_chunk
3247 NOTE: this function is experimental and may change or be
3248 removed without notice.
3249
3250 Reads in the next chunk of text to be lexed, appending it to
3251 "PL_parser->linestr". This should be called when lexing code
3252 has looked to the end of the current chunk and wants to know
3253 more. It is usual, but not necessary, for lexing to have
3254 consumed the entirety of the current chunk at this time.
3255
3256 If "PL_parser->bufptr" is pointing to the very end of the
3257 current chunk (i.e., the current chunk has been entirely
3258 consumed), normally the current chunk will be discarded at the
3259 same time that the new chunk is read in. If "flags" has the
3260 "LEX_KEEP_PREVIOUS" bit set, the current chunk will not be
3261 discarded. If the current chunk has not been entirely
3262 consumed, then it will not be discarded regardless of the flag.
3263
3264 Returns true if some new text was added to the buffer, or false
3265 if the buffer has reached the end of the input text.
3266
3267 bool lex_next_chunk(U32 flags)
3268
3269 lex_peek_unichar
3270 NOTE: this function is experimental and may change or be
3271 removed without notice.
3272
3273 Looks ahead one (Unicode) character in the text currently being
3274 lexed. Returns the codepoint (unsigned integer value) of the
3275 next character, or -1 if lexing has reached the end of the
3276 input text. To consume the peeked character, use
3277 "lex_read_unichar".
3278
3279 If the next character is in (or extends into) the next chunk of
3280 input text, the next chunk will be read in. Normally the
3281 current chunk will be discarded at the same time, but if
3282 "flags" has the "LEX_KEEP_PREVIOUS" bit set, then the current
3283 chunk will not be discarded.
3284
3285 If the input is being interpreted as UTF-8 and a UTF-8 encoding
3286 error is encountered, an exception is generated.
3287
3288 I32 lex_peek_unichar(U32 flags)
3289
3290 lex_read_space
3291 NOTE: this function is experimental and may change or be
3292 removed without notice.
3293
3294 Reads optional spaces, in Perl style, in the text currently
3295 being lexed. The spaces may include ordinary whitespace
3296 characters and Perl-style comments. "#line" directives are
3297 processed if encountered. "PL_parser->bufptr" is moved past
3298 the spaces, so that it points at a non-space character (or the
3299 end of the input text).
3300
3301 If spaces extend into the next chunk of input text, the next
3302 chunk will be read in. Normally the current chunk will be
3303 discarded at the same time, but if "flags" has the
3304 "LEX_KEEP_PREVIOUS" bit set, then the current chunk will not be
3305 discarded.
3306
3307 void lex_read_space(U32 flags)
3308
3309 lex_read_to
3310 NOTE: this function is experimental and may change or be
3311 removed without notice.
3312
3313 Consume text in the lexer buffer, from "PL_parser->bufptr" up
3314 to "ptr". This advances "PL_parser->bufptr" to match "ptr",
3315 performing the correct bookkeeping whenever a newline character
3316 is passed. This is the normal way to consume lexed text.
3317
3318 Interpretation of the buffer's octets can be abstracted out by
3319 using the slightly higher-level functions "lex_peek_unichar"
3320 and "lex_read_unichar".
3321
3322 void lex_read_to(char *ptr)
3323
3324 lex_read_unichar
3325 NOTE: this function is experimental and may change or be
3326 removed without notice.
3327
3328 Reads the next (Unicode) character in the text currently being
3329 lexed. Returns the codepoint (unsigned integer value) of the
3330 character read, and moves "PL_parser->bufptr" past the
3331 character, or returns -1 if lexing has reached the end of the
3332 input text. To non-destructively examine the next character,
3333 use "lex_peek_unichar" instead.
3334
3335 If the next character is in (or extends into) the next chunk of
3336 input text, the next chunk will be read in. Normally the
3337 current chunk will be discarded at the same time, but if
3338 "flags" has the "LEX_KEEP_PREVIOUS" bit set, then the current
3339 chunk will not be discarded.
3340
3341 If the input is being interpreted as UTF-8 and a UTF-8 encoding
3342 error is encountered, an exception is generated.
3343
3344 I32 lex_read_unichar(U32 flags)
3345
3346 lex_start
3347 NOTE: this function is experimental and may change or be
3348 removed without notice.
3349
3350 Creates and initialises a new lexer/parser state object,
3351 supplying a context in which to lex and parse from a new source
3352 of Perl code. A pointer to the new state object is placed in
3353 "PL_parser". An entry is made on the save stack so that upon
3354 unwinding, the new state object will be destroyed and the
3355 former value of "PL_parser" will be restored. Nothing else
3356 need be done to clean up the parsing context.
3357
3358 The code to be parsed comes from "line" and "rsfp". "line", if
3359 non-null, provides a string (in SV form) containing code to be
3360 parsed. A copy of the string is made, so subsequent
3361 modification of "line" does not affect parsing. "rsfp", if
3362 non-null, provides an input stream from which code will be read
3363 to be parsed. If both are non-null, the code in "line" comes
3364 first and must consist of complete lines of input, and "rsfp"
3365 supplies the remainder of the source.
3366
3367 The "flags" parameter is reserved for future use. Currently it
3368 is only used by perl internally, so extensions should always
3369 pass zero.
3370
3371 void lex_start(SV *line, PerlIO *rsfp, U32 flags)
3372
3373 lex_stuff_pv
3374 NOTE: this function is experimental and may change or be
3375 removed without notice.
3376
3377 Insert characters into the lexer buffer ("PL_parser->linestr"),
3378 immediately after the current lexing point
3379 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3380 This means that lexing code that runs later will see the
3381 characters as if they had appeared in the input. It is not
3382 recommended to do this as part of normal parsing, and most uses
3383 of this facility run the risk of the inserted characters being
3384 interpreted in an unintended manner.
3385
3386 The string to be inserted is represented by octets starting at
3387 "pv" and continuing to the first nul. These octets are
3388 interpreted as either UTF-8 or Latin-1, according to whether
3389 the "LEX_STUFF_UTF8" flag is set in "flags". The characters
3390 are recoded for the lexer buffer, according to how the buffer
3391 is currently being interpreted ("lex_bufutf8"). If it is not
3392 convenient to nul-terminate a string to be inserted, the
3393 "lex_stuff_pvn" function is more appropriate.
3394
3395 void lex_stuff_pv(const char *pv, U32 flags)
3396
3397 lex_stuff_pvn
3398 NOTE: this function is experimental and may change or be
3399 removed without notice.
3400
3401 Insert characters into the lexer buffer ("PL_parser->linestr"),
3402 immediately after the current lexing point
3403 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3404 This means that lexing code that runs later will see the
3405 characters as if they had appeared in the input. It is not
3406 recommended to do this as part of normal parsing, and most uses
3407 of this facility run the risk of the inserted characters being
3408 interpreted in an unintended manner.
3409
3410 The string to be inserted is represented by "len" octets
3411 starting at "pv". These octets are interpreted as either UTF-8
3412 or Latin-1, according to whether the "LEX_STUFF_UTF8" flag is
3413 set in "flags". The characters are recoded for the lexer
3414 buffer, according to how the buffer is currently being
3415 interpreted ("lex_bufutf8"). If a string to be inserted is
3416 available as a Perl scalar, the "lex_stuff_sv" function is more
3417 convenient.
3418
3419 void lex_stuff_pvn(const char *pv, STRLEN len,
3420 U32 flags)
3421
3422 lex_stuff_pvs
3423 NOTE: this function is experimental and may change or be
3424 removed without notice.
3425
3426 Like "lex_stuff_pvn", but takes a literal string instead of a
3427 string/length pair.
3428
3429 void lex_stuff_pvs("literal string" pv, U32 flags)
3430
3431 lex_stuff_sv
3432 NOTE: this function is experimental and may change or be
3433 removed without notice.
3434
3435 Insert characters into the lexer buffer ("PL_parser->linestr"),
3436 immediately after the current lexing point
3437 ("PL_parser->bufptr"), reallocating the buffer if necessary.
3438 This means that lexing code that runs later will see the
3439 characters as if they had appeared in the input. It is not
3440 recommended to do this as part of normal parsing, and most uses
3441 of this facility run the risk of the inserted characters being
3442 interpreted in an unintended manner.
3443
3444 The string to be inserted is the string value of "sv". The
3445 characters are recoded for the lexer buffer, according to how
3446 the buffer is currently being interpreted ("lex_bufutf8"). If
3447 a string to be inserted is not already a Perl scalar, the
3448 "lex_stuff_pvn" function avoids the need to construct a scalar.
3449
3450 void lex_stuff_sv(SV *sv, U32 flags)
3451
3452 lex_unstuff
3453 NOTE: this function is experimental and may change or be
3454 removed without notice.
3455
3456 Discards text about to be lexed, from "PL_parser->bufptr" up to
3457 "ptr". Text following "ptr" will be moved, and the buffer
3458 shortened. This hides the discarded text from any lexing code
3459 that runs later, as if the text had never appeared.
3460
3461 This is not the normal way to consume lexed text. For that,
3462 use "lex_read_to".
3463
3464 void lex_unstuff(char *ptr)
3465
3466 parse_arithexpr
3467 NOTE: this function is experimental and may change or be
3468 removed without notice.
3469
3470 Parse a Perl arithmetic expression. This may contain operators
3471 of precedence down to the bit shift operators. The expression
3472 must be followed (and thus terminated) either by a comparison
3473 or lower-precedence operator or by something that would
3474 normally terminate an expression such as semicolon. If "flags"
3475 has the "PARSE_OPTIONAL" bit set, then the expression is
3476 optional, otherwise it is mandatory. It is up to the caller to
3477 ensure that the dynamic parser state ("PL_parser" et al) is
3478 correctly set to reflect the source of the code to be parsed
3479 and the lexical context for the expression.
3480
3481 The op tree representing the expression is returned. If an
3482 optional expression is absent, a null pointer is returned,
3483 otherwise the pointer will be non-null.
3484
3485 If an error occurs in parsing or compilation, in most cases a
3486 valid op tree is returned anyway. The error is reflected in
3487 the parser state, normally resulting in a single exception at
3488 the top level of parsing which covers all the compilation
3489 errors that occurred. Some compilation errors, however, will
3490 throw an exception immediately.
3491
3492 OP * parse_arithexpr(U32 flags)
3493
3494 parse_barestmt
3495 NOTE: this function is experimental and may change or be
3496 removed without notice.
3497
3498 Parse a single unadorned Perl statement. This may be a normal
3499 imperative statement or a declaration that has compile-time
3500 effect. It does not include any label or other affixture. It
3501 is up to the caller to ensure that the dynamic parser state
3502 ("PL_parser" et al) is correctly set to reflect the source of
3503 the code to be parsed and the lexical context for the
3504 statement.
3505
3506 The op tree representing the statement is returned. This may
3507 be a null pointer if the statement is null, for example if it
3508 was actually a subroutine definition (which has compile-time
3509 side effects). If not null, it will be ops directly
3510 implementing the statement, suitable to pass to "newSTATEOP".
3511 It will not normally include a "nextstate" or equivalent op
3512 (except for those embedded in a scope contained entirely within
3513 the statement).
3514
3515 If an error occurs in parsing or compilation, in most cases a
3516 valid op tree (most likely null) is returned anyway. The error
3517 is reflected in the parser state, normally resulting in a
3518 single exception at the top level of parsing which covers all
3519 the compilation errors that occurred. Some compilation errors,
3520 however, will throw an exception immediately.
3521
3522 The "flags" parameter is reserved for future use, and must
3523 always be zero.
3524
3525 OP * parse_barestmt(U32 flags)
3526
3527 parse_block
3528 NOTE: this function is experimental and may change or be
3529 removed without notice.
3530
3531 Parse a single complete Perl code block. This consists of an
3532 opening brace, a sequence of statements, and a closing brace.
3533 The block constitutes a lexical scope, so "my" variables and
3534 various compile-time effects can be contained within it. It is
3535 up to the caller to ensure that the dynamic parser state
3536 ("PL_parser" et al) is correctly set to reflect the source of
3537 the code to be parsed and the lexical context for the
3538 statement.
3539
3540 The op tree representing the code block is returned. This is
3541 always a real op, never a null pointer. It will normally be a
3542 "lineseq" list, including "nextstate" or equivalent ops. No
3543 ops to construct any kind of runtime scope are included by
3544 virtue of it being a block.
3545
3546 If an error occurs in parsing or compilation, in most cases a
3547 valid op tree (most likely null) is returned anyway. The error
3548 is reflected in the parser state, normally resulting in a
3549 single exception at the top level of parsing which covers all
3550 the compilation errors that occurred. Some compilation errors,
3551 however, will throw an exception immediately.
3552
3553 The "flags" parameter is reserved for future use, and must
3554 always be zero.
3555
3556 OP * parse_block(U32 flags)
3557
3558 parse_fullexpr
3559 NOTE: this function is experimental and may change or be
3560 removed without notice.
3561
3562 Parse a single complete Perl expression. This allows the full
3563 expression grammar, including the lowest-precedence operators
3564 such as "or". The expression must be followed (and thus
3565 terminated) by a token that an expression would normally be
3566 terminated by: end-of-file, closing bracketing punctuation,
3567 semicolon, or one of the keywords that signals a postfix
3568 expression-statement modifier. If "flags" has the
3569 "PARSE_OPTIONAL" bit set, then the expression is optional,
3570 otherwise it is mandatory. It is up to the caller to ensure
3571 that the dynamic parser state ("PL_parser" et al) is correctly
3572 set to reflect the source of the code to be parsed and the
3573 lexical context for the expression.
3574
3575 The op tree representing the expression is returned. If an
3576 optional expression is absent, a null pointer is returned,
3577 otherwise the pointer will be non-null.
3578
3579 If an error occurs in parsing or compilation, in most cases a
3580 valid op tree is returned anyway. The error is reflected in
3581 the parser state, normally resulting in a single exception at
3582 the top level of parsing which covers all the compilation
3583 errors that occurred. Some compilation errors, however, will
3584 throw an exception immediately.
3585
3586 OP * parse_fullexpr(U32 flags)
3587
3588 parse_fullstmt
3589 NOTE: this function is experimental and may change or be
3590 removed without notice.
3591
3592 Parse a single complete Perl statement. This may be a normal
3593 imperative statement or a declaration that has compile-time
3594 effect, and may include optional labels. It is up to the
3595 caller to ensure that the dynamic parser state ("PL_parser" et
3596 al) is correctly set to reflect the source of the code to be
3597 parsed and the lexical context for the statement.
3598
3599 The op tree representing the statement is returned. This may
3600 be a null pointer if the statement is null, for example if it
3601 was actually a subroutine definition (which has compile-time
3602 side effects). If not null, it will be the result of a
3603 "newSTATEOP" call, normally including a "nextstate" or
3604 equivalent op.
3605
3606 If an error occurs in parsing or compilation, in most cases a
3607 valid op tree (most likely null) is returned anyway. The error
3608 is reflected in the parser state, normally resulting in a
3609 single exception at the top level of parsing which covers all
3610 the compilation errors that occurred. Some compilation errors,
3611 however, will throw an exception immediately.
3612
3613 The "flags" parameter is reserved for future use, and must
3614 always be zero.
3615
3616 OP * parse_fullstmt(U32 flags)
3617
3618 parse_label
3619 NOTE: this function is experimental and may change or be
3620 removed without notice.
3621
3622 Parse a single label, possibly optional, of the type that may
3623 prefix a Perl statement. It is up to the caller to ensure that
3624 the dynamic parser state ("PL_parser" et al) is correctly set
3625 to reflect the source of the code to be parsed. If "flags" has
3626 the "PARSE_OPTIONAL" bit set, then the label is optional,
3627 otherwise it is mandatory.
3628
3629 The name of the label is returned in the form of a fresh
3630 scalar. If an optional label is absent, a null pointer is
3631 returned.
3632
3633 If an error occurs in parsing, which can only occur if the
3634 label is mandatory, a valid label is returned anyway. The
3635 error is reflected in the parser state, normally resulting in a
3636 single exception at the top level of parsing which covers all
3637 the compilation errors that occurred.
3638
3639 SV * parse_label(U32 flags)
3640
3641 parse_listexpr
3642 NOTE: this function is experimental and may change or be
3643 removed without notice.
3644
3645 Parse a Perl list expression. This may contain operators of
3646 precedence down to the comma operator. The expression must be
3647 followed (and thus terminated) either by a low-precedence logic
3648 operator such as "or" or by something that would normally
3649 terminate an expression such as semicolon. If "flags" has the
3650 "PARSE_OPTIONAL" bit set, then the expression is optional,
3651 otherwise it is mandatory. It is up to the caller to ensure
3652 that the dynamic parser state ("PL_parser" et al) is correctly
3653 set to reflect the source of the code to be parsed and the
3654 lexical context for the expression.
3655
3656 The op tree representing the expression is returned. If an
3657 optional expression is absent, a null pointer is returned,
3658 otherwise the pointer will be non-null.
3659
3660 If an error occurs in parsing or compilation, in most cases a
3661 valid op tree is returned anyway. The error is reflected in
3662 the parser state, normally resulting in a single exception at
3663 the top level of parsing which covers all the compilation
3664 errors that occurred. Some compilation errors, however, will
3665 throw an exception immediately.
3666
3667 OP * parse_listexpr(U32 flags)
3668
3669 parse_stmtseq
3670 NOTE: this function is experimental and may change or be
3671 removed without notice.
3672
3673 Parse a sequence of zero or more Perl statements. These may be
3674 normal imperative statements, including optional labels, or
3675 declarations that have compile-time effect, or any mixture
3676 thereof. The statement sequence ends when a closing brace or
3677 end-of-file is encountered in a place where a new statement
3678 could have validly started. It is up to the caller to ensure
3679 that the dynamic parser state ("PL_parser" et al) is correctly
3680 set to reflect the source of the code to be parsed and the
3681 lexical context for the statements.
3682
3683 The op tree representing the statement sequence is returned.
3684 This may be a null pointer if the statements were all null, for
3685 example if there were no statements or if there were only
3686 subroutine definitions (which have compile-time side effects).
3687 If not null, it will be a "lineseq" list, normally including
3688 "nextstate" or equivalent ops.
3689
3690 If an error occurs in parsing or compilation, in most cases a
3691 valid op tree is returned anyway. The error is reflected in
3692 the parser state, normally resulting in a single exception at
3693 the top level of parsing which covers all the compilation
3694 errors that occurred. Some compilation errors, however, will
3695 throw an exception immediately.
3696
3697 The "flags" parameter is reserved for future use, and must
3698 always be zero.
3699
3700 OP * parse_stmtseq(U32 flags)
3701
3702 parse_termexpr
3703 NOTE: this function is experimental and may change or be
3704 removed without notice.
3705
3706 Parse a Perl term expression. This may contain operators of
3707 precedence down to the assignment operators. The expression
3708 must be followed (and thus terminated) either by a comma or
3709 lower-precedence operator or by something that would normally
3710 terminate an expression such as semicolon. If "flags" has the
3711 "PARSE_OPTIONAL" bit set, then the expression is optional,
3712 otherwise it is mandatory. It is up to the caller to ensure
3713 that the dynamic parser state ("PL_parser" et al) is correctly
3714 set to reflect the source of the code to be parsed and the
3715 lexical context for the expression.
3716
3717 The op tree representing the expression is returned. If an
3718 optional expression is absent, a null pointer is returned,
3719 otherwise the pointer will be non-null.
3720
3721 If an error occurs in parsing or compilation, in most cases a
3722 valid op tree is returned anyway. The error is reflected in
3723 the parser state, normally resulting in a single exception at
3724 the top level of parsing which covers all the compilation
3725 errors that occurred. Some compilation errors, however, will
3726 throw an exception immediately.
3727
3728 OP * parse_termexpr(U32 flags)
3729
3730 PL_parser
3731 Pointer to a structure encapsulating the state of the parsing
3732 operation currently in progress. The pointer can be locally
3733 changed to perform a nested parse without interfering with the
3734 state of an outer parse. Individual members of "PL_parser"
3735 have their own documentation.
3736
3737 PL_parser->bufend
3738 NOTE: this function is experimental and may change or be
3739 removed without notice.
3740
3741 Direct pointer to the end of the chunk of text currently being
3742 lexed, the end of the lexer buffer. This is equal to
3743 "SvPVX(PL_parser->linestr) + SvCUR(PL_parser->linestr)". A
3744 "NUL" character (zero octet) is always located at the end of
3745 the buffer, and does not count as part of the buffer's
3746 contents.
3747
3748 PL_parser->bufptr
3749 NOTE: this function is experimental and may change or be
3750 removed without notice.
3751
3752 Points to the current position of lexing inside the lexer
3753 buffer. Characters around this point may be freely examined,
3754 within the range delimited by "SvPVX("PL_parser->linestr")" and
3755 "PL_parser->bufend". The octets of the buffer may be intended
3756 to be interpreted as either UTF-8 or Latin-1, as indicated by
3757 "lex_bufutf8".
3758
3759 Lexing code (whether in the Perl core or not) moves this
3760 pointer past the characters that it consumes. It is also
3761 expected to perform some bookkeeping whenever a newline
3762 character is consumed. This movement can be more conveniently
3763 performed by the function "lex_read_to", which handles newlines
3764 appropriately.
3765
3766 Interpretation of the buffer's octets can be abstracted out by
3767 using the slightly higher-level functions "lex_peek_unichar"
3768 and "lex_read_unichar".
3769
3770 PL_parser->linestart
3771 NOTE: this function is experimental and may change or be
3772 removed without notice.
3773
3774 Points to the start of the current line inside the lexer
3775 buffer. This is useful for indicating at which column an error
3776 occurred, and not much else. This must be updated by any
3777 lexing code that consumes a newline; the function "lex_read_to"
3778 handles this detail.
3779
3780 PL_parser->linestr
3781 NOTE: this function is experimental and may change or be
3782 removed without notice.
3783
3784 Buffer scalar containing the chunk currently under
3785 consideration of the text currently being lexed. This is
3786 always a plain string scalar (for which "SvPOK" is true). It
3787 is not intended to be used as a scalar by normal scalar means;
3788 instead refer to the buffer directly by the pointer variables
3789 described below.
3790
3791 The lexer maintains various "char*" pointers to things in the
3792 "PL_parser->linestr" buffer. If "PL_parser->linestr" is ever
3793 reallocated, all of these pointers must be updated. Don't
3794 attempt to do this manually, but rather use "lex_grow_linestr"
3795 if you need to reallocate the buffer.
3796
3797 The content of the text chunk in the buffer is commonly exactly
3798 one complete line of input, up to and including a newline
3799 terminator, but there are situations where it is otherwise.
3800 The octets of the buffer may be intended to be interpreted as
3801 either UTF-8 or Latin-1. The function "lex_bufutf8" tells you
3802 which. Do not use the "SvUTF8" flag on this scalar, which may
3803 disagree with it.
3804
3805 For direct examination of the buffer, the variable
3806 "PL_parser->bufend" points to the end of the buffer. The
3807 current lexing position is pointed to by "PL_parser->bufptr".
3808 Direct use of these pointers is usually preferable to
3809 examination of the scalar through normal scalar means.
3810
3811 wrap_keyword_plugin
3812 NOTE: this function is experimental and may change or be
3813 removed without notice.
3814
3815 Puts a C function into the chain of keyword plugins. This is
3816 the preferred way to manipulate the "PL_keyword_plugin"
3817 variable. "new_plugin" is a pointer to the C function that is
3818 to be added to the keyword plugin chain, and "old_plugin_p"
3819 points to the storage location where a pointer to the next
3820 function in the chain will be stored. The value of
3821 "new_plugin" is written into the "PL_keyword_plugin" variable,
3822 while the value previously stored there is written to
3823 *old_plugin_p.
3824
3825 "PL_keyword_plugin" is global to an entire process, and a
3826 module wishing to hook keyword parsing may find itself invoked
3827 more than once per process, typically in different threads. To
3828 handle that situation, this function is idempotent. The
3829 location *old_plugin_p must initially (once per process)
3830 contain a null pointer. A C variable of static duration
3831 (declared at file scope, typically also marked "static" to give
3832 it internal linkage) will be implicitly initialised
3833 appropriately, if it does not have an explicit initialiser.
3834 This function will only actually modify the plugin chain if it
3835 finds *old_plugin_p to be null. This function is also thread
3836 safe on the small scale. It uses appropriate locking to avoid
3837 race conditions in accessing "PL_keyword_plugin".
3838
3839 When this function is called, the function referenced by
3840 "new_plugin" must be ready to be called, except for
3841 *old_plugin_p being unfilled. In a threading situation,
3842 "new_plugin" may be called immediately, even before this
3843 function has returned. *old_plugin_p will always be
3844 appropriately set before "new_plugin" is called. If
3845 "new_plugin" decides not to do anything special with the
3846 identifier that it is given (which is the usual case for most
3847 calls to a keyword plugin), it must chain the plugin function
3848 referenced by *old_plugin_p.
3849
3850 Taken all together, XS code to install a keyword plugin should
3851 typically look something like this:
3852
3853 static Perl_keyword_plugin_t next_keyword_plugin;
3854 static OP *my_keyword_plugin(pTHX_
3855 char *keyword_plugin, STRLEN keyword_len, OP **op_ptr)
3856 {
3857 if (memEQs(keyword_ptr, keyword_len,
3858 "my_new_keyword")) {
3859 ...
3860 } else {
3861 return next_keyword_plugin(aTHX_
3862 keyword_ptr, keyword_len, op_ptr);
3863 }
3864 }
3865 BOOT:
3866 wrap_keyword_plugin(my_keyword_plugin,
3867 &next_keyword_plugin);
3868
3869 Direct access to "PL_keyword_plugin" should be avoided.
3870
3871 void wrap_keyword_plugin(
3872 Perl_keyword_plugin_t new_plugin,
3873 Perl_keyword_plugin_t *old_plugin_p
3874 )
3875
3877 DECLARATION_FOR_LC_NUMERIC_MANIPULATION
3878 This macro should be used as a statement. It declares a
3879 private variable (whose name begins with an underscore) that is
3880 needed by the other macros in this section. Failing to include
3881 this correctly should lead to a syntax error. For
3882 compatibility with C89 C compilers it should be placed in a
3883 block before any executable statements.
3884
3885 void DECLARATION_FOR_LC_NUMERIC_MANIPULATION
3886
3887 Perl_langinfo
3888 This is an (almost) drop-in replacement for the system
3889 nl_langinfo(3), taking the same "item" parameter values, and
3890 returning the same information. But it is more thread-safe
3891 than regular "nl_langinfo()", and hides the quirks of Perl's
3892 locale handling from your code, and can be used on systems that
3893 lack a native "nl_langinfo".
3894
3895 Expanding on these:
3896
3897 · The reason it isn't quite a drop-in replacement is actually
3898 an advantage. The only difference is that it returns
3899 "const char *", whereas plain "nl_langinfo()" returns
3900 "char *", but you are (only by documentation) forbidden to
3901 write into the buffer. By declaring this "const", the
3902 compiler enforces this restriction, so if it is violated,
3903 you know at compilation time, rather than getting segfaults
3904 at runtime.
3905
3906 · It delivers the correct results for the "RADIXCHAR" and
3907 "THOUSEP" items, without you having to write extra code.
3908 The reason for the extra code would be because these are
3909 from the "LC_NUMERIC" locale category, which is normally
3910 kept set by Perl so that the radix is a dot, and the
3911 separator is the empty string, no matter what the
3912 underlying locale is supposed to be, and so to get the
3913 expected results, you have to temporarily toggle into the
3914 underlying locale, and later toggle back. (You could use
3915 plain "nl_langinfo" and
3916 "STORE_LC_NUMERIC_FORCE_TO_UNDERLYING" for this but then
3917 you wouldn't get the other advantages of "Perl_langinfo()";
3918 not keeping "LC_NUMERIC" in the C (or equivalent) locale
3919 would break a lot of CPAN, which is expecting the radix
3920 (decimal point) character to be a dot.)
3921
3922 · The system function it replaces can have its static return
3923 buffer trashed, not only by a subesequent call to that
3924 function, but by a "freelocale", "setlocale", or other
3925 locale change. The returned buffer of this function is not
3926 changed until the next call to it, so the buffer is never
3927 in a trashed state.
3928
3929 · Its return buffer is per-thread, so it also is never
3930 overwritten by a call to this function from another thread;
3931 unlike the function it replaces.
3932
3933 · But most importantly, it works on systems that don't have
3934 "nl_langinfo", such as Windows, hence makes your code more
3935 portable. Of the fifty-some possible items specified by
3936 the POSIX 2008 standard,
3937 <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/langinfo.h.html>,
3938 only one is completely unimplemented, though on non-Windows
3939 platforms, another significant one is also not
3940 implemented). It uses various techniques to recover the
3941 other items, including calling localeconv(3), and
3942 strftime(3), both of which are specified in C89, so should
3943 be always be available. Later "strftime()" versions have
3944 additional capabilities; "" is returned for those not
3945 available on your system.
3946
3947 It is important to note that when called with an item that
3948 is recovered by using "localeconv", the buffer from any
3949 previous explicit call to "localeconv" will be overwritten.
3950 This means you must save that buffer's contents if you need
3951 to access them after a call to this function. (But note
3952 that you might not want to be using "localeconv()" directly
3953 anyway, because of issues like the ones listed in the
3954 second item of this list (above) for "RADIXCHAR" and
3955 "THOUSEP". You can use the methods given in perlcall to
3956 call "localeconv" in POSIX and avoid all the issues, but
3957 then you have a hash to unpack).
3958
3959 The details for those items which may deviate from what
3960 this emulation returns and what a native "nl_langinfo()"
3961 would return are specified in I18N::Langinfo.
3962
3963 When using "Perl_langinfo" on systems that don't have a native
3964 "nl_langinfo()", you must
3965
3966 #include "perl_langinfo.h"
3967
3968 before the "perl.h" "#include". You can replace your
3969 "langinfo.h" "#include" with this one. (Doing it this way
3970 keeps out the symbols that plain "langinfo.h" would try to
3971 import into the namespace for code that doesn't need it.)
3972
3973 The original impetus for "Perl_langinfo()" was so that code
3974 that needs to find out the current currency symbol, floating
3975 point radix character, or digit grouping separator can use, on
3976 all systems, the simpler and more thread-friendly "nl_langinfo"
3977 API instead of localeconv(3) which is a pain to make thread-
3978 friendly. For other fields returned by "localeconv", it is
3979 better to use the methods given in perlcall to call
3980 "POSIX::localeconv()", which is thread-friendly.
3981
3982 const char* Perl_langinfo(const nl_item item)
3983
3984 Perl_setlocale
3985 This is an (almost) drop-in replacement for the system
3986 setlocale(3), taking the same parameters, and returning the
3987 same information, except that it returns the correct underlying
3988 "LC_NUMERIC" locale. Regular "setlocale" will instead return
3989 "C" if the underlying locale has a non-dot decimal point
3990 character, or a non-empty thousands separator for displaying
3991 floating point numbers. This is because perl keeps that locale
3992 category such that it has a dot and empty separator, changing
3993 the locale briefly during the operations where the underlying
3994 one is required. "Perl_setlocale" knows about this, and
3995 compensates; regular "setlocale" doesn't.
3996
3997 Another reason it isn't completely a drop-in replacement is
3998 that it is declared to return "const char *", whereas the
3999 system setlocale omits the "const" (presumably because its API
4000 was specified long ago, and can't be updated; it is illegal to
4001 change the information "setlocale" returns; doing so leads to
4002 segfaults.)
4003
4004 Finally, "Perl_setlocale" works under all circumstances,
4005 whereas plain "setlocale" can be completely ineffective on some
4006 platforms under some configurations.
4007
4008 "Perl_setlocale" should not be used to change the locale except
4009 on systems where the predefined variable "${^SAFE_LOCALES}" is
4010 1. On some such systems, the system "setlocale()" is
4011 ineffective, returning the wrong information, and failing to
4012 actually change the locale. "Perl_setlocale", however works
4013 properly in all circumstances.
4014
4015 The return points to a per-thread static buffer, which is
4016 overwritten the next time "Perl_setlocale" is called from the
4017 same thread.
4018
4019 const char* Perl_setlocale(const int category,
4020 const char* locale)
4021
4022 RESTORE_LC_NUMERIC
4023 This is used in conjunction with one of the macros
4024 "STORE_LC_NUMERIC_SET_TO_NEEDED" and
4025 "STORE_LC_NUMERIC_FORCE_TO_UNDERLYING" to properly restore the
4026 "LC_NUMERIC" state.
4027
4028 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4029 been made to declare at compile time a private variable used by
4030 this macro and the two "STORE" ones. This macro should be
4031 called as a single statement, not an expression, but with an
4032 empty argument list, like this:
4033
4034 {
4035 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4036 ...
4037 RESTORE_LC_NUMERIC();
4038 ...
4039 }
4040
4041 void RESTORE_LC_NUMERIC()
4042
4043 STORE_LC_NUMERIC_FORCE_TO_UNDERLYING
4044 This is used by XS code that that is "LC_NUMERIC" locale-aware
4045 to force the locale for category "LC_NUMERIC" to be what perl
4046 thinks is the current underlying locale. (The perl interpreter
4047 could be wrong about what the underlying locale actually is if
4048 some C or XS code has called the C library function
4049 setlocale(3) behind its back; calling "sync_locale" before
4050 calling this macro will update perl's records.)
4051
4052 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4053 been made to declare at compile time a private variable used by
4054 this macro. This macro should be called as a single statement,
4055 not an expression, but with an empty argument list, like this:
4056
4057 {
4058 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4059 ...
4060 STORE_LC_NUMERIC_FORCE_TO_UNDERLYING();
4061 ...
4062 RESTORE_LC_NUMERIC();
4063 ...
4064 }
4065
4066 The private variable is used to save the current locale state,
4067 so that the requisite matching call to "RESTORE_LC_NUMERIC" can
4068 restore it.
4069
4070 On threaded perls not operating with thread-safe functionality,
4071 this macro uses a mutex to force a critical section. Therefore
4072 the matching RESTORE should be close by, and guaranteed to be
4073 called.
4074
4075 void STORE_LC_NUMERIC_FORCE_TO_UNDERLYING()
4076
4077 STORE_LC_NUMERIC_SET_TO_NEEDED
4078 This is used to help wrap XS or C code that is "LC_NUMERIC"
4079 locale-aware. This locale category is generally kept set to a
4080 locale where the decimal radix character is a dot, and the
4081 separator between groups of digits is empty. This is because
4082 most XS code that reads floating point numbers is expecting
4083 them to have this syntax.
4084
4085 This macro makes sure the current "LC_NUMERIC" state is set
4086 properly, to be aware of locale if the call to the XS or C code
4087 from the Perl program is from within the scope of a
4088 "use locale"; or to ignore locale if the call is instead from
4089 outside such scope.
4090
4091 This macro is the start of wrapping the C or XS code; the wrap
4092 ending is done by calling the "RESTORE_LC_NUMERIC" macro after
4093 the operation. Otherwise the state can be changed that will
4094 adversely affect other XS code.
4095
4096 A call to "DECLARATION_FOR_LC_NUMERIC_MANIPULATION" must have
4097 been made to declare at compile time a private variable used by
4098 this macro. This macro should be called as a single statement,
4099 not an expression, but with an empty argument list, like this:
4100
4101 {
4102 DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
4103 ...
4104 STORE_LC_NUMERIC_SET_TO_NEEDED();
4105 ...
4106 RESTORE_LC_NUMERIC();
4107 ...
4108 }
4109
4110 On threaded perls not operating with thread-safe functionality,
4111 this macro uses a mutex to force a critical section. Therefore
4112 the matching RESTORE should be close by, and guaranteed to be
4113 called.
4114
4115 void STORE_LC_NUMERIC_SET_TO_NEEDED()
4116
4117 switch_to_global_locale
4118 On systems without locale support, or on typical single-
4119 threaded builds, or on platforms that do not support per-thread
4120 locale operations, this function does nothing. On such systems
4121 that do have locale support, only a locale global to the whole
4122 program is available.
4123
4124 On multi-threaded builds on systems that do have per-thread
4125 locale operations, this function converts the thread it is
4126 running in to use the global locale. This is for code that has
4127 not yet or cannot be updated to handle multi-threaded locale
4128 operation. As long as only a single thread is so-converted,
4129 everything works fine, as all the other threads continue to
4130 ignore the global one, so only this thread looks at it.
4131
4132 However, on Windows systems this isn't quite true prior to
4133 Visual Studio 15, at which point Microsoft fixed a bug. A race
4134 can occur if you use the following operations on earlier
4135 Windows platforms:
4136
4137 POSIX::localeconv
4138 I18N::Langinfo, items "CRNCYSTR" and "THOUSEP"
4139 "Perl_langinfo" in perlapi, items "CRNCYSTR" and "THOUSEP"
4140
4141 The first item is not fixable (except by upgrading to a later
4142 Visual Studio release), but it would be possible to work around
4143 the latter two items by using the Windows API functions
4144 "GetNumberFormat" and "GetCurrencyFormat"; patches welcome.
4145
4146 Without this function call, threads that use the setlocale(3)
4147 system function will not work properly, as all the locale-
4148 sensitive functions will look at the per-thread locale, and
4149 "setlocale" will have no effect on this thread.
4150
4151 Perl code should convert to either call "Perl_setlocale" (which
4152 is a drop-in for the system "setlocale") or use the methods
4153 given in perlcall to call "POSIX::setlocale". Either one will
4154 transparently properly handle all cases of single- vs multi-
4155 thread, POSIX 2008-supported or not.
4156
4157 Non-Perl libraries, such as "gtk", that call the system
4158 "setlocale" can continue to work if this function is called
4159 before transferring control to the library.
4160
4161 Upon return from the code that needs to use the global locale,
4162 "sync_locale()" should be called to restore the safe multi-
4163 thread operation.
4164
4165 void switch_to_global_locale()
4166
4167 sync_locale
4168 "Perl_setlocale" can be used at any time to query or change the
4169 locale (though changing the locale is antisocial and dangerous
4170 on multi-threaded systems that don't have multi-thread safe
4171 locale operations. (See "Multi-threaded operation" in
4172 perllocale). Using the system setlocale(3) should be avoided.
4173 Nevertheless, certain non-Perl libraries called from XS, such
4174 as "Gtk" do so, and this can't be changed. When the locale is
4175 changed by XS code that didn't use "Perl_setlocale", Perl needs
4176 to be told that the locale has changed. Use this function to
4177 do so, before returning to Perl.
4178
4179 The return value is a boolean: TRUE if the global locale at the
4180 time of call was in effect; and FALSE if a per-thread locale
4181 was in effect. This can be used by the caller that needs to
4182 restore things as-they-were to decide whether or not to call
4183 "Perl_switch_to_global_locale".
4184
4185 bool sync_locale()
4186
4188 mg_clear
4189 Clear something magical that the SV represents. See
4190 "sv_magic".
4191
4192 int mg_clear(SV* sv)
4193
4194 mg_copy Copies the magic from one SV to another. See "sv_magic".
4195
4196 int mg_copy(SV *sv, SV *nsv, const char *key,
4197 I32 klen)
4198
4199 mg_find Finds the magic pointer for "type" matching the SV. See
4200 "sv_magic".
4201
4202 MAGIC* mg_find(const SV* sv, int type)
4203
4204 mg_findext
4205 Finds the magic pointer of "type" with the given "vtbl" for the
4206 "SV". See "sv_magicext".
4207
4208 MAGIC* mg_findext(const SV* sv, int type,
4209 const MGVTBL *vtbl)
4210
4211 mg_free Free any magic storage used by the SV. See "sv_magic".
4212
4213 int mg_free(SV* sv)
4214
4215 mg_freeext
4216 Remove any magic of type "how" using virtual table "vtbl" from
4217 the SV "sv". See "sv_magic".
4218
4219 "mg_freeext(sv, how, NULL)" is equivalent to "mg_free_type(sv,
4220 how)".
4221
4222 void mg_freeext(SV* sv, int how, const MGVTBL *vtbl)
4223
4224 mg_free_type
4225 Remove any magic of type "how" from the SV "sv". See
4226 "sv_magic".
4227
4228 void mg_free_type(SV *sv, int how)
4229
4230 mg_get Do magic before a value is retrieved from the SV. The type of
4231 SV must be >= "SVt_PVMG". See "sv_magic".
4232
4233 int mg_get(SV* sv)
4234
4235 mg_length
4236 DEPRECATED! It is planned to remove this function from a
4237 future release of Perl. Do not use it for new code; remove it
4238 from existing code.
4239
4240 Reports on the SV's length in bytes, calling length magic if
4241 available, but does not set the UTF8 flag on "sv". It will
4242 fall back to 'get' magic if there is no 'length' magic, but
4243 with no indication as to whether it called 'get' magic. It
4244 assumes "sv" is a "PVMG" or higher. Use "sv_len()" instead.
4245
4246 U32 mg_length(SV* sv)
4247
4248 mg_magical
4249 Turns on the magical status of an SV. See "sv_magic".
4250
4251 void mg_magical(SV* sv)
4252
4253 mg_set Do magic after a value is assigned to the SV. See "sv_magic".
4254
4255 int mg_set(SV* sv)
4256
4257 SvGETMAGIC
4258 Invokes "mg_get" on an SV if it has 'get' magic. For example,
4259 this will call "FETCH" on a tied variable. This macro
4260 evaluates its argument more than once.
4261
4262 void SvGETMAGIC(SV* sv)
4263
4264 SvLOCK Arranges for a mutual exclusion lock to be obtained on "sv" if
4265 a suitable module has been loaded.
4266
4267 void SvLOCK(SV* sv)
4268
4269 SvSETMAGIC
4270 Invokes "mg_set" on an SV if it has 'set' magic. This is
4271 necessary after modifying a scalar, in case it is a magical
4272 variable like $| or a tied variable (it calls "STORE"). This
4273 macro evaluates its argument more than once.
4274
4275 void SvSETMAGIC(SV* sv)
4276
4277 SvSetMagicSV
4278 Like "SvSetSV", but does any set magic required afterwards.
4279
4280 void SvSetMagicSV(SV* dsv, SV* ssv)
4281
4282 SvSetMagicSV_nosteal
4283 Like "SvSetSV_nosteal", but does any set magic required
4284 afterwards.
4285
4286 void SvSetMagicSV_nosteal(SV* dsv, SV* ssv)
4287
4288 SvSetSV Calls "sv_setsv" if "dsv" is not the same as "ssv". May
4289 evaluate arguments more than once. Does not handle 'set' magic
4290 on the destination SV.
4291
4292 void SvSetSV(SV* dsv, SV* ssv)
4293
4294 SvSetSV_nosteal
4295 Calls a non-destructive version of "sv_setsv" if "dsv" is not
4296 the same as "ssv". May evaluate arguments more than once.
4297
4298 void SvSetSV_nosteal(SV* dsv, SV* ssv)
4299
4300 SvSHARE Arranges for "sv" to be shared between threads if a suitable
4301 module has been loaded.
4302
4303 void SvSHARE(SV* sv)
4304
4305 sv_string_from_errnum
4306 Generates the message string describing an OS error and returns
4307 it as an SV. "errnum" must be a value that "errno" could take,
4308 identifying the type of error.
4309
4310 If "tgtsv" is non-null then the string will be written into
4311 that SV (overwriting existing content) and it will be returned.
4312 If "tgtsv" is a null pointer then the string will be written
4313 into a new mortal SV which will be returned.
4314
4315 The message will be taken from whatever locale would be used by
4316 $!, and will be encoded in the SV in whatever manner would be
4317 used by $!. The details of this process are subject to future
4318 change. Currently, the message is taken from the C locale by
4319 default (usually producing an English message), and from the
4320 currently selected locale when in the scope of the "use locale"
4321 pragma. A heuristic attempt is made to decode the message from
4322 the locale's character encoding, but it will only be decoded as
4323 either UTF-8 or ISO-8859-1. It is always correctly decoded in
4324 a UTF-8 locale, usually in an ISO-8859-1 locale, and never in
4325 any other locale.
4326
4327 The SV is always returned containing an actual string, and with
4328 no other OK bits set. Unlike $!, a message is even yielded for
4329 "errnum" zero (meaning success), and if no useful message is
4330 available then a useless string (currently empty) is returned.
4331
4332 SV * sv_string_from_errnum(int errnum, SV *tgtsv)
4333
4334 SvUNLOCK
4335 Releases a mutual exclusion lock on "sv" if a suitable module
4336 has been loaded.
4337
4338 void SvUNLOCK(SV* sv)
4339
4341 Copy The XSUB-writer's interface to the C "memcpy" function. The
4342 "src" is the source, "dest" is the destination, "nitems" is the
4343 number of items, and "type" is the type. May fail on
4344 overlapping copies. See also "Move".
4345
4346 void Copy(void* src, void* dest, int nitems, type)
4347
4348 CopyD Like "Copy" but returns "dest". Useful for encouraging
4349 compilers to tail-call optimise.
4350
4351 void * CopyD(void* src, void* dest, int nitems, type)
4352
4353 Move The XSUB-writer's interface to the C "memmove" function. The
4354 "src" is the source, "dest" is the destination, "nitems" is the
4355 number of items, and "type" is the type. Can do overlapping
4356 moves. See also "Copy".
4357
4358 void Move(void* src, void* dest, int nitems, type)
4359
4360 MoveD Like "Move" but returns "dest". Useful for encouraging
4361 compilers to tail-call optimise.
4362
4363 void * MoveD(void* src, void* dest, int nitems, type)
4364
4365 Newx The XSUB-writer's interface to the C "malloc" function.
4366
4367 Memory obtained by this should ONLY be freed with "Safefree".
4368
4369 In 5.9.3, Newx() and friends replace the older New() API, and
4370 drops the first parameter, x, a debug aid which allowed callers
4371 to identify themselves. This aid has been superseded by a new
4372 build option, PERL_MEM_LOG (see "PERL_MEM_LOG" in
4373 perlhacktips). The older API is still there for use in XS
4374 modules supporting older perls.
4375
4376 void Newx(void* ptr, int nitems, type)
4377
4378 Newxc The XSUB-writer's interface to the C "malloc" function, with
4379 cast. See also "Newx".
4380
4381 Memory obtained by this should ONLY be freed with "Safefree".
4382
4383 void Newxc(void* ptr, int nitems, type, cast)
4384
4385 Newxz The XSUB-writer's interface to the C "malloc" function. The
4386 allocated memory is zeroed with "memzero". See also "Newx".
4387
4388 Memory obtained by this should ONLY be freed with "Safefree".
4389
4390 void Newxz(void* ptr, int nitems, type)
4391
4392 Poison PoisonWith(0xEF) for catching access to freed memory.
4393
4394 void Poison(void* dest, int nitems, type)
4395
4396 PoisonFree
4397 PoisonWith(0xEF) for catching access to freed memory.
4398
4399 void PoisonFree(void* dest, int nitems, type)
4400
4401 PoisonNew
4402 PoisonWith(0xAB) for catching access to allocated but
4403 uninitialized memory.
4404
4405 void PoisonNew(void* dest, int nitems, type)
4406
4407 PoisonWith
4408 Fill up memory with a byte pattern (a byte repeated over and
4409 over again) that hopefully catches attempts to access
4410 uninitialized memory.
4411
4412 void PoisonWith(void* dest, int nitems, type,
4413 U8 byte)
4414
4415 Renew The XSUB-writer's interface to the C "realloc" function.
4416
4417 Memory obtained by this should ONLY be freed with "Safefree".
4418
4419 void Renew(void* ptr, int nitems, type)
4420
4421 Renewc The XSUB-writer's interface to the C "realloc" function, with
4422 cast.
4423
4424 Memory obtained by this should ONLY be freed with "Safefree".
4425
4426 void Renewc(void* ptr, int nitems, type, cast)
4427
4428 Safefree
4429 The XSUB-writer's interface to the C "free" function.
4430
4431 This should ONLY be used on memory obtained using "Newx" and
4432 friends.
4433
4434 void Safefree(void* ptr)
4435
4436 savepv Perl's version of "strdup()". Returns a pointer to a newly
4437 allocated string which is a duplicate of "pv". The size of the
4438 string is determined by "strlen()", which means it may not
4439 contain embedded "NUL" characters and must have a trailing
4440 "NUL". The memory allocated for the new string can be freed
4441 with the "Safefree()" function.
4442
4443 On some platforms, Windows for example, all allocated memory
4444 owned by a thread is deallocated when that thread ends. So if
4445 you need that not to happen, you need to use the shared memory
4446 functions, such as "savesharedpv".
4447
4448 char* savepv(const char* pv)
4449
4450 savepvn Perl's version of what "strndup()" would be if it existed.
4451 Returns a pointer to a newly allocated string which is a
4452 duplicate of the first "len" bytes from "pv", plus a trailing
4453 "NUL" byte. The memory allocated for the new string can be
4454 freed with the "Safefree()" function.
4455
4456 On some platforms, Windows for example, all allocated memory
4457 owned by a thread is deallocated when that thread ends. So if
4458 you need that not to happen, you need to use the shared memory
4459 functions, such as "savesharedpvn".
4460
4461 char* savepvn(const char* pv, I32 len)
4462
4463 savepvs Like "savepvn", but takes a literal string instead of a
4464 string/length pair.
4465
4466 char* savepvs("literal string" s)
4467
4468 savesharedpv
4469 A version of "savepv()" which allocates the duplicate string in
4470 memory which is shared between threads.
4471
4472 char* savesharedpv(const char* pv)
4473
4474 savesharedpvn
4475 A version of "savepvn()" which allocates the duplicate string
4476 in memory which is shared between threads. (With the specific
4477 difference that a "NULL" pointer is not acceptable)
4478
4479 char* savesharedpvn(const char *const pv,
4480 const STRLEN len)
4481
4482 savesharedpvs
4483 A version of "savepvs()" which allocates the duplicate string
4484 in memory which is shared between threads.
4485
4486 char* savesharedpvs("literal string" s)
4487
4488 savesharedsvpv
4489 A version of "savesharedpv()" which allocates the duplicate
4490 string in memory which is shared between threads.
4491
4492 char* savesharedsvpv(SV *sv)
4493
4494 savesvpv
4495 A version of "savepv()"/"savepvn()" which gets the string to
4496 duplicate from the passed in SV using "SvPV()"
4497
4498 On some platforms, Windows for example, all allocated memory
4499 owned by a thread is deallocated when that thread ends. So if
4500 you need that not to happen, you need to use the shared memory
4501 functions, such as "savesharedsvpv".
4502
4503 char* savesvpv(SV* sv)
4504
4505 StructCopy
4506 This is an architecture-independent macro to copy one structure
4507 to another.
4508
4509 void StructCopy(type *src, type *dest, type)
4510
4511 Zero The XSUB-writer's interface to the C "memzero" function. The
4512 "dest" is the destination, "nitems" is the number of items, and
4513 "type" is the type.
4514
4515 void Zero(void* dest, int nitems, type)
4516
4517 ZeroD Like "Zero" but returns dest. Useful for encouraging compilers
4518 to tail-call optimise.
4519
4520 void * ZeroD(void* dest, int nitems, type)
4521
4523 dump_c_backtrace
4524 Dumps the C backtrace to the given "fp".
4525
4526 Returns true if a backtrace could be retrieved, false if not.
4527
4528 bool dump_c_backtrace(PerlIO* fp, int max_depth,
4529 int skip)
4530
4531 fbm_compile
4532 Analyzes the string in order to make fast searches on it using
4533 "fbm_instr()" -- the Boyer-Moore algorithm.
4534
4535 void fbm_compile(SV* sv, U32 flags)
4536
4537 fbm_instr
4538 Returns the location of the SV in the string delimited by "big"
4539 and "bigend" ("bigend") is the char following the last char).
4540 It returns "NULL" if the string can't be found. The "sv" does
4541 not have to be "fbm_compiled", but the search will not be as
4542 fast then.
4543
4544 char* fbm_instr(unsigned char* big,
4545 unsigned char* bigend, SV* littlestr,
4546 U32 flags)
4547
4548 foldEQ Returns true if the leading "len" bytes of the strings "s1" and
4549 "s2" are the same case-insensitively; false otherwise.
4550 Uppercase and lowercase ASCII range bytes match themselves and
4551 their opposite case counterparts. Non-cased and non-ASCII
4552 range bytes match only themselves.
4553
4554 I32 foldEQ(const char* a, const char* b, I32 len)
4555
4556 foldEQ_locale
4557 Returns true if the leading "len" bytes of the strings "s1" and
4558 "s2" are the same case-insensitively in the current locale;
4559 false otherwise.
4560
4561 I32 foldEQ_locale(const char* a, const char* b,
4562 I32 len)
4563
4564 form Takes a sprintf-style format pattern and conventional (non-SV)
4565 arguments and returns the formatted string.
4566
4567 (char *) Perl_form(pTHX_ const char* pat, ...)
4568
4569 can be used any place a string (char *) is required:
4570
4571 char * s = Perl_form("%d.%d",major,minor);
4572
4573 Uses a single private buffer so if you want to format several
4574 strings you must explicitly copy the earlier strings away (and
4575 free the copies when you are done).
4576
4577 char* form(const char* pat, ...)
4578
4579 getcwd_sv
4580 Fill "sv" with current working directory
4581
4582 int getcwd_sv(SV* sv)
4583
4584 get_c_backtrace_dump
4585 Returns a SV containing a dump of "depth" frames of the call
4586 stack, skipping the "skip" innermost ones. "depth" of 20 is
4587 usually enough.
4588
4589 The appended output looks like:
4590
4591 ... 1 10e004812:0082 Perl_croak util.c:1716
4592 /usr/bin/perl 2 10df8d6d2:1d72 perl_parse perl.c:3975
4593 /usr/bin/perl ...
4594
4595 The fields are tab-separated. The first column is the depth
4596 (zero being the innermost non-skipped frame). In the
4597 hex:offset, the hex is where the program counter was in
4598 "S_parse_body", and the :offset (might be missing) tells how
4599 much inside the "S_parse_body" the program counter was.
4600
4601 The "util.c:1716" is the source code file and line number.
4602
4603 The /usr/bin/perl is obvious (hopefully).
4604
4605 Unknowns are "-". Unknowns can happen unfortunately quite
4606 easily: if the platform doesn't support retrieving the
4607 information; if the binary is missing the debug information; if
4608 the optimizer has transformed the code by for example inlining.
4609
4610 SV* get_c_backtrace_dump(int max_depth, int skip)
4611
4612 ibcmp This is a synonym for "(! foldEQ())"
4613
4614 I32 ibcmp(const char* a, const char* b, I32 len)
4615
4616 ibcmp_locale
4617 This is a synonym for "(! foldEQ_locale())"
4618
4619 I32 ibcmp_locale(const char* a, const char* b,
4620 I32 len)
4621
4622 is_safe_syscall
4623 Test that the given "pv" doesn't contain any internal "NUL"
4624 characters. If it does, set "errno" to "ENOENT", optionally
4625 warn, and return FALSE.
4626
4627 Return TRUE if the name is safe.
4628
4629 Used by the "IS_SAFE_SYSCALL()" macro.
4630
4631 bool is_safe_syscall(const char *pv, STRLEN len,
4632 const char *what,
4633 const char *op_name)
4634
4635 memEQ Test two buffers (which may contain embedded "NUL" characters,
4636 to see if they are equal. The "len" parameter indicates the
4637 number of bytes to compare. Returns zero if equal, or non-zero
4638 if non-equal.
4639
4640 bool memEQ(char* s1, char* s2, STRLEN len)
4641
4642 memNE Test two buffers (which may contain embedded "NUL" characters,
4643 to see if they are not equal. The "len" parameter indicates
4644 the number of bytes to compare. Returns zero if non-equal, or
4645 non-zero if equal.
4646
4647 bool memNE(char* s1, char* s2, STRLEN len)
4648
4649 mess Take a sprintf-style format pattern and argument list. These
4650 are used to generate a string message. If the message does not
4651 end with a newline, then it will be extended with some
4652 indication of the current location in the code, as described
4653 for "mess_sv".
4654
4655 Normally, the resulting message is returned in a new mortal SV.
4656 During global destruction a single SV may be shared between
4657 uses of this function.
4658
4659 SV * mess(const char *pat, ...)
4660
4661 mess_sv Expands a message, intended for the user, to include an
4662 indication of the current location in the code, if the message
4663 does not already appear to be complete.
4664
4665 "basemsg" is the initial message or object. If it is a
4666 reference, it will be used as-is and will be the result of this
4667 function. Otherwise it is used as a string, and if it already
4668 ends with a newline, it is taken to be complete, and the result
4669 of this function will be the same string. If the message does
4670 not end with a newline, then a segment such as "at foo.pl line
4671 37" will be appended, and possibly other clauses indicating the
4672 current state of execution. The resulting message will end
4673 with a dot and a newline.
4674
4675 Normally, the resulting message is returned in a new mortal SV.
4676 During global destruction a single SV may be shared between
4677 uses of this function. If "consume" is true, then the function
4678 is permitted (but not required) to modify and return "basemsg"
4679 instead of allocating a new SV.
4680
4681 SV * mess_sv(SV *basemsg, bool consume)
4682
4683 my_snprintf
4684 The C library "snprintf" functionality, if available and
4685 standards-compliant (uses "vsnprintf", actually). However, if
4686 the "vsnprintf" is not available, will unfortunately use the
4687 unsafe "vsprintf" which can overrun the buffer (there is an
4688 overrun check, but that may be too late). Consider using
4689 "sv_vcatpvf" instead, or getting "vsnprintf".
4690
4691 int my_snprintf(char *buffer, const Size_t len,
4692 const char *format, ...)
4693
4694 my_strlcat
4695 The C library "strlcat" if available, or a Perl implementation
4696 of it. This operates on C "NUL"-terminated strings.
4697
4698 "my_strlcat()" appends string "src" to the end of "dst". It
4699 will append at most "size - strlen(dst) - 1" characters. It
4700 will then "NUL"-terminate, unless "size" is 0 or the original
4701 "dst" string was longer than "size" (in practice this should
4702 not happen as it means that either "size" is incorrect or that
4703 "dst" is not a proper "NUL"-terminated string).
4704
4705 Note that "size" is the full size of the destination buffer and
4706 the result is guaranteed to be "NUL"-terminated if there is
4707 room. Note that room for the "NUL" should be included in
4708 "size".
4709
4710 The return value is the total length that "dst" would have if
4711 "size" is sufficiently large. Thus it is the initial length of
4712 "dst" plus the length of "src". If "size" is smaller than the
4713 return, the excess was not appended.
4714
4715 Size_t my_strlcat(char *dst, const char *src,
4716 Size_t size)
4717
4718 my_strlcpy
4719 The C library "strlcpy" if available, or a Perl implementation
4720 of it. This operates on C "NUL"-terminated strings.
4721
4722 "my_strlcpy()" copies up to "size - 1" characters from the
4723 string "src" to "dst", "NUL"-terminating the result if "size"
4724 is not 0.
4725
4726 The return value is the total length "src" would be if the copy
4727 completely succeeded. If it is larger than "size", the excess
4728 was not copied.
4729
4730 Size_t my_strlcpy(char *dst, const char *src,
4731 Size_t size)
4732
4733 my_strnlen
4734 The C library "strnlen" if available, or a Perl implementation
4735 of it.
4736
4737 "my_strnlen()" computes the length of the string, up to
4738 "maxlen" characters. It will will never attempt to address
4739 more than "maxlen" characters, making it suitable for use with
4740 strings that are not guaranteed to be NUL-terminated.
4741
4742 Size_t my_strnlen(const char *str, Size_t maxlen)
4743
4744 my_vsnprintf
4745 The C library "vsnprintf" if available and standards-compliant.
4746 However, if if the "vsnprintf" is not available, will
4747 unfortunately use the unsafe "vsprintf" which can overrun the
4748 buffer (there is an overrun check, but that may be too late).
4749 Consider using "sv_vcatpvf" instead, or getting "vsnprintf".
4750
4751 int my_vsnprintf(char *buffer, const Size_t len,
4752 const char *format, va_list ap)
4753
4754 ninstr Find the first (leftmost) occurrence of a sequence of bytes
4755 within another sequence. This is the Perl version of
4756 "strstr()", extended to handle arbitrary sequences, potentially
4757 containing embedded "NUL" characters ("NUL" is what the initial
4758 "n" in the function name stands for; some systems have an
4759 equivalent, "memmem()", but with a somewhat different API).
4760
4761 Another way of thinking about this function is finding a needle
4762 in a haystack. "big" points to the first byte in the haystack.
4763 "big_end" points to one byte beyond the final byte in the
4764 haystack. "little" points to the first byte in the needle.
4765 "little_end" points to one byte beyond the final byte in the
4766 needle. All the parameters must be non-"NULL".
4767
4768 The function returns "NULL" if there is no occurrence of
4769 "little" within "big". If "little" is the empty string, "big"
4770 is returned.
4771
4772 Because this function operates at the byte level, and because
4773 of the inherent characteristics of UTF-8 (or UTF-EBCDIC), it
4774 will work properly if both the needle and the haystack are
4775 strings with the same UTF-8ness, but not if the UTF-8ness
4776 differs.
4777
4778 char * ninstr(char * big, char * bigend, char * little,
4779 char * little_end)
4780
4781 PERL_SYS_INIT
4782 Provides system-specific tune up of the C runtime environment
4783 necessary to run Perl interpreters. This should be called only
4784 once, before creating any Perl interpreters.
4785
4786 void PERL_SYS_INIT(int *argc, char*** argv)
4787
4788 PERL_SYS_INIT3
4789 Provides system-specific tune up of the C runtime environment
4790 necessary to run Perl interpreters. This should be called only
4791 once, before creating any Perl interpreters.
4792
4793 void PERL_SYS_INIT3(int *argc, char*** argv,
4794 char*** env)
4795
4796 PERL_SYS_TERM
4797 Provides system-specific clean up of the C runtime environment
4798 after running Perl interpreters. This should be called only
4799 once, after freeing any remaining Perl interpreters.
4800
4801 void PERL_SYS_TERM()
4802
4803 quadmath_format_needed
4804 "quadmath_format_needed()" returns true if the "format" string
4805 seems to contain at least one non-Q-prefixed "%[efgaEFGA]"
4806 format specifier, or returns false otherwise.
4807
4808 The format specifier detection is not complete printf-syntax
4809 detection, but it should catch most common cases.
4810
4811 If true is returned, those arguments should in theory be
4812 processed with "quadmath_snprintf()", but in case there is more
4813 than one such format specifier (see "quadmath_format_single"),
4814 and if there is anything else beyond that one (even just a
4815 single byte), they cannot be processed because
4816 "quadmath_snprintf()" is very strict, accepting only one format
4817 spec, and nothing else. In this case, the code should probably
4818 fail.
4819
4820 bool quadmath_format_needed(const char* format)
4821
4822 quadmath_format_single
4823 "quadmath_snprintf()" is very strict about its "format" string
4824 and will fail, returning -1, if the format is invalid. It
4825 accepts exactly one format spec.
4826
4827 "quadmath_format_single()" checks that the intended single spec
4828 looks sane: begins with "%", has only one "%", ends with
4829 "[efgaEFGA]", and has "Q" before it. This is not a full
4830 "printf syntax check", just the basics.
4831
4832 Returns the format if it is valid, NULL if not.
4833
4834 "quadmath_format_single()" can and will actually patch in the
4835 missing "Q", if necessary. In this case it will return the
4836 modified copy of the format, which the caller will need to
4837 free.
4838
4839 See also "quadmath_format_needed".
4840
4841 const char* quadmath_format_single(const char* format)
4842
4843 READ_XDIGIT
4844 Returns the value of an ASCII-range hex digit and advances the
4845 string pointer. Behaviour is only well defined when
4846 isXDIGIT(*str) is true.
4847
4848 U8 READ_XDIGIT(char str*)
4849
4850 rninstr Like "ninstr", but instead finds the final (rightmost)
4851 occurrence of a sequence of bytes within another sequence,
4852 returning "NULL" if there is no such occurrence.
4853
4854 char * rninstr(char * big, char * bigend,
4855 char * little, char * little_end)
4856
4857 strEQ Test two "NUL"-terminated strings to see if they are equal.
4858 Returns true or false.
4859
4860 bool strEQ(char* s1, char* s2)
4861
4862 strGE Test two "NUL"-terminated strings to see if the first, "s1", is
4863 greater than or equal to the second, "s2". Returns true or
4864 false.
4865
4866 bool strGE(char* s1, char* s2)
4867
4868 strGT Test two "NUL"-terminated strings to see if the first, "s1", is
4869 greater than the second, "s2". Returns true or false.
4870
4871 bool strGT(char* s1, char* s2)
4872
4873 strLE Test two "NUL"-terminated strings to see if the first, "s1", is
4874 less than or equal to the second, "s2". Returns true or false.
4875
4876 bool strLE(char* s1, char* s2)
4877
4878 strLT Test two "NUL"-terminated strings to see if the first, "s1", is
4879 less than the second, "s2". Returns true or false.
4880
4881 bool strLT(char* s1, char* s2)
4882
4883 strNE Test two "NUL"-terminated strings to see if they are different.
4884 Returns true or false.
4885
4886 bool strNE(char* s1, char* s2)
4887
4888 strnEQ Test two "NUL"-terminated strings to see if they are equal.
4889 The "len" parameter indicates the number of bytes to compare.
4890 Returns true or false. (A wrapper for "strncmp").
4891
4892 bool strnEQ(char* s1, char* s2, STRLEN len)
4893
4894 strnNE Test two "NUL"-terminated strings to see if they are different.
4895 The "len" parameter indicates the number of bytes to compare.
4896 Returns true or false. (A wrapper for "strncmp").
4897
4898 bool strnNE(char* s1, char* s2, STRLEN len)
4899
4900 sv_destroyable
4901 Dummy routine which reports that object can be destroyed when
4902 there is no sharing module present. It ignores its single SV
4903 argument, and returns 'true'. Exists to avoid test for a
4904 "NULL" function pointer and because it could potentially warn
4905 under some level of strict-ness.
4906
4907 bool sv_destroyable(SV *sv)
4908
4909 sv_nosharing
4910 Dummy routine which "shares" an SV when there is no sharing
4911 module present. Or "locks" it. Or "unlocks" it. In other
4912 words, ignores its single SV argument. Exists to avoid test
4913 for a "NULL" function pointer and because it could potentially
4914 warn under some level of strict-ness.
4915
4916 void sv_nosharing(SV *sv)
4917
4918 vmess "pat" and "args" are a sprintf-style format pattern and
4919 encapsulated argument list, respectively. These are used to
4920 generate a string message. If the message does not end with a
4921 newline, then it will be extended with some indication of the
4922 current location in the code, as described for "mess_sv".
4923
4924 Normally, the resulting message is returned in a new mortal SV.
4925 During global destruction a single SV may be shared between
4926 uses of this function.
4927
4928 SV * vmess(const char *pat, va_list *args)
4929
4931 These functions are related to the method resolution order of perl
4932 classes
4933
4934 mro_get_linear_isa
4935 Returns the mro linearisation for the given stash. By default,
4936 this will be whatever "mro_get_linear_isa_dfs" returns unless
4937 some other MRO is in effect for the stash. The return value is
4938 a read-only AV*.
4939
4940 You are responsible for "SvREFCNT_inc()" on the return value if
4941 you plan to store it anywhere semi-permanently (otherwise it
4942 might be deleted out from under you the next time the cache is
4943 invalidated).
4944
4945 AV* mro_get_linear_isa(HV* stash)
4946
4947 mro_method_changed_in
4948 Invalidates method caching on any child classes of the given
4949 stash, so that they might notice the changes in this one.
4950
4951 Ideally, all instances of "PL_sub_generation++" in perl source
4952 outside of mro.c should be replaced by calls to this.
4953
4954 Perl automatically handles most of the common ways a method
4955 might be redefined. However, there are a few ways you could
4956 change a method in a stash without the cache code noticing, in
4957 which case you need to call this method afterwards:
4958
4959 1) Directly manipulating the stash HV entries from XS code.
4960
4961 2) Assigning a reference to a readonly scalar constant into a
4962 stash entry in order to create a constant subroutine (like
4963 constant.pm does).
4964
4965 This same method is available from pure perl via,
4966 "mro::method_changed_in(classname)".
4967
4968 void mro_method_changed_in(HV* stash)
4969
4970 mro_register
4971 Registers a custom mro plugin. See perlmroapi for details.
4972
4973 void mro_register(const struct mro_alg *mro)
4974
4976 dMULTICALL
4977 Declare local variables for a multicall. See "LIGHTWEIGHT
4978 CALLBACKS" in perlcall.
4979
4980 dMULTICALL;
4981
4982 MULTICALL
4983 Make a lightweight callback. See "LIGHTWEIGHT CALLBACKS" in
4984 perlcall.
4985
4986 MULTICALL;
4987
4988 POP_MULTICALL
4989 Closing bracket for a lightweight callback. See "LIGHTWEIGHT
4990 CALLBACKS" in perlcall.
4991
4992 POP_MULTICALL;
4993
4994 PUSH_MULTICALL
4995 Opening bracket for a lightweight callback. See "LIGHTWEIGHT
4996 CALLBACKS" in perlcall.
4997
4998 PUSH_MULTICALL;
4999
5001 grok_bin
5002 converts a string representing a binary number to numeric form.
5003
5004 On entry "start" and *len give the string to scan, *flags gives
5005 conversion flags, and "result" should be "NULL" or a pointer to
5006 an NV. The scan stops at the end of the string, or the first
5007 invalid character. Unless "PERL_SCAN_SILENT_ILLDIGIT" is set
5008 in *flags, encountering an invalid character will also trigger
5009 a warning. On return *len is set to the length of the scanned
5010 string, and *flags gives output flags.
5011
5012 If the value is <= "UV_MAX" it is returned as a UV, the output
5013 flags are clear, and nothing is written to *result. If the
5014 value is > "UV_MAX", "grok_bin" returns "UV_MAX", sets
5015 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5016 the value to *result (or the value is discarded if "result" is
5017 NULL).
5018
5019 The binary number may optionally be prefixed with "0b" or "b"
5020 unless "PERL_SCAN_DISALLOW_PREFIX" is set in *flags on entry.
5021 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the
5022 binary number may use "_" characters to separate digits.
5023
5024 UV grok_bin(const char* start, STRLEN* len_p,
5025 I32* flags, NV *result)
5026
5027 grok_hex
5028 converts a string representing a hex number to numeric form.
5029
5030 On entry "start" and *len_p give the string to scan, *flags
5031 gives conversion flags, and "result" should be "NULL" or a
5032 pointer to an NV. The scan stops at the end of the string, or
5033 the first invalid character. Unless
5034 "PERL_SCAN_SILENT_ILLDIGIT" is set in *flags, encountering an
5035 invalid character will also trigger a warning. On return *len
5036 is set to the length of the scanned string, and *flags gives
5037 output flags.
5038
5039 If the value is <= "UV_MAX" it is returned as a UV, the output
5040 flags are clear, and nothing is written to *result. If the
5041 value is > "UV_MAX", "grok_hex" returns "UV_MAX", sets
5042 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5043 the value to *result (or the value is discarded if "result" is
5044 "NULL").
5045
5046 The hex number may optionally be prefixed with "0x" or "x"
5047 unless "PERL_SCAN_DISALLOW_PREFIX" is set in *flags on entry.
5048 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the hex
5049 number may use "_" characters to separate digits.
5050
5051 UV grok_hex(const char* start, STRLEN* len_p,
5052 I32* flags, NV *result)
5053
5054 grok_infnan
5055 Helper for "grok_number()", accepts various ways of spelling
5056 "infinity" or "not a number", and returns one of the following
5057 flag combinations:
5058
5059 IS_NUMBER_INFINITY
5060 IS_NUMBER_NAN
5061 IS_NUMBER_INFINITY | IS_NUMBER_NEG
5062 IS_NUMBER_NAN | IS_NUMBER_NEG
5063 0
5064
5065 possibly |-ed with "IS_NUMBER_TRAILING".
5066
5067 If an infinity or a not-a-number is recognized, *sp will point
5068 to one byte past the end of the recognized string. If the
5069 recognition fails, zero is returned, and *sp will not move.
5070
5071 int grok_infnan(const char** sp, const char *send)
5072
5073 grok_number
5074 Identical to "grok_number_flags()" with "flags" set to zero.
5075
5076 int grok_number(const char *pv, STRLEN len,
5077 UV *valuep)
5078
5079 grok_number_flags
5080 Recognise (or not) a number. The type of the number is
5081 returned (0 if unrecognised), otherwise it is a bit-ORed
5082 combination of "IS_NUMBER_IN_UV",
5083 "IS_NUMBER_GREATER_THAN_UV_MAX", "IS_NUMBER_NOT_INT",
5084 "IS_NUMBER_NEG", "IS_NUMBER_INFINITY", "IS_NUMBER_NAN" (defined
5085 in perl.h).
5086
5087 If the value of the number can fit in a UV, it is returned in
5088 *valuep. "IS_NUMBER_IN_UV" will be set to indicate that
5089 *valuep is valid, "IS_NUMBER_IN_UV" will never be set unless
5090 *valuep is valid, but *valuep may have been assigned to during
5091 processing even though "IS_NUMBER_IN_UV" is not set on return.
5092 If "valuep" is "NULL", "IS_NUMBER_IN_UV" will be set for the
5093 same cases as when "valuep" is non-"NULL", but no actual
5094 assignment (or SEGV) will occur.
5095
5096 "IS_NUMBER_NOT_INT" will be set with "IS_NUMBER_IN_UV" if
5097 trailing decimals were seen (in which case *valuep gives the
5098 true value truncated to an integer), and "IS_NUMBER_NEG" if the
5099 number is negative (in which case *valuep holds the absolute
5100 value). "IS_NUMBER_IN_UV" is not set if e notation was used or
5101 the number is larger than a UV.
5102
5103 "flags" allows only "PERL_SCAN_TRAILING", which allows for
5104 trailing non-numeric text on an otherwise successful grok,
5105 setting "IS_NUMBER_TRAILING" on the result.
5106
5107 int grok_number_flags(const char *pv, STRLEN len,
5108 UV *valuep, U32 flags)
5109
5110 grok_numeric_radix
5111 Scan and skip for a numeric decimal separator (radix).
5112
5113 bool grok_numeric_radix(const char **sp,
5114 const char *send)
5115
5116 grok_oct
5117 converts a string representing an octal number to numeric form.
5118
5119 On entry "start" and *len give the string to scan, *flags gives
5120 conversion flags, and "result" should be "NULL" or a pointer to
5121 an NV. The scan stops at the end of the string, or the first
5122 invalid character. Unless "PERL_SCAN_SILENT_ILLDIGIT" is set
5123 in *flags, encountering an 8 or 9 will also trigger a warning.
5124 On return *len is set to the length of the scanned string, and
5125 *flags gives output flags.
5126
5127 If the value is <= "UV_MAX" it is returned as a UV, the output
5128 flags are clear, and nothing is written to *result. If the
5129 value is > "UV_MAX", "grok_oct" returns "UV_MAX", sets
5130 "PERL_SCAN_GREATER_THAN_UV_MAX" in the output flags, and writes
5131 the value to *result (or the value is discarded if "result" is
5132 "NULL").
5133
5134 If "PERL_SCAN_ALLOW_UNDERSCORES" is set in *flags then the
5135 octal number may use "_" characters to separate digits.
5136
5137 UV grok_oct(const char* start, STRLEN* len_p,
5138 I32* flags, NV *result)
5139
5140 isinfnan
5141 "Perl_isinfnan()" is utility function that returns true if the
5142 NV argument is either an infinity or a "NaN", false otherwise.
5143 To test in more detail, use "Perl_isinf()" and "Perl_isnan()".
5144
5145 This is also the logical inverse of Perl_isfinite().
5146
5147 bool isinfnan(NV nv)
5148
5149 my_strtod
5150 This function is equivalent to the libc strtod() function, and
5151 is available even on platforms that lack plain strtod(). Its
5152 return value is the best available precision depending on
5153 platform capabilities and Configure options.
5154
5155 It properly handles the locale radix character, meaning it
5156 expects a dot except when called from within the scope of
5157 "use locale", in which case the radix character should be that
5158 specified by the current locale.
5159
5160 The synonym Strod() may be used instead.
5161
5162 NV my_strtod(const char * const s, char ** e)
5163
5164 Perl_signbit
5165 NOTE: this function is experimental and may change or be
5166 removed without notice.
5167
5168 Return a non-zero integer if the sign bit on an NV is set, and
5169 0 if it is not.
5170
5171 If Configure detects this system has a "signbit()" that will
5172 work with our NVs, then we just use it via the "#define" in
5173 perl.h. Otherwise, fall back on this implementation. The main
5174 use of this function is catching "-0.0".
5175
5176 "Configure" notes: This function is called 'Perl_signbit'
5177 instead of a plain 'signbit' because it is easy to imagine a
5178 system having a "signbit()" function or macro that doesn't
5179 happen to work with our particular choice of NVs. We shouldn't
5180 just re-"#define" "signbit" as "Perl_signbit" and expect the
5181 standard system headers to be happy. Also, this is a no-
5182 context function (no "pTHX_") because "Perl_signbit()" is
5183 usually re-"#defined" in perl.h as a simple macro call to the
5184 system's "signbit()". Users should just always call
5185 "Perl_signbit()".
5186
5187 int Perl_signbit(NV f)
5188
5189 scan_bin
5190 For backwards compatibility. Use "grok_bin" instead.
5191
5192 NV scan_bin(const char* start, STRLEN len,
5193 STRLEN* retlen)
5194
5195 scan_hex
5196 For backwards compatibility. Use "grok_hex" instead.
5197
5198 NV scan_hex(const char* start, STRLEN len,
5199 STRLEN* retlen)
5200
5201 scan_oct
5202 For backwards compatibility. Use "grok_oct" instead.
5203
5204 NV scan_oct(const char* start, STRLEN len,
5205 STRLEN* retlen)
5206
5208 Some of these are also deprecated. You can exclude these from your
5209 compiled Perl by adding this option to Configure:
5210 "-Accflags='-DNO_MATHOMS'"
5211
5212 custom_op_desc
5213 Return the description of a given custom op. This was once
5214 used by the "OP_DESC" macro, but is no longer: it has only been
5215 kept for compatibility, and should not be used.
5216
5217 const char * custom_op_desc(const OP *o)
5218
5219 custom_op_name
5220 Return the name for a given custom op. This was once used by
5221 the "OP_NAME" macro, but is no longer: it has only been kept
5222 for compatibility, and should not be used.
5223
5224 const char * custom_op_name(const OP *o)
5225
5226 gv_fetchmethod
5227 See "gv_fetchmethod_autoload".
5228
5229 GV* gv_fetchmethod(HV* stash, const char* name)
5230
5231 is_utf8_char
5232 DEPRECATED! It is planned to remove this function from a
5233 future release of Perl. Do not use it for new code; remove it
5234 from existing code.
5235
5236 Tests if some arbitrary number of bytes begins in a valid UTF-8
5237 character. Note that an INVARIANT (i.e. ASCII on non-EBCDIC
5238 machines) character is a valid UTF-8 character. The actual
5239 number of bytes in the UTF-8 character will be returned if it
5240 is valid, otherwise 0.
5241
5242 This function is deprecated due to the possibility that
5243 malformed input could cause reading beyond the end of the input
5244 buffer. Use "isUTF8_CHAR" instead.
5245
5246 STRLEN is_utf8_char(const U8 *s)
5247
5248 is_utf8_char_buf
5249 This is identical to the macro "isUTF8_CHAR".
5250
5251 STRLEN is_utf8_char_buf(const U8 *buf,
5252 const U8 *buf_end)
5253
5254 pack_cat
5255 The engine implementing "pack()" Perl function. Note:
5256 parameters "next_in_list" and "flags" are not used. This call
5257 should not be used; use "packlist" instead.
5258
5259 void pack_cat(SV *cat, const char *pat,
5260 const char *patend, SV **beglist,
5261 SV **endlist, SV ***next_in_list,
5262 U32 flags)
5263
5264 pad_compname_type
5265 Looks up the type of the lexical variable at position "po" in
5266 the currently-compiling pad. If the variable is typed, the
5267 stash of the class to which it is typed is returned. If not,
5268 "NULL" is returned.
5269
5270 HV * pad_compname_type(PADOFFSET po)
5271
5272 sv_2pvbyte_nolen
5273 Return a pointer to the byte-encoded representation of the SV.
5274 May cause the SV to be downgraded from UTF-8 as a side-effect.
5275
5276 Usually accessed via the "SvPVbyte_nolen" macro.
5277
5278 char* sv_2pvbyte_nolen(SV* sv)
5279
5280 sv_2pvutf8_nolen
5281 Return a pointer to the UTF-8-encoded representation of the SV.
5282 May cause the SV to be upgraded to UTF-8 as a side-effect.
5283
5284 Usually accessed via the "SvPVutf8_nolen" macro.
5285
5286 char* sv_2pvutf8_nolen(SV* sv)
5287
5288 sv_2pv_nolen
5289 Like "sv_2pv()", but doesn't return the length too. You should
5290 usually use the macro wrapper "SvPV_nolen(sv)" instead.
5291
5292 char* sv_2pv_nolen(SV* sv)
5293
5294 sv_catpvn_mg
5295 Like "sv_catpvn", but also handles 'set' magic.
5296
5297 void sv_catpvn_mg(SV *sv, const char *ptr,
5298 STRLEN len)
5299
5300 sv_catsv_mg
5301 Like "sv_catsv", but also handles 'set' magic.
5302
5303 void sv_catsv_mg(SV *dsv, SV *ssv)
5304
5305 sv_force_normal
5306 Undo various types of fakery on an SV: if the PV is a shared
5307 string, make a private copy; if we're a ref, stop refing; if
5308 we're a glob, downgrade to an "xpvmg". See also
5309 "sv_force_normal_flags".
5310
5311 void sv_force_normal(SV *sv)
5312
5313 sv_iv A private implementation of the "SvIVx" macro for compilers
5314 which can't cope with complex macro expressions. Always use
5315 the macro instead.
5316
5317 IV sv_iv(SV* sv)
5318
5319 sv_nolocking
5320 Dummy routine which "locks" an SV when there is no locking
5321 module present. Exists to avoid test for a "NULL" function
5322 pointer and because it could potentially warn under some level
5323 of strict-ness.
5324
5325 "Superseded" by "sv_nosharing()".
5326
5327 void sv_nolocking(SV *sv)
5328
5329 sv_nounlocking
5330 Dummy routine which "unlocks" an SV when there is no locking
5331 module present. Exists to avoid test for a "NULL" function
5332 pointer and because it could potentially warn under some level
5333 of strict-ness.
5334
5335 "Superseded" by "sv_nosharing()".
5336
5337 void sv_nounlocking(SV *sv)
5338
5339 sv_nv A private implementation of the "SvNVx" macro for compilers
5340 which can't cope with complex macro expressions. Always use
5341 the macro instead.
5342
5343 NV sv_nv(SV* sv)
5344
5345 sv_pv Use the "SvPV_nolen" macro instead
5346
5347 char* sv_pv(SV *sv)
5348
5349 sv_pvbyte
5350 Use "SvPVbyte_nolen" instead.
5351
5352 char* sv_pvbyte(SV *sv)
5353
5354 sv_pvbyten
5355 A private implementation of the "SvPVbyte" macro for compilers
5356 which can't cope with complex macro expressions. Always use
5357 the macro instead.
5358
5359 char* sv_pvbyten(SV *sv, STRLEN *lp)
5360
5361 sv_pvn A private implementation of the "SvPV" macro for compilers
5362 which can't cope with complex macro expressions. Always use
5363 the macro instead.
5364
5365 char* sv_pvn(SV *sv, STRLEN *lp)
5366
5367 sv_pvutf8
5368 Use the "SvPVutf8_nolen" macro instead
5369
5370 char* sv_pvutf8(SV *sv)
5371
5372 sv_pvutf8n
5373 A private implementation of the "SvPVutf8" macro for compilers
5374 which can't cope with complex macro expressions. Always use
5375 the macro instead.
5376
5377 char* sv_pvutf8n(SV *sv, STRLEN *lp)
5378
5379 sv_taint
5380 Taint an SV. Use "SvTAINTED_on" instead.
5381
5382 void sv_taint(SV* sv)
5383
5384 sv_unref
5385 Unsets the RV status of the SV, and decrements the reference
5386 count of whatever was being referenced by the RV. This can
5387 almost be thought of as a reversal of "newSVrv". This is
5388 "sv_unref_flags" with the "flag" being zero. See "SvROK_off".
5389
5390 void sv_unref(SV* sv)
5391
5392 sv_usepvn
5393 Tells an SV to use "ptr" to find its string value. Implemented
5394 by calling "sv_usepvn_flags" with "flags" of 0, hence does not
5395 handle 'set' magic. See "sv_usepvn_flags".
5396
5397 void sv_usepvn(SV* sv, char* ptr, STRLEN len)
5398
5399 sv_usepvn_mg
5400 Like "sv_usepvn", but also handles 'set' magic.
5401
5402 void sv_usepvn_mg(SV *sv, char *ptr, STRLEN len)
5403
5404 sv_uv A private implementation of the "SvUVx" macro for compilers
5405 which can't cope with complex macro expressions. Always use
5406 the macro instead.
5407
5408 UV sv_uv(SV* sv)
5409
5410 unpack_str
5411 The engine implementing "unpack()" Perl function. Note:
5412 parameters "strbeg", "new_s" and "ocnt" are not used. This
5413 call should not be used, use "unpackstring" instead.
5414
5415 SSize_t unpack_str(const char *pat, const char *patend,
5416 const char *s, const char *strbeg,
5417 const char *strend, char **new_s,
5418 I32 ocnt, U32 flags)
5419
5420 utf8_to_uvuni
5421 DEPRECATED! It is planned to remove this function from a
5422 future release of Perl. Do not use it for new code; remove it
5423 from existing code.
5424
5425 Returns the Unicode code point of the first character in the
5426 string "s" which is assumed to be in UTF-8 encoding; "retlen"
5427 will be set to the length, in bytes, of that character.
5428
5429 Some, but not all, UTF-8 malformations are detected, and in
5430 fact, some malformed input could cause reading beyond the end
5431 of the input buffer, which is one reason why this function is
5432 deprecated. The other is that only in extremely limited
5433 circumstances should the Unicode versus native code point be of
5434 any interest to you. See "utf8_to_uvuni_buf" for alternatives.
5435
5436 If "s" points to one of the detected malformations, and UTF8
5437 warnings are enabled, zero is returned and *retlen is set (if
5438 "retlen" doesn't point to NULL) to -1. If those warnings are
5439 off, the computed value if well-defined (or the Unicode
5440 REPLACEMENT CHARACTER, if not) is silently returned, and
5441 *retlen is set (if "retlen" isn't NULL) so that ("s" + *retlen)
5442 is the next possible position in "s" that could begin a non-
5443 malformed character. See "utf8n_to_uvchr" for details on when
5444 the REPLACEMENT CHARACTER is returned.
5445
5446 UV utf8_to_uvuni(const U8 *s, STRLEN *retlen)
5447
5449 newASSIGNOP
5450 Constructs, checks, and returns an assignment op. "left" and
5451 "right" supply the parameters of the assignment; they are
5452 consumed by this function and become part of the constructed op
5453 tree.
5454
5455 If "optype" is "OP_ANDASSIGN", "OP_ORASSIGN", or
5456 "OP_DORASSIGN", then a suitable conditional optree is
5457 constructed. If "optype" is the opcode of a binary operator,
5458 such as "OP_BIT_OR", then an op is constructed that performs
5459 the binary operation and assigns the result to the left
5460 argument. Either way, if "optype" is non-zero then "flags" has
5461 no effect.
5462
5463 If "optype" is zero, then a plain scalar or list assignment is
5464 constructed. Which type of assignment it is is automatically
5465 determined. "flags" gives the eight bits of "op_flags", except
5466 that "OPf_KIDS" will be set automatically, and, shifted up
5467 eight bits, the eight bits of "op_private", except that the bit
5468 with value 1 or 2 is automatically set as required.
5469
5470 OP * newASSIGNOP(I32 flags, OP *left, I32 optype,
5471 OP *right)
5472
5473 newBINOP
5474 Constructs, checks, and returns an op of any binary type.
5475 "type" is the opcode. "flags" gives the eight bits of
5476 "op_flags", except that "OPf_KIDS" will be set automatically,
5477 and, shifted up eight bits, the eight bits of "op_private",
5478 except that the bit with value 1 or 2 is automatically set as
5479 required. "first" and "last" supply up to two ops to be the
5480 direct children of the binary op; they are consumed by this
5481 function and become part of the constructed op tree.
5482
5483 OP * newBINOP(I32 type, I32 flags, OP *first,
5484 OP *last)
5485
5486 newCONDOP
5487 Constructs, checks, and returns a conditional-expression
5488 ("cond_expr") op. "flags" gives the eight bits of "op_flags",
5489 except that "OPf_KIDS" will be set automatically, and, shifted
5490 up eight bits, the eight bits of "op_private", except that the
5491 bit with value 1 is automatically set. "first" supplies the
5492 expression selecting between the two branches, and "trueop" and
5493 "falseop" supply the branches; they are consumed by this
5494 function and become part of the constructed op tree.
5495
5496 OP * newCONDOP(I32 flags, OP *first, OP *trueop,
5497 OP *falseop)
5498
5499 newDEFSVOP
5500 Constructs and returns an op to access $_.
5501
5502 OP * newDEFSVOP()
5503
5504 newFOROP
5505 Constructs, checks, and returns an op tree expressing a
5506 "foreach" loop (iteration through a list of values). This is a
5507 heavyweight loop, with structure that allows exiting the loop
5508 by "last" and suchlike.
5509
5510 "sv" optionally supplies the variable that will be aliased to
5511 each item in turn; if null, it defaults to $_. "expr" supplies
5512 the list of values to iterate over. "block" supplies the main
5513 body of the loop, and "cont" optionally supplies a "continue"
5514 block that operates as a second half of the body. All of these
5515 optree inputs are consumed by this function and become part of
5516 the constructed op tree.
5517
5518 "flags" gives the eight bits of "op_flags" for the "leaveloop"
5519 op and, shifted up eight bits, the eight bits of "op_private"
5520 for the "leaveloop" op, except that (in both cases) some bits
5521 will be set automatically.
5522
5523 OP * newFOROP(I32 flags, OP *sv, OP *expr, OP *block,
5524 OP *cont)
5525
5526 newGIVENOP
5527 Constructs, checks, and returns an op tree expressing a "given"
5528 block. "cond" supplies the expression to whose value $_ will
5529 be locally aliased, and "block" supplies the body of the
5530 "given" construct; they are consumed by this function and
5531 become part of the constructed op tree. "defsv_off" must be
5532 zero (it used to identity the pad slot of lexical $_).
5533
5534 OP * newGIVENOP(OP *cond, OP *block,
5535 PADOFFSET defsv_off)
5536
5537 newGVOP Constructs, checks, and returns an op of any type that involves
5538 an embedded reference to a GV. "type" is the opcode. "flags"
5539 gives the eight bits of "op_flags". "gv" identifies the GV
5540 that the op should reference; calling this function does not
5541 transfer ownership of any reference to it.
5542
5543 OP * newGVOP(I32 type, I32 flags, GV *gv)
5544
5545 newLISTOP
5546 Constructs, checks, and returns an op of any list type. "type"
5547 is the opcode. "flags" gives the eight bits of "op_flags",
5548 except that "OPf_KIDS" will be set automatically if required.
5549 "first" and "last" supply up to two ops to be direct children
5550 of the list op; they are consumed by this function and become
5551 part of the constructed op tree.
5552
5553 For most list operators, the check function expects all the kid
5554 ops to be present already, so calling "newLISTOP(OP_JOIN, ...)"
5555 (e.g.) is not appropriate. What you want to do in that case is
5556 create an op of type "OP_LIST", append more children to it, and
5557 then call "op_convert_list". See "op_convert_list" for more
5558 information.
5559
5560 OP * newLISTOP(I32 type, I32 flags, OP *first,
5561 OP *last)
5562
5563 newLOGOP
5564 Constructs, checks, and returns a logical (flow control) op.
5565 "type" is the opcode. "flags" gives the eight bits of
5566 "op_flags", except that "OPf_KIDS" will be set automatically,
5567 and, shifted up eight bits, the eight bits of "op_private",
5568 except that the bit with value 1 is automatically set. "first"
5569 supplies the expression controlling the flow, and "other"
5570 supplies the side (alternate) chain of ops; they are consumed
5571 by this function and become part of the constructed op tree.
5572
5573 OP * newLOGOP(I32 type, I32 flags, OP *first,
5574 OP *other)
5575
5576 newLOOPEX
5577 Constructs, checks, and returns a loop-exiting op (such as
5578 "goto" or "last"). "type" is the opcode. "label" supplies the
5579 parameter determining the target of the op; it is consumed by
5580 this function and becomes part of the constructed op tree.
5581
5582 OP * newLOOPEX(I32 type, OP *label)
5583
5584 newLOOPOP
5585 Constructs, checks, and returns an op tree expressing a loop.
5586 This is only a loop in the control flow through the op tree; it
5587 does not have the heavyweight loop structure that allows
5588 exiting the loop by "last" and suchlike. "flags" gives the
5589 eight bits of "op_flags" for the top-level op, except that some
5590 bits will be set automatically as required. "expr" supplies
5591 the expression controlling loop iteration, and "block" supplies
5592 the body of the loop; they are consumed by this function and
5593 become part of the constructed op tree. "debuggable" is
5594 currently unused and should always be 1.
5595
5596 OP * newLOOPOP(I32 flags, I32 debuggable, OP *expr,
5597 OP *block)
5598
5599 newMETHOP
5600 Constructs, checks, and returns an op of method type with a
5601 method name evaluated at runtime. "type" is the opcode.
5602 "flags" gives the eight bits of "op_flags", except that
5603 "OPf_KIDS" will be set automatically, and, shifted up eight
5604 bits, the eight bits of "op_private", except that the bit with
5605 value 1 is automatically set. "dynamic_meth" supplies an op
5606 which evaluates method name; it is consumed by this function
5607 and become part of the constructed op tree. Supported optypes:
5608 "OP_METHOD".
5609
5610 OP * newMETHOP(I32 type, I32 flags, OP *first)
5611
5612 newMETHOP_named
5613 Constructs, checks, and returns an op of method type with a
5614 constant method name. "type" is the opcode. "flags" gives the
5615 eight bits of "op_flags", and, shifted up eight bits, the eight
5616 bits of "op_private". "const_meth" supplies a constant method
5617 name; it must be a shared COW string. Supported optypes:
5618 "OP_METHOD_NAMED".
5619
5620 OP * newMETHOP_named(I32 type, I32 flags,
5621 SV *const_meth)
5622
5623 newNULLLIST
5624 Constructs, checks, and returns a new "stub" op, which
5625 represents an empty list expression.
5626
5627 OP * newNULLLIST()
5628
5629 newOP Constructs, checks, and returns an op of any base type (any
5630 type that has no extra fields). "type" is the opcode. "flags"
5631 gives the eight bits of "op_flags", and, shifted up eight bits,
5632 the eight bits of "op_private".
5633
5634 OP * newOP(I32 type, I32 flags)
5635
5636 newPADOP
5637 Constructs, checks, and returns an op of any type that involves
5638 a reference to a pad element. "type" is the opcode. "flags"
5639 gives the eight bits of "op_flags". A pad slot is
5640 automatically allocated, and is populated with "sv"; this
5641 function takes ownership of one reference to it.
5642
5643 This function only exists if Perl has been compiled to use
5644 ithreads.
5645
5646 OP * newPADOP(I32 type, I32 flags, SV *sv)
5647
5648 newPMOP Constructs, checks, and returns an op of any pattern matching
5649 type. "type" is the opcode. "flags" gives the eight bits of
5650 "op_flags" and, shifted up eight bits, the eight bits of
5651 "op_private".
5652
5653 OP * newPMOP(I32 type, I32 flags)
5654
5655 newPVOP Constructs, checks, and returns an op of any type that involves
5656 an embedded C-level pointer (PV). "type" is the opcode.
5657 "flags" gives the eight bits of "op_flags". "pv" supplies the
5658 C-level pointer. Depending on the op type, the memory
5659 referenced by "pv" may be freed when the op is destroyed. If
5660 the op is of a freeing type, "pv" must have been allocated
5661 using "PerlMemShared_malloc".
5662
5663 OP * newPVOP(I32 type, I32 flags, char *pv)
5664
5665 newRANGE
5666 Constructs and returns a "range" op, with subordinate "flip"
5667 and "flop" ops. "flags" gives the eight bits of "op_flags" for
5668 the "flip" op and, shifted up eight bits, the eight bits of
5669 "op_private" for both the "flip" and "range" ops, except that
5670 the bit with value 1 is automatically set. "left" and "right"
5671 supply the expressions controlling the endpoints of the range;
5672 they are consumed by this function and become part of the
5673 constructed op tree.
5674
5675 OP * newRANGE(I32 flags, OP *left, OP *right)
5676
5677 newSLICEOP
5678 Constructs, checks, and returns an "lslice" (list slice) op.
5679 "flags" gives the eight bits of "op_flags", except that
5680 "OPf_KIDS" will be set automatically, and, shifted up eight
5681 bits, the eight bits of "op_private", except that the bit with
5682 value 1 or 2 is automatically set as required. "listval" and
5683 "subscript" supply the parameters of the slice; they are
5684 consumed by this function and become part of the constructed op
5685 tree.
5686
5687 OP * newSLICEOP(I32 flags, OP *subscript,
5688 OP *listval)
5689
5690 newSTATEOP
5691 Constructs a state op (COP). The state op is normally a
5692 "nextstate" op, but will be a "dbstate" op if debugging is
5693 enabled for currently-compiled code. The state op is populated
5694 from "PL_curcop" (or "PL_compiling"). If "label" is non-null,
5695 it supplies the name of a label to attach to the state op; this
5696 function takes ownership of the memory pointed at by "label",
5697 and will free it. "flags" gives the eight bits of "op_flags"
5698 for the state op.
5699
5700 If "o" is null, the state op is returned. Otherwise the state
5701 op is combined with "o" into a "lineseq" list op, which is
5702 returned. "o" is consumed by this function and becomes part of
5703 the returned op tree.
5704
5705 OP * newSTATEOP(I32 flags, char *label, OP *o)
5706
5707 newSVOP Constructs, checks, and returns an op of any type that involves
5708 an embedded SV. "type" is the opcode. "flags" gives the eight
5709 bits of "op_flags". "sv" gives the SV to embed in the op; this
5710 function takes ownership of one reference to it.
5711
5712 OP * newSVOP(I32 type, I32 flags, SV *sv)
5713
5714 newUNOP Constructs, checks, and returns an op of any unary type.
5715 "type" is the opcode. "flags" gives the eight bits of
5716 "op_flags", except that "OPf_KIDS" will be set automatically if
5717 required, and, shifted up eight bits, the eight bits of
5718 "op_private", except that the bit with value 1 is automatically
5719 set. "first" supplies an optional op to be the direct child of
5720 the unary op; it is consumed by this function and become part
5721 of the constructed op tree.
5722
5723 OP * newUNOP(I32 type, I32 flags, OP *first)
5724
5725 newUNOP_AUX
5726 Similar to "newUNOP", but creates an "UNOP_AUX" struct instead,
5727 with "op_aux" initialised to "aux"
5728
5729 OP* newUNOP_AUX(I32 type, I32 flags, OP* first,
5730 UNOP_AUX_item *aux)
5731
5732 newWHENOP
5733 Constructs, checks, and returns an op tree expressing a "when"
5734 block. "cond" supplies the test expression, and "block"
5735 supplies the block that will be executed if the test evaluates
5736 to true; they are consumed by this function and become part of
5737 the constructed op tree. "cond" will be interpreted
5738 DWIMically, often as a comparison against $_, and may be null
5739 to generate a "default" block.
5740
5741 OP * newWHENOP(OP *cond, OP *block)
5742
5743 newWHILEOP
5744 Constructs, checks, and returns an op tree expressing a "while"
5745 loop. This is a heavyweight loop, with structure that allows
5746 exiting the loop by "last" and suchlike.
5747
5748 "loop" is an optional preconstructed "enterloop" op to use in
5749 the loop; if it is null then a suitable op will be constructed
5750 automatically. "expr" supplies the loop's controlling
5751 expression. "block" supplies the main body of the loop, and
5752 "cont" optionally supplies a "continue" block that operates as
5753 a second half of the body. All of these optree inputs are
5754 consumed by this function and become part of the constructed op
5755 tree.
5756
5757 "flags" gives the eight bits of "op_flags" for the "leaveloop"
5758 op and, shifted up eight bits, the eight bits of "op_private"
5759 for the "leaveloop" op, except that (in both cases) some bits
5760 will be set automatically. "debuggable" is currently unused
5761 and should always be 1. "has_my" can be supplied as true to
5762 force the loop body to be enclosed in its own scope.
5763
5764 OP * newWHILEOP(I32 flags, I32 debuggable,
5765 LOOP *loop, OP *expr, OP *block,
5766 OP *cont, I32 has_my)
5767
5769 alloccopstash
5770 NOTE: this function is experimental and may change or be
5771 removed without notice.
5772
5773 Available only under threaded builds, this function allocates
5774 an entry in "PL_stashpad" for the stash passed to it.
5775
5776 PADOFFSET alloccopstash(HV *hv)
5777
5778 block_end
5779 Handles compile-time scope exit. "floor" is the savestack
5780 index returned by "block_start", and "seq" is the body of the
5781 block. Returns the block, possibly modified.
5782
5783 OP * block_end(I32 floor, OP *seq)
5784
5785 block_start
5786 Handles compile-time scope entry. Arranges for hints to be
5787 restored on block exit and also handles pad sequence numbers to
5788 make lexical variables scope right. Returns a savestack index
5789 for use with "block_end".
5790
5791 int block_start(int full)
5792
5793 ck_entersub_args_list
5794 Performs the default fixup of the arguments part of an
5795 "entersub" op tree. This consists of applying list context to
5796 each of the argument ops. This is the standard treatment used
5797 on a call marked with "&", or a method call, or a call through
5798 a subroutine reference, or any other call where the callee
5799 can't be identified at compile time, or a call where the callee
5800 has no prototype.
5801
5802 OP * ck_entersub_args_list(OP *entersubop)
5803
5804 ck_entersub_args_proto
5805 Performs the fixup of the arguments part of an "entersub" op
5806 tree based on a subroutine prototype. This makes various
5807 modifications to the argument ops, from applying context up to
5808 inserting "refgen" ops, and checking the number and syntactic
5809 types of arguments, as directed by the prototype. This is the
5810 standard treatment used on a subroutine call, not marked with
5811 "&", where the callee can be identified at compile time and has
5812 a prototype.
5813
5814 "protosv" supplies the subroutine prototype to be applied to
5815 the call. It may be a normal defined scalar, of which the
5816 string value will be used. Alternatively, for convenience, it
5817 may be a subroutine object (a "CV*" that has been cast to
5818 "SV*") which has a prototype. The prototype supplied, in
5819 whichever form, does not need to match the actual callee
5820 referenced by the op tree.
5821
5822 If the argument ops disagree with the prototype, for example by
5823 having an unacceptable number of arguments, a valid op tree is
5824 returned anyway. The error is reflected in the parser state,
5825 normally resulting in a single exception at the top level of
5826 parsing which covers all the compilation errors that occurred.
5827 In the error message, the callee is referred to by the name
5828 defined by the "namegv" parameter.
5829
5830 OP * ck_entersub_args_proto(OP *entersubop,
5831 GV *namegv, SV *protosv)
5832
5833 ck_entersub_args_proto_or_list
5834 Performs the fixup of the arguments part of an "entersub" op
5835 tree either based on a subroutine prototype or using default
5836 list-context processing. This is the standard treatment used
5837 on a subroutine call, not marked with "&", where the callee can
5838 be identified at compile time.
5839
5840 "protosv" supplies the subroutine prototype to be applied to
5841 the call, or indicates that there is no prototype. It may be a
5842 normal scalar, in which case if it is defined then the string
5843 value will be used as a prototype, and if it is undefined then
5844 there is no prototype. Alternatively, for convenience, it may
5845 be a subroutine object (a "CV*" that has been cast to "SV*"),
5846 of which the prototype will be used if it has one. The
5847 prototype (or lack thereof) supplied, in whichever form, does
5848 not need to match the actual callee referenced by the op tree.
5849
5850 If the argument ops disagree with the prototype, for example by
5851 having an unacceptable number of arguments, a valid op tree is
5852 returned anyway. The error is reflected in the parser state,
5853 normally resulting in a single exception at the top level of
5854 parsing which covers all the compilation errors that occurred.
5855 In the error message, the callee is referred to by the name
5856 defined by the "namegv" parameter.
5857
5858 OP * ck_entersub_args_proto_or_list(OP *entersubop,
5859 GV *namegv,
5860 SV *protosv)
5861
5862 cv_const_sv
5863 If "cv" is a constant sub eligible for inlining, returns the
5864 constant value returned by the sub. Otherwise, returns "NULL".
5865
5866 Constant subs can be created with "newCONSTSUB" or as described
5867 in "Constant Functions" in perlsub.
5868
5869 SV* cv_const_sv(const CV *const cv)
5870
5871 cv_get_call_checker
5872 The original form of "cv_get_call_checker_flags", which does
5873 not return checker flags. When using a checker function
5874 returned by this function, it is only safe to call it with a
5875 genuine GV as its "namegv" argument.
5876
5877 void cv_get_call_checker(CV *cv,
5878 Perl_call_checker *ckfun_p,
5879 SV **ckobj_p)
5880
5881 cv_get_call_checker_flags
5882 Retrieves the function that will be used to fix up a call to
5883 "cv". Specifically, the function is applied to an "entersub"
5884 op tree for a subroutine call, not marked with "&", where the
5885 callee can be identified at compile time as "cv".
5886
5887 The C-level function pointer is returned in *ckfun_p, an SV
5888 argument for it is returned in *ckobj_p, and control flags are
5889 returned in *ckflags_p. The function is intended to be called
5890 in this manner:
5891
5892 entersubop = (*ckfun_p)(aTHX_ entersubop, namegv, (*ckobj_p));
5893
5894 In this call, "entersubop" is a pointer to the "entersub" op,
5895 which may be replaced by the check function, and "namegv"
5896 supplies the name that should be used by the check function to
5897 refer to the callee of the "entersub" op if it needs to emit
5898 any diagnostics. It is permitted to apply the check function
5899 in non-standard situations, such as to a call to a different
5900 subroutine or to a method call.
5901
5902 "namegv" may not actually be a GV. If the
5903 "CALL_CHECKER_REQUIRE_GV" bit is clear in *ckflags_p, it is
5904 permitted to pass a CV or other SV instead, anything that can
5905 be used as the first argument to "cv_name". If the
5906 "CALL_CHECKER_REQUIRE_GV" bit is set in *ckflags_p then the
5907 check function requires "namegv" to be a genuine GV.
5908
5909 By default, the check function is
5910 Perl_ck_entersub_args_proto_or_list, the SV parameter is "cv"
5911 itself, and the "CALL_CHECKER_REQUIRE_GV" flag is clear. This
5912 implements standard prototype processing. It can be changed,
5913 for a particular subroutine, by "cv_set_call_checker_flags".
5914
5915 If the "CALL_CHECKER_REQUIRE_GV" bit is set in "gflags" then it
5916 indicates that the caller only knows about the genuine GV
5917 version of "namegv", and accordingly the corresponding bit will
5918 always be set in *ckflags_p, regardless of the check function's
5919 recorded requirements. If the "CALL_CHECKER_REQUIRE_GV" bit is
5920 clear in "gflags" then it indicates the caller knows about the
5921 possibility of passing something other than a GV as "namegv",
5922 and accordingly the corresponding bit may be either set or
5923 clear in *ckflags_p, indicating the check function's recorded
5924 requirements.
5925
5926 "gflags" is a bitset passed into "cv_get_call_checker_flags",
5927 in which only the "CALL_CHECKER_REQUIRE_GV" bit currently has a
5928 defined meaning (for which see above). All other bits should
5929 be clear.
5930
5931 void cv_get_call_checker_flags(
5932 CV *cv, U32 gflags,
5933 Perl_call_checker *ckfun_p, SV **ckobj_p,
5934 U32 *ckflags_p
5935 )
5936
5937 cv_set_call_checker
5938 The original form of "cv_set_call_checker_flags", which passes
5939 it the "CALL_CHECKER_REQUIRE_GV" flag for backward-
5940 compatibility. The effect of that flag setting is that the
5941 check function is guaranteed to get a genuine GV as its
5942 "namegv" argument.
5943
5944 void cv_set_call_checker(CV *cv,
5945 Perl_call_checker ckfun,
5946 SV *ckobj)
5947
5948 cv_set_call_checker_flags
5949 Sets the function that will be used to fix up a call to "cv".
5950 Specifically, the function is applied to an "entersub" op tree
5951 for a subroutine call, not marked with "&", where the callee
5952 can be identified at compile time as "cv".
5953
5954 The C-level function pointer is supplied in "ckfun", an SV
5955 argument for it is supplied in "ckobj", and control flags are
5956 supplied in "ckflags". The function should be defined like
5957 this:
5958
5959 STATIC OP * ckfun(pTHX_ OP *op, GV *namegv, SV *ckobj)
5960
5961 It is intended to be called in this manner:
5962
5963 entersubop = ckfun(aTHX_ entersubop, namegv, ckobj);
5964
5965 In this call, "entersubop" is a pointer to the "entersub" op,
5966 which may be replaced by the check function, and "namegv"
5967 supplies the name that should be used by the check function to
5968 refer to the callee of the "entersub" op if it needs to emit
5969 any diagnostics. It is permitted to apply the check function
5970 in non-standard situations, such as to a call to a different
5971 subroutine or to a method call.
5972
5973 "namegv" may not actually be a GV. For efficiency, perl may
5974 pass a CV or other SV instead. Whatever is passed can be used
5975 as the first argument to "cv_name". You can force perl to pass
5976 a GV by including "CALL_CHECKER_REQUIRE_GV" in the "ckflags".
5977
5978 "ckflags" is a bitset, in which only the
5979 "CALL_CHECKER_REQUIRE_GV" bit currently has a defined meaning
5980 (for which see above). All other bits should be clear.
5981
5982 The current setting for a particular CV can be retrieved by
5983 "cv_get_call_checker_flags".
5984
5985 void cv_set_call_checker_flags(
5986 CV *cv, Perl_call_checker ckfun, SV *ckobj,
5987 U32 ckflags
5988 )
5989
5990 LINKLIST
5991 Given the root of an optree, link the tree in execution order
5992 using the "op_next" pointers and return the first op executed.
5993 If this has already been done, it will not be redone, and
5994 "o->op_next" will be returned. If "o->op_next" is not already
5995 set, "o" should be at least an "UNOP".
5996
5997 OP* LINKLIST(OP *o)
5998
5999 newCONSTSUB
6000 Behaves like "newCONSTSUB_flags", except that "name" is nul-
6001 terminated rather than of counted length, and no flags are set.
6002 (This means that "name" is always interpreted as Latin-1.)
6003
6004 CV * newCONSTSUB(HV *stash, const char *name, SV *sv)
6005
6006 newCONSTSUB_flags
6007 Construct a constant subroutine, also performing some
6008 surrounding jobs. A scalar constant-valued subroutine is
6009 eligible for inlining at compile-time, and in Perl code can be
6010 created by "sub FOO () { 123 }". Other kinds of constant
6011 subroutine have other treatment.
6012
6013 The subroutine will have an empty prototype and will ignore any
6014 arguments when called. Its constant behaviour is determined by
6015 "sv". If "sv" is null, the subroutine will yield an empty
6016 list. If "sv" points to a scalar, the subroutine will always
6017 yield that scalar. If "sv" points to an array, the subroutine
6018 will always yield a list of the elements of that array in list
6019 context, or the number of elements in the array in scalar
6020 context. This function takes ownership of one counted
6021 reference to the scalar or array, and will arrange for the
6022 object to live as long as the subroutine does. If "sv" points
6023 to a scalar then the inlining assumes that the value of the
6024 scalar will never change, so the caller must ensure that the
6025 scalar is not subsequently written to. If "sv" points to an
6026 array then no such assumption is made, so it is ostensibly safe
6027 to mutate the array or its elements, but whether this is really
6028 supported has not been determined.
6029
6030 The subroutine will have "CvFILE" set according to "PL_curcop".
6031 Other aspects of the subroutine will be left in their default
6032 state. The caller is free to mutate the subroutine beyond its
6033 initial state after this function has returned.
6034
6035 If "name" is null then the subroutine will be anonymous, with
6036 its "CvGV" referring to an "__ANON__" glob. If "name" is non-
6037 null then the subroutine will be named accordingly, referenced
6038 by the appropriate glob. "name" is a string of length "len"
6039 bytes giving a sigilless symbol name, in UTF-8 if "flags" has
6040 the "SVf_UTF8" bit set and in Latin-1 otherwise. The name may
6041 be either qualified or unqualified. If the name is unqualified
6042 then it defaults to being in the stash specified by "stash" if
6043 that is non-null, or to "PL_curstash" if "stash" is null. The
6044 symbol is always added to the stash if necessary, with
6045 "GV_ADDMULTI" semantics.
6046
6047 "flags" should not have bits set other than "SVf_UTF8".
6048
6049 If there is already a subroutine of the specified name, then
6050 the new sub will replace the existing one in the glob. A
6051 warning may be generated about the redefinition.
6052
6053 If the subroutine has one of a few special names, such as
6054 "BEGIN" or "END", then it will be claimed by the appropriate
6055 queue for automatic running of phase-related subroutines. In
6056 this case the relevant glob will be left not containing any
6057 subroutine, even if it did contain one before. Execution of
6058 the subroutine will likely be a no-op, unless "sv" was a tied
6059 array or the caller modified the subroutine in some interesting
6060 way before it was executed. In the case of "BEGIN", the
6061 treatment is buggy: the sub will be executed when only half
6062 built, and may be deleted prematurely, possibly causing a
6063 crash.
6064
6065 The function returns a pointer to the constructed subroutine.
6066 If the sub is anonymous then ownership of one counted reference
6067 to the subroutine is transferred to the caller. If the sub is
6068 named then the caller does not get ownership of a reference.
6069 In most such cases, where the sub has a non-phase name, the sub
6070 will be alive at the point it is returned by virtue of being
6071 contained in the glob that names it. A phase-named subroutine
6072 will usually be alive by virtue of the reference owned by the
6073 phase's automatic run queue. A "BEGIN" subroutine may have
6074 been destroyed already by the time this function returns, but
6075 currently bugs occur in that case before the caller gets
6076 control. It is the caller's responsibility to ensure that it
6077 knows which of these situations applies.
6078
6079 CV * newCONSTSUB_flags(HV *stash, const char *name,
6080 STRLEN len, U32 flags, SV *sv)
6081
6082 newXS Used by "xsubpp" to hook up XSUBs as Perl subs. "filename"
6083 needs to be static storage, as it is used directly as CvFILE(),
6084 without a copy being made.
6085
6086 op_append_elem
6087 Append an item to the list of ops contained directly within a
6088 list-type op, returning the lengthened list. "first" is the
6089 list-type op, and "last" is the op to append to the list.
6090 "optype" specifies the intended opcode for the list. If
6091 "first" is not already a list of the right type, it will be
6092 upgraded into one. If either "first" or "last" is null, the
6093 other is returned unchanged.
6094
6095 OP * op_append_elem(I32 optype, OP *first, OP *last)
6096
6097 op_append_list
6098 Concatenate the lists of ops contained directly within two
6099 list-type ops, returning the combined list. "first" and "last"
6100 are the list-type ops to concatenate. "optype" specifies the
6101 intended opcode for the list. If either "first" or "last" is
6102 not already a list of the right type, it will be upgraded into
6103 one. If either "first" or "last" is null, the other is
6104 returned unchanged.
6105
6106 OP * op_append_list(I32 optype, OP *first, OP *last)
6107
6108 OP_CLASS
6109 Return the class of the provided OP: that is, which of the *OP
6110 structures it uses. For core ops this currently gets the
6111 information out of "PL_opargs", which does not always
6112 accurately reflect the type used; in v5.26 onwards, see also
6113 the function "op_class" which can do a better job of
6114 determining the used type.
6115
6116 For custom ops the type is returned from the registration, and
6117 it is up to the registree to ensure it is accurate. The value
6118 returned will be one of the "OA_"* constants from op.h.
6119
6120 U32 OP_CLASS(OP *o)
6121
6122 op_contextualize
6123 Applies a syntactic context to an op tree representing an
6124 expression. "o" is the op tree, and "context" must be
6125 "G_SCALAR", "G_ARRAY", or "G_VOID" to specify the context to
6126 apply. The modified op tree is returned.
6127
6128 OP * op_contextualize(OP *o, I32 context)
6129
6130 op_convert_list
6131 Converts "o" into a list op if it is not one already, and then
6132 converts it into the specified "type", calling its check
6133 function, allocating a target if it needs one, and folding
6134 constants.
6135
6136 A list-type op is usually constructed one kid at a time via
6137 "newLISTOP", "op_prepend_elem" and "op_append_elem". Then
6138 finally it is passed to "op_convert_list" to make it the right
6139 type.
6140
6141 OP * op_convert_list(I32 type, I32 flags, OP *o)
6142
6143 OP_DESC Return a short description of the provided OP.
6144
6145 const char * OP_DESC(OP *o)
6146
6147 op_free Free an op. Only use this when an op is no longer linked to
6148 from any optree.
6149
6150 void op_free(OP *o)
6151
6152 OpHAS_SIBLING
6153 Returns true if "o" has a sibling
6154
6155 bool OpHAS_SIBLING(OP *o)
6156
6157 OpLASTSIB_set
6158 Marks "o" as having no further siblings and marks o as having
6159 the specified parent. See also "OpMORESIB_set" and
6160 "OpMAYBESIB_set". For a higher-level interface, see
6161 "op_sibling_splice".
6162
6163 void OpLASTSIB_set(OP *o, OP *parent)
6164
6165 op_linklist
6166 This function is the implementation of the "LINKLIST" macro.
6167 It should not be called directly.
6168
6169 OP* op_linklist(OP *o)
6170
6171 op_lvalue
6172 NOTE: this function is experimental and may change or be
6173 removed without notice.
6174
6175 Propagate lvalue ("modifiable") context to an op and its
6176 children. "type" represents the context type, roughly based on
6177 the type of op that would do the modifying, although "local()"
6178 is represented by "OP_NULL", because it has no op type of its
6179 own (it is signalled by a flag on the lvalue op).
6180
6181 This function detects things that can't be modified, such as
6182 "$x+1", and generates errors for them. For example, "$x+1 = 2"
6183 would cause it to be called with an op of type "OP_ADD" and a
6184 "type" argument of "OP_SASSIGN".
6185
6186 It also flags things that need to behave specially in an lvalue
6187 context, such as "$$x = 5" which might have to vivify a
6188 reference in $x.
6189
6190 OP * op_lvalue(OP *o, I32 type)
6191
6192 OpMAYBESIB_set
6193 Conditionally does "OpMORESIB_set" or "OpLASTSIB_set" depending
6194 on whether "sib" is non-null. For a higher-level interface, see
6195 "op_sibling_splice".
6196
6197 void OpMAYBESIB_set(OP *o, OP *sib, OP *parent)
6198
6199 OpMORESIB_set
6200 Sets the sibling of "o" to the non-zero value "sib". See also
6201 "OpLASTSIB_set" and "OpMAYBESIB_set". For a higher-level
6202 interface, see "op_sibling_splice".
6203
6204 void OpMORESIB_set(OP *o, OP *sib)
6205
6206 OP_NAME Return the name of the provided OP. For core ops this looks up
6207 the name from the op_type; for custom ops from the op_ppaddr.
6208
6209 const char * OP_NAME(OP *o)
6210
6211 op_null Neutralizes an op when it is no longer needed, but is still
6212 linked to from other ops.
6213
6214 void op_null(OP *o)
6215
6216 op_parent
6217 Returns the parent OP of "o", if it has a parent. Returns
6218 "NULL" otherwise.
6219
6220 OP* op_parent(OP *o)
6221
6222 op_prepend_elem
6223 Prepend an item to the list of ops contained directly within a
6224 list-type op, returning the lengthened list. "first" is the op
6225 to prepend to the list, and "last" is the list-type op.
6226 "optype" specifies the intended opcode for the list. If "last"
6227 is not already a list of the right type, it will be upgraded
6228 into one. If either "first" or "last" is null, the other is
6229 returned unchanged.
6230
6231 OP * op_prepend_elem(I32 optype, OP *first, OP *last)
6232
6233 op_scope
6234 NOTE: this function is experimental and may change or be
6235 removed without notice.
6236
6237 Wraps up an op tree with some additional ops so that at runtime
6238 a dynamic scope will be created. The original ops run in the
6239 new dynamic scope, and then, provided that they exit normally,
6240 the scope will be unwound. The additional ops used to create
6241 and unwind the dynamic scope will normally be an
6242 "enter"/"leave" pair, but a "scope" op may be used instead if
6243 the ops are simple enough to not need the full dynamic scope
6244 structure.
6245
6246 OP * op_scope(OP *o)
6247
6248 OpSIBLING
6249 Returns the sibling of "o", or "NULL" if there is no sibling
6250
6251 OP* OpSIBLING(OP *o)
6252
6253 op_sibling_splice
6254 A general function for editing the structure of an existing
6255 chain of op_sibling nodes. By analogy with the perl-level
6256 "splice()" function, allows you to delete zero or more
6257 sequential nodes, replacing them with zero or more different
6258 nodes. Performs the necessary op_first/op_last housekeeping on
6259 the parent node and op_sibling manipulation on the children.
6260 The last deleted node will be marked as as the last node by
6261 updating the op_sibling/op_sibparent or op_moresib field as
6262 appropriate.
6263
6264 Note that op_next is not manipulated, and nodes are not freed;
6265 that is the responsibility of the caller. It also won't create
6266 a new list op for an empty list etc; use higher-level functions
6267 like op_append_elem() for that.
6268
6269 "parent" is the parent node of the sibling chain. It may passed
6270 as "NULL" if the splicing doesn't affect the first or last op
6271 in the chain.
6272
6273 "start" is the node preceding the first node to be spliced.
6274 Node(s) following it will be deleted, and ops will be inserted
6275 after it. If it is "NULL", the first node onwards is deleted,
6276 and nodes are inserted at the beginning.
6277
6278 "del_count" is the number of nodes to delete. If zero, no
6279 nodes are deleted. If -1 or greater than or equal to the
6280 number of remaining kids, all remaining kids are deleted.
6281
6282 "insert" is the first of a chain of nodes to be inserted in
6283 place of the nodes. If "NULL", no nodes are inserted.
6284
6285 The head of the chain of deleted ops is returned, or "NULL" if
6286 no ops were deleted.
6287
6288 For example:
6289
6290 action before after returns
6291 ------ ----- ----- -------
6292
6293 P P
6294 splice(P, A, 2, X-Y-Z) | | B-C
6295 A-B-C-D A-X-Y-Z-D
6296
6297 P P
6298 splice(P, NULL, 1, X-Y) | | A
6299 A-B-C-D X-Y-B-C-D
6300
6301 P P
6302 splice(P, NULL, 3, NULL) | | A-B-C
6303 A-B-C-D D
6304
6305 P P
6306 splice(P, B, 0, X-Y) | | NULL
6307 A-B-C-D A-B-X-Y-C-D
6308
6309 For lower-level direct manipulation of "op_sibparent" and
6310 "op_moresib", see "OpMORESIB_set", "OpLASTSIB_set",
6311 "OpMAYBESIB_set".
6312
6313 OP* op_sibling_splice(OP *parent, OP *start,
6314 int del_count, OP* insert)
6315
6316 OP_TYPE_IS
6317 Returns true if the given OP is not a "NULL" pointer and if it
6318 is of the given type.
6319
6320 The negation of this macro, "OP_TYPE_ISNT" is also available as
6321 well as "OP_TYPE_IS_NN" and "OP_TYPE_ISNT_NN" which elide the
6322 NULL pointer check.
6323
6324 bool OP_TYPE_IS(OP *o, Optype type)
6325
6326 OP_TYPE_IS_OR_WAS
6327 Returns true if the given OP is not a NULL pointer and if it is
6328 of the given type or used to be before being replaced by an OP
6329 of type OP_NULL.
6330
6331 The negation of this macro, "OP_TYPE_ISNT_AND_WASNT" is also
6332 available as well as "OP_TYPE_IS_OR_WAS_NN" and
6333 "OP_TYPE_ISNT_AND_WASNT_NN" which elide the "NULL" pointer
6334 check.
6335
6336 bool OP_TYPE_IS_OR_WAS(OP *o, Optype type)
6337
6338 rv2cv_op_cv
6339 Examines an op, which is expected to identify a subroutine at
6340 runtime, and attempts to determine at compile time which
6341 subroutine it identifies. This is normally used during Perl
6342 compilation to determine whether a prototype can be applied to
6343 a function call. "cvop" is the op being considered, normally
6344 an "rv2cv" op. A pointer to the identified subroutine is
6345 returned, if it could be determined statically, and a null
6346 pointer is returned if it was not possible to determine
6347 statically.
6348
6349 Currently, the subroutine can be identified statically if the
6350 RV that the "rv2cv" is to operate on is provided by a suitable
6351 "gv" or "const" op. A "gv" op is suitable if the GV's CV slot
6352 is populated. A "const" op is suitable if the constant value
6353 must be an RV pointing to a CV. Details of this process may
6354 change in future versions of Perl. If the "rv2cv" op has the
6355 "OPpENTERSUB_AMPER" flag set then no attempt is made to
6356 identify the subroutine statically: this flag is used to
6357 suppress compile-time magic on a subroutine call, forcing it to
6358 use default runtime behaviour.
6359
6360 If "flags" has the bit "RV2CVOPCV_MARK_EARLY" set, then the
6361 handling of a GV reference is modified. If a GV was examined
6362 and its CV slot was found to be empty, then the "gv" op has the
6363 "OPpEARLY_CV" flag set. If the op is not optimised away, and
6364 the CV slot is later populated with a subroutine having a
6365 prototype, that flag eventually triggers the warning "called
6366 too early to check prototype".
6367
6368 If "flags" has the bit "RV2CVOPCV_RETURN_NAME_GV" set, then
6369 instead of returning a pointer to the subroutine it returns a
6370 pointer to the GV giving the most appropriate name for the
6371 subroutine in this context. Normally this is just the "CvGV"
6372 of the subroutine, but for an anonymous ("CvANON") subroutine
6373 that is referenced through a GV it will be the referencing GV.
6374 The resulting "GV*" is cast to "CV*" to be returned. A null
6375 pointer is returned as usual if there is no statically-
6376 determinable subroutine.
6377
6378 CV * rv2cv_op_cv(OP *cvop, U32 flags)
6379
6381 packlist
6382 The engine implementing "pack()" Perl function.
6383
6384 void packlist(SV *cat, const char *pat,
6385 const char *patend, SV **beglist,
6386 SV **endlist)
6387
6388 unpackstring
6389 The engine implementing the "unpack()" Perl function.
6390
6391 Using the template "pat..patend", this function unpacks the
6392 string "s..strend" into a number of mortal SVs, which it pushes
6393 onto the perl argument (@_) stack (so you will need to issue a
6394 "PUTBACK" before and "SPAGAIN" after the call to this
6395 function). It returns the number of pushed elements.
6396
6397 The "strend" and "patend" pointers should point to the byte
6398 following the last character of each string.
6399
6400 Although this function returns its values on the perl argument
6401 stack, it doesn't take any parameters from that stack (and thus
6402 in particular there's no need to do a "PUSHMARK" before calling
6403 it, unlike "call_pv" for example).
6404
6405 SSize_t unpackstring(const char *pat,
6406 const char *patend, const char *s,
6407 const char *strend, U32 flags)
6408
6410 CvPADLIST
6411 NOTE: this function is experimental and may change or be
6412 removed without notice.
6413
6414 CV's can have CvPADLIST(cv) set to point to a PADLIST. This is
6415 the CV's scratchpad, which stores lexical variables and opcode
6416 temporary and per-thread values.
6417
6418 For these purposes "formats" are a kind-of CV; eval""s are too
6419 (except they're not callable at will and are always thrown away
6420 after the eval"" is done executing). Require'd files are
6421 simply evals without any outer lexical scope.
6422
6423 XSUBs do not have a "CvPADLIST". "dXSTARG" fetches values from
6424 "PL_curpad", but that is really the callers pad (a slot of
6425 which is allocated by every entersub). Do not get or set
6426 "CvPADLIST" if a CV is an XSUB (as determined by "CvISXSUB()"),
6427 "CvPADLIST" slot is reused for a different internal purpose in
6428 XSUBs.
6429
6430 The PADLIST has a C array where pads are stored.
6431
6432 The 0th entry of the PADLIST is a PADNAMELIST which represents
6433 the "names" or rather the "static type information" for
6434 lexicals. The individual elements of a PADNAMELIST are
6435 PADNAMEs. Future refactorings might stop the PADNAMELIST from
6436 being stored in the PADLIST's array, so don't rely on it. See
6437 "PadlistNAMES".
6438
6439 The CvDEPTH'th entry of a PADLIST is a PAD (an AV) which is the
6440 stack frame at that depth of recursion into the CV. The 0th
6441 slot of a frame AV is an AV which is @_. Other entries are
6442 storage for variables and op targets.
6443
6444 Iterating over the PADNAMELIST iterates over all possible pad
6445 items. Pad slots for targets ("SVs_PADTMP") and GVs end up
6446 having &PL_padname_undef "names", while slots for constants
6447 have &PL_padname_const "names" (see "pad_alloc"). That
6448 &PL_padname_undef and &PL_padname_const are used is an
6449 implementation detail subject to change. To test for them, use
6450 "!PadnamePV(name)" and "PadnamePV(name) && !PadnameLEN(name)",
6451 respectively.
6452
6453 Only "my"/"our" variable slots get valid names. The rest are
6454 op targets/GVs/constants which are statically allocated or
6455 resolved at compile time. These don't have names by which they
6456 can be looked up from Perl code at run time through eval"" the
6457 way "my"/"our" variables can be. Since they can't be looked up
6458 by "name" but only by their index allocated at compile time
6459 (which is usually in "PL_op->op_targ"), wasting a name SV for
6460 them doesn't make sense.
6461
6462 The pad names in the PADNAMELIST have their PV holding the name
6463 of the variable. The "COP_SEQ_RANGE_LOW" and "_HIGH" fields
6464 form a range (low+1..high inclusive) of cop_seq numbers for
6465 which the name is valid. During compilation, these fields may
6466 hold the special value PERL_PADSEQ_INTRO to indicate various
6467 stages:
6468
6469 COP_SEQ_RANGE_LOW _HIGH
6470 ----------------- -----
6471 PERL_PADSEQ_INTRO 0 variable not yet introduced:
6472 { my ($x
6473 valid-seq# PERL_PADSEQ_INTRO variable in scope:
6474 { my ($x);
6475 valid-seq# valid-seq# compilation of scope complete:
6476 { my ($x); .... }
6477
6478 When a lexical var hasn't yet been introduced, it already
6479 exists from the perspective of duplicate declarations, but not
6480 for variable lookups, e.g.
6481
6482 my ($x, $x); # '"my" variable $x masks earlier declaration'
6483 my $x = $x; # equal to my $x = $::x;
6484
6485 For typed lexicals "PadnameTYPE" points at the type stash. For
6486 "our" lexicals, "PadnameOURSTASH" points at the stash of the
6487 associated global (so that duplicate "our" declarations in the
6488 same package can be detected). "PadnameGEN" is sometimes used
6489 to store the generation number during compilation.
6490
6491 If "PadnameOUTER" is set on the pad name, then that slot in the
6492 frame AV is a REFCNT'ed reference to a lexical from "outside".
6493 Such entries are sometimes referred to as 'fake'. In this
6494 case, the name does not use 'low' and 'high' to store a cop_seq
6495 range, since it is in scope throughout. Instead 'high' stores
6496 some flags containing info about the real lexical (is it
6497 declared in an anon, and is it capable of being instantiated
6498 multiple times?), and for fake ANONs, 'low' contains the index
6499 within the parent's pad where the lexical's value is stored, to
6500 make cloning quicker.
6501
6502 If the 'name' is "&" the corresponding entry in the PAD is a CV
6503 representing a possible closure.
6504
6505 Note that formats are treated as anon subs, and are cloned each
6506 time write is called (if necessary).
6507
6508 The flag "SVs_PADSTALE" is cleared on lexicals each time the
6509 "my()" is executed, and set on scope exit. This allows the
6510 "Variable $x is not available" warning to be generated in
6511 evals, such as
6512
6513 { my $x = 1; sub f { eval '$x'} } f();
6514
6515 For state vars, "SVs_PADSTALE" is overloaded to mean 'not yet
6516 initialised', but this internal state is stored in a separate
6517 pad entry.
6518
6519 PADLIST * CvPADLIST(CV *cv)
6520
6521 pad_add_name_pvs
6522 Exactly like "pad_add_name_pvn", but takes a literal string
6523 instead of a string/length pair.
6524
6525 PADOFFSET pad_add_name_pvs("literal string" name,
6526 U32 flags, HV *typestash,
6527 HV *ourstash)
6528
6529 PadARRAY
6530 NOTE: this function is experimental and may change or be
6531 removed without notice.
6532
6533 The C array of pad entries.
6534
6535 SV ** PadARRAY(PAD pad)
6536
6537 pad_findmy_pvs
6538 Exactly like "pad_findmy_pvn", but takes a literal string
6539 instead of a string/length pair.
6540
6541 PADOFFSET pad_findmy_pvs("literal string" name,
6542 U32 flags)
6543
6544 PadlistARRAY
6545 NOTE: this function is experimental and may change or be
6546 removed without notice.
6547
6548 The C array of a padlist, containing the pads. Only subscript
6549 it with numbers >= 1, as the 0th entry is not guaranteed to
6550 remain usable.
6551
6552 PAD ** PadlistARRAY(PADLIST padlist)
6553
6554 PadlistMAX
6555 NOTE: this function is experimental and may change or be
6556 removed without notice.
6557
6558 The index of the last allocated space in the padlist. Note
6559 that the last pad may be in an earlier slot. Any entries
6560 following it will be "NULL" in that case.
6561
6562 SSize_t PadlistMAX(PADLIST padlist)
6563
6564 PadlistNAMES
6565 NOTE: this function is experimental and may change or be
6566 removed without notice.
6567
6568 The names associated with pad entries.
6569
6570 PADNAMELIST * PadlistNAMES(PADLIST padlist)
6571
6572 PadlistNAMESARRAY
6573 NOTE: this function is experimental and may change or be
6574 removed without notice.
6575
6576 The C array of pad names.
6577
6578 PADNAME ** PadlistNAMESARRAY(PADLIST padlist)
6579
6580 PadlistNAMESMAX
6581 NOTE: this function is experimental and may change or be
6582 removed without notice.
6583
6584 The index of the last pad name.
6585
6586 SSize_t PadlistNAMESMAX(PADLIST padlist)
6587
6588 PadlistREFCNT
6589 NOTE: this function is experimental and may change or be
6590 removed without notice.
6591
6592 The reference count of the padlist. Currently this is always
6593 1.
6594
6595 U32 PadlistREFCNT(PADLIST padlist)
6596
6597 PadMAX NOTE: this function is experimental and may change or be
6598 removed without notice.
6599
6600 The index of the last pad entry.
6601
6602 SSize_t PadMAX(PAD pad)
6603
6604 PadnameLEN
6605 NOTE: this function is experimental and may change or be
6606 removed without notice.
6607
6608 The length of the name.
6609
6610 STRLEN PadnameLEN(PADNAME pn)
6611
6612 PadnamelistARRAY
6613 NOTE: this function is experimental and may change or be
6614 removed without notice.
6615
6616 The C array of pad names.
6617
6618 PADNAME ** PadnamelistARRAY(PADNAMELIST pnl)
6619
6620 PadnamelistMAX
6621 NOTE: this function is experimental and may change or be
6622 removed without notice.
6623
6624 The index of the last pad name.
6625
6626 SSize_t PadnamelistMAX(PADNAMELIST pnl)
6627
6628 PadnamelistREFCNT
6629 NOTE: this function is experimental and may change or be
6630 removed without notice.
6631
6632 The reference count of the pad name list.
6633
6634 SSize_t PadnamelistREFCNT(PADNAMELIST pnl)
6635
6636 PadnamelistREFCNT_dec
6637 NOTE: this function is experimental and may change or be
6638 removed without notice.
6639
6640 Lowers the reference count of the pad name list.
6641
6642 void PadnamelistREFCNT_dec(PADNAMELIST pnl)
6643
6644 PadnamePV
6645 NOTE: this function is experimental and may change or be
6646 removed without notice.
6647
6648 The name stored in the pad name struct. This returns "NULL"
6649 for a target slot.
6650
6651 char * PadnamePV(PADNAME pn)
6652
6653 PadnameREFCNT
6654 NOTE: this function is experimental and may change or be
6655 removed without notice.
6656
6657 The reference count of the pad name.
6658
6659 SSize_t PadnameREFCNT(PADNAME pn)
6660
6661 PadnameREFCNT_dec
6662 NOTE: this function is experimental and may change or be
6663 removed without notice.
6664
6665 Lowers the reference count of the pad name.
6666
6667 void PadnameREFCNT_dec(PADNAME pn)
6668
6669 PadnameSV
6670 NOTE: this function is experimental and may change or be
6671 removed without notice.
6672
6673 Returns the pad name as a mortal SV.
6674
6675 SV * PadnameSV(PADNAME pn)
6676
6677 PadnameUTF8
6678 NOTE: this function is experimental and may change or be
6679 removed without notice.
6680
6681 Whether PadnamePV is in UTF-8. Currently, this is always true.
6682
6683 bool PadnameUTF8(PADNAME pn)
6684
6685 pad_new Create a new padlist, updating the global variables for the
6686 currently-compiling padlist to point to the new padlist. The
6687 following flags can be OR'ed together:
6688
6689 padnew_CLONE this pad is for a cloned CV
6690 padnew_SAVE save old globals on the save stack
6691 padnew_SAVESUB also save extra stuff for start of sub
6692
6693 PADLIST * pad_new(int flags)
6694
6695 PL_comppad
6696 NOTE: this function is experimental and may change or be
6697 removed without notice.
6698
6699 During compilation, this points to the array containing the
6700 values part of the pad for the currently-compiling code. (At
6701 runtime a CV may have many such value arrays; at compile time
6702 just one is constructed.) At runtime, this points to the array
6703 containing the currently-relevant values for the pad for the
6704 currently-executing code.
6705
6706 PL_comppad_name
6707 NOTE: this function is experimental and may change or be
6708 removed without notice.
6709
6710 During compilation, this points to the array containing the
6711 names part of the pad for the currently-compiling code.
6712
6713 PL_curpad
6714 NOTE: this function is experimental and may change or be
6715 removed without notice.
6716
6717 Points directly to the body of the "PL_comppad" array. (I.e.,
6718 this is "PadARRAY(PL_comppad)".)
6719
6721 PL_modglobal
6722 "PL_modglobal" is a general purpose, interpreter global HV for
6723 use by extensions that need to keep information on a per-
6724 interpreter basis. In a pinch, it can also be used as a symbol
6725 table for extensions to share data among each other. It is a
6726 good idea to use keys prefixed by the package name of the
6727 extension that owns the data.
6728
6729 HV* PL_modglobal
6730
6731 PL_na A convenience variable which is typically used with "SvPV" when
6732 one doesn't care about the length of the string. It is usually
6733 more efficient to either declare a local variable and use that
6734 instead or to use the "SvPV_nolen" macro.
6735
6736 STRLEN PL_na
6737
6738 PL_opfreehook
6739 When non-"NULL", the function pointed by this variable will be
6740 called each time an OP is freed with the corresponding OP as
6741 the argument. This allows extensions to free any extra
6742 attribute they have locally attached to an OP. It is also
6743 assured to first fire for the parent OP and then for its kids.
6744
6745 When you replace this variable, it is considered a good
6746 practice to store the possibly previously installed hook and
6747 that you recall it inside your own.
6748
6749 Perl_ophook_t PL_opfreehook
6750
6751 PL_peepp
6752 Pointer to the per-subroutine peephole optimiser. This is a
6753 function that gets called at the end of compilation of a Perl
6754 subroutine (or equivalently independent piece of Perl code) to
6755 perform fixups of some ops and to perform small-scale
6756 optimisations. The function is called once for each subroutine
6757 that is compiled, and is passed, as sole parameter, a pointer
6758 to the op that is the entry point to the subroutine. It
6759 modifies the op tree in place.
6760
6761 The peephole optimiser should never be completely replaced.
6762 Rather, add code to it by wrapping the existing optimiser. The
6763 basic way to do this can be seen in "Compile pass 3: peephole
6764 optimization" in perlguts. If the new code wishes to operate
6765 on ops throughout the subroutine's structure, rather than just
6766 at the top level, it is likely to be more convenient to wrap
6767 the "PL_rpeepp" hook.
6768
6769 peep_t PL_peepp
6770
6771 PL_rpeepp
6772 Pointer to the recursive peephole optimiser. This is a
6773 function that gets called at the end of compilation of a Perl
6774 subroutine (or equivalently independent piece of Perl code) to
6775 perform fixups of some ops and to perform small-scale
6776 optimisations. The function is called once for each chain of
6777 ops linked through their "op_next" fields; it is recursively
6778 called to handle each side chain. It is passed, as sole
6779 parameter, a pointer to the op that is at the head of the
6780 chain. It modifies the op tree in place.
6781
6782 The peephole optimiser should never be completely replaced.
6783 Rather, add code to it by wrapping the existing optimiser. The
6784 basic way to do this can be seen in "Compile pass 3: peephole
6785 optimization" in perlguts. If the new code wishes to operate
6786 only on ops at a subroutine's top level, rather than throughout
6787 the structure, it is likely to be more convenient to wrap the
6788 "PL_peepp" hook.
6789
6790 peep_t PL_rpeepp
6791
6792 PL_sv_no
6793 This is the "false" SV. See "PL_sv_yes". Always refer to this
6794 as &PL_sv_no.
6795
6796 SV PL_sv_no
6797
6798 PL_sv_undef
6799 This is the "undef" SV. Always refer to this as &PL_sv_undef.
6800
6801 SV PL_sv_undef
6802
6803 PL_sv_yes
6804 This is the "true" SV. See "PL_sv_no". Always refer to this
6805 as &PL_sv_yes.
6806
6807 SV PL_sv_yes
6808
6809 PL_sv_zero
6810 This readonly SV has a zero numeric value and a "0" string
6811 value. It's similar to "PL_sv_no" except for its string value.
6812 Can be used as a cheap alternative to mXPUSHi(0) for example.
6813 Always refer to this as &PL_sv_zero. Introduced in 5.28.
6814
6815 SV PL_sv_zero
6816
6818 SvRX Convenience macro to get the REGEXP from a SV. This is
6819 approximately equivalent to the following snippet:
6820
6821 if (SvMAGICAL(sv))
6822 mg_get(sv);
6823 if (SvROK(sv))
6824 sv = MUTABLE_SV(SvRV(sv));
6825 if (SvTYPE(sv) == SVt_REGEXP)
6826 return (REGEXP*) sv;
6827
6828 "NULL" will be returned if a REGEXP* is not found.
6829
6830 REGEXP * SvRX(SV *sv)
6831
6832 SvRXOK Returns a boolean indicating whether the SV (or the one it
6833 references) is a REGEXP.
6834
6835 If you want to do something with the REGEXP* later use SvRX
6836 instead and check for NULL.
6837
6838 bool SvRXOK(SV* sv)
6839
6841 dMARK Declare a stack marker variable, "mark", for the XSUB. See
6842 "MARK" and "dORIGMARK".
6843
6844 dMARK;
6845
6846 dORIGMARK
6847 Saves the original stack mark for the XSUB. See "ORIGMARK".
6848
6849 dORIGMARK;
6850
6851 dSP Declares a local copy of perl's stack pointer for the XSUB,
6852 available via the "SP" macro. See "SP".
6853
6854 dSP;
6855
6856 EXTEND Used to extend the argument stack for an XSUB's return values.
6857 Once used, guarantees that there is room for at least "nitems"
6858 to be pushed onto the stack.
6859
6860 void EXTEND(SP, SSize_t nitems)
6861
6862 MARK Stack marker variable for the XSUB. See "dMARK".
6863
6864 mPUSHi Push an integer onto the stack. The stack must have room for
6865 this element. Does not use "TARG". See also "PUSHi",
6866 "mXPUSHi" and "XPUSHi".
6867
6868 void mPUSHi(IV iv)
6869
6870 mPUSHn Push a double onto the stack. The stack must have room for
6871 this element. Does not use "TARG". See also "PUSHn",
6872 "mXPUSHn" and "XPUSHn".
6873
6874 void mPUSHn(NV nv)
6875
6876 mPUSHp Push a string onto the stack. The stack must have room for
6877 this element. The "len" indicates the length of the string.
6878 Does not use "TARG". See also "PUSHp", "mXPUSHp" and "XPUSHp".
6879
6880 void mPUSHp(char* str, STRLEN len)
6881
6882 mPUSHs Push an SV onto the stack and mortalizes the SV. The stack
6883 must have room for this element. Does not use "TARG". See
6884 also "PUSHs" and "mXPUSHs".
6885
6886 void mPUSHs(SV* sv)
6887
6888 mPUSHu Push an unsigned integer onto the stack. The stack must have
6889 room for this element. Does not use "TARG". See also "PUSHu",
6890 "mXPUSHu" and "XPUSHu".
6891
6892 void mPUSHu(UV uv)
6893
6894 mXPUSHi Push an integer onto the stack, extending the stack if
6895 necessary. Does not use "TARG". See also "XPUSHi", "mPUSHi"
6896 and "PUSHi".
6897
6898 void mXPUSHi(IV iv)
6899
6900 mXPUSHn Push a double onto the stack, extending the stack if necessary.
6901 Does not use "TARG". See also "XPUSHn", "mPUSHn" and "PUSHn".
6902
6903 void mXPUSHn(NV nv)
6904
6905 mXPUSHp Push a string onto the stack, extending the stack if necessary.
6906 The "len" indicates the length of the string. Does not use
6907 "TARG". See also "XPUSHp", "mPUSHp" and "PUSHp".
6908
6909 void mXPUSHp(char* str, STRLEN len)
6910
6911 mXPUSHs Push an SV onto the stack, extending the stack if necessary and
6912 mortalizes the SV. Does not use "TARG". See also "XPUSHs" and
6913 "mPUSHs".
6914
6915 void mXPUSHs(SV* sv)
6916
6917 mXPUSHu Push an unsigned integer onto the stack, extending the stack if
6918 necessary. Does not use "TARG". See also "XPUSHu", "mPUSHu"
6919 and "PUSHu".
6920
6921 void mXPUSHu(UV uv)
6922
6923 ORIGMARK
6924 The original stack mark for the XSUB. See "dORIGMARK".
6925
6926 POPi Pops an integer off the stack.
6927
6928 IV POPi
6929
6930 POPl Pops a long off the stack.
6931
6932 long POPl
6933
6934 POPn Pops a double off the stack.
6935
6936 NV POPn
6937
6938 POPp Pops a string off the stack.
6939
6940 char* POPp
6941
6942 POPpbytex
6943 Pops a string off the stack which must consist of bytes i.e.
6944 characters < 256.
6945
6946 char* POPpbytex
6947
6948 POPpx Pops a string off the stack. Identical to POPp. There are two
6949 names for historical reasons.
6950
6951 char* POPpx
6952
6953 POPs Pops an SV off the stack.
6954
6955 SV* POPs
6956
6957 POPu Pops an unsigned integer off the stack.
6958
6959 UV POPu
6960
6961 POPul Pops an unsigned long off the stack.
6962
6963 long POPul
6964
6965 PUSHi Push an integer onto the stack. The stack must have room for
6966 this element. Handles 'set' magic. Uses "TARG", so "dTARGET"
6967 or "dXSTARG" should be called to declare it. Do not call
6968 multiple "TARG"-oriented macros to return lists from XSUB's -
6969 see "mPUSHi" instead. See also "XPUSHi" and "mXPUSHi".
6970
6971 void PUSHi(IV iv)
6972
6973 PUSHMARK
6974 Opening bracket for arguments on a callback. See "PUTBACK" and
6975 perlcall.
6976
6977 void PUSHMARK(SP)
6978
6979 PUSHmortal
6980 Push a new mortal SV onto the stack. The stack must have room
6981 for this element. Does not use "TARG". See also "PUSHs",
6982 "XPUSHmortal" and "XPUSHs".
6983
6984 void PUSHmortal()
6985
6986 PUSHn Push a double onto the stack. The stack must have room for
6987 this element. Handles 'set' magic. Uses "TARG", so "dTARGET"
6988 or "dXSTARG" should be called to declare it. Do not call
6989 multiple "TARG"-oriented macros to return lists from XSUB's -
6990 see "mPUSHn" instead. See also "XPUSHn" and "mXPUSHn".
6991
6992 void PUSHn(NV nv)
6993
6994 PUSHp Push a string onto the stack. The stack must have room for
6995 this element. The "len" indicates the length of the string.
6996 Handles 'set' magic. Uses "TARG", so "dTARGET" or "dXSTARG"
6997 should be called to declare it. Do not call multiple
6998 "TARG"-oriented macros to return lists from XSUB's - see
6999 "mPUSHp" instead. See also "XPUSHp" and "mXPUSHp".
7000
7001 void PUSHp(char* str, STRLEN len)
7002
7003 PUSHs Push an SV onto the stack. The stack must have room for this
7004 element. Does not handle 'set' magic. Does not use "TARG".
7005 See also "PUSHmortal", "XPUSHs", and "XPUSHmortal".
7006
7007 void PUSHs(SV* sv)
7008
7009 PUSHu Push an unsigned integer onto the stack. The stack must have
7010 room for this element. Handles 'set' magic. Uses "TARG", so
7011 "dTARGET" or "dXSTARG" should be called to declare it. Do not
7012 call multiple "TARG"-oriented macros to return lists from
7013 XSUB's - see "mPUSHu" instead. See also "XPUSHu" and
7014 "mXPUSHu".
7015
7016 void PUSHu(UV uv)
7017
7018 PUTBACK Closing bracket for XSUB arguments. This is usually handled by
7019 "xsubpp". See "PUSHMARK" and perlcall for other uses.
7020
7021 PUTBACK;
7022
7023 SP Stack pointer. This is usually handled by "xsubpp". See "dSP"
7024 and "SPAGAIN".
7025
7026 SPAGAIN Refetch the stack pointer. Used after a callback. See
7027 perlcall.
7028
7029 SPAGAIN;
7030
7031 XPUSHi Push an integer onto the stack, extending the stack if
7032 necessary. Handles 'set' magic. Uses "TARG", so "dTARGET" or
7033 "dXSTARG" should be called to declare it. Do not call multiple
7034 "TARG"-oriented macros to return lists from XSUB's - see
7035 "mXPUSHi" instead. See also "PUSHi" and "mPUSHi".
7036
7037 void XPUSHi(IV iv)
7038
7039 XPUSHmortal
7040 Push a new mortal SV onto the stack, extending the stack if
7041 necessary. Does not use "TARG". See also "XPUSHs",
7042 "PUSHmortal" and "PUSHs".
7043
7044 void XPUSHmortal()
7045
7046 XPUSHn Push a double onto the stack, extending the stack if necessary.
7047 Handles 'set' magic. Uses "TARG", so "dTARGET" or "dXSTARG"
7048 should be called to declare it. Do not call multiple
7049 "TARG"-oriented macros to return lists from XSUB's - see
7050 "mXPUSHn" instead. See also "PUSHn" and "mPUSHn".
7051
7052 void XPUSHn(NV nv)
7053
7054 XPUSHp Push a string onto the stack, extending the stack if necessary.
7055 The "len" indicates the length of the string. Handles 'set'
7056 magic. Uses "TARG", so "dTARGET" or "dXSTARG" should be called
7057 to declare it. Do not call multiple "TARG"-oriented macros to
7058 return lists from XSUB's - see "mXPUSHp" instead. See also
7059 "PUSHp" and "mPUSHp".
7060
7061 void XPUSHp(char* str, STRLEN len)
7062
7063 XPUSHs Push an SV onto the stack, extending the stack if necessary.
7064 Does not handle 'set' magic. Does not use "TARG". See also
7065 "XPUSHmortal", "PUSHs" and "PUSHmortal".
7066
7067 void XPUSHs(SV* sv)
7068
7069 XPUSHu Push an unsigned integer onto the stack, extending the stack if
7070 necessary. Handles 'set' magic. Uses "TARG", so "dTARGET" or
7071 "dXSTARG" should be called to declare it. Do not call multiple
7072 "TARG"-oriented macros to return lists from XSUB's - see
7073 "mXPUSHu" instead. See also "PUSHu" and "mPUSHu".
7074
7075 void XPUSHu(UV uv)
7076
7077 XSRETURN
7078 Return from XSUB, indicating number of items on the stack.
7079 This is usually handled by "xsubpp".
7080
7081 void XSRETURN(int nitems)
7082
7083 XSRETURN_EMPTY
7084 Return an empty list from an XSUB immediately.
7085
7086 XSRETURN_EMPTY;
7087
7088 XSRETURN_IV
7089 Return an integer from an XSUB immediately. Uses "XST_mIV".
7090
7091 void XSRETURN_IV(IV iv)
7092
7093 XSRETURN_NO
7094 Return &PL_sv_no from an XSUB immediately. Uses "XST_mNO".
7095
7096 XSRETURN_NO;
7097
7098 XSRETURN_NV
7099 Return a double from an XSUB immediately. Uses "XST_mNV".
7100
7101 void XSRETURN_NV(NV nv)
7102
7103 XSRETURN_PV
7104 Return a copy of a string from an XSUB immediately. Uses
7105 "XST_mPV".
7106
7107 void XSRETURN_PV(char* str)
7108
7109 XSRETURN_UNDEF
7110 Return &PL_sv_undef from an XSUB immediately. Uses
7111 "XST_mUNDEF".
7112
7113 XSRETURN_UNDEF;
7114
7115 XSRETURN_UV
7116 Return an integer from an XSUB immediately. Uses "XST_mUV".
7117
7118 void XSRETURN_UV(IV uv)
7119
7120 XSRETURN_YES
7121 Return &PL_sv_yes from an XSUB immediately. Uses "XST_mYES".
7122
7123 XSRETURN_YES;
7124
7125 XST_mIV Place an integer into the specified position "pos" on the
7126 stack. The value is stored in a new mortal SV.
7127
7128 void XST_mIV(int pos, IV iv)
7129
7130 XST_mNO Place &PL_sv_no into the specified position "pos" on the stack.
7131
7132 void XST_mNO(int pos)
7133
7134 XST_mNV Place a double into the specified position "pos" on the stack.
7135 The value is stored in a new mortal SV.
7136
7137 void XST_mNV(int pos, NV nv)
7138
7139 XST_mPV Place a copy of a string into the specified position "pos" on
7140 the stack. The value is stored in a new mortal SV.
7141
7142 void XST_mPV(int pos, char* str)
7143
7144 XST_mUNDEF
7145 Place &PL_sv_undef into the specified position "pos" on the
7146 stack.
7147
7148 void XST_mUNDEF(int pos)
7149
7150 XST_mYES
7151 Place &PL_sv_yes into the specified position "pos" on the
7152 stack.
7153
7154 void XST_mYES(int pos)
7155
7157 SVt_INVLIST
7158 Type flag for scalars. See "svtype".
7159
7160 SVt_IV Type flag for scalars. See "svtype".
7161
7162 SVt_NULL
7163 Type flag for scalars. See "svtype".
7164
7165 SVt_NV Type flag for scalars. See "svtype".
7166
7167 SVt_PV Type flag for scalars. See "svtype".
7168
7169 SVt_PVAV
7170 Type flag for arrays. See "svtype".
7171
7172 SVt_PVCV
7173 Type flag for subroutines. See "svtype".
7174
7175 SVt_PVFM
7176 Type flag for formats. See "svtype".
7177
7178 SVt_PVGV
7179 Type flag for typeglobs. See "svtype".
7180
7181 SVt_PVHV
7182 Type flag for hashes. See "svtype".
7183
7184 SVt_PVIO
7185 Type flag for I/O objects. See "svtype".
7186
7187 SVt_PVIV
7188 Type flag for scalars. See "svtype".
7189
7190 SVt_PVLV
7191 Type flag for scalars. See "svtype".
7192
7193 SVt_PVMG
7194 Type flag for scalars. See "svtype".
7195
7196 SVt_PVNV
7197 Type flag for scalars. See "svtype".
7198
7199 SVt_REGEXP
7200 Type flag for regular expressions. See "svtype".
7201
7202 svtype An enum of flags for Perl types. These are found in the file
7203 sv.h in the "svtype" enum. Test these flags with the "SvTYPE"
7204 macro.
7205
7206 The types are:
7207
7208 SVt_NULL
7209 SVt_IV
7210 SVt_NV
7211 SVt_RV
7212 SVt_PV
7213 SVt_PVIV
7214 SVt_PVNV
7215 SVt_PVMG
7216 SVt_INVLIST
7217 SVt_REGEXP
7218 SVt_PVGV
7219 SVt_PVLV
7220 SVt_PVAV
7221 SVt_PVHV
7222 SVt_PVCV
7223 SVt_PVFM
7224 SVt_PVIO
7225
7226 These are most easily explained from the bottom up.
7227
7228 "SVt_PVIO" is for I/O objects, "SVt_PVFM" for formats,
7229 "SVt_PVCV" for subroutines, "SVt_PVHV" for hashes and
7230 "SVt_PVAV" for arrays.
7231
7232 All the others are scalar types, that is, things that can be
7233 bound to a "$" variable. For these, the internal types are
7234 mostly orthogonal to types in the Perl language.
7235
7236 Hence, checking "SvTYPE(sv) < SVt_PVAV" is the best way to see
7237 whether something is a scalar.
7238
7239 "SVt_PVGV" represents a typeglob. If "!SvFAKE(sv)", then it is
7240 a real, incoercible typeglob. If "SvFAKE(sv)", then it is a
7241 scalar to which a typeglob has been assigned. Assigning to it
7242 again will stop it from being a typeglob. "SVt_PVLV"
7243 represents a scalar that delegates to another scalar behind the
7244 scenes. It is used, e.g., for the return value of "substr" and
7245 for tied hash and array elements. It can hold any scalar
7246 value, including a typeglob. "SVt_REGEXP" is for regular
7247 expressions. "SVt_INVLIST" is for Perl core internal use only.
7248
7249 "SVt_PVMG" represents a "normal" scalar (not a typeglob,
7250 regular expression, or delegate). Since most scalars do not
7251 need all the internal fields of a PVMG, we save memory by
7252 allocating smaller structs when possible. All the other types
7253 are just simpler forms of "SVt_PVMG", with fewer internal
7254 fields. "SVt_NULL" can only hold undef. "SVt_IV" can hold
7255 undef, an integer, or a reference. ("SVt_RV" is an alias for
7256 "SVt_IV", which exists for backward compatibility.) "SVt_NV"
7257 can hold any of those or a double. "SVt_PV" can only hold
7258 "undef" or a string. "SVt_PVIV" is a superset of "SVt_PV" and
7259 "SVt_IV". "SVt_PVNV" is similar. "SVt_PVMG" can hold anything
7260 "SVt_PVNV" can hold, but it can, but does not have to, be
7261 blessed or magical.
7262
7264 boolSV Returns a true SV if "b" is a true value, or a false SV if "b"
7265 is 0.
7266
7267 See also "PL_sv_yes" and "PL_sv_no".
7268
7269 SV * boolSV(bool b)
7270
7271 croak_xs_usage
7272 A specialised variant of "croak()" for emitting the usage
7273 message for xsubs
7274
7275 croak_xs_usage(cv, "eee_yow");
7276
7277 works out the package name and subroutine name from "cv", and
7278 then calls "croak()". Hence if "cv" is &ouch::awk, it would
7279 call "croak" as:
7280
7281 Perl_croak(aTHX_ "Usage: %" SVf "::%" SVf "(%s)", "ouch" "awk",
7282 "eee_yow");
7283
7284 void croak_xs_usage(const CV *const cv,
7285 const char *const params)
7286
7287 get_sv Returns the SV of the specified Perl scalar. "flags" are
7288 passed to "gv_fetchpv". If "GV_ADD" is set and the Perl
7289 variable does not exist then it will be created. If "flags" is
7290 zero and the variable does not exist then NULL is returned.
7291
7292 NOTE: the perl_ form of this function is deprecated.
7293
7294 SV* get_sv(const char *name, I32 flags)
7295
7296 looks_like_number
7297 Test if the content of an SV looks like a number (or is a
7298 number). "Inf" and "Infinity" are treated as numbers (so will
7299 not issue a non-numeric warning), even if your "atof()" doesn't
7300 grok them. Get-magic is ignored.
7301
7302 I32 looks_like_number(SV *const sv)
7303
7304 newRV_inc
7305 Creates an RV wrapper for an SV. The reference count for the
7306 original SV is incremented.
7307
7308 SV* newRV_inc(SV* sv)
7309
7310 newRV_noinc
7311 Creates an RV wrapper for an SV. The reference count for the
7312 original SV is not incremented.
7313
7314 SV* newRV_noinc(SV *const tmpRef)
7315
7316 newSV Creates a new SV. A non-zero "len" parameter indicates the
7317 number of bytes of preallocated string space the SV should
7318 have. An extra byte for a trailing "NUL" is also reserved.
7319 ("SvPOK" is not set for the SV even if string space is
7320 allocated.) The reference count for the new SV is set to 1.
7321
7322 In 5.9.3, "newSV()" replaces the older "NEWSV()" API, and drops
7323 the first parameter, x, a debug aid which allowed callers to
7324 identify themselves. This aid has been superseded by a new
7325 build option, "PERL_MEM_LOG" (see "PERL_MEM_LOG" in
7326 perlhacktips). The older API is still there for use in XS
7327 modules supporting older perls.
7328
7329 SV* newSV(const STRLEN len)
7330
7331 newSVhek
7332 Creates a new SV from the hash key structure. It will generate
7333 scalars that point to the shared string table where possible.
7334 Returns a new (undefined) SV if "hek" is NULL.
7335
7336 SV* newSVhek(const HEK *const hek)
7337
7338 newSViv Creates a new SV and copies an integer into it. The reference
7339 count for the SV is set to 1.
7340
7341 SV* newSViv(const IV i)
7342
7343 newSVnv Creates a new SV and copies a floating point value into it.
7344 The reference count for the SV is set to 1.
7345
7346 SV* newSVnv(const NV n)
7347
7348 newSVpadname
7349 NOTE: this function is experimental and may change or be
7350 removed without notice.
7351
7352 Creates a new SV containing the pad name.
7353
7354 SV* newSVpadname(PADNAME *pn)
7355
7356 newSVpv Creates a new SV and copies a string (which may contain "NUL"
7357 ("\0") characters) into it. The reference count for the SV is
7358 set to 1. If "len" is zero, Perl will compute the length using
7359 "strlen()", (which means if you use this option, that "s" can't
7360 have embedded "NUL" characters and has to have a terminating
7361 "NUL" byte).
7362
7363 This function can cause reliability issues if you are likely to
7364 pass in empty strings that are not null terminated, because it
7365 will run strlen on the string and potentially run past valid
7366 memory.
7367
7368 Using "newSVpvn" is a safer alternative for non "NUL"
7369 terminated strings. For string literals use "newSVpvs"
7370 instead. This function will work fine for "NUL" terminated
7371 strings, but if you want to avoid the if statement on whether
7372 to call "strlen" use "newSVpvn" instead (calling "strlen"
7373 yourself).
7374
7375 SV* newSVpv(const char *const s, const STRLEN len)
7376
7377 newSVpvf
7378 Creates a new SV and initializes it with the string formatted
7379 like "sv_catpvf".
7380
7381 SV* newSVpvf(const char *const pat, ...)
7382
7383 newSVpvn
7384 Creates a new SV and copies a string into it, which may contain
7385 "NUL" characters ("\0") and other binary data. The reference
7386 count for the SV is set to 1. Note that if "len" is zero, Perl
7387 will create a zero length (Perl) string. You are responsible
7388 for ensuring that the source buffer is at least "len" bytes
7389 long. If the "buffer" argument is NULL the new SV will be
7390 undefined.
7391
7392 SV* newSVpvn(const char *const buffer,
7393 const STRLEN len)
7394
7395 newSVpvn_flags
7396 Creates a new SV and copies a string (which may contain "NUL"
7397 ("\0") characters) into it. The reference count for the SV is
7398 set to 1. Note that if "len" is zero, Perl will create a zero
7399 length string. You are responsible for ensuring that the
7400 source string is at least "len" bytes long. If the "s"
7401 argument is NULL the new SV will be undefined. Currently the
7402 only flag bits accepted are "SVf_UTF8" and "SVs_TEMP". If
7403 "SVs_TEMP" is set, then "sv_2mortal()" is called on the result
7404 before returning. If "SVf_UTF8" is set, "s" is considered to
7405 be in UTF-8 and the "SVf_UTF8" flag will be set on the new SV.
7406 "newSVpvn_utf8()" is a convenience wrapper for this function,
7407 defined as
7408
7409 #define newSVpvn_utf8(s, len, u) \
7410 newSVpvn_flags((s), (len), (u) ? SVf_UTF8 : 0)
7411
7412 SV* newSVpvn_flags(const char *const s,
7413 const STRLEN len,
7414 const U32 flags)
7415
7416 newSVpvn_share
7417 Creates a new SV with its "SvPVX_const" pointing to a shared
7418 string in the string table. If the string does not already
7419 exist in the table, it is created first. Turns on the
7420 "SvIsCOW" flag (or "READONLY" and "FAKE" in 5.16 and earlier).
7421 If the "hash" parameter is non-zero, that value is used;
7422 otherwise the hash is computed. The string's hash can later be
7423 retrieved from the SV with the "SvSHARED_HASH()" macro. The
7424 idea here is that as the string table is used for shared hash
7425 keys these strings will have "SvPVX_const == HeKEY" and hash
7426 lookup will avoid string compare.
7427
7428 SV* newSVpvn_share(const char* s, I32 len, U32 hash)
7429
7430 newSVpvn_utf8
7431 Creates a new SV and copies a string (which may contain "NUL"
7432 ("\0") characters) into it. If "utf8" is true, calls
7433 "SvUTF8_on" on the new SV. Implemented as a wrapper around
7434 "newSVpvn_flags".
7435
7436 SV* newSVpvn_utf8(const char* s, STRLEN len,
7437 U32 utf8)
7438
7439 newSVpvs
7440 Like "newSVpvn", but takes a literal string instead of a
7441 string/length pair.
7442
7443 SV* newSVpvs("literal string" s)
7444
7445 newSVpvs_flags
7446 Like "newSVpvn_flags", but takes a literal string instead of a
7447 string/length pair.
7448
7449 SV* newSVpvs_flags("literal string" s, U32 flags)
7450
7451 newSVpv_share
7452 Like "newSVpvn_share", but takes a "NUL"-terminated string
7453 instead of a string/length pair.
7454
7455 SV* newSVpv_share(const char* s, U32 hash)
7456
7457 newSVpvs_share
7458 Like "newSVpvn_share", but takes a literal string instead of a
7459 string/length pair and omits the hash parameter.
7460
7461 SV* newSVpvs_share("literal string" s)
7462
7463 newSVrv Creates a new SV for the existing RV, "rv", to point to. If
7464 "rv" is not an RV then it will be upgraded to one. If
7465 "classname" is non-null then the new SV will be blessed in the
7466 specified package. The new SV is returned and its reference
7467 count is 1. The reference count 1 is owned by "rv". See also
7468 newRV_inc() and newRV_noinc() for creating a new RV properly.
7469
7470 SV* newSVrv(SV *const rv,
7471 const char *const classname)
7472
7473 newSVsv Creates a new SV which is an exact duplicate of the original
7474 SV. (Uses "sv_setsv".)
7475
7476 SV* newSVsv(SV *const old)
7477
7478 newSVsv_nomg
7479 Like "newSVsv" but does not process get magic.
7480
7481 SV* newSVsv_nomg(SV *const old)
7482
7483 newSV_type
7484 Creates a new SV, of the type specified. The reference count
7485 for the new SV is set to 1.
7486
7487 SV* newSV_type(const svtype type)
7488
7489 newSVuv Creates a new SV and copies an unsigned integer into it. The
7490 reference count for the SV is set to 1.
7491
7492 SV* newSVuv(const UV u)
7493
7494 sv_2bool
7495 This macro is only used by "sv_true()" or its macro equivalent,
7496 and only if the latter's argument is neither "SvPOK", "SvIOK"
7497 nor "SvNOK". It calls "sv_2bool_flags" with the "SV_GMAGIC"
7498 flag.
7499
7500 bool sv_2bool(SV *const sv)
7501
7502 sv_2bool_flags
7503 This function is only used by "sv_true()" and friends, and
7504 only if the latter's argument is neither "SvPOK", "SvIOK" nor
7505 "SvNOK". If the flags contain "SV_GMAGIC", then it does an
7506 "mg_get()" first.
7507
7508 bool sv_2bool_flags(SV *sv, I32 flags)
7509
7510 sv_2cv Using various gambits, try to get a CV from an SV; in addition,
7511 try if possible to set *st and *gvp to the stash and GV
7512 associated with it. The flags in "lref" are passed to
7513 "gv_fetchsv".
7514
7515 CV* sv_2cv(SV* sv, HV **const st, GV **const gvp,
7516 const I32 lref)
7517
7518 sv_2io Using various gambits, try to get an IO from an SV: the IO slot
7519 if its a GV; or the recursive result if we're an RV; or the IO
7520 slot of the symbol named after the PV if we're a string.
7521
7522 'Get' magic is ignored on the "sv" passed in, but will be
7523 called on "SvRV(sv)" if "sv" is an RV.
7524
7525 IO* sv_2io(SV *const sv)
7526
7527 sv_2iv_flags
7528 Return the integer value of an SV, doing any necessary string
7529 conversion. If "flags" has the "SV_GMAGIC" bit set, does an
7530 "mg_get()" first. Normally used via the "SvIV(sv)" and
7531 "SvIVx(sv)" macros.
7532
7533 IV sv_2iv_flags(SV *const sv, const I32 flags)
7534
7535 sv_2mortal
7536 Marks an existing SV as mortal. The SV will be destroyed
7537 "soon", either by an explicit call to "FREETMPS", or by an
7538 implicit call at places such as statement boundaries.
7539 "SvTEMP()" is turned on which means that the SV's string buffer
7540 can be "stolen" if this SV is copied. See also "sv_newmortal"
7541 and "sv_mortalcopy".
7542
7543 SV* sv_2mortal(SV *const sv)
7544
7545 sv_2nv_flags
7546 Return the num value of an SV, doing any necessary string or
7547 integer conversion. If "flags" has the "SV_GMAGIC" bit set,
7548 does an "mg_get()" first. Normally used via the "SvNV(sv)" and
7549 "SvNVx(sv)" macros.
7550
7551 NV sv_2nv_flags(SV *const sv, const I32 flags)
7552
7553 sv_2pvbyte
7554 Return a pointer to the byte-encoded representation of the SV,
7555 and set *lp to its length. May cause the SV to be downgraded
7556 from UTF-8 as a side-effect.
7557
7558 Usually accessed via the "SvPVbyte" macro.
7559
7560 char* sv_2pvbyte(SV *sv, STRLEN *const lp)
7561
7562 sv_2pvutf8
7563 Return a pointer to the UTF-8-encoded representation of the SV,
7564 and set *lp to its length. May cause the SV to be upgraded to
7565 UTF-8 as a side-effect.
7566
7567 Usually accessed via the "SvPVutf8" macro.
7568
7569 char* sv_2pvutf8(SV *sv, STRLEN *const lp)
7570
7571 sv_2pv_flags
7572 Returns a pointer to the string value of an SV, and sets *lp to
7573 its length. If flags has the "SV_GMAGIC" bit set, does an
7574 "mg_get()" first. Coerces "sv" to a string if necessary.
7575 Normally invoked via the "SvPV_flags" macro. "sv_2pv()" and
7576 "sv_2pv_nomg" usually end up here too.
7577
7578 char* sv_2pv_flags(SV *const sv, STRLEN *const lp,
7579 const I32 flags)
7580
7581 sv_2uv_flags
7582 Return the unsigned integer value of an SV, doing any necessary
7583 string conversion. If "flags" has the "SV_GMAGIC" bit set,
7584 does an "mg_get()" first. Normally used via the "SvUV(sv)" and
7585 "SvUVx(sv)" macros.
7586
7587 UV sv_2uv_flags(SV *const sv, const I32 flags)
7588
7589 sv_backoff
7590 Remove any string offset. You should normally use the
7591 "SvOOK_off" macro wrapper instead.
7592
7593 void sv_backoff(SV *const sv)
7594
7595 sv_bless
7596 Blesses an SV into a specified package. The SV must be an RV.
7597 The package must be designated by its stash (see "gv_stashpv").
7598 The reference count of the SV is unaffected.
7599
7600 SV* sv_bless(SV *const sv, HV *const stash)
7601
7602 sv_catpv
7603 Concatenates the "NUL"-terminated string onto the end of the
7604 string which is in the SV. If the SV has the UTF-8 status set,
7605 then the bytes appended should be valid UTF-8. Handles 'get'
7606 magic, but not 'set' magic. See "sv_catpv_mg".
7607
7608 void sv_catpv(SV *const sv, const char* ptr)
7609
7610 sv_catpvf
7611 Processes its arguments like "sprintf", and appends the
7612 formatted output to an SV. As with "sv_vcatpvfn" called with a
7613 non-null C-style variable argument list, argument reordering is
7614 not supported. If the appended data contains "wide" characters
7615 (including, but not limited to, SVs with a UTF-8 PV formatted
7616 with %s, and characters >255 formatted with %c), the original
7617 SV might get upgraded to UTF-8. Handles 'get' magic, but not
7618 'set' magic. See "sv_catpvf_mg". If the original SV was
7619 UTF-8, the pattern should be valid UTF-8; if the original SV
7620 was bytes, the pattern should be too.
7621
7622 void sv_catpvf(SV *const sv, const char *const pat,
7623 ...)
7624
7625 sv_catpvf_mg
7626 Like "sv_catpvf", but also handles 'set' magic.
7627
7628 void sv_catpvf_mg(SV *const sv,
7629 const char *const pat, ...)
7630
7631 sv_catpvn
7632 Concatenates the string onto the end of the string which is in
7633 the SV. "len" indicates number of bytes to copy. If the SV
7634 has the UTF-8 status set, then the bytes appended should be
7635 valid UTF-8. Handles 'get' magic, but not 'set' magic. See
7636 "sv_catpvn_mg".
7637
7638 void sv_catpvn(SV *dsv, const char *sstr, STRLEN len)
7639
7640 sv_catpvn_flags
7641 Concatenates the string onto the end of the string which is in
7642 the SV. The "len" indicates number of bytes to copy.
7643
7644 By default, the string appended is assumed to be valid UTF-8 if
7645 the SV has the UTF-8 status set, and a string of bytes
7646 otherwise. One can force the appended string to be interpreted
7647 as UTF-8 by supplying the "SV_CATUTF8" flag, and as bytes by
7648 supplying the "SV_CATBYTES" flag; the SV or the string appended
7649 will be upgraded to UTF-8 if necessary.
7650
7651 If "flags" has the "SV_SMAGIC" bit set, will "mg_set" on "dsv"
7652 afterwards if appropriate. "sv_catpvn" and "sv_catpvn_nomg"
7653 are implemented in terms of this function.
7654
7655 void sv_catpvn_flags(SV *const dstr,
7656 const char *sstr,
7657 const STRLEN len,
7658 const I32 flags)
7659
7660 sv_catpvn_nomg
7661 Like "sv_catpvn" but doesn't process magic.
7662
7663 void sv_catpvn_nomg(SV* sv, const char* ptr,
7664 STRLEN len)
7665
7666 sv_catpvs
7667 Like "sv_catpvn", but takes a literal string instead of a
7668 string/length pair.
7669
7670 void sv_catpvs(SV* sv, "literal string" s)
7671
7672 sv_catpvs_flags
7673 Like "sv_catpvn_flags", but takes a literal string instead of a
7674 string/length pair.
7675
7676 void sv_catpvs_flags(SV* sv, "literal string" s,
7677 I32 flags)
7678
7679 sv_catpvs_mg
7680 Like "sv_catpvn_mg", but takes a literal string instead of a
7681 string/length pair.
7682
7683 void sv_catpvs_mg(SV* sv, "literal string" s)
7684
7685 sv_catpvs_nomg
7686 Like "sv_catpvn_nomg", but takes a literal string instead of a
7687 string/length pair.
7688
7689 void sv_catpvs_nomg(SV* sv, "literal string" s)
7690
7691 sv_catpv_flags
7692 Concatenates the "NUL"-terminated string onto the end of the
7693 string which is in the SV. If the SV has the UTF-8 status set,
7694 then the bytes appended should be valid UTF-8. If "flags" has
7695 the "SV_SMAGIC" bit set, will "mg_set" on the modified SV if
7696 appropriate.
7697
7698 void sv_catpv_flags(SV *dstr, const char *sstr,
7699 const I32 flags)
7700
7701 sv_catpv_mg
7702 Like "sv_catpv", but also handles 'set' magic.
7703
7704 void sv_catpv_mg(SV *const sv, const char *const ptr)
7705
7706 sv_catpv_nomg
7707 Like "sv_catpv" but doesn't process magic.
7708
7709 void sv_catpv_nomg(SV* sv, const char* ptr)
7710
7711 sv_catsv
7712 Concatenates the string from SV "ssv" onto the end of the
7713 string in SV "dsv". If "ssv" is null, does nothing; otherwise
7714 modifies only "dsv". Handles 'get' magic on both SVs, but no
7715 'set' magic. See "sv_catsv_mg" and "sv_catsv_nomg".
7716
7717 void sv_catsv(SV *dstr, SV *sstr)
7718
7719 sv_catsv_flags
7720 Concatenates the string from SV "ssv" onto the end of the
7721 string in SV "dsv". If "ssv" is null, does nothing; otherwise
7722 modifies only "dsv". If "flags" has the "SV_GMAGIC" bit set,
7723 will call "mg_get" on both SVs if appropriate. If "flags" has
7724 the "SV_SMAGIC" bit set, "mg_set" will be called on the
7725 modified SV afterward, if appropriate. "sv_catsv",
7726 "sv_catsv_nomg", and "sv_catsv_mg" are implemented in terms of
7727 this function.
7728
7729 void sv_catsv_flags(SV *const dsv, SV *const ssv,
7730 const I32 flags)
7731
7732 sv_catsv_nomg
7733 Like "sv_catsv" but doesn't process magic.
7734
7735 void sv_catsv_nomg(SV* dsv, SV* ssv)
7736
7737 sv_chop Efficient removal of characters from the beginning of the
7738 string buffer. "SvPOK(sv)", or at least "SvPOKp(sv)", must be
7739 true and "ptr" must be a pointer to somewhere inside the string
7740 buffer. "ptr" becomes the first character of the adjusted
7741 string. Uses the "OOK" hack. On return, only "SvPOK(sv)" and
7742 "SvPOKp(sv)" among the "OK" flags will be true.
7743
7744 Beware: after this function returns, "ptr" and SvPVX_const(sv)
7745 may no longer refer to the same chunk of data.
7746
7747 The unfortunate similarity of this function's name to that of
7748 Perl's "chop" operator is strictly coincidental. This function
7749 works from the left; "chop" works from the right.
7750
7751 void sv_chop(SV *const sv, const char *const ptr)
7752
7753 sv_clear
7754 Clear an SV: call any destructors, free up any memory used by
7755 the body, and free the body itself. The SV's head is not
7756 freed, although its type is set to all 1's so that it won't
7757 inadvertently be assumed to be live during global destruction
7758 etc. This function should only be called when "REFCNT" is
7759 zero. Most of the time you'll want to call "sv_free()" (or its
7760 macro wrapper "SvREFCNT_dec") instead.
7761
7762 void sv_clear(SV *const orig_sv)
7763
7764 sv_cmp Compares the strings in two SVs. Returns -1, 0, or 1
7765 indicating whether the string in "sv1" is less than, equal to,
7766 or greater than the string in "sv2". Is UTF-8 and 'use bytes'
7767 aware, handles get magic, and will coerce its args to strings
7768 if necessary. See also "sv_cmp_locale".
7769
7770 I32 sv_cmp(SV *const sv1, SV *const sv2)
7771
7772 sv_cmp_flags
7773 Compares the strings in two SVs. Returns -1, 0, or 1
7774 indicating whether the string in "sv1" is less than, equal to,
7775 or greater than the string in "sv2". Is UTF-8 and 'use bytes'
7776 aware and will coerce its args to strings if necessary. If the
7777 flags has the "SV_GMAGIC" bit set, it handles get magic. See
7778 also "sv_cmp_locale_flags".
7779
7780 I32 sv_cmp_flags(SV *const sv1, SV *const sv2,
7781 const U32 flags)
7782
7783 sv_cmp_locale
7784 Compares the strings in two SVs in a locale-aware manner. Is
7785 UTF-8 and 'use bytes' aware, handles get magic, and will coerce
7786 its args to strings if necessary. See also "sv_cmp".
7787
7788 I32 sv_cmp_locale(SV *const sv1, SV *const sv2)
7789
7790 sv_cmp_locale_flags
7791 Compares the strings in two SVs in a locale-aware manner. Is
7792 UTF-8 and 'use bytes' aware and will coerce its args to strings
7793 if necessary. If the flags contain "SV_GMAGIC", it handles get
7794 magic. See also "sv_cmp_flags".
7795
7796 I32 sv_cmp_locale_flags(SV *const sv1,
7797 SV *const sv2,
7798 const U32 flags)
7799
7800 sv_collxfrm
7801 This calls "sv_collxfrm_flags" with the SV_GMAGIC flag. See
7802 "sv_collxfrm_flags".
7803
7804 char* sv_collxfrm(SV *const sv, STRLEN *const nxp)
7805
7806 sv_collxfrm_flags
7807 Add Collate Transform magic to an SV if it doesn't already have
7808 it. If the flags contain "SV_GMAGIC", it handles get-magic.
7809
7810 Any scalar variable may carry "PERL_MAGIC_collxfrm" magic that
7811 contains the scalar data of the variable, but transformed to
7812 such a format that a normal memory comparison can be used to
7813 compare the data according to the locale settings.
7814
7815 char* sv_collxfrm_flags(SV *const sv,
7816 STRLEN *const nxp,
7817 I32 const flags)
7818
7819 sv_copypv
7820 Copies a stringified representation of the source SV into the
7821 destination SV. Automatically performs any necessary "mg_get"
7822 and coercion of numeric values into strings. Guaranteed to
7823 preserve "UTF8" flag even from overloaded objects. Similar in
7824 nature to "sv_2pv[_flags]" but operates directly on an SV
7825 instead of just the string. Mostly uses "sv_2pv_flags" to do
7826 its work, except when that would lose the UTF-8'ness of the PV.
7827
7828 void sv_copypv(SV *const dsv, SV *const ssv)
7829
7830 sv_copypv_flags
7831 Implementation of "sv_copypv" and "sv_copypv_nomg". Calls get
7832 magic iff flags has the "SV_GMAGIC" bit set.
7833
7834 void sv_copypv_flags(SV *const dsv, SV *const ssv,
7835 const I32 flags)
7836
7837 sv_copypv_nomg
7838 Like "sv_copypv", but doesn't invoke get magic first.
7839
7840 void sv_copypv_nomg(SV *const dsv, SV *const ssv)
7841
7842 SvCUR Returns the length of the string which is in the SV. See
7843 "SvLEN".
7844
7845 STRLEN SvCUR(SV* sv)
7846
7847 SvCUR_set
7848 Set the current length of the string which is in the SV. See
7849 "SvCUR" and "SvIV_set">.
7850
7851 void SvCUR_set(SV* sv, STRLEN len)
7852
7853 sv_dec Auto-decrement of the value in the SV, doing string to numeric
7854 conversion if necessary. Handles 'get' magic and operator
7855 overloading.
7856
7857 void sv_dec(SV *const sv)
7858
7859 sv_dec_nomg
7860 Auto-decrement of the value in the SV, doing string to numeric
7861 conversion if necessary. Handles operator overloading. Skips
7862 handling 'get' magic.
7863
7864 void sv_dec_nomg(SV *const sv)
7865
7866 sv_derived_from
7867 Exactly like "sv_derived_from_pv", but doesn't take a "flags"
7868 parameter.
7869
7870 bool sv_derived_from(SV* sv, const char *const name)
7871
7872 sv_derived_from_pv
7873 Exactly like "sv_derived_from_pvn", but takes a nul-terminated
7874 string instead of a string/length pair.
7875
7876 bool sv_derived_from_pv(SV* sv,
7877 const char *const name,
7878 U32 flags)
7879
7880 sv_derived_from_pvn
7881 Returns a boolean indicating whether the SV is derived from the
7882 specified class at the C level. To check derivation at the
7883 Perl level, call "isa()" as a normal Perl method.
7884
7885 Currently, the only significant value for "flags" is SVf_UTF8.
7886
7887 bool sv_derived_from_pvn(SV* sv,
7888 const char *const name,
7889 const STRLEN len, U32 flags)
7890
7891 sv_derived_from_sv
7892 Exactly like "sv_derived_from_pvn", but takes the name string
7893 in the form of an SV instead of a string/length pair.
7894
7895 bool sv_derived_from_sv(SV* sv, SV *namesv,
7896 U32 flags)
7897
7898 sv_does Like "sv_does_pv", but doesn't take a "flags" parameter.
7899
7900 bool sv_does(SV* sv, const char *const name)
7901
7902 sv_does_pv
7903 Like "sv_does_sv", but takes a nul-terminated string instead of
7904 an SV.
7905
7906 bool sv_does_pv(SV* sv, const char *const name,
7907 U32 flags)
7908
7909 sv_does_pvn
7910 Like "sv_does_sv", but takes a string/length pair instead of an
7911 SV.
7912
7913 bool sv_does_pvn(SV* sv, const char *const name,
7914 const STRLEN len, U32 flags)
7915
7916 sv_does_sv
7917 Returns a boolean indicating whether the SV performs a
7918 specific, named role. The SV can be a Perl object or the name
7919 of a Perl class.
7920
7921 bool sv_does_sv(SV* sv, SV* namesv, U32 flags)
7922
7923 SvEND Returns a pointer to the spot just after the last character in
7924 the string which is in the SV, where there is usually a
7925 trailing "NUL" character (even though Perl scalars do not
7926 strictly require it). See "SvCUR". Access the character as
7927 "*(SvEND(sv))".
7928
7929 Warning: If "SvCUR" is equal to "SvLEN", then "SvEND" points to
7930 unallocated memory.
7931
7932 char* SvEND(SV* sv)
7933
7934 sv_eq Returns a boolean indicating whether the strings in the two SVs
7935 are identical. Is UTF-8 and 'use bytes' aware, handles get
7936 magic, and will coerce its args to strings if necessary.
7937
7938 I32 sv_eq(SV* sv1, SV* sv2)
7939
7940 sv_eq_flags
7941 Returns a boolean indicating whether the strings in the two SVs
7942 are identical. Is UTF-8 and 'use bytes' aware and coerces its
7943 args to strings if necessary. If the flags has the "SV_GMAGIC"
7944 bit set, it handles get-magic, too.
7945
7946 I32 sv_eq_flags(SV* sv1, SV* sv2, const U32 flags)
7947
7948 sv_force_normal_flags
7949 Undo various types of fakery on an SV, where fakery means "more
7950 than" a string: if the PV is a shared string, make a private
7951 copy; if we're a ref, stop refing; if we're a glob, downgrade
7952 to an "xpvmg"; if we're a copy-on-write scalar, this is the on-
7953 write time when we do the copy, and is also used locally; if
7954 this is a vstring, drop the vstring magic. If "SV_COW_DROP_PV"
7955 is set then a copy-on-write scalar drops its PV buffer (if any)
7956 and becomes "SvPOK_off" rather than making a copy. (Used where
7957 this scalar is about to be set to some other value.) In
7958 addition, the "flags" parameter gets passed to
7959 "sv_unref_flags()" when unreffing. "sv_force_normal" calls
7960 this function with flags set to 0.
7961
7962 This function is expected to be used to signal to perl that
7963 this SV is about to be written to, and any extra book-keeping
7964 needs to be taken care of. Hence, it croaks on read-only
7965 values.
7966
7967 void sv_force_normal_flags(SV *const sv,
7968 const U32 flags)
7969
7970 sv_free Decrement an SV's reference count, and if it drops to zero,
7971 call "sv_clear" to invoke destructors and free up any memory
7972 used by the body; finally, deallocating the SV's head itself.
7973 Normally called via a wrapper macro "SvREFCNT_dec".
7974
7975 void sv_free(SV *const sv)
7976
7977 SvGAMAGIC
7978 Returns true if the SV has get magic or overloading. If either
7979 is true then the scalar is active data, and has the potential
7980 to return a new value every time it is accessed. Hence you
7981 must be careful to only read it once per user logical operation
7982 and work with that returned value. If neither is true then the
7983 scalar's value cannot change unless written to.
7984
7985 U32 SvGAMAGIC(SV* sv)
7986
7987 sv_gets Get a line from the filehandle and store it into the SV,
7988 optionally appending to the currently-stored string. If
7989 "append" is not 0, the line is appended to the SV instead of
7990 overwriting it. "append" should be set to the byte offset that
7991 the appended string should start at in the SV (typically,
7992 "SvCUR(sv)" is a suitable choice).
7993
7994 char* sv_gets(SV *const sv, PerlIO *const fp,
7995 I32 append)
7996
7997 sv_get_backrefs
7998 NOTE: this function is experimental and may change or be
7999 removed without notice.
8000
8001 If "sv" is the target of a weak reference then it returns the
8002 back references structure associated with the sv; otherwise
8003 return "NULL".
8004
8005 When returning a non-null result the type of the return is
8006 relevant. If it is an AV then the elements of the AV are the
8007 weak reference RVs which point at this item. If it is any other
8008 type then the item itself is the weak reference.
8009
8010 See also "Perl_sv_add_backref()", "Perl_sv_del_backref()",
8011 "Perl_sv_kill_backrefs()"
8012
8013 SV* sv_get_backrefs(SV *const sv)
8014
8015 SvGROW Expands the character buffer in the SV so that it has room for
8016 the indicated number of bytes (remember to reserve space for an
8017 extra trailing "NUL" character). Calls "sv_grow" to perform
8018 the expansion if necessary. Returns a pointer to the character
8019 buffer. SV must be of type >= "SVt_PV". One alternative is to
8020 call "sv_grow" if you are not sure of the type of SV.
8021
8022 You might mistakenly think that "len" is the number of bytes to
8023 add to the existing size, but instead it is the total size "sv"
8024 should be.
8025
8026 char * SvGROW(SV* sv, STRLEN len)
8027
8028 sv_grow Expands the character buffer in the SV. If necessary, uses
8029 "sv_unref" and upgrades the SV to "SVt_PV". Returns a pointer
8030 to the character buffer. Use the "SvGROW" wrapper instead.
8031
8032 char* sv_grow(SV *const sv, STRLEN newlen)
8033
8034 sv_inc Auto-increment of the value in the SV, doing string to numeric
8035 conversion if necessary. Handles 'get' magic and operator
8036 overloading.
8037
8038 void sv_inc(SV *const sv)
8039
8040 sv_inc_nomg
8041 Auto-increment of the value in the SV, doing string to numeric
8042 conversion if necessary. Handles operator overloading. Skips
8043 handling 'get' magic.
8044
8045 void sv_inc_nomg(SV *const sv)
8046
8047 sv_insert
8048 Inserts and/or replaces a string at the specified offset/length
8049 within the SV. Similar to the Perl "substr()" function, with
8050 "littlelen" bytes starting at "little" replacing "len" bytes of
8051 the string in "bigstr" starting at "offset". Handles get
8052 magic.
8053
8054 void sv_insert(SV *const bigstr, const STRLEN offset,
8055 const STRLEN len,
8056 const char *const little,
8057 const STRLEN littlelen)
8058
8059 sv_insert_flags
8060 Same as "sv_insert", but the extra "flags" are passed to the
8061 "SvPV_force_flags" that applies to "bigstr".
8062
8063 void sv_insert_flags(SV *const bigstr,
8064 const STRLEN offset,
8065 const STRLEN len,
8066 const char *little,
8067 const STRLEN littlelen,
8068 const U32 flags)
8069
8070 SvIOK Returns a U32 value indicating whether the SV contains an
8071 integer.
8072
8073 U32 SvIOK(SV* sv)
8074
8075 SvIOK_notUV
8076 Returns a boolean indicating whether the SV contains a signed
8077 integer.
8078
8079 bool SvIOK_notUV(SV* sv)
8080
8081 SvIOK_off
8082 Unsets the IV status of an SV.
8083
8084 void SvIOK_off(SV* sv)
8085
8086 SvIOK_on
8087 Tells an SV that it is an integer.
8088
8089 void SvIOK_on(SV* sv)
8090
8091 SvIOK_only
8092 Tells an SV that it is an integer and disables all other "OK"
8093 bits.
8094
8095 void SvIOK_only(SV* sv)
8096
8097 SvIOK_only_UV
8098 Tells an SV that it is an unsigned integer and disables all
8099 other "OK" bits.
8100
8101 void SvIOK_only_UV(SV* sv)
8102
8103 SvIOKp Returns a U32 value indicating whether the SV contains an
8104 integer. Checks the private setting. Use "SvIOK" instead.
8105
8106 U32 SvIOKp(SV* sv)
8107
8108 SvIOK_UV
8109 Returns a boolean indicating whether the SV contains an integer
8110 that must be interpreted as unsigned. A non-negative integer
8111 whose value is within the range of both an IV and a UV may be
8112 be flagged as either "SvUOK" or "SVIOK".
8113
8114 bool SvIOK_UV(SV* sv)
8115
8116 sv_isa Returns a boolean indicating whether the SV is blessed into the
8117 specified class. This does not check for subtypes; use
8118 "sv_derived_from" to verify an inheritance relationship.
8119
8120 int sv_isa(SV* sv, const char *const name)
8121
8122 SvIsCOW Returns a U32 value indicating whether the SV is Copy-On-Write
8123 (either shared hash key scalars, or full Copy On Write scalars
8124 if 5.9.0 is configured for COW).
8125
8126 U32 SvIsCOW(SV* sv)
8127
8128 SvIsCOW_shared_hash
8129 Returns a boolean indicating whether the SV is Copy-On-Write
8130 shared hash key scalar.
8131
8132 bool SvIsCOW_shared_hash(SV* sv)
8133
8134 sv_isobject
8135 Returns a boolean indicating whether the SV is an RV pointing
8136 to a blessed object. If the SV is not an RV, or if the object
8137 is not blessed, then this will return false.
8138
8139 int sv_isobject(SV* sv)
8140
8141 SvIV Coerces the given SV to IV and returns it. The returned value
8142 in many circumstances will get stored in "sv"'s IV slot, but
8143 not in all cases. (Use "sv_setiv" to make sure it does).
8144
8145 See "SvIVx" for a version which guarantees to evaluate "sv"
8146 only once.
8147
8148 IV SvIV(SV* sv)
8149
8150 SvIV_nomg
8151 Like "SvIV" but doesn't process magic.
8152
8153 IV SvIV_nomg(SV* sv)
8154
8155 SvIV_set
8156 Set the value of the IV pointer in sv to val. It is possible
8157 to perform the same function of this macro with an lvalue
8158 assignment to "SvIVX". With future Perls, however, it will be
8159 more efficient to use "SvIV_set" instead of the lvalue
8160 assignment to "SvIVX".
8161
8162 void SvIV_set(SV* sv, IV val)
8163
8164 SvIVX Returns the raw value in the SV's IV slot, without checks or
8165 conversions. Only use when you are sure "SvIOK" is true. See
8166 also "SvIV".
8167
8168 IV SvIVX(SV* sv)
8169
8170 SvIVx Coerces the given SV to IV and returns it. The returned value
8171 in many circumstances will get stored in "sv"'s IV slot, but
8172 not in all cases. (Use "sv_setiv" to make sure it does).
8173
8174 This form guarantees to evaluate "sv" only once. Only use this
8175 if "sv" is an expression with side effects, otherwise use the
8176 more efficient "SvIV".
8177
8178 IV SvIVx(SV* sv)
8179
8180 SvLEN Returns the size of the string buffer in the SV, not including
8181 any part attributable to "SvOOK". See "SvCUR".
8182
8183 STRLEN SvLEN(SV* sv)
8184
8185 sv_len Returns the length of the string in the SV. Handles magic and
8186 type coercion and sets the UTF8 flag appropriately. See also
8187 "SvCUR", which gives raw access to the "xpv_cur" slot.
8188
8189 STRLEN sv_len(SV *const sv)
8190
8191 SvLEN_set
8192 Set the size of the string buffer for the SV. See "SvLEN".
8193
8194 void SvLEN_set(SV* sv, STRLEN len)
8195
8196 sv_len_utf8
8197 Returns the number of characters in the string in an SV,
8198 counting wide UTF-8 bytes as a single character. Handles magic
8199 and type coercion.
8200
8201 STRLEN sv_len_utf8(SV *const sv)
8202
8203 sv_magic
8204 Adds magic to an SV. First upgrades "sv" to type "SVt_PVMG" if
8205 necessary, then adds a new magic item of type "how" to the head
8206 of the magic list.
8207
8208 See "sv_magicext" (which "sv_magic" now calls) for a
8209 description of the handling of the "name" and "namlen"
8210 arguments.
8211
8212 You need to use "sv_magicext" to add magic to "SvREADONLY" SVs
8213 and also to add more than one instance of the same "how".
8214
8215 void sv_magic(SV *const sv, SV *const obj,
8216 const int how, const char *const name,
8217 const I32 namlen)
8218
8219 sv_magicext
8220 Adds magic to an SV, upgrading it if necessary. Applies the
8221 supplied "vtable" and returns a pointer to the magic added.
8222
8223 Note that "sv_magicext" will allow things that "sv_magic" will
8224 not. In particular, you can add magic to "SvREADONLY" SVs, and
8225 add more than one instance of the same "how".
8226
8227 If "namlen" is greater than zero then a "savepvn" copy of
8228 "name" is stored, if "namlen" is zero then "name" is stored as-
8229 is and - as another special case - if "(name && namlen ==
8230 HEf_SVKEY)" then "name" is assumed to contain an SV* and is
8231 stored as-is with its "REFCNT" incremented.
8232
8233 (This is now used as a subroutine by "sv_magic".)
8234
8235 MAGIC * sv_magicext(SV *const sv, SV *const obj,
8236 const int how,
8237 const MGVTBL *const vtbl,
8238 const char *const name,
8239 const I32 namlen)
8240
8241 SvMAGIC_set
8242 Set the value of the MAGIC pointer in "sv" to val. See
8243 "SvIV_set".
8244
8245 void SvMAGIC_set(SV* sv, MAGIC* val)
8246
8247 sv_mortalcopy
8248 Creates a new SV which is a copy of the original SV (using
8249 "sv_setsv"). The new SV is marked as mortal. It will be
8250 destroyed "soon", either by an explicit call to "FREETMPS", or
8251 by an implicit call at places such as statement boundaries.
8252 See also "sv_newmortal" and "sv_2mortal".
8253
8254 SV* sv_mortalcopy(SV *const oldsv)
8255
8256 sv_newmortal
8257 Creates a new null SV which is mortal. The reference count of
8258 the SV is set to 1. It will be destroyed "soon", either by an
8259 explicit call to "FREETMPS", or by an implicit call at places
8260 such as statement boundaries. See also "sv_mortalcopy" and
8261 "sv_2mortal".
8262
8263 SV* sv_newmortal()
8264
8265 sv_newref
8266 Increment an SV's reference count. Use the "SvREFCNT_inc()"
8267 wrapper instead.
8268
8269 SV* sv_newref(SV *const sv)
8270
8271 SvNIOK Returns a U32 value indicating whether the SV contains a
8272 number, integer or double.
8273
8274 U32 SvNIOK(SV* sv)
8275
8276 SvNIOK_off
8277 Unsets the NV/IV status of an SV.
8278
8279 void SvNIOK_off(SV* sv)
8280
8281 SvNIOKp Returns a U32 value indicating whether the SV contains a
8282 number, integer or double. Checks the private setting. Use
8283 "SvNIOK" instead.
8284
8285 U32 SvNIOKp(SV* sv)
8286
8287 SvNOK Returns a U32 value indicating whether the SV contains a
8288 double.
8289
8290 U32 SvNOK(SV* sv)
8291
8292 SvNOK_off
8293 Unsets the NV status of an SV.
8294
8295 void SvNOK_off(SV* sv)
8296
8297 SvNOK_on
8298 Tells an SV that it is a double.
8299
8300 void SvNOK_on(SV* sv)
8301
8302 SvNOK_only
8303 Tells an SV that it is a double and disables all other OK bits.
8304
8305 void SvNOK_only(SV* sv)
8306
8307 SvNOKp Returns a U32 value indicating whether the SV contains a
8308 double. Checks the private setting. Use "SvNOK" instead.
8309
8310 U32 SvNOKp(SV* sv)
8311
8312 SvNV Coerces the given SV to NV and returns it. The returned value
8313 in many circumstances will get stored in "sv"'s NV slot, but
8314 not in all cases. (Use "sv_setnv" to make sure it does).
8315
8316 See "SvNVx" for a version which guarantees to evaluate "sv"
8317 only once.
8318
8319 NV SvNV(SV* sv)
8320
8321 SvNV_nomg
8322 Like "SvNV" but doesn't process magic.
8323
8324 NV SvNV_nomg(SV* sv)
8325
8326 SvNV_set
8327 Set the value of the NV pointer in "sv" to val. See
8328 "SvIV_set".
8329
8330 void SvNV_set(SV* sv, NV val)
8331
8332 SvNVX Returns the raw value in the SV's NV slot, without checks or
8333 conversions. Only use when you are sure "SvNOK" is true. See
8334 also "SvNV".
8335
8336 NV SvNVX(SV* sv)
8337
8338 SvNVx Coerces the given SV to NV and returns it. The returned value
8339 in many circumstances will get stored in "sv"'s NV slot, but
8340 not in all cases. (Use "sv_setnv" to make sure it does).
8341
8342 This form guarantees to evaluate "sv" only once. Only use this
8343 if "sv" is an expression with side effects, otherwise use the
8344 more efficient "SvNV".
8345
8346 NV SvNVx(SV* sv)
8347
8348 SvOK Returns a U32 value indicating whether the value is defined.
8349 This is only meaningful for scalars.
8350
8351 U32 SvOK(SV* sv)
8352
8353 SvOOK Returns a U32 indicating whether the pointer to the string
8354 buffer is offset. This hack is used internally to speed up
8355 removal of characters from the beginning of a "SvPV". When
8356 "SvOOK" is true, then the start of the allocated string buffer
8357 is actually "SvOOK_offset()" bytes before "SvPVX". This offset
8358 used to be stored in "SvIVX", but is now stored within the
8359 spare part of the buffer.
8360
8361 U32 SvOOK(SV* sv)
8362
8363 SvOOK_offset
8364 Reads into "len" the offset from "SvPVX" back to the true start
8365 of the allocated buffer, which will be non-zero if "sv_chop"
8366 has been used to efficiently remove characters from start of
8367 the buffer. Implemented as a macro, which takes the address of
8368 "len", which must be of type "STRLEN". Evaluates "sv" more
8369 than once. Sets "len" to 0 if "SvOOK(sv)" is false.
8370
8371 void SvOOK_offset(SV*sv, STRLEN len)
8372
8373 SvPOK Returns a U32 value indicating whether the SV contains a
8374 character string.
8375
8376 U32 SvPOK(SV* sv)
8377
8378 SvPOK_off
8379 Unsets the PV status of an SV.
8380
8381 void SvPOK_off(SV* sv)
8382
8383 SvPOK_on
8384 Tells an SV that it is a string.
8385
8386 void SvPOK_on(SV* sv)
8387
8388 SvPOK_only
8389 Tells an SV that it is a string and disables all other "OK"
8390 bits. Will also turn off the UTF-8 status.
8391
8392 void SvPOK_only(SV* sv)
8393
8394 SvPOK_only_UTF8
8395 Tells an SV that it is a string and disables all other "OK"
8396 bits, and leaves the UTF-8 status as it was.
8397
8398 void SvPOK_only_UTF8(SV* sv)
8399
8400 SvPOKp Returns a U32 value indicating whether the SV contains a
8401 character string. Checks the private setting. Use "SvPOK"
8402 instead.
8403
8404 U32 SvPOKp(SV* sv)
8405
8406 sv_pos_b2u
8407 Converts the value pointed to by "offsetp" from a count of
8408 bytes from the start of the string, to a count of the
8409 equivalent number of UTF-8 chars. Handles magic and type
8410 coercion.
8411
8412 Use "sv_pos_b2u_flags" in preference, which correctly handles
8413 strings longer than 2Gb.
8414
8415 void sv_pos_b2u(SV *const sv, I32 *const offsetp)
8416
8417 sv_pos_b2u_flags
8418 Converts "offset" from a count of bytes from the start of the
8419 string, to a count of the equivalent number of UTF-8 chars.
8420 Handles type coercion. "flags" is passed to "SvPV_flags", and
8421 usually should be "SV_GMAGIC|SV_CONST_RETURN" to handle magic.
8422
8423 STRLEN sv_pos_b2u_flags(SV *const sv,
8424 STRLEN const offset, U32 flags)
8425
8426 sv_pos_u2b
8427 Converts the value pointed to by "offsetp" from a count of
8428 UTF-8 chars from the start of the string, to a count of the
8429 equivalent number of bytes; if "lenp" is non-zero, it does the
8430 same to "lenp", but this time starting from the offset, rather
8431 than from the start of the string. Handles magic and type
8432 coercion.
8433
8434 Use "sv_pos_u2b_flags" in preference, which correctly handles
8435 strings longer than 2Gb.
8436
8437 void sv_pos_u2b(SV *const sv, I32 *const offsetp,
8438 I32 *const lenp)
8439
8440 sv_pos_u2b_flags
8441 Converts the offset from a count of UTF-8 chars from the start
8442 of the string, to a count of the equivalent number of bytes; if
8443 "lenp" is non-zero, it does the same to "lenp", but this time
8444 starting from "offset", rather than from the start of the
8445 string. Handles type coercion. "flags" is passed to
8446 "SvPV_flags", and usually should be "SV_GMAGIC|SV_CONST_RETURN"
8447 to handle magic.
8448
8449 STRLEN sv_pos_u2b_flags(SV *const sv, STRLEN uoffset,
8450 STRLEN *const lenp, U32 flags)
8451
8452 SvPV Returns a pointer to the string in the SV, or a stringified
8453 form of the SV if the SV does not contain a string. The SV may
8454 cache the stringified version becoming "SvPOK". Handles 'get'
8455 magic. The "len" variable will be set to the length of the
8456 string (this is a macro, so don't use &len). See also "SvPVx"
8457 for a version which guarantees to evaluate "sv" only once.
8458
8459 Note that there is no guarantee that the return value of
8460 "SvPV()" is equal to "SvPVX(sv)", or that "SvPVX(sv)" contains
8461 valid data, or that successive calls to "SvPV(sv)" will return
8462 the same pointer value each time. This is due to the way that
8463 things like overloading and Copy-On-Write are handled. In
8464 these cases, the return value may point to a temporary buffer
8465 or similar. If you absolutely need the "SvPVX" field to be
8466 valid (for example, if you intend to write to it), then see
8467 "SvPV_force".
8468
8469 char* SvPV(SV* sv, STRLEN len)
8470
8471 SvPVbyte
8472 Like "SvPV", but converts "sv" to byte representation first if
8473 necessary.
8474
8475 char* SvPVbyte(SV* sv, STRLEN len)
8476
8477 SvPVbyte_force
8478 Like "SvPV_force", but converts "sv" to byte representation
8479 first if necessary.
8480
8481 char* SvPVbyte_force(SV* sv, STRLEN len)
8482
8483 SvPVbyte_nolen
8484 Like "SvPV_nolen", but converts "sv" to byte representation
8485 first if necessary.
8486
8487 char* SvPVbyte_nolen(SV* sv)
8488
8489 sv_pvbyten_force
8490 The backend for the "SvPVbytex_force" macro. Always use the
8491 macro instead.
8492
8493 char* sv_pvbyten_force(SV *const sv, STRLEN *const lp)
8494
8495 SvPVbytex
8496 Like "SvPV", but converts "sv" to byte representation first if
8497 necessary. Guarantees to evaluate "sv" only once; use the more
8498 efficient "SvPVbyte" otherwise.
8499
8500 char* SvPVbytex(SV* sv, STRLEN len)
8501
8502 SvPVbytex_force
8503 Like "SvPV_force", but converts "sv" to byte representation
8504 first if necessary. Guarantees to evaluate "sv" only once; use
8505 the more efficient "SvPVbyte_force" otherwise.
8506
8507 char* SvPVbytex_force(SV* sv, STRLEN len)
8508
8509 SvPVCLEAR
8510 Ensures that sv is a SVt_PV and that its SvCUR is 0, and that
8511 it is properly null terminated. Equivalent to sv_setpvs(""),
8512 but more efficient.
8513
8514 char * SvPVCLEAR(SV* sv)
8515
8516 SvPV_force
8517 Like "SvPV" but will force the SV into containing a string
8518 ("SvPOK"), and only a string ("SvPOK_only"), by hook or by
8519 crook. You need force if you are going to update the "SvPVX"
8520 directly. Processes get magic.
8521
8522 Note that coercing an arbitrary scalar into a plain PV will
8523 potentially strip useful data from it. For example if the SV
8524 was "SvROK", then the referent will have its reference count
8525 decremented, and the SV itself may be converted to an "SvPOK"
8526 scalar with a string buffer containing a value such as
8527 "ARRAY(0x1234)".
8528
8529 char* SvPV_force(SV* sv, STRLEN len)
8530
8531 SvPV_force_nomg
8532 Like "SvPV_force", but doesn't process get magic.
8533
8534 char* SvPV_force_nomg(SV* sv, STRLEN len)
8535
8536 SvPV_nolen
8537 Like "SvPV" but doesn't set a length variable.
8538
8539 char* SvPV_nolen(SV* sv)
8540
8541 SvPV_nomg
8542 Like "SvPV" but doesn't process magic.
8543
8544 char* SvPV_nomg(SV* sv, STRLEN len)
8545
8546 SvPV_nomg_nolen
8547 Like "SvPV_nolen" but doesn't process magic.
8548
8549 char* SvPV_nomg_nolen(SV* sv)
8550
8551 sv_pvn_force
8552 Get a sensible string out of the SV somehow. A private
8553 implementation of the "SvPV_force" macro for compilers which
8554 can't cope with complex macro expressions. Always use the
8555 macro instead.
8556
8557 char* sv_pvn_force(SV* sv, STRLEN* lp)
8558
8559 sv_pvn_force_flags
8560 Get a sensible string out of the SV somehow. If "flags" has
8561 the "SV_GMAGIC" bit set, will "mg_get" on "sv" if appropriate,
8562 else not. "sv_pvn_force" and "sv_pvn_force_nomg" are
8563 implemented in terms of this function. You normally want to
8564 use the various wrapper macros instead: see "SvPV_force" and
8565 "SvPV_force_nomg".
8566
8567 char* sv_pvn_force_flags(SV *const sv,
8568 STRLEN *const lp,
8569 const I32 flags)
8570
8571 SvPV_set
8572 This is probably not what you want to use, you probably wanted
8573 "sv_usepvn_flags" or "sv_setpvn" or "sv_setpvs".
8574
8575 Set the value of the PV pointer in "sv" to the Perl allocated
8576 "NUL"-terminated string "val". See also "SvIV_set".
8577
8578 Remember to free the previous PV buffer. There are many things
8579 to check. Beware that the existing pointer may be involved in
8580 copy-on-write or other mischief, so do "SvOOK_off(sv)" and use
8581 "sv_force_normal" or "SvPV_force" (or check the "SvIsCOW" flag)
8582 first to make sure this modification is safe. Then finally, if
8583 it is not a COW, call "SvPV_free" to free the previous PV
8584 buffer.
8585
8586 void SvPV_set(SV* sv, char* val)
8587
8588 SvPVutf8
8589 Like "SvPV", but converts "sv" to UTF-8 first if necessary.
8590
8591 char* SvPVutf8(SV* sv, STRLEN len)
8592
8593 sv_pvutf8n_force
8594 The backend for the "SvPVutf8x_force" macro. Always use the
8595 macro instead.
8596
8597 char* sv_pvutf8n_force(SV *const sv, STRLEN *const lp)
8598
8599 SvPVutf8x
8600 Like "SvPV", but converts "sv" to UTF-8 first if necessary.
8601 Guarantees to evaluate "sv" only once; use the more efficient
8602 "SvPVutf8" otherwise.
8603
8604 char* SvPVutf8x(SV* sv, STRLEN len)
8605
8606 SvPVutf8x_force
8607 Like "SvPV_force", but converts "sv" to UTF-8 first if
8608 necessary. Guarantees to evaluate "sv" only once; use the more
8609 efficient "SvPVutf8_force" otherwise.
8610
8611 char* SvPVutf8x_force(SV* sv, STRLEN len)
8612
8613 SvPVutf8_force
8614 Like "SvPV_force", but converts "sv" to UTF-8 first if
8615 necessary.
8616
8617 char* SvPVutf8_force(SV* sv, STRLEN len)
8618
8619 SvPVutf8_nolen
8620 Like "SvPV_nolen", but converts "sv" to UTF-8 first if
8621 necessary.
8622
8623 char* SvPVutf8_nolen(SV* sv)
8624
8625 SvPVX Returns a pointer to the physical string in the SV. The SV
8626 must contain a string. Prior to 5.9.3 it is not safe to
8627 execute this macro unless the SV's type >= "SVt_PV".
8628
8629 This is also used to store the name of an autoloaded subroutine
8630 in an XS AUTOLOAD routine. See "Autoloading with XSUBs" in
8631 perlguts.
8632
8633 char* SvPVX(SV* sv)
8634
8635 SvPVx A version of "SvPV" which guarantees to evaluate "sv" only
8636 once. Only use this if "sv" is an expression with side
8637 effects, otherwise use the more efficient "SvPV".
8638
8639 char* SvPVx(SV* sv, STRLEN len)
8640
8641 SvREADONLY
8642 Returns true if the argument is readonly, otherwise returns
8643 false. Exposed to to perl code via Internals::SvREADONLY().
8644
8645 U32 SvREADONLY(SV* sv)
8646
8647 SvREADONLY_off
8648 Mark an object as not-readonly. Exactly what this mean depends
8649 on the object type. Exposed to perl code via
8650 Internals::SvREADONLY().
8651
8652 U32 SvREADONLY_off(SV* sv)
8653
8654 SvREADONLY_on
8655 Mark an object as readonly. Exactly what this means depends on
8656 the object type. Exposed to perl code via
8657 Internals::SvREADONLY().
8658
8659 U32 SvREADONLY_on(SV* sv)
8660
8661 sv_ref Returns a SV describing what the SV passed in is a reference
8662 to.
8663
8664 dst can be a SV to be set to the description or NULL, in which
8665 case a mortal SV is returned.
8666
8667 If ob is true and the SV is blessed, the description is the
8668 class name, otherwise it is the type of the SV, "SCALAR",
8669 "ARRAY" etc.
8670
8671 SV* sv_ref(SV *dst, const SV *const sv,
8672 const int ob)
8673
8674 SvREFCNT
8675 Returns the value of the object's reference count. Exposed to
8676 perl code via Internals::SvREFCNT().
8677
8678 U32 SvREFCNT(SV* sv)
8679
8680 SvREFCNT_dec
8681 Decrements the reference count of the given SV. "sv" may be
8682 "NULL".
8683
8684 void SvREFCNT_dec(SV* sv)
8685
8686 SvREFCNT_dec_NN
8687 Same as "SvREFCNT_dec", but can only be used if you know "sv"
8688 is not "NULL". Since we don't have to check the NULLness, it's
8689 faster and smaller.
8690
8691 void SvREFCNT_dec_NN(SV* sv)
8692
8693 SvREFCNT_inc
8694 Increments the reference count of the given SV, returning the
8695 SV.
8696
8697 All of the following "SvREFCNT_inc"* macros are optimized
8698 versions of "SvREFCNT_inc", and can be replaced with
8699 "SvREFCNT_inc".
8700
8701 SV* SvREFCNT_inc(SV* sv)
8702
8703 SvREFCNT_inc_NN
8704 Same as "SvREFCNT_inc", but can only be used if you know "sv"
8705 is not "NULL". Since we don't have to check the NULLness, it's
8706 faster and smaller.
8707
8708 SV* SvREFCNT_inc_NN(SV* sv)
8709
8710 SvREFCNT_inc_simple
8711 Same as "SvREFCNT_inc", but can only be used with expressions
8712 without side effects. Since we don't have to store a temporary
8713 value, it's faster.
8714
8715 SV* SvREFCNT_inc_simple(SV* sv)
8716
8717 SvREFCNT_inc_simple_NN
8718 Same as "SvREFCNT_inc_simple", but can only be used if you know
8719 "sv" is not "NULL". Since we don't have to check the NULLness,
8720 it's faster and smaller.
8721
8722 SV* SvREFCNT_inc_simple_NN(SV* sv)
8723
8724 SvREFCNT_inc_simple_void
8725 Same as "SvREFCNT_inc_simple", but can only be used if you
8726 don't need the return value. The macro doesn't need to return
8727 a meaningful value.
8728
8729 void SvREFCNT_inc_simple_void(SV* sv)
8730
8731 SvREFCNT_inc_simple_void_NN
8732 Same as "SvREFCNT_inc", but can only be used if you don't need
8733 the return value, and you know that "sv" is not "NULL". The
8734 macro doesn't need to return a meaningful value, or check for
8735 NULLness, so it's smaller and faster.
8736
8737 void SvREFCNT_inc_simple_void_NN(SV* sv)
8738
8739 SvREFCNT_inc_void
8740 Same as "SvREFCNT_inc", but can only be used if you don't need
8741 the return value. The macro doesn't need to return a
8742 meaningful value.
8743
8744 void SvREFCNT_inc_void(SV* sv)
8745
8746 SvREFCNT_inc_void_NN
8747 Same as "SvREFCNT_inc", but can only be used if you don't need
8748 the return value, and you know that "sv" is not "NULL". The
8749 macro doesn't need to return a meaningful value, or check for
8750 NULLness, so it's smaller and faster.
8751
8752 void SvREFCNT_inc_void_NN(SV* sv)
8753
8754 sv_reftype
8755 Returns a string describing what the SV is a reference to.
8756
8757 If ob is true and the SV is blessed, the string is the class
8758 name, otherwise it is the type of the SV, "SCALAR", "ARRAY"
8759 etc.
8760
8761 const char* sv_reftype(const SV *const sv, const int ob)
8762
8763 sv_replace
8764 Make the first argument a copy of the second, then delete the
8765 original. The target SV physically takes over ownership of the
8766 body of the source SV and inherits its flags; however, the
8767 target keeps any magic it owns, and any magic in the source is
8768 discarded. Note that this is a rather specialist SV copying
8769 operation; most of the time you'll want to use "sv_setsv" or
8770 one of its many macro front-ends.
8771
8772 void sv_replace(SV *const sv, SV *const nsv)
8773
8774 sv_report_used
8775 Dump the contents of all SVs not yet freed (debugging aid).
8776
8777 void sv_report_used()
8778
8779 sv_reset
8780 Underlying implementation for the "reset" Perl function. Note
8781 that the perl-level function is vaguely deprecated.
8782
8783 void sv_reset(const char* s, HV *const stash)
8784
8785 SvROK Tests if the SV is an RV.
8786
8787 U32 SvROK(SV* sv)
8788
8789 SvROK_off
8790 Unsets the RV status of an SV.
8791
8792 void SvROK_off(SV* sv)
8793
8794 SvROK_on
8795 Tells an SV that it is an RV.
8796
8797 void SvROK_on(SV* sv)
8798
8799 SvRV Dereferences an RV to return the SV.
8800
8801 SV* SvRV(SV* sv)
8802
8803 SvRV_set
8804 Set the value of the RV pointer in "sv" to val. See
8805 "SvIV_set".
8806
8807 void SvRV_set(SV* sv, SV* val)
8808
8809 sv_rvunweaken
8810 Unweaken a reference: Clear the "SvWEAKREF" flag on this RV;
8811 remove the backreference to this RV from the array of
8812 backreferences associated with the target SV, increment the
8813 refcount of the target. Silently ignores "undef" and warns on
8814 non-weak references.
8815
8816 SV* sv_rvunweaken(SV *const sv)
8817
8818 sv_rvweaken
8819 Weaken a reference: set the "SvWEAKREF" flag on this RV; give
8820 the referred-to SV "PERL_MAGIC_backref" magic if it hasn't
8821 already; and push a back-reference to this RV onto the array of
8822 backreferences associated with that magic. If the RV is
8823 magical, set magic will be called after the RV is cleared.
8824 Silently ignores "undef" and warns on already-weak references.
8825
8826 SV* sv_rvweaken(SV *const sv)
8827
8828 sv_setiv
8829 Copies an integer into the given SV, upgrading first if
8830 necessary. Does not handle 'set' magic. See also
8831 "sv_setiv_mg".
8832
8833 void sv_setiv(SV *const sv, const IV num)
8834
8835 sv_setiv_mg
8836 Like "sv_setiv", but also handles 'set' magic.
8837
8838 void sv_setiv_mg(SV *const sv, const IV i)
8839
8840 sv_setnv
8841 Copies a double into the given SV, upgrading first if
8842 necessary. Does not handle 'set' magic. See also
8843 "sv_setnv_mg".
8844
8845 void sv_setnv(SV *const sv, const NV num)
8846
8847 sv_setnv_mg
8848 Like "sv_setnv", but also handles 'set' magic.
8849
8850 void sv_setnv_mg(SV *const sv, const NV num)
8851
8852 sv_setpv
8853 Copies a string into an SV. The string must be terminated with
8854 a "NUL" character, and not contain embeded "NUL"'s. Does not
8855 handle 'set' magic. See "sv_setpv_mg".
8856
8857 void sv_setpv(SV *const sv, const char *const ptr)
8858
8859 sv_setpvf
8860 Works like "sv_catpvf" but copies the text into the SV instead
8861 of appending it. Does not handle 'set' magic. See
8862 "sv_setpvf_mg".
8863
8864 void sv_setpvf(SV *const sv, const char *const pat,
8865 ...)
8866
8867 sv_setpvf_mg
8868 Like "sv_setpvf", but also handles 'set' magic.
8869
8870 void sv_setpvf_mg(SV *const sv,
8871 const char *const pat, ...)
8872
8873 sv_setpviv
8874 Copies an integer into the given SV, also updating its string
8875 value. Does not handle 'set' magic. See "sv_setpviv_mg".
8876
8877 void sv_setpviv(SV *const sv, const IV num)
8878
8879 sv_setpviv_mg
8880 Like "sv_setpviv", but also handles 'set' magic.
8881
8882 void sv_setpviv_mg(SV *const sv, const IV iv)
8883
8884 sv_setpvn
8885 Copies a string (possibly containing embedded "NUL" characters)
8886 into an SV. The "len" parameter indicates the number of bytes
8887 to be copied. If the "ptr" argument is NULL the SV will become
8888 undefined. Does not handle 'set' magic. See "sv_setpvn_mg".
8889
8890 void sv_setpvn(SV *const sv, const char *const ptr,
8891 const STRLEN len)
8892
8893 sv_setpvn_mg
8894 Like "sv_setpvn", but also handles 'set' magic.
8895
8896 void sv_setpvn_mg(SV *const sv,
8897 const char *const ptr,
8898 const STRLEN len)
8899
8900 sv_setpvs
8901 Like "sv_setpvn", but takes a literal string instead of a
8902 string/length pair.
8903
8904 void sv_setpvs(SV* sv, "literal string" s)
8905
8906 sv_setpvs_mg
8907 Like "sv_setpvn_mg", but takes a literal string instead of a
8908 string/length pair.
8909
8910 void sv_setpvs_mg(SV* sv, "literal string" s)
8911
8912 sv_setpv_bufsize
8913 Sets the SV to be a string of cur bytes length, with at least
8914 len bytes available. Ensures that there is a null byte at
8915 SvEND. Returns a char * pointer to the SvPV buffer.
8916
8917 char * sv_setpv_bufsize(SV *const sv, const STRLEN cur,
8918 const STRLEN len)
8919
8920 sv_setpv_mg
8921 Like "sv_setpv", but also handles 'set' magic.
8922
8923 void sv_setpv_mg(SV *const sv, const char *const ptr)
8924
8925 sv_setref_iv
8926 Copies an integer into a new SV, optionally blessing the SV.
8927 The "rv" argument will be upgraded to an RV. That RV will be
8928 modified to point to the new SV. The "classname" argument
8929 indicates the package for the blessing. Set "classname" to
8930 "NULL" to avoid the blessing. The new SV will have a reference
8931 count of 1, and the RV will be returned.
8932
8933 SV* sv_setref_iv(SV *const rv,
8934 const char *const classname,
8935 const IV iv)
8936
8937 sv_setref_nv
8938 Copies a double into a new SV, optionally blessing the SV. The
8939 "rv" argument will be upgraded to an RV. That RV will be
8940 modified to point to the new SV. The "classname" argument
8941 indicates the package for the blessing. Set "classname" to
8942 "NULL" to avoid the blessing. The new SV will have a reference
8943 count of 1, and the RV will be returned.
8944
8945 SV* sv_setref_nv(SV *const rv,
8946 const char *const classname,
8947 const NV nv)
8948
8949 sv_setref_pv
8950 Copies a pointer into a new SV, optionally blessing the SV.
8951 The "rv" argument will be upgraded to an RV. That RV will be
8952 modified to point to the new SV. If the "pv" argument is
8953 "NULL", then "PL_sv_undef" will be placed into the SV. The
8954 "classname" argument indicates the package for the blessing.
8955 Set "classname" to "NULL" to avoid the blessing. The new SV
8956 will have a reference count of 1, and the RV will be returned.
8957
8958 Do not use with other Perl types such as HV, AV, SV, CV,
8959 because those objects will become corrupted by the pointer copy
8960 process.
8961
8962 Note that "sv_setref_pvn" copies the string while this copies
8963 the pointer.
8964
8965 SV* sv_setref_pv(SV *const rv,
8966 const char *const classname,
8967 void *const pv)
8968
8969 sv_setref_pvn
8970 Copies a string into a new SV, optionally blessing the SV. The
8971 length of the string must be specified with "n". The "rv"
8972 argument will be upgraded to an RV. That RV will be modified
8973 to point to the new SV. The "classname" argument indicates the
8974 package for the blessing. Set "classname" to "NULL" to avoid
8975 the blessing. The new SV will have a reference count of 1, and
8976 the RV will be returned.
8977
8978 Note that "sv_setref_pv" copies the pointer while this copies
8979 the string.
8980
8981 SV* sv_setref_pvn(SV *const rv,
8982 const char *const classname,
8983 const char *const pv,
8984 const STRLEN n)
8985
8986 sv_setref_pvs
8987 Like "sv_setref_pvn", but takes a literal string instead of a
8988 string/length pair.
8989
8990 SV * sv_setref_pvs("literal string" s)
8991
8992 sv_setref_uv
8993 Copies an unsigned integer into a new SV, optionally blessing
8994 the SV. The "rv" argument will be upgraded to an RV. That RV
8995 will be modified to point to the new SV. The "classname"
8996 argument indicates the package for the blessing. Set
8997 "classname" to "NULL" to avoid the blessing. The new SV will
8998 have a reference count of 1, and the RV will be returned.
8999
9000 SV* sv_setref_uv(SV *const rv,
9001 const char *const classname,
9002 const UV uv)
9003
9004 sv_setsv
9005 Copies the contents of the source SV "ssv" into the destination
9006 SV "dsv". The source SV may be destroyed if it is mortal, so
9007 don't use this function if the source SV needs to be reused.
9008 Does not handle 'set' magic on destination SV. Calls 'get'
9009 magic on source SV. Loosely speaking, it performs a copy-by-
9010 value, obliterating any previous content of the destination.
9011
9012 You probably want to use one of the assortment of wrappers,
9013 such as "SvSetSV", "SvSetSV_nosteal", "SvSetMagicSV" and
9014 "SvSetMagicSV_nosteal".
9015
9016 void sv_setsv(SV *dstr, SV *sstr)
9017
9018 sv_setsv_flags
9019 Copies the contents of the source SV "ssv" into the destination
9020 SV "dsv". The source SV may be destroyed if it is mortal, so
9021 don't use this function if the source SV needs to be reused.
9022 Does not handle 'set' magic. Loosely speaking, it performs a
9023 copy-by-value, obliterating any previous content of the
9024 destination. If the "flags" parameter has the "SV_GMAGIC" bit
9025 set, will "mg_get" on "ssv" if appropriate, else not. If the
9026 "flags" parameter has the "SV_NOSTEAL" bit set then the buffers
9027 of temps will not be stolen. "sv_setsv" and "sv_setsv_nomg"
9028 are implemented in terms of this function.
9029
9030 You probably want to use one of the assortment of wrappers,
9031 such as "SvSetSV", "SvSetSV_nosteal", "SvSetMagicSV" and
9032 "SvSetMagicSV_nosteal".
9033
9034 This is the primary function for copying scalars, and most
9035 other copy-ish functions and macros use this underneath.
9036
9037 void sv_setsv_flags(SV *dstr, SV *sstr,
9038 const I32 flags)
9039
9040 sv_setsv_mg
9041 Like "sv_setsv", but also handles 'set' magic.
9042
9043 void sv_setsv_mg(SV *const dstr, SV *const sstr)
9044
9045 sv_setsv_nomg
9046 Like "sv_setsv" but doesn't process magic.
9047
9048 void sv_setsv_nomg(SV* dsv, SV* ssv)
9049
9050 sv_setuv
9051 Copies an unsigned integer into the given SV, upgrading first
9052 if necessary. Does not handle 'set' magic. See also
9053 "sv_setuv_mg".
9054
9055 void sv_setuv(SV *const sv, const UV num)
9056
9057 sv_setuv_mg
9058 Like "sv_setuv", but also handles 'set' magic.
9059
9060 void sv_setuv_mg(SV *const sv, const UV u)
9061
9062 sv_set_undef
9063 Equivalent to "sv_setsv(sv, &PL_sv_undef)", but more efficient.
9064 Doesn't handle set magic.
9065
9066 The perl equivalent is "$sv = undef;". Note that it doesn't
9067 free any string buffer, unlike "undef $sv".
9068
9069 Introduced in perl 5.25.12.
9070
9071 void sv_set_undef(SV *sv)
9072
9073 SvSTASH Returns the stash of the SV.
9074
9075 HV* SvSTASH(SV* sv)
9076
9077 SvSTASH_set
9078 Set the value of the STASH pointer in "sv" to val. See
9079 "SvIV_set".
9080
9081 void SvSTASH_set(SV* sv, HV* val)
9082
9083 SvTAINT Taints an SV if tainting is enabled, and if some input to the
9084 current expression is tainted--usually a variable, but possibly
9085 also implicit inputs such as locale settings. "SvTAINT"
9086 propagates that taintedness to the outputs of an expression in
9087 a pessimistic fashion; i.e., without paying attention to
9088 precisely which outputs are influenced by which inputs.
9089
9090 void SvTAINT(SV* sv)
9091
9092 SvTAINTED
9093 Checks to see if an SV is tainted. Returns TRUE if it is,
9094 FALSE if not.
9095
9096 bool SvTAINTED(SV* sv)
9097
9098 sv_tainted
9099 Test an SV for taintedness. Use "SvTAINTED" instead.
9100
9101 bool sv_tainted(SV *const sv)
9102
9103 SvTAINTED_off
9104 Untaints an SV. Be very careful with this routine, as it
9105 short-circuits some of Perl's fundamental security features.
9106 XS module authors should not use this function unless they
9107 fully understand all the implications of unconditionally
9108 untainting the value. Untainting should be done in the
9109 standard perl fashion, via a carefully crafted regexp, rather
9110 than directly untainting variables.
9111
9112 void SvTAINTED_off(SV* sv)
9113
9114 SvTAINTED_on
9115 Marks an SV as tainted if tainting is enabled.
9116
9117 void SvTAINTED_on(SV* sv)
9118
9119 SvTRUE Returns a boolean indicating whether Perl would evaluate the SV
9120 as true or false. See "SvOK" for a defined/undefined test.
9121 Handles 'get' magic unless the scalar is already "SvPOK",
9122 "SvIOK" or "SvNOK" (the public, not the private flags).
9123
9124 bool SvTRUE(SV* sv)
9125
9126 sv_true Returns true if the SV has a true value by Perl's rules. Use
9127 the "SvTRUE" macro instead, which may call "sv_true()" or may
9128 instead use an in-line version.
9129
9130 I32 sv_true(SV *const sv)
9131
9132 SvTRUE_nomg
9133 Returns a boolean indicating whether Perl would evaluate the SV
9134 as true or false. See "SvOK" for a defined/undefined test.
9135 Does not handle 'get' magic.
9136
9137 bool SvTRUE_nomg(SV* sv)
9138
9139 SvTYPE Returns the type of the SV. See "svtype".
9140
9141 svtype SvTYPE(SV* sv)
9142
9143 sv_unmagic
9144 Removes all magic of type "type" from an SV.
9145
9146 int sv_unmagic(SV *const sv, const int type)
9147
9148 sv_unmagicext
9149 Removes all magic of type "type" with the specified "vtbl" from
9150 an SV.
9151
9152 int sv_unmagicext(SV *const sv, const int type,
9153 MGVTBL *vtbl)
9154
9155 sv_unref_flags
9156 Unsets the RV status of the SV, and decrements the reference
9157 count of whatever was being referenced by the RV. This can
9158 almost be thought of as a reversal of "newSVrv". The "cflags"
9159 argument can contain "SV_IMMEDIATE_UNREF" to force the
9160 reference count to be decremented (otherwise the decrementing
9161 is conditional on the reference count being different from one
9162 or the reference being a readonly SV). See "SvROK_off".
9163
9164 void sv_unref_flags(SV *const ref, const U32 flags)
9165
9166 sv_untaint
9167 Untaint an SV. Use "SvTAINTED_off" instead.
9168
9169 void sv_untaint(SV *const sv)
9170
9171 SvUOK Returns a boolean indicating whether the SV contains an integer
9172 that must be interpreted as unsigned. A non-negative integer
9173 whose value is within the range of both an IV and a UV may be
9174 be flagged as either "SvUOK" or "SVIOK".
9175
9176 bool SvUOK(SV* sv)
9177
9178 SvUPGRADE
9179 Used to upgrade an SV to a more complex form. Uses
9180 "sv_upgrade" to perform the upgrade if necessary. See
9181 "svtype".
9182
9183 void SvUPGRADE(SV* sv, svtype type)
9184
9185 sv_upgrade
9186 Upgrade an SV to a more complex form. Generally adds a new
9187 body type to the SV, then copies across as much information as
9188 possible from the old body. It croaks if the SV is already in
9189 a more complex form than requested. You generally want to use
9190 the "SvUPGRADE" macro wrapper, which checks the type before
9191 calling "sv_upgrade", and hence does not croak. See also
9192 "svtype".
9193
9194 void sv_upgrade(SV *const sv, svtype new_type)
9195
9196 sv_usepvn_flags
9197 Tells an SV to use "ptr" to find its string value. Normally
9198 the string is stored inside the SV, but sv_usepvn allows the SV
9199 to use an outside string. "ptr" should point to memory that
9200 was allocated by "Newx". It must be the start of a "Newx"-ed
9201 block of memory, and not a pointer to the middle of it (beware
9202 of "OOK" and copy-on-write), and not be from a non-"Newx"
9203 memory allocator like "malloc". The string length, "len", must
9204 be supplied. By default this function will "Renew" (i.e.
9205 realloc, move) the memory pointed to by "ptr", so that pointer
9206 should not be freed or used by the programmer after giving it
9207 to "sv_usepvn", and neither should any pointers from "behind"
9208 that pointer (e.g. ptr + 1) be used.
9209
9210 If "flags & SV_SMAGIC" is true, will call "SvSETMAGIC". If
9211 "flags & SV_HAS_TRAILING_NUL" is true, then "ptr[len]" must be
9212 "NUL", and the realloc will be skipped (i.e. the buffer is
9213 actually at least 1 byte longer than "len", and already meets
9214 the requirements for storing in "SvPVX").
9215
9216 void sv_usepvn_flags(SV *const sv, char* ptr,
9217 const STRLEN len,
9218 const U32 flags)
9219
9220 SvUTF8 Returns a U32 value indicating the UTF-8 status of an SV. If
9221 things are set-up properly, this indicates whether or not the
9222 SV contains UTF-8 encoded data. You should use this after a
9223 call to "SvPV()" or one of its variants, in case any call to
9224 string overloading updates the internal flag.
9225
9226 If you want to take into account the bytes pragma, use
9227 "DO_UTF8" instead.
9228
9229 U32 SvUTF8(SV* sv)
9230
9231 sv_utf8_decode
9232 If the PV of the SV is an octet sequence in Perl's extended
9233 UTF-8 and contains a multiple-byte character, the "SvUTF8" flag
9234 is turned on so that it looks like a character. If the PV
9235 contains only single-byte characters, the "SvUTF8" flag stays
9236 off. Scans PV for validity and returns FALSE if the PV is
9237 invalid UTF-8.
9238
9239 bool sv_utf8_decode(SV *const sv)
9240
9241 sv_utf8_downgrade
9242 Attempts to convert the PV of an SV from characters to bytes.
9243 If the PV contains a character that cannot fit in a byte, this
9244 conversion will fail; in this case, either returns false or, if
9245 "fail_ok" is not true, croaks.
9246
9247 This is not a general purpose Unicode to byte encoding
9248 interface: use the "Encode" extension for that.
9249
9250 bool sv_utf8_downgrade(SV *const sv,
9251 const bool fail_ok)
9252
9253 sv_utf8_encode
9254 Converts the PV of an SV to UTF-8, but then turns the "SvUTF8"
9255 flag off so that it looks like octets again.
9256
9257 void sv_utf8_encode(SV *const sv)
9258
9259 sv_utf8_upgrade
9260 Converts the PV of an SV to its UTF-8-encoded form. Forces the
9261 SV to string form if it is not already. Will "mg_get" on "sv"
9262 if appropriate. Always sets the "SvUTF8" flag to avoid future
9263 validity checks even if the whole string is the same in UTF-8
9264 as not. Returns the number of bytes in the converted string
9265
9266 This is not a general purpose byte encoding to Unicode
9267 interface: use the Encode extension for that.
9268
9269 STRLEN sv_utf8_upgrade(SV *sv)
9270
9271 sv_utf8_upgrade_flags
9272 Converts the PV of an SV to its UTF-8-encoded form. Forces the
9273 SV to string form if it is not already. Always sets the SvUTF8
9274 flag to avoid future validity checks even if all the bytes are
9275 invariant in UTF-8. If "flags" has "SV_GMAGIC" bit set, will
9276 "mg_get" on "sv" if appropriate, else not.
9277
9278 The "SV_FORCE_UTF8_UPGRADE" flag is now ignored.
9279
9280 Returns the number of bytes in the converted string.
9281
9282 This is not a general purpose byte encoding to Unicode
9283 interface: use the Encode extension for that.
9284
9285 STRLEN sv_utf8_upgrade_flags(SV *const sv,
9286 const I32 flags)
9287
9288 sv_utf8_upgrade_flags_grow
9289 Like "sv_utf8_upgrade_flags", but has an additional parameter
9290 "extra", which is the number of unused bytes the string of "sv"
9291 is guaranteed to have free after it upon return. This allows
9292 the caller to reserve extra space that it intends to fill, to
9293 avoid extra grows.
9294
9295 "sv_utf8_upgrade", "sv_utf8_upgrade_nomg", and
9296 "sv_utf8_upgrade_flags" are implemented in terms of this
9297 function.
9298
9299 Returns the number of bytes in the converted string (not
9300 including the spares).
9301
9302 STRLEN sv_utf8_upgrade_flags_grow(SV *const sv,
9303 const I32 flags,
9304 STRLEN extra)
9305
9306 sv_utf8_upgrade_nomg
9307 Like "sv_utf8_upgrade", but doesn't do magic on "sv".
9308
9309 STRLEN sv_utf8_upgrade_nomg(SV *sv)
9310
9311 SvUTF8_off
9312 Unsets the UTF-8 status of an SV (the data is not changed, just
9313 the flag). Do not use frivolously.
9314
9315 void SvUTF8_off(SV *sv)
9316
9317 SvUTF8_on
9318 Turn on the UTF-8 status of an SV (the data is not changed,
9319 just the flag). Do not use frivolously.
9320
9321 void SvUTF8_on(SV *sv)
9322
9323 SvUV Coerces the given SV to UV and returns it. The returned value
9324 in many circumstances will get stored in "sv"'s UV slot, but
9325 not in all cases. (Use "sv_setuv" to make sure it does).
9326
9327 See "SvUVx" for a version which guarantees to evaluate "sv"
9328 only once.
9329
9330 UV SvUV(SV* sv)
9331
9332 SvUV_nomg
9333 Like "SvUV" but doesn't process magic.
9334
9335 UV SvUV_nomg(SV* sv)
9336
9337 SvUV_set
9338 Set the value of the UV pointer in "sv" to val. See
9339 "SvIV_set".
9340
9341 void SvUV_set(SV* sv, UV val)
9342
9343 SvUVX Returns the raw value in the SV's UV slot, without checks or
9344 conversions. Only use when you are sure "SvIOK" is true. See
9345 also "SvUV".
9346
9347 UV SvUVX(SV* sv)
9348
9349 SvUVx Coerces the given SV to UV and returns it. The returned value
9350 in many circumstances will get stored in "sv"'s UV slot, but
9351 not in all cases. (Use "sv_setuv" to make sure it does).
9352
9353 This form guarantees to evaluate "sv" only once. Only use this
9354 if "sv" is an expression with side effects, otherwise use the
9355 more efficient "SvUV".
9356
9357 UV SvUVx(SV* sv)
9358
9359 sv_vcatpvf
9360 Processes its arguments like "sv_vcatpvfn" called with a non-
9361 null C-style variable argument list, and appends the formatted
9362 output to an SV. Does not handle 'set' magic. See
9363 "sv_vcatpvf_mg".
9364
9365 Usually used via its frontend "sv_catpvf".
9366
9367 void sv_vcatpvf(SV *const sv, const char *const pat,
9368 va_list *const args)
9369
9370 sv_vcatpvfn
9371 void sv_vcatpvfn(SV *const sv, const char *const pat,
9372 const STRLEN patlen,
9373 va_list *const args,
9374 SV **const svargs,
9375 const Size_t sv_count,
9376 bool *const maybe_tainted)
9377
9378 sv_vcatpvfn_flags
9379 Processes its arguments like "vsprintf" and appends the
9380 formatted output to an SV. Uses an array of SVs if the C-style
9381 variable argument list is missing ("NULL"). Argument reordering
9382 (using format specifiers like "%2$d" or "%*2$d") is supported
9383 only when using an array of SVs; using a C-style "va_list"
9384 argument list with a format string that uses argument
9385 reordering will yield an exception.
9386
9387 When running with taint checks enabled, indicates via
9388 "maybe_tainted" if results are untrustworthy (often due to the
9389 use of locales).
9390
9391 If called as "sv_vcatpvfn" or flags has the "SV_GMAGIC" bit
9392 set, calls get magic.
9393
9394 It assumes that pat has the same utf8-ness as sv. It's the
9395 caller's responsibility to ensure that this is so.
9396
9397 Usually used via one of its frontends "sv_vcatpvf" and
9398 "sv_vcatpvf_mg".
9399
9400 void sv_vcatpvfn_flags(SV *const sv,
9401 const char *const pat,
9402 const STRLEN patlen,
9403 va_list *const args,
9404 SV **const svargs,
9405 const Size_t sv_count,
9406 bool *const maybe_tainted,
9407 const U32 flags)
9408
9409 sv_vcatpvf_mg
9410 Like "sv_vcatpvf", but also handles 'set' magic.
9411
9412 Usually used via its frontend "sv_catpvf_mg".
9413
9414 void sv_vcatpvf_mg(SV *const sv,
9415 const char *const pat,
9416 va_list *const args)
9417
9418 SvVOK Returns a boolean indicating whether the SV contains a
9419 v-string.
9420
9421 bool SvVOK(SV* sv)
9422
9423 sv_vsetpvf
9424 Works like "sv_vcatpvf" but copies the text into the SV instead
9425 of appending it. Does not handle 'set' magic. See
9426 "sv_vsetpvf_mg".
9427
9428 Usually used via its frontend "sv_setpvf".
9429
9430 void sv_vsetpvf(SV *const sv, const char *const pat,
9431 va_list *const args)
9432
9433 sv_vsetpvfn
9434 Works like "sv_vcatpvfn" but copies the text into the SV
9435 instead of appending it.
9436
9437 Usually used via one of its frontends "sv_vsetpvf" and
9438 "sv_vsetpvf_mg".
9439
9440 void sv_vsetpvfn(SV *const sv, const char *const pat,
9441 const STRLEN patlen,
9442 va_list *const args,
9443 SV **const svargs,
9444 const Size_t sv_count,
9445 bool *const maybe_tainted)
9446
9447 sv_vsetpvf_mg
9448 Like "sv_vsetpvf", but also handles 'set' magic.
9449
9450 Usually used via its frontend "sv_setpvf_mg".
9451
9452 void sv_vsetpvf_mg(SV *const sv,
9453 const char *const pat,
9454 va_list *const args)
9455
9457 "Unicode Support" in perlguts has an introduction to this API.
9458
9459 See also "Character classification", and "Character case changing".
9460 Various functions outside this section also work specially with
9461 Unicode. Search for the string "utf8" in this document.
9462
9463 BOM_UTF8
9464 This is a macro that evaluates to a string constant of the
9465 UTF-8 bytes that define the Unicode BYTE ORDER MARK (U+FEFF)
9466 for the platform that perl is compiled on. This allows code to
9467 use a mnemonic for this character that works on both ASCII and
9468 EBCDIC platforms. "sizeof(BOM_UTF8) - 1" can be used to get
9469 its length in bytes.
9470
9471 bytes_cmp_utf8
9472 Compares the sequence of characters (stored as octets) in "b",
9473 "blen" with the sequence of characters (stored as UTF-8) in
9474 "u", "ulen". Returns 0 if they are equal, -1 or -2 if the
9475 first string is less than the second string, +1 or +2 if the
9476 first string is greater than the second string.
9477
9478 -1 or +1 is returned if the shorter string was identical to the
9479 start of the longer string. -2 or +2 is returned if there was
9480 a difference between characters within the strings.
9481
9482 int bytes_cmp_utf8(const U8 *b, STRLEN blen,
9483 const U8 *u, STRLEN ulen)
9484
9485 bytes_from_utf8
9486 NOTE: this function is experimental and may change or be
9487 removed without notice.
9488
9489 Converts a potentially UTF-8 encoded string "s" of length *lenp
9490 into native byte encoding. On input, the boolean *is_utf8p
9491 gives whether or not "s" is actually encoded in UTF-8.
9492
9493 Unlike "utf8_to_bytes" but like "bytes_to_utf8", this is non-
9494 destructive of the input string.
9495
9496 Do nothing if *is_utf8p is 0, or if there are code points in
9497 the string not expressible in native byte encoding. In these
9498 cases, *is_utf8p and *lenp are unchanged, and the return value
9499 is the original "s".
9500
9501 Otherwise, *is_utf8p is set to 0, and the return value is a
9502 pointer to a newly created string containing a downgraded copy
9503 of "s", and whose length is returned in *lenp, updated. The
9504 new string is "NUL"-terminated. The caller is responsible for
9505 arranging for the memory used by this string to get freed.
9506
9507 Upon successful return, the number of variants in the string
9508 can be computed by having saved the value of *lenp before the
9509 call, and subtracting the after-call value of *lenp from it.
9510
9511 U8* bytes_from_utf8(const U8 *s, STRLEN *lenp,
9512 bool *is_utf8p)
9513
9514 bytes_to_utf8
9515 NOTE: this function is experimental and may change or be
9516 removed without notice.
9517
9518 Converts a string "s" of length *lenp bytes from the native
9519 encoding into UTF-8. Returns a pointer to the newly-created
9520 string, and sets *lenp to reflect the new length in bytes. The
9521 caller is responsible for arranging for the memory used by this
9522 string to get freed.
9523
9524 Upon successful return, the number of variants in the string
9525 can be computed by having saved the value of *lenp before the
9526 call, and subtracting it from the after-call value of *lenp.
9527
9528 A "NUL" character will be written after the end of the string.
9529
9530 If you want to convert to UTF-8 from encodings other than the
9531 native (Latin1 or EBCDIC), see "sv_recode_to_utf8"().
9532
9533 U8* bytes_to_utf8(const U8 *s, STRLEN *lenp)
9534
9535 DO_UTF8 Returns a bool giving whether or not the PV in "sv" is to be
9536 treated as being encoded in UTF-8.
9537
9538 You should use this after a call to "SvPV()" or one of its
9539 variants, in case any call to string overloading updates the
9540 internal UTF-8 encoding flag.
9541
9542 bool DO_UTF8(SV* sv)
9543
9544 foldEQ_utf8
9545 Returns true if the leading portions of the strings "s1" and
9546 "s2" (either or both of which may be in UTF-8) are the same
9547 case-insensitively; false otherwise. How far into the strings
9548 to compare is determined by other input parameters.
9549
9550 If "u1" is true, the string "s1" is assumed to be in
9551 UTF-8-encoded Unicode; otherwise it is assumed to be in native
9552 8-bit encoding. Correspondingly for "u2" with respect to "s2".
9553
9554 If the byte length "l1" is non-zero, it says how far into "s1"
9555 to check for fold equality. In other words, "s1"+"l1" will be
9556 used as a goal to reach. The scan will not be considered to be
9557 a match unless the goal is reached, and scanning won't continue
9558 past that goal. Correspondingly for "l2" with respect to "s2".
9559
9560 If "pe1" is non-"NULL" and the pointer it points to is not
9561 "NULL", that pointer is considered an end pointer to the
9562 position 1 byte past the maximum point in "s1" beyond which
9563 scanning will not continue under any circumstances. (This
9564 routine assumes that UTF-8 encoded input strings are not
9565 malformed; malformed input can cause it to read past "pe1").
9566 This means that if both "l1" and "pe1" are specified, and "pe1"
9567 is less than "s1"+"l1", the match will never be successful
9568 because it can never get as far as its goal (and in fact is
9569 asserted against). Correspondingly for "pe2" with respect to
9570 "s2".
9571
9572 At least one of "s1" and "s2" must have a goal (at least one of
9573 "l1" and "l2" must be non-zero), and if both do, both have to
9574 be reached for a successful match. Also, if the fold of a
9575 character is multiple characters, all of them must be matched
9576 (see tr21 reference below for 'folding').
9577
9578 Upon a successful match, if "pe1" is non-"NULL", it will be set
9579 to point to the beginning of the next character of "s1" beyond
9580 what was matched. Correspondingly for "pe2" and "s2".
9581
9582 For case-insensitiveness, the "casefolding" of Unicode is used
9583 instead of upper/lowercasing both the characters, see
9584 <http://www.unicode.org/unicode/reports/tr21/> (Case Mappings).
9585
9586 I32 foldEQ_utf8(const char *s1, char **pe1, UV l1,
9587 bool u1, const char *s2, char **pe2,
9588 UV l2, bool u2)
9589
9590 is_ascii_string
9591 This is a misleadingly-named synonym for
9592 "is_utf8_invariant_string". On ASCII-ish platforms, the name
9593 isn't misleading: the ASCII-range characters are exactly the
9594 UTF-8 invariants. But EBCDIC machines have more invariants
9595 than just the ASCII characters, so "is_utf8_invariant_string"
9596 is preferred.
9597
9598 bool is_ascii_string(const U8* const s, STRLEN len)
9599
9600 is_c9strict_utf8_string
9601 Returns TRUE if the first "len" bytes of string "s" form a
9602 valid UTF-8-encoded string that conforms to Unicode Corrigendum
9603 #9 <http://www.unicode.org/versions/corrigendum9.html>;
9604 otherwise it returns FALSE. If "len" is 0, it will be
9605 calculated using strlen(s) (which means if you use this option,
9606 that "s" can't have embedded "NUL" characters and has to have a
9607 terminating "NUL" byte). Note that all characters being ASCII
9608 constitute 'a valid UTF-8 string'.
9609
9610 This function returns FALSE for strings containing any code
9611 points above the Unicode max of 0x10FFFF or surrogate code
9612 points, but accepts non-character code points per Corrigendum
9613 #9 <http://www.unicode.org/versions/corrigendum9.html>.
9614
9615 See also "is_utf8_invariant_string",
9616 "is_utf8_invariant_string_loc", "is_utf8_string",
9617 "is_utf8_string_flags", "is_utf8_string_loc",
9618 "is_utf8_string_loc_flags", "is_utf8_string_loclen",
9619 "is_utf8_string_loclen_flags", "is_utf8_fixed_width_buf_flags",
9620 "is_utf8_fixed_width_buf_loc_flags",
9621 "is_utf8_fixed_width_buf_loclen_flags",
9622 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9623 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string_loc",
9624 and "is_c9strict_utf8_string_loclen".
9625
9626 bool is_c9strict_utf8_string(const U8 *s, STRLEN len)
9627
9628 is_c9strict_utf8_string_loc
9629 Like "is_c9strict_utf8_string" but stores the location of the
9630 failure (in the case of "utf8ness failure") or the location
9631 "s"+"len" (in the case of "utf8ness success") in the "ep"
9632 pointer.
9633
9634 See also "is_c9strict_utf8_string_loclen".
9635
9636 bool is_c9strict_utf8_string_loc(const U8 *s,
9637 STRLEN len,
9638 const U8 **ep)
9639
9640 is_c9strict_utf8_string_loclen
9641 Like "is_c9strict_utf8_string" but stores the location of the
9642 failure (in the case of "utf8ness failure") or the location
9643 "s"+"len" (in the case of "utf8ness success") in the "ep"
9644 pointer, and the number of UTF-8 encoded characters in the "el"
9645 pointer.
9646
9647 See also "is_c9strict_utf8_string_loc".
9648
9649 bool is_c9strict_utf8_string_loclen(const U8 *s,
9650 STRLEN len,
9651 const U8 **ep,
9652 STRLEN *el)
9653
9654 isC9_STRICT_UTF8_CHAR
9655 Evaluates to non-zero if the first few bytes of the string
9656 starting at "s" and looking no further than "e - 1" are well-
9657 formed UTF-8 that represents some Unicode non-surrogate code
9658 point; otherwise it evaluates to 0. If non-zero, the value
9659 gives how many bytes starting at "s" comprise the code point's
9660 representation. Any bytes remaining before "e", but beyond the
9661 ones needed to form the first code point in "s", are not
9662 examined.
9663
9664 The largest acceptable code point is the Unicode maximum
9665 0x10FFFF. This differs from "isSTRICT_UTF8_CHAR" only in that
9666 it accepts non-character code points. This corresponds to
9667 Unicode Corrigendum #9
9668 <http://www.unicode.org/versions/corrigendum9.html>. which
9669 said that non-character code points are merely discouraged
9670 rather than completely forbidden in open interchange. See
9671 "Noncharacter code points" in perlunicode.
9672
9673 Use "isUTF8_CHAR" to check for Perl's extended UTF-8; and
9674 "isUTF8_CHAR_flags" for a more customized definition.
9675
9676 Use "is_c9strict_utf8_string", "is_c9strict_utf8_string_loc",
9677 and "is_c9strict_utf8_string_loclen" to check entire strings.
9678
9679 STRLEN isC9_STRICT_UTF8_CHAR(const U8 *s, const U8 *e)
9680
9681 is_invariant_string
9682 This is a somewhat misleadingly-named synonym for
9683 "is_utf8_invariant_string". "is_utf8_invariant_string" is
9684 preferred, as it indicates under what conditions the string is
9685 invariant.
9686
9687 bool is_invariant_string(const U8* const s,
9688 STRLEN len)
9689
9690 isSTRICT_UTF8_CHAR
9691 Evaluates to non-zero if the first few bytes of the string
9692 starting at "s" and looking no further than "e - 1" are well-
9693 formed UTF-8 that represents some Unicode code point completely
9694 acceptable for open interchange between all applications;
9695 otherwise it evaluates to 0. If non-zero, the value gives how
9696 many bytes starting at "s" comprise the code point's
9697 representation. Any bytes remaining before "e", but beyond the
9698 ones needed to form the first code point in "s", are not
9699 examined.
9700
9701 The largest acceptable code point is the Unicode maximum
9702 0x10FFFF, and must not be a surrogate nor a non-character code
9703 point. Thus this excludes any code point from Perl's extended
9704 UTF-8.
9705
9706 This is used to efficiently decide if the next few bytes in "s"
9707 is legal Unicode-acceptable UTF-8 for a single character.
9708
9709 Use "isC9_STRICT_UTF8_CHAR" to use the Unicode Corrigendum #9
9710 <http://www.unicode.org/versions/corrigendum9.html> definition
9711 of allowable code points; "isUTF8_CHAR" to check for Perl's
9712 extended UTF-8; and "isUTF8_CHAR_flags" for a more customized
9713 definition.
9714
9715 Use "is_strict_utf8_string", "is_strict_utf8_string_loc", and
9716 "is_strict_utf8_string_loclen" to check entire strings.
9717
9718 Size_t isSTRICT_UTF8_CHAR(const U8 * const s0,
9719 const U8 * const e)
9720
9721 is_strict_utf8_string
9722 Returns TRUE if the first "len" bytes of string "s" form a
9723 valid UTF-8-encoded string that is fully interchangeable by any
9724 application using Unicode rules; otherwise it returns FALSE.
9725 If "len" is 0, it will be calculated using strlen(s) (which
9726 means if you use this option, that "s" can't have embedded
9727 "NUL" characters and has to have a terminating "NUL" byte).
9728 Note that all characters being ASCII constitute 'a valid UTF-8
9729 string'.
9730
9731 This function returns FALSE for strings containing any code
9732 points above the Unicode max of 0x10FFFF, surrogate code
9733 points, or non-character code points.
9734
9735 See also "is_utf8_invariant_string",
9736 "is_utf8_invariant_string_loc", "is_utf8_string",
9737 "is_utf8_string_flags", "is_utf8_string_loc",
9738 "is_utf8_string_loc_flags", "is_utf8_string_loclen",
9739 "is_utf8_string_loclen_flags", "is_utf8_fixed_width_buf_flags",
9740 "is_utf8_fixed_width_buf_loc_flags",
9741 "is_utf8_fixed_width_buf_loclen_flags",
9742 "is_strict_utf8_string_loc", "is_strict_utf8_string_loclen",
9743 "is_c9strict_utf8_string", "is_c9strict_utf8_string_loc", and
9744 "is_c9strict_utf8_string_loclen".
9745
9746 bool is_strict_utf8_string(const U8 *s, STRLEN len)
9747
9748 is_strict_utf8_string_loc
9749 Like "is_strict_utf8_string" but stores the location of the
9750 failure (in the case of "utf8ness failure") or the location
9751 "s"+"len" (in the case of "utf8ness success") in the "ep"
9752 pointer.
9753
9754 See also "is_strict_utf8_string_loclen".
9755
9756 bool is_strict_utf8_string_loc(const U8 *s,
9757 STRLEN len,
9758 const U8 **ep)
9759
9760 is_strict_utf8_string_loclen
9761 Like "is_strict_utf8_string" but stores the location of the
9762 failure (in the case of "utf8ness failure") or the location
9763 "s"+"len" (in the case of "utf8ness success") in the "ep"
9764 pointer, and the number of UTF-8 encoded characters in the "el"
9765 pointer.
9766
9767 See also "is_strict_utf8_string_loc".
9768
9769 bool is_strict_utf8_string_loclen(const U8 *s,
9770 STRLEN len,
9771 const U8 **ep,
9772 STRLEN *el)
9773
9774 is_utf8_fixed_width_buf_flags
9775 Returns TRUE if the fixed-width buffer starting at "s" with
9776 length "len" is entirely valid UTF-8, subject to the
9777 restrictions given by "flags"; otherwise it returns FALSE.
9778
9779 If "flags" is 0, any well-formed UTF-8, as extended by Perl, is
9780 accepted without restriction. If the final few bytes of the
9781 buffer do not form a complete code point, this will return TRUE
9782 anyway, provided that "is_utf8_valid_partial_char_flags"
9783 returns TRUE for them.
9784
9785 If "flags" in non-zero, it can be any combination of the
9786 "UTF8_DISALLOW_foo" flags accepted by "utf8n_to_uvchr", and
9787 with the same meanings.
9788
9789 This function differs from "is_utf8_string_flags" only in that
9790 the latter returns FALSE if the final few bytes of the string
9791 don't form a complete code point.
9792
9793 bool is_utf8_fixed_width_buf_flags(
9794 const U8 * const s, STRLEN len,
9795 const U32 flags
9796 )
9797
9798 is_utf8_fixed_width_buf_loclen_flags
9799 Like "is_utf8_fixed_width_buf_loc_flags" but stores the number
9800 of complete, valid characters found in the "el" pointer.
9801
9802 bool is_utf8_fixed_width_buf_loclen_flags(
9803 const U8 * const s, STRLEN len,
9804 const U8 **ep, STRLEN *el, const U32 flags
9805 )
9806
9807 is_utf8_fixed_width_buf_loc_flags
9808 Like "is_utf8_fixed_width_buf_flags" but stores the location of
9809 the failure in the "ep" pointer. If the function returns TRUE,
9810 *ep will point to the beginning of any partial character at the
9811 end of the buffer; if there is no partial character *ep will
9812 contain "s"+"len".
9813
9814 See also "is_utf8_fixed_width_buf_loclen_flags".
9815
9816 bool is_utf8_fixed_width_buf_loc_flags(
9817 const U8 * const s, STRLEN len,
9818 const U8 **ep, const U32 flags
9819 )
9820
9821 is_utf8_invariant_string
9822 Returns TRUE if the first "len" bytes of the string "s" are the
9823 same regardless of the UTF-8 encoding of the string (or UTF-
9824 EBCDIC encoding on EBCDIC machines); otherwise it returns
9825 FALSE. That is, it returns TRUE if they are UTF-8 invariant.
9826 On ASCII-ish machines, all the ASCII characters and only the
9827 ASCII characters fit this definition. On EBCDIC machines, the
9828 ASCII-range characters are invariant, but so also are the C1
9829 controls.
9830
9831 If "len" is 0, it will be calculated using strlen(s), (which
9832 means if you use this option, that "s" can't have embedded
9833 "NUL" characters and has to have a terminating "NUL" byte).
9834
9835 See also "is_utf8_string", "is_utf8_string_flags",
9836 "is_utf8_string_loc", "is_utf8_string_loc_flags",
9837 "is_utf8_string_loclen", "is_utf8_string_loclen_flags",
9838 "is_utf8_fixed_width_buf_flags",
9839 "is_utf8_fixed_width_buf_loc_flags",
9840 "is_utf8_fixed_width_buf_loclen_flags",
9841 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9842 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string",
9843 "is_c9strict_utf8_string_loc", and
9844 "is_c9strict_utf8_string_loclen".
9845
9846 bool is_utf8_invariant_string(const U8* const s,
9847 STRLEN len)
9848
9849 is_utf8_invariant_string_loc
9850 Like "is_utf8_invariant_string" but upon failure, stores the
9851 location of the first UTF-8 variant character in the "ep"
9852 pointer; if all characters are UTF-8 invariant, this function
9853 does not change the contents of *ep.
9854
9855 bool is_utf8_invariant_string_loc(const U8* const s,
9856 STRLEN len,
9857 const U8 ** ep)
9858
9859 is_utf8_string
9860 Returns TRUE if the first "len" bytes of string "s" form a
9861 valid Perl-extended-UTF-8 string; returns FALSE otherwise. If
9862 "len" is 0, it will be calculated using strlen(s) (which means
9863 if you use this option, that "s" can't have embedded "NUL"
9864 characters and has to have a terminating "NUL" byte). Note
9865 that all characters being ASCII constitute 'a valid UTF-8
9866 string'.
9867
9868 This function considers Perl's extended UTF-8 to be valid.
9869 That means that code points above Unicode, surrogates, and non-
9870 character code points are considered valid by this function.
9871 Use "is_strict_utf8_string", "is_c9strict_utf8_string", or
9872 "is_utf8_string_flags" to restrict what code points are
9873 considered valid.
9874
9875 See also "is_utf8_invariant_string",
9876 "is_utf8_invariant_string_loc", "is_utf8_string_loc",
9877 "is_utf8_string_loclen", "is_utf8_fixed_width_buf_flags",
9878 "is_utf8_fixed_width_buf_loc_flags",
9879 "is_utf8_fixed_width_buf_loclen_flags",
9880
9881 bool is_utf8_string(const U8 *s, STRLEN len)
9882
9883 is_utf8_string_flags
9884 Returns TRUE if the first "len" bytes of string "s" form a
9885 valid UTF-8 string, subject to the restrictions imposed by
9886 "flags"; returns FALSE otherwise. If "len" is 0, it will be
9887 calculated using strlen(s) (which means if you use this option,
9888 that "s" can't have embedded "NUL" characters and has to have a
9889 terminating "NUL" byte). Note that all characters being ASCII
9890 constitute 'a valid UTF-8 string'.
9891
9892 If "flags" is 0, this gives the same results as
9893 "is_utf8_string"; if "flags" is
9894 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", this gives the same
9895 results as "is_strict_utf8_string"; and if "flags" is
9896 "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE", this gives the same
9897 results as "is_c9strict_utf8_string". Otherwise "flags" may be
9898 any combination of the "UTF8_DISALLOW_foo" flags understood by
9899 "utf8n_to_uvchr", with the same meanings.
9900
9901 See also "is_utf8_invariant_string",
9902 "is_utf8_invariant_string_loc", "is_utf8_string",
9903 "is_utf8_string_loc", "is_utf8_string_loc_flags",
9904 "is_utf8_string_loclen", "is_utf8_string_loclen_flags",
9905 "is_utf8_fixed_width_buf_flags",
9906 "is_utf8_fixed_width_buf_loc_flags",
9907 "is_utf8_fixed_width_buf_loclen_flags",
9908 "is_strict_utf8_string", "is_strict_utf8_string_loc",
9909 "is_strict_utf8_string_loclen", "is_c9strict_utf8_string",
9910 "is_c9strict_utf8_string_loc", and
9911 "is_c9strict_utf8_string_loclen".
9912
9913 bool is_utf8_string_flags(const U8 *s, STRLEN len,
9914 const U32 flags)
9915
9916 is_utf8_string_loc
9917 Like "is_utf8_string" but stores the location of the failure
9918 (in the case of "utf8ness failure") or the location "s"+"len"
9919 (in the case of "utf8ness success") in the "ep" pointer.
9920
9921 See also "is_utf8_string_loclen".
9922
9923 bool is_utf8_string_loc(const U8 *s,
9924 const STRLEN len,
9925 const U8 **ep)
9926
9927 is_utf8_string_loclen
9928 Like "is_utf8_string" but stores the location of the failure
9929 (in the case of "utf8ness failure") or the location "s"+"len"
9930 (in the case of "utf8ness success") in the "ep" pointer, and
9931 the number of UTF-8 encoded characters in the "el" pointer.
9932
9933 See also "is_utf8_string_loc".
9934
9935 bool is_utf8_string_loclen(const U8 *s, STRLEN len,
9936 const U8 **ep, STRLEN *el)
9937
9938 is_utf8_string_loclen_flags
9939 Like "is_utf8_string_flags" but stores the location of the
9940 failure (in the case of "utf8ness failure") or the location
9941 "s"+"len" (in the case of "utf8ness success") in the "ep"
9942 pointer, and the number of UTF-8 encoded characters in the "el"
9943 pointer.
9944
9945 See also "is_utf8_string_loc_flags".
9946
9947 bool is_utf8_string_loclen_flags(const U8 *s,
9948 STRLEN len,
9949 const U8 **ep,
9950 STRLEN *el,
9951 const U32 flags)
9952
9953 is_utf8_string_loc_flags
9954 Like "is_utf8_string_flags" but stores the location of the
9955 failure (in the case of "utf8ness failure") or the location
9956 "s"+"len" (in the case of "utf8ness success") in the "ep"
9957 pointer.
9958
9959 See also "is_utf8_string_loclen_flags".
9960
9961 bool is_utf8_string_loc_flags(const U8 *s,
9962 STRLEN len,
9963 const U8 **ep,
9964 const U32 flags)
9965
9966 is_utf8_valid_partial_char
9967 Returns 0 if the sequence of bytes starting at "s" and looking
9968 no further than "e - 1" is the UTF-8 encoding, as extended by
9969 Perl, for one or more code points. Otherwise, it returns 1 if
9970 there exists at least one non-empty sequence of bytes that when
9971 appended to sequence "s", starting at position "e" causes the
9972 entire sequence to be the well-formed UTF-8 of some code point;
9973 otherwise returns 0.
9974
9975 In other words this returns TRUE if "s" points to a partial
9976 UTF-8-encoded code point.
9977
9978 This is useful when a fixed-length buffer is being tested for
9979 being well-formed UTF-8, but the final few bytes in it don't
9980 comprise a full character; that is, it is split somewhere in
9981 the middle of the final code point's UTF-8 representation.
9982 (Presumably when the buffer is refreshed with the next chunk of
9983 data, the new first bytes will complete the partial code
9984 point.) This function is used to verify that the final bytes
9985 in the current buffer are in fact the legal beginning of some
9986 code point, so that if they aren't, the failure can be
9987 signalled without having to wait for the next read.
9988
9989 bool is_utf8_valid_partial_char(const U8 * const s,
9990 const U8 * const e)
9991
9992 is_utf8_valid_partial_char_flags
9993 Like "is_utf8_valid_partial_char", it returns a boolean giving
9994 whether or not the input is a valid UTF-8 encoded partial
9995 character, but it takes an extra parameter, "flags", which can
9996 further restrict which code points are considered valid.
9997
9998 If "flags" is 0, this behaves identically to
9999 "is_utf8_valid_partial_char". Otherwise "flags" can be any
10000 combination of the "UTF8_DISALLOW_foo" flags accepted by
10001 "utf8n_to_uvchr". If there is any sequence of bytes that can
10002 complete the input partial character in such a way that a non-
10003 prohibited character is formed, the function returns TRUE;
10004 otherwise FALSE. Non character code points cannot be
10005 determined based on partial character input. But many of the
10006 other possible excluded types can be determined from just the
10007 first one or two bytes.
10008
10009 bool is_utf8_valid_partial_char_flags(
10010 const U8 * const s, const U8 * const e,
10011 const U32 flags
10012 )
10013
10014 isUTF8_CHAR
10015 Evaluates to non-zero if the first few bytes of the string
10016 starting at "s" and looking no further than "e - 1" are well-
10017 formed UTF-8, as extended by Perl, that represents some code
10018 point; otherwise it evaluates to 0. If non-zero, the value
10019 gives how many bytes starting at "s" comprise the code point's
10020 representation. Any bytes remaining before "e", but beyond the
10021 ones needed to form the first code point in "s", are not
10022 examined.
10023
10024 The code point can be any that will fit in an IV on this
10025 machine, using Perl's extension to official UTF-8 to represent
10026 those higher than the Unicode maximum of 0x10FFFF. That means
10027 that this macro is used to efficiently decide if the next few
10028 bytes in "s" is legal UTF-8 for a single character.
10029
10030 Use "isSTRICT_UTF8_CHAR" to restrict the acceptable code points
10031 to those defined by Unicode to be fully interchangeable across
10032 applications; "isC9_STRICT_UTF8_CHAR" to use the Unicode
10033 Corrigendum #9
10034 <http://www.unicode.org/versions/corrigendum9.html> definition
10035 of allowable code points; and "isUTF8_CHAR_flags" for a more
10036 customized definition.
10037
10038 Use "is_utf8_string", "is_utf8_string_loc", and
10039 "is_utf8_string_loclen" to check entire strings.
10040
10041 Note also that a UTF-8 "invariant" character (i.e. ASCII on
10042 non-EBCDIC machines) is a valid UTF-8 character.
10043
10044 STRLEN isUTF8_CHAR(const U8 *s, const U8 *e)
10045
10046 isUTF8_CHAR_flags
10047 Evaluates to non-zero if the first few bytes of the string
10048 starting at "s" and looking no further than "e - 1" are well-
10049 formed UTF-8, as extended by Perl, that represents some code
10050 point, subject to the restrictions given by "flags"; otherwise
10051 it evaluates to 0. If non-zero, the value gives how many bytes
10052 starting at "s" comprise the code point's representation. Any
10053 bytes remaining before "e", but beyond the ones needed to form
10054 the first code point in "s", are not examined.
10055
10056 If "flags" is 0, this gives the same results as "isUTF8_CHAR";
10057 if "flags" is "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", this gives
10058 the same results as "isSTRICT_UTF8_CHAR"; and if "flags" is
10059 "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE", this gives the same
10060 results as "isC9_STRICT_UTF8_CHAR". Otherwise "flags" may be
10061 any combination of the "UTF8_DISALLOW_foo" flags understood by
10062 "utf8n_to_uvchr", with the same meanings.
10063
10064 The three alternative macros are for the most commonly needed
10065 validations; they are likely to run somewhat faster than this
10066 more general one, as they can be inlined into your code.
10067
10068 Use "is_utf8_string_flags", "is_utf8_string_loc_flags", and
10069 "is_utf8_string_loclen_flags" to check entire strings.
10070
10071 STRLEN isUTF8_CHAR_flags(const U8 *s, const U8 *e,
10072 const U32 flags)
10073
10074 pv_uni_display
10075 Build to the scalar "dsv" a displayable version of the string
10076 "spv", length "len", the displayable version being at most
10077 "pvlim" bytes long (if longer, the rest is truncated and "..."
10078 will be appended).
10079
10080 The "flags" argument can have "UNI_DISPLAY_ISPRINT" set to
10081 display "isPRINT()"able characters as themselves,
10082 "UNI_DISPLAY_BACKSLASH" to display the "\\[nrfta\\]" as the
10083 backslashed versions (like "\n") ("UNI_DISPLAY_BACKSLASH" is
10084 preferred over "UNI_DISPLAY_ISPRINT" for "\\").
10085 "UNI_DISPLAY_QQ" (and its alias "UNI_DISPLAY_REGEX") have both
10086 "UNI_DISPLAY_BACKSLASH" and "UNI_DISPLAY_ISPRINT" turned on.
10087
10088 The pointer to the PV of the "dsv" is returned.
10089
10090 See also "sv_uni_display".
10091
10092 char* pv_uni_display(SV *dsv, const U8 *spv,
10093 STRLEN len, STRLEN pvlim,
10094 UV flags)
10095
10096 REPLACEMENT_CHARACTER_UTF8
10097 This is a macro that evaluates to a string constant of the
10098 UTF-8 bytes that define the Unicode REPLACEMENT CHARACTER
10099 (U+FFFD) for the platform that perl is compiled on. This
10100 allows code to use a mnemonic for this character that works on
10101 both ASCII and EBCDIC platforms.
10102 "sizeof(REPLACEMENT_CHARACTER_UTF8) - 1" can be used to get its
10103 length in bytes.
10104
10105 sv_cat_decode
10106 "encoding" is assumed to be an "Encode" object, the PV of "ssv"
10107 is assumed to be octets in that encoding and decoding the input
10108 starts from the position which "(PV + *offset)" pointed to.
10109 "dsv" will be concatenated with the decoded UTF-8 string from
10110 "ssv". Decoding will terminate when the string "tstr" appears
10111 in decoding output or the input ends on the PV of "ssv". The
10112 value which "offset" points will be modified to the last input
10113 position on "ssv".
10114
10115 Returns TRUE if the terminator was found, else returns FALSE.
10116
10117 bool sv_cat_decode(SV* dsv, SV *encoding, SV *ssv,
10118 int *offset, char* tstr, int tlen)
10119
10120 sv_recode_to_utf8
10121 "encoding" is assumed to be an "Encode" object, on entry the PV
10122 of "sv" is assumed to be octets in that encoding, and "sv" will
10123 be converted into Unicode (and UTF-8).
10124
10125 If "sv" already is UTF-8 (or if it is not "POK"), or if
10126 "encoding" is not a reference, nothing is done to "sv". If
10127 "encoding" is not an "Encode::XS" Encoding object, bad things
10128 will happen. (See cpan/Encode/encoding.pm and Encode.)
10129
10130 The PV of "sv" is returned.
10131
10132 char* sv_recode_to_utf8(SV* sv, SV *encoding)
10133
10134 sv_uni_display
10135 Build to the scalar "dsv" a displayable version of the scalar
10136 "sv", the displayable version being at most "pvlim" bytes long
10137 (if longer, the rest is truncated and "..." will be appended).
10138
10139 The "flags" argument is as in "pv_uni_display"().
10140
10141 The pointer to the PV of the "dsv" is returned.
10142
10143 char* sv_uni_display(SV *dsv, SV *ssv, STRLEN pvlim,
10144 UV flags)
10145
10146 to_utf8_fold
10147 DEPRECATED! It is planned to remove this function from a
10148 future release of Perl. Do not use it for new code; remove it
10149 from existing code.
10150
10151 Instead use "toFOLD_utf8_safe".
10152
10153 UV to_utf8_fold(const U8 *p, U8* ustrp,
10154 STRLEN *lenp)
10155
10156 to_utf8_lower
10157 DEPRECATED! It is planned to remove this function from a
10158 future release of Perl. Do not use it for new code; remove it
10159 from existing code.
10160
10161 Instead use "toLOWER_utf8_safe".
10162
10163 UV to_utf8_lower(const U8 *p, U8* ustrp,
10164 STRLEN *lenp)
10165
10166 to_utf8_title
10167 DEPRECATED! It is planned to remove this function from a
10168 future release of Perl. Do not use it for new code; remove it
10169 from existing code.
10170
10171 Instead use "toTITLE_utf8_safe".
10172
10173 UV to_utf8_title(const U8 *p, U8* ustrp,
10174 STRLEN *lenp)
10175
10176 to_utf8_upper
10177 DEPRECATED! It is planned to remove this function from a
10178 future release of Perl. Do not use it for new code; remove it
10179 from existing code.
10180
10181 Instead use "toUPPER_utf8_safe".
10182
10183 UV to_utf8_upper(const U8 *p, U8* ustrp,
10184 STRLEN *lenp)
10185
10186 utf8n_to_uvchr
10187 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10188 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10189 rather than call this directly.
10190
10191 Bottom level UTF-8 decode routine. Returns the native code
10192 point value of the first character in the string "s", which is
10193 assumed to be in UTF-8 (or UTF-EBCDIC) encoding, and no longer
10194 than "curlen" bytes; *retlen (if "retlen" isn't NULL) will be
10195 set to the length, in bytes, of that character.
10196
10197 The value of "flags" determines the behavior when "s" does not
10198 point to a well-formed UTF-8 character. If "flags" is 0,
10199 encountering a malformation causes zero to be returned and
10200 *retlen is set so that ("s" + *retlen) is the next possible
10201 position in "s" that could begin a non-malformed character.
10202 Also, if UTF-8 warnings haven't been lexically disabled, a
10203 warning is raised. Some UTF-8 input sequences may contain
10204 multiple malformations. This function tries to find every
10205 possible one in each call, so multiple warnings can be raised
10206 for the same sequence.
10207
10208 Various ALLOW flags can be set in "flags" to allow (and not
10209 warn on) individual types of malformations, such as the
10210 sequence being overlong (that is, when there is a shorter
10211 sequence that can express the same code point; overlong
10212 sequences are expressly forbidden in the UTF-8 standard due to
10213 potential security issues). Another malformation example is
10214 the first byte of a character not being a legal first byte.
10215 See utf8.h for the list of such flags. Even if allowed, this
10216 function generally returns the Unicode REPLACEMENT CHARACTER
10217 when it encounters a malformation. There are flags in utf8.h
10218 to override this behavior for the overlong malformations, but
10219 don't do that except for very specialized purposes.
10220
10221 The "UTF8_CHECK_ONLY" flag overrides the behavior when a non-
10222 allowed (by other flags) malformation is found. If this flag
10223 is set, the routine assumes that the caller will raise a
10224 warning, and this function will silently just set "retlen" to
10225 "-1" (cast to "STRLEN") and return zero.
10226
10227 Note that this API requires disambiguation between successful
10228 decoding a "NUL" character, and an error return (unless the
10229 "UTF8_CHECK_ONLY" flag is set), as in both cases, 0 is
10230 returned, and, depending on the malformation, "retlen" may be
10231 set to 1. To disambiguate, upon a zero return, see if the
10232 first byte of "s" is 0 as well. If so, the input was a "NUL";
10233 if not, the input had an error. Or you can use
10234 "utf8n_to_uvchr_error".
10235
10236 Certain code points are considered problematic. These are
10237 Unicode surrogates, Unicode non-characters, and code points
10238 above the Unicode maximum of 0x10FFFF. By default these are
10239 considered regular code points, but certain situations warrant
10240 special handling for them, which can be specified using the
10241 "flags" parameter. If "flags" contains
10242 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE", all three classes are
10243 treated as malformations and handled as such. The flags
10244 "UTF8_DISALLOW_SURROGATE", "UTF8_DISALLOW_NONCHAR", and
10245 "UTF8_DISALLOW_SUPER" (meaning above the legal Unicode maximum)
10246 can be set to disallow these categories individually.
10247 "UTF8_DISALLOW_ILLEGAL_INTERCHANGE" restricts the allowed
10248 inputs to the strict UTF-8 traditionally defined by Unicode.
10249 Use "UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE" to use the
10250 strictness definition given by Unicode Corrigendum #9
10251 <http://www.unicode.org/versions/corrigendum9.html>. The
10252 difference between traditional strictness and C9 strictness is
10253 that the latter does not forbid non-character code points.
10254 (They are still discouraged, however.) For more discussion see
10255 "Noncharacter code points" in perlunicode.
10256
10257 The flags "UTF8_WARN_ILLEGAL_INTERCHANGE",
10258 "UTF8_WARN_ILLEGAL_C9_INTERCHANGE", "UTF8_WARN_SURROGATE",
10259 "UTF8_WARN_NONCHAR", and "UTF8_WARN_SUPER" will cause warning
10260 messages to be raised for their respective categories, but
10261 otherwise the code points are considered valid (not
10262 malformations). To get a category to both be treated as a
10263 malformation and raise a warning, specify both the WARN and
10264 DISALLOW flags. (But note that warnings are not raised if
10265 lexically disabled nor if "UTF8_CHECK_ONLY" is also specified.)
10266
10267 Extremely high code points were never specified in any
10268 standard, and require an extension to UTF-8 to express, which
10269 Perl does. It is likely that programs written in something
10270 other than Perl would not be able to read files that contain
10271 these; nor would Perl understand files written by something
10272 that uses a different extension. For these reasons, there is a
10273 separate set of flags that can warn and/or disallow these
10274 extremely high code points, even if other above-Unicode ones
10275 are accepted. They are the "UTF8_WARN_PERL_EXTENDED" and
10276 "UTF8_DISALLOW_PERL_EXTENDED" flags. For more information see
10277 ""UTF8_GOT_PERL_EXTENDED"". Of course "UTF8_DISALLOW_SUPER"
10278 will treat all above-Unicode code points, including these, as
10279 malformations. (Note that the Unicode standard considers
10280 anything above 0x10FFFF to be illegal, but there are standards
10281 predating it that allow up to 0x7FFF_FFFF (2**31 -1))
10282
10283 A somewhat misleadingly named synonym for
10284 "UTF8_WARN_PERL_EXTENDED" is retained for backward
10285 compatibility: "UTF8_WARN_ABOVE_31_BIT". Similarly,
10286 "UTF8_DISALLOW_ABOVE_31_BIT" is usable instead of the more
10287 accurately named "UTF8_DISALLOW_PERL_EXTENDED". The names are
10288 misleading because these flags can apply to code points that
10289 actually do fit in 31 bits. This happens on EBCDIC platforms,
10290 and sometimes when the overlong malformation is also present.
10291 The new names accurately describe the situation in all cases.
10292
10293 All other code points corresponding to Unicode characters,
10294 including private use and those yet to be assigned, are never
10295 considered malformed and never warn.
10296
10297 UV utf8n_to_uvchr(const U8 *s, STRLEN curlen,
10298 STRLEN *retlen, const U32 flags)
10299
10300 utf8n_to_uvchr_error
10301 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10302 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10303 rather than call this directly.
10304
10305 This function is for code that needs to know what the precise
10306 malformation(s) are when an error is found. If you also need
10307 to know the generated warning messages, use
10308 "utf8n_to_uvchr_msgs"() instead.
10309
10310 It is like "utf8n_to_uvchr" but it takes an extra parameter
10311 placed after all the others, "errors". If this parameter is 0,
10312 this function behaves identically to "utf8n_to_uvchr".
10313 Otherwise, "errors" should be a pointer to a "U32" variable,
10314 which this function sets to indicate any errors found. Upon
10315 return, if *errors is 0, there were no errors found.
10316 Otherwise, *errors is the bit-wise "OR" of the bits described
10317 in the list below. Some of these bits will be set if a
10318 malformation is found, even if the input "flags" parameter
10319 indicates that the given malformation is allowed; those
10320 exceptions are noted:
10321
10322 "UTF8_GOT_PERL_EXTENDED"
10323 The input sequence is not standard UTF-8, but a Perl
10324 extension. This bit is set only if the input "flags"
10325 parameter contains either the "UTF8_DISALLOW_PERL_EXTENDED"
10326 or the "UTF8_WARN_PERL_EXTENDED" flags.
10327
10328 Code points above 0x7FFF_FFFF (2**31 - 1) were never
10329 specified in any standard, and so some extension must be
10330 used to express them. Perl uses a natural extension to
10331 UTF-8 to represent the ones up to 2**36-1, and invented a
10332 further extension to represent even higher ones, so that
10333 any code point that fits in a 64-bit word can be
10334 represented. Text using these extensions is not likely to
10335 be portable to non-Perl code. We lump both of these
10336 extensions together and refer to them as Perl extended
10337 UTF-8. There exist other extensions that people have
10338 invented, incompatible with Perl's.
10339
10340 On EBCDIC platforms starting in Perl v5.24, the Perl
10341 extension for representing extremely high code points kicks
10342 in at 0x3FFF_FFFF (2**30 -1), which is lower than on ASCII.
10343 Prior to that, code points 2**31 and higher were simply
10344 unrepresentable, and a different, incompatible method was
10345 used to represent code points between 2**30 and 2**31 - 1.
10346
10347 On both platforms, ASCII and EBCDIC,
10348 "UTF8_GOT_PERL_EXTENDED" is set if Perl extended UTF-8 is
10349 used.
10350
10351 In earlier Perls, this bit was named
10352 "UTF8_GOT_ABOVE_31_BIT", which you still may use for
10353 backward compatibility. That name is misleading, as this
10354 flag may be set when the code point actually does fit in 31
10355 bits. This happens on EBCDIC platforms, and sometimes when
10356 the overlong malformation is also present. The new name
10357 accurately describes the situation in all cases.
10358
10359 "UTF8_GOT_CONTINUATION"
10360 The input sequence was malformed in that the first byte was
10361 a a UTF-8 continuation byte.
10362
10363 "UTF8_GOT_EMPTY"
10364 The input "curlen" parameter was 0.
10365
10366 "UTF8_GOT_LONG"
10367 The input sequence was malformed in that there is some
10368 other sequence that evaluates to the same code point, but
10369 that sequence is shorter than this one.
10370
10371 Until Unicode 3.1, it was legal for programs to accept this
10372 malformation, but it was discovered that this created
10373 security issues.
10374
10375 "UTF8_GOT_NONCHAR"
10376 The code point represented by the input UTF-8 sequence is
10377 for a Unicode non-character code point. This bit is set
10378 only if the input "flags" parameter contains either the
10379 "UTF8_DISALLOW_NONCHAR" or the "UTF8_WARN_NONCHAR" flags.
10380
10381 "UTF8_GOT_NON_CONTINUATION"
10382 The input sequence was malformed in that a non-continuation
10383 type byte was found in a position where only a continuation
10384 type one should be. See also ""UTF8_GOT_SHORT"".
10385
10386 "UTF8_GOT_OVERFLOW"
10387 The input sequence was malformed in that it is for a code
10388 point that is not representable in the number of bits
10389 available in an IV on the current platform.
10390
10391 "UTF8_GOT_SHORT"
10392 The input sequence was malformed in that "curlen" is
10393 smaller than required for a complete sequence. In other
10394 words, the input is for a partial character sequence.
10395
10396 "UTF8_GOT_SHORT" and "UTF8_GOT_NON_CONTINUATION" both
10397 indicate a too short sequence. The difference is that
10398 "UTF8_GOT_NON_CONTINUATION" indicates always that there is
10399 an error, while "UTF8_GOT_SHORT" means that an incomplete
10400 sequence was looked at. If no other flags are present, it
10401 means that the sequence was valid as far as it went.
10402 Depending on the application, this could mean one of three
10403 things:
10404
10405 · The "curlen" length parameter passed in was too small,
10406 and the function was prevented from examining all the
10407 necessary bytes.
10408
10409 · The buffer being looked at is based on reading data,
10410 and the data received so far stopped in the middle of a
10411 character, so that the next read will read the
10412 remainder of this character. (It is up to the caller
10413 to deal with the split bytes somehow.)
10414
10415 · This is a real error, and the partial sequence is all
10416 we're going to get.
10417
10418 "UTF8_GOT_SUPER"
10419 The input sequence was malformed in that it is for a non-
10420 Unicode code point; that is, one above the legal Unicode
10421 maximum. This bit is set only if the input "flags"
10422 parameter contains either the "UTF8_DISALLOW_SUPER" or the
10423 "UTF8_WARN_SUPER" flags.
10424
10425 "UTF8_GOT_SURROGATE"
10426 The input sequence was malformed in that it is for a
10427 -Unicode UTF-16 surrogate code point. This bit is set only
10428 if the input "flags" parameter contains either the
10429 "UTF8_DISALLOW_SURROGATE" or the "UTF8_WARN_SURROGATE"
10430 flags.
10431
10432 To do your own error handling, call this function with the
10433 "UTF8_CHECK_ONLY" flag to suppress any warnings, and then
10434 examine the *errors return.
10435
10436 UV utf8n_to_uvchr_error(const U8 *s, STRLEN curlen,
10437 STRLEN *retlen,
10438 const U32 flags,
10439 U32 * errors)
10440
10441 utf8n_to_uvchr_msgs
10442 NOTE: this function is experimental and may change or be
10443 removed without notice.
10444
10445 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10446 CIRCUMSTANCES. Most code should use "utf8_to_uvchr_buf"()
10447 rather than call this directly.
10448
10449 This function is for code that needs to know what the precise
10450 malformation(s) are when an error is found, and wants the
10451 corresponding warning and/or error messages to be returned to
10452 the caller rather than be displayed. All messages that would
10453 have been displayed if all lexcial warnings are enabled will be
10454 returned.
10455
10456 It is just like "utf8n_to_uvchr_error" but it takes an extra
10457 parameter placed after all the others, "msgs". If this
10458 parameter is 0, this function behaves identically to
10459 "utf8n_to_uvchr_error". Otherwise, "msgs" should be a pointer
10460 to an "AV *" variable, in which this function creates a new AV
10461 to contain any appropriate messages. The elements of the array
10462 are ordered so that the first message that would have been
10463 displayed is in the 0th element, and so on. Each element is a
10464 hash with three key-value pairs, as follows:
10465
10466 "text"
10467 The text of the message as a "SVpv".
10468
10469 "warn_categories"
10470 The warning category (or categories) packed into a "SVuv".
10471
10472 "flag"
10473 A single flag bit associated with this message, in a
10474 "SVuv". The bit corresponds to some bit in the *errors
10475 return value, such as "UTF8_GOT_LONG".
10476
10477 It's important to note that specifying this parameter as non-
10478 null will cause any warnings this function would otherwise
10479 generate to be suppressed, and instead be placed in *msgs. The
10480 caller can check the lexical warnings state (or not) when
10481 choosing what to do with the returned messages.
10482
10483 If the flag "UTF8_CHECK_ONLY" is passed, no warnings are
10484 generated, and hence no AV is created.
10485
10486 The caller, of course, is responsible for freeing any returned
10487 AV.
10488
10489 UV utf8n_to_uvchr_msgs(const U8 *s, STRLEN curlen,
10490 STRLEN *retlen,
10491 const U32 flags,
10492 U32 * errors, AV ** msgs)
10493
10494 utf8n_to_uvuni
10495 Instead use "utf8_to_uvchr_buf", or rarely, "utf8n_to_uvchr".
10496
10497 This function was useful for code that wanted to handle both
10498 EBCDIC and ASCII platforms with Unicode properties, but
10499 starting in Perl v5.20, the distinctions between the platforms
10500 have mostly been made invisible to most code, so this function
10501 is quite unlikely to be what you want. If you do need this
10502 precise functionality, use instead
10503 "NATIVE_TO_UNI(utf8_to_uvchr_buf(...))" or
10504 "NATIVE_TO_UNI(utf8n_to_uvchr(...))".
10505
10506 UV utf8n_to_uvuni(const U8 *s, STRLEN curlen,
10507 STRLEN *retlen, U32 flags)
10508
10509 UTF8SKIP
10510 returns the number of bytes in the UTF-8 encoded character
10511 whose first (perhaps only) byte is pointed to by "s".
10512
10513 STRLEN UTF8SKIP(char* s)
10514
10515 utf8_distance
10516 Returns the number of UTF-8 characters between the UTF-8
10517 pointers "a" and "b".
10518
10519 WARNING: use only if you *know* that the pointers point inside
10520 the same UTF-8 buffer.
10521
10522 IV utf8_distance(const U8 *a, const U8 *b)
10523
10524 utf8_hop
10525 Return the UTF-8 pointer "s" displaced by "off" characters,
10526 either forward or backward.
10527
10528 WARNING: do not use the following unless you *know* "off" is
10529 within the UTF-8 data pointed to by "s" *and* that on entry "s"
10530 is aligned on the first byte of character or just after the
10531 last byte of a character.
10532
10533 U8* utf8_hop(const U8 *s, SSize_t off)
10534
10535 utf8_hop_back
10536 Return the UTF-8 pointer "s" displaced by up to "off"
10537 characters, backward.
10538
10539 "off" must be non-positive.
10540
10541 "s" must be after or equal to "start".
10542
10543 When moving backward it will not move before "start".
10544
10545 Will not exceed this limit even if the string is not valid
10546 "UTF-8".
10547
10548 U8* utf8_hop_back(const U8 *s, SSize_t off,
10549 const U8 *start)
10550
10551 utf8_hop_forward
10552 Return the UTF-8 pointer "s" displaced by up to "off"
10553 characters, forward.
10554
10555 "off" must be non-negative.
10556
10557 "s" must be before or equal to "end".
10558
10559 When moving forward it will not move beyond "end".
10560
10561 Will not exceed this limit even if the string is not valid
10562 "UTF-8".
10563
10564 U8* utf8_hop_forward(const U8 *s, SSize_t off,
10565 const U8 *end)
10566
10567 utf8_hop_safe
10568 Return the UTF-8 pointer "s" displaced by up to "off"
10569 characters, either forward or backward.
10570
10571 When moving backward it will not move before "start".
10572
10573 When moving forward it will not move beyond "end".
10574
10575 Will not exceed those limits even if the string is not valid
10576 "UTF-8".
10577
10578 U8* utf8_hop_safe(const U8 *s, SSize_t off,
10579 const U8 *start, const U8 *end)
10580
10581 UTF8_IS_INVARIANT
10582 Evaluates to 1 if the byte "c" represents the same character
10583 when encoded in UTF-8 as when not; otherwise evaluates to 0.
10584 UTF-8 invariant characters can be copied as-is when converting
10585 to/from UTF-8, saving time.
10586
10587 In spite of the name, this macro gives the correct result if
10588 the input string from which "c" comes is not encoded in UTF-8.
10589
10590 See "UVCHR_IS_INVARIANT" for checking if a UV is invariant.
10591
10592 bool UTF8_IS_INVARIANT(char c)
10593
10594 UTF8_IS_NONCHAR
10595 Evaluates to non-zero if the first few bytes of the string
10596 starting at "s" and looking no further than "e - 1" are well-
10597 formed UTF-8 that represents one of the Unicode non-character
10598 code points; otherwise it evaluates to 0. If non-zero, the
10599 value gives how many bytes starting at "s" comprise the code
10600 point's representation.
10601
10602 bool UTF8_IS_NONCHAR(const U8 *s, const U8 *e)
10603
10604 UTF8_IS_SUPER
10605 Recall that Perl recognizes an extension to UTF-8 that can
10606 encode code points larger than the ones defined by Unicode,
10607 which are 0..0x10FFFF.
10608
10609 This macro evaluates to non-zero if the first few bytes of the
10610 string starting at "s" and looking no further than "e - 1" are
10611 from this UTF-8 extension; otherwise it evaluates to 0. If
10612 non-zero, the value gives how many bytes starting at "s"
10613 comprise the code point's representation.
10614
10615 0 is returned if the bytes are not well-formed extended UTF-8,
10616 or if they represent a code point that cannot fit in a UV on
10617 the current platform. Hence this macro can give different
10618 results when run on a 64-bit word machine than on one with a
10619 32-bit word size.
10620
10621 Note that it is illegal to have code points that are larger
10622 than what can fit in an IV on the current machine.
10623
10624 bool UTF8_IS_SUPER(const U8 *s, const U8 *e)
10625
10626 UTF8_IS_SURROGATE
10627 Evaluates to non-zero if the first few bytes of the string
10628 starting at "s" and looking no further than "e - 1" are well-
10629 formed UTF-8 that represents one of the Unicode surrogate code
10630 points; otherwise it evaluates to 0. If non-zero, the value
10631 gives how many bytes starting at "s" comprise the code point's
10632 representation.
10633
10634 bool UTF8_IS_SURROGATE(const U8 *s, const U8 *e)
10635
10636 utf8_length
10637 Returns the number of characters in the sequence of
10638 UTF-8-encoded bytes starting at "s" and ending at the byte just
10639 before "e". If <s> and <e> point to the same place, it returns
10640 0 with no warning raised.
10641
10642 If "e < s" or if the scan would end up past "e", it raises a
10643 UTF8 warning and returns the number of valid characters.
10644
10645 STRLEN utf8_length(const U8* s, const U8 *e)
10646
10647 UTF8_SAFE_SKIP
10648 returns 0 if "s >= e"; otherwise returns the number of bytes in
10649 the UTF-8 encoded character whose first byte is pointed to by
10650 "s". But it never returns beyond "e". On DEBUGGING builds, it
10651 asserts that "s <= e".
10652
10653 STRLEN UTF8_SAFE_SKIP(char* s, char* e)
10654
10655 utf8_to_bytes
10656 NOTE: this function is experimental and may change or be
10657 removed without notice.
10658
10659 Converts a string "s" of length *lenp from UTF-8 into native
10660 byte encoding. Unlike "bytes_to_utf8", this over-writes the
10661 original string, and updates *lenp to contain the new length.
10662 Returns zero on failure (leaving "s" unchanged) setting *lenp
10663 to -1.
10664
10665 Upon successful return, the number of variants in the string
10666 can be computed by having saved the value of *lenp before the
10667 call, and subtracting the after-call value of *lenp from it.
10668
10669 If you need a copy of the string, see "bytes_from_utf8".
10670
10671 U8* utf8_to_bytes(U8 *s, STRLEN *lenp)
10672
10673 utf8_to_uvchr
10674 DEPRECATED! It is planned to remove this function from a
10675 future release of Perl. Do not use it for new code; remove it
10676 from existing code.
10677
10678 Returns the native code point of the first character in the
10679 string "s" which is assumed to be in UTF-8 encoding; "retlen"
10680 will be set to the length, in bytes, of that character.
10681
10682 Some, but not all, UTF-8 malformations are detected, and in
10683 fact, some malformed input could cause reading beyond the end
10684 of the input buffer, which is why this function is deprecated.
10685 Use "utf8_to_uvchr_buf" instead.
10686
10687 If "s" points to one of the detected malformations, and UTF8
10688 warnings are enabled, zero is returned and *retlen is set (if
10689 "retlen" isn't "NULL") to -1. If those warnings are off, the
10690 computed value if well-defined (or the Unicode REPLACEMENT
10691 CHARACTER, if not) is silently returned, and *retlen is set (if
10692 "retlen" isn't NULL) so that ("s" + *retlen) is the next
10693 possible position in "s" that could begin a non-malformed
10694 character. See "utf8n_to_uvchr" for details on when the
10695 REPLACEMENT CHARACTER is returned.
10696
10697 UV utf8_to_uvchr(const U8 *s, STRLEN *retlen)
10698
10699 utf8_to_uvchr_buf
10700 Returns the native code point of the first character in the
10701 string "s" which is assumed to be in UTF-8 encoding; "send"
10702 points to 1 beyond the end of "s". *retlen will be set to the
10703 length, in bytes, of that character.
10704
10705 If "s" does not point to a well-formed UTF-8 character and UTF8
10706 warnings are enabled, zero is returned and *retlen is set (if
10707 "retlen" isn't "NULL") to -1. If those warnings are off, the
10708 computed value, if well-defined (or the Unicode REPLACEMENT
10709 CHARACTER if not), is silently returned, and *retlen is set (if
10710 "retlen" isn't "NULL") so that ("s" + *retlen) is the next
10711 possible position in "s" that could begin a non-malformed
10712 character. See "utf8n_to_uvchr" for details on when the
10713 REPLACEMENT CHARACTER is returned.
10714
10715 UV utf8_to_uvchr_buf(const U8 *s, const U8 *send,
10716 STRLEN *retlen)
10717
10718 utf8_to_uvuni_buf
10719 DEPRECATED! It is planned to remove this function from a
10720 future release of Perl. Do not use it for new code; remove it
10721 from existing code.
10722
10723 Only in very rare circumstances should code need to be dealing
10724 in Unicode (as opposed to native) code points. In those few
10725 cases, use "NATIVE_TO_UNI(utf8_to_uvchr_buf(...))" instead. If
10726 you are not absolutely sure this is one of those cases, then
10727 assume it isn't and use plain "utf8_to_uvchr_buf" instead.
10728
10729 Returns the Unicode (not-native) code point of the first
10730 character in the string "s" which is assumed to be in UTF-8
10731 encoding; "send" points to 1 beyond the end of "s". "retlen"
10732 will be set to the length, in bytes, of that character.
10733
10734 If "s" does not point to a well-formed UTF-8 character and UTF8
10735 warnings are enabled, zero is returned and *retlen is set (if
10736 "retlen" isn't NULL) to -1. If those warnings are off, the
10737 computed value if well-defined (or the Unicode REPLACEMENT
10738 CHARACTER, if not) is silently returned, and *retlen is set (if
10739 "retlen" isn't NULL) so that ("s" + *retlen) is the next
10740 possible position in "s" that could begin a non-malformed
10741 character. See "utf8n_to_uvchr" for details on when the
10742 REPLACEMENT CHARACTER is returned.
10743
10744 UV utf8_to_uvuni_buf(const U8 *s, const U8 *send,
10745 STRLEN *retlen)
10746
10747 UVCHR_IS_INVARIANT
10748 Evaluates to 1 if the representation of code point "cp" is the
10749 same whether or not it is encoded in UTF-8; otherwise evaluates
10750 to 0. UTF-8 invariant characters can be copied as-is when
10751 converting to/from UTF-8, saving time. "cp" is Unicode if
10752 above 255; otherwise is platform-native.
10753
10754 bool UVCHR_IS_INVARIANT(UV cp)
10755
10756 UVCHR_SKIP
10757 returns the number of bytes required to represent the code
10758 point "cp" when encoded as UTF-8. "cp" is a native (ASCII or
10759 EBCDIC) code point if less than 255; a Unicode code point
10760 otherwise.
10761
10762 STRLEN UVCHR_SKIP(UV cp)
10763
10764 uvchr_to_utf8
10765 Adds the UTF-8 representation of the native code point "uv" to
10766 the end of the string "d"; "d" should have at least
10767 "UVCHR_SKIP(uv)+1" (up to "UTF8_MAXBYTES+1") free bytes
10768 available. The return value is the pointer to the byte after
10769 the end of the new character. In other words,
10770
10771 d = uvchr_to_utf8(d, uv);
10772
10773 is the recommended wide native character-aware way of saying
10774
10775 *(d++) = uv;
10776
10777 This function accepts any code point from 0.."IV_MAX" as input.
10778 "IV_MAX" is typically 0x7FFF_FFFF in a 32-bit word.
10779
10780 It is possible to forbid or warn on non-Unicode code points, or
10781 those that may be problematic by using "uvchr_to_utf8_flags".
10782
10783 U8* uvchr_to_utf8(U8 *d, UV uv)
10784
10785 uvchr_to_utf8_flags
10786 Adds the UTF-8 representation of the native code point "uv" to
10787 the end of the string "d"; "d" should have at least
10788 "UVCHR_SKIP(uv)+1" (up to "UTF8_MAXBYTES+1") free bytes
10789 available. The return value is the pointer to the byte after
10790 the end of the new character. In other words,
10791
10792 d = uvchr_to_utf8_flags(d, uv, flags);
10793
10794 or, in most cases,
10795
10796 d = uvchr_to_utf8_flags(d, uv, 0);
10797
10798 This is the Unicode-aware way of saying
10799
10800 *(d++) = uv;
10801
10802 If "flags" is 0, this function accepts any code point from
10803 0.."IV_MAX" as input. "IV_MAX" is typically 0x7FFF_FFFF in a
10804 32-bit word.
10805
10806 Specifying "flags" can further restrict what is allowed and not
10807 warned on, as follows:
10808
10809 If "uv" is a Unicode surrogate code point and
10810 "UNICODE_WARN_SURROGATE" is set, the function will raise a
10811 warning, provided UTF8 warnings are enabled. If instead
10812 "UNICODE_DISALLOW_SURROGATE" is set, the function will fail and
10813 return NULL. If both flags are set, the function will both
10814 warn and return NULL.
10815
10816 Similarly, the "UNICODE_WARN_NONCHAR" and
10817 "UNICODE_DISALLOW_NONCHAR" flags affect how the function
10818 handles a Unicode non-character.
10819
10820 And likewise, the "UNICODE_WARN_SUPER" and
10821 "UNICODE_DISALLOW_SUPER" flags affect the handling of code
10822 points that are above the Unicode maximum of 0x10FFFF.
10823 Languages other than Perl may not be able to accept files that
10824 contain these.
10825
10826 The flag "UNICODE_WARN_ILLEGAL_INTERCHANGE" selects all three
10827 of the above WARN flags; and
10828 "UNICODE_DISALLOW_ILLEGAL_INTERCHANGE" selects all three
10829 DISALLOW flags. "UNICODE_DISALLOW_ILLEGAL_INTERCHANGE"
10830 restricts the allowed inputs to the strict UTF-8 traditionally
10831 defined by Unicode. Similarly,
10832 "UNICODE_WARN_ILLEGAL_C9_INTERCHANGE" and
10833 "UNICODE_DISALLOW_ILLEGAL_C9_INTERCHANGE" are shortcuts to
10834 select the above-Unicode and surrogate flags, but not the non-
10835 character ones, as defined in Unicode Corrigendum #9
10836 <http://www.unicode.org/versions/corrigendum9.html>. See
10837 "Noncharacter code points" in perlunicode.
10838
10839 Extremely high code points were never specified in any
10840 standard, and require an extension to UTF-8 to express, which
10841 Perl does. It is likely that programs written in something
10842 other than Perl would not be able to read files that contain
10843 these; nor would Perl understand files written by something
10844 that uses a different extension. For these reasons, there is a
10845 separate set of flags that can warn and/or disallow these
10846 extremely high code points, even if other above-Unicode ones
10847 are accepted. They are the "UNICODE_WARN_PERL_EXTENDED" and
10848 "UNICODE_DISALLOW_PERL_EXTENDED" flags. For more information
10849 see ""UTF8_GOT_PERL_EXTENDED"". Of course
10850 "UNICODE_DISALLOW_SUPER" will treat all above-Unicode code
10851 points, including these, as malformations. (Note that the
10852 Unicode standard considers anything above 0x10FFFF to be
10853 illegal, but there are standards predating it that allow up to
10854 0x7FFF_FFFF (2**31 -1))
10855
10856 A somewhat misleadingly named synonym for
10857 "UNICODE_WARN_PERL_EXTENDED" is retained for backward
10858 compatibility: "UNICODE_WARN_ABOVE_31_BIT". Similarly,
10859 "UNICODE_DISALLOW_ABOVE_31_BIT" is usable instead of the more
10860 accurately named "UNICODE_DISALLOW_PERL_EXTENDED". The names
10861 are misleading because on EBCDIC platforms,these flags can
10862 apply to code points that actually do fit in 31 bits. The new
10863 names accurately describe the situation in all cases.
10864
10865 U8* uvchr_to_utf8_flags(U8 *d, UV uv, UV flags)
10866
10867 uvchr_to_utf8_flags_msgs
10868 NOTE: this function is experimental and may change or be
10869 removed without notice.
10870
10871 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10872 CIRCUMSTANCES.
10873
10874 Most code should use ""uvchr_to_utf8_flags"()" rather than call
10875 this directly.
10876
10877 This function is for code that wants any warning and/or error
10878 messages to be returned to the caller rather than be displayed.
10879 All messages that would have been displayed if all lexical
10880 warnings are enabled will be returned.
10881
10882 It is just like "uvchr_to_utf8_flags" but it takes an extra
10883 parameter placed after all the others, "msgs". If this
10884 parameter is 0, this function behaves identically to
10885 "uvchr_to_utf8_flags". Otherwise, "msgs" should be a pointer
10886 to an "HV *" variable, in which this function creates a new HV
10887 to contain any appropriate messages. The hash has three key-
10888 value pairs, as follows:
10889
10890 "text"
10891 The text of the message as a "SVpv".
10892
10893 "warn_categories"
10894 The warning category (or categories) packed into a "SVuv".
10895
10896 "flag"
10897 A single flag bit associated with this message, in a
10898 "SVuv". The bit corresponds to some bit in the *errors
10899 return value, such as "UNICODE_GOT_SURROGATE".
10900
10901 It's important to note that specifying this parameter as non-
10902 null will cause any warnings this function would otherwise
10903 generate to be suppressed, and instead be placed in *msgs. The
10904 caller can check the lexical warnings state (or not) when
10905 choosing what to do with the returned messages.
10906
10907 The caller, of course, is responsible for freeing any returned
10908 HV.
10909
10910 U8* uvchr_to_utf8_flags_msgs(U8 *d, UV uv, UV flags,
10911 HV ** msgs)
10912
10913 uvoffuni_to_utf8_flags
10914 THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED
10915 CIRCUMSTANCES. Instead, Almost all code should use
10916 "uvchr_to_utf8" or "uvchr_to_utf8_flags".
10917
10918 This function is like them, but the input is a strict Unicode
10919 (as opposed to native) code point. Only in very rare
10920 circumstances should code not be using the native code point.
10921
10922 For details, see the description for "uvchr_to_utf8_flags".
10923
10924 U8* uvoffuni_to_utf8_flags(U8 *d, UV uv,
10925 const UV flags)
10926
10927 uvuni_to_utf8_flags
10928 Instead you almost certainly want to use "uvchr_to_utf8" or
10929 "uvchr_to_utf8_flags".
10930
10931 This function is a deprecated synonym for
10932 "uvoffuni_to_utf8_flags", which itself, while not deprecated,
10933 should be used only in isolated circumstances. These functions
10934 were useful for code that wanted to handle both EBCDIC and
10935 ASCII platforms with Unicode properties, but starting in Perl
10936 v5.20, the distinctions between the platforms have mostly been
10937 made invisible to most code, so this function is quite unlikely
10938 to be what you want.
10939
10940 U8* uvuni_to_utf8_flags(U8 *d, UV uv, UV flags)
10941
10942 valid_utf8_to_uvchr
10943 Like "utf8_to_uvchr_buf", but should only be called when it is
10944 known that the next character in the input UTF-8 string "s" is
10945 well-formed (e.g., it passes "isUTF8_CHAR". Surrogates, non-
10946 character code points, and non-Unicode code points are allowed.
10947
10948 UV valid_utf8_to_uvchr(const U8 *s, STRLEN *retlen)
10949
10951 newXSproto
10952 Used by "xsubpp" to hook up XSUBs as Perl subs. Adds Perl
10953 prototypes to the subs.
10954
10955 XS_APIVERSION_BOOTCHECK
10956 Macro to verify that the perl api version an XS module has been
10957 compiled against matches the api version of the perl
10958 interpreter it's being loaded into.
10959
10960 XS_APIVERSION_BOOTCHECK;
10961
10962 XS_VERSION
10963 The version identifier for an XS module. This is usually
10964 handled automatically by "ExtUtils::MakeMaker". See
10965 "XS_VERSION_BOOTCHECK".
10966
10967 XS_VERSION_BOOTCHECK
10968 Macro to verify that a PM module's $VERSION variable matches
10969 the XS module's "XS_VERSION" variable. This is usually handled
10970 automatically by "xsubpp". See "The VERSIONCHECK: Keyword" in
10971 perlxs.
10972
10973 XS_VERSION_BOOTCHECK;
10974
10976 ckWARN Returns a boolean as to whether or not warnings are enabled for
10977 the warning category "w". If the category is by default
10978 enabled even if not within the scope of "use warnings", instead
10979 use the "ckWARN_d" macro.
10980
10981 bool ckWARN(U32 w)
10982
10983 ckWARN2 Like "ckWARN", but takes two warnings categories as input, and
10984 returns TRUE if either is enabled. If either category is by
10985 default enabled even if not within the scope of "use warnings",
10986 instead use the "ckWARN2_d" macro. The categories must be
10987 completely independent, one may not be subclassed from the
10988 other.
10989
10990 bool ckWARN2(U32 w1, U32 w2)
10991
10992 ckWARN3 Like "ckWARN2", but takes three warnings categories as input,
10993 and returns TRUE if any is enabled. If any of the categories
10994 is by default enabled even if not within the scope of
10995 "use warnings", instead use the "ckWARN3_d" macro. The
10996 categories must be completely independent, one may not be
10997 subclassed from any other.
10998
10999 bool ckWARN3(U32 w1, U32 w2, U32 w3)
11000
11001 ckWARN4 Like "ckWARN3", but takes four warnings categories as input,
11002 and returns TRUE if any is enabled. If any of the categories
11003 is by default enabled even if not within the scope of
11004 "use warnings", instead use the "ckWARN4_d" macro. The
11005 categories must be completely independent, one may not be
11006 subclassed from any other.
11007
11008 bool ckWARN4(U32 w1, U32 w2, U32 w3, U32 w4)
11009
11010 ckWARN_d
11011 Like "ckWARN", but for use if and only if the warning category
11012 is by default enabled even if not within the scope of
11013 "use warnings".
11014
11015 bool ckWARN_d(U32 w)
11016
11017 ckWARN2_d
11018 Like "ckWARN2", but for use if and only if either warning
11019 category is by default enabled even if not within the scope of
11020 "use warnings".
11021
11022 bool ckWARN2_d(U32 w1, U32 w2)
11023
11024 ckWARN3_d
11025 Like "ckWARN3", but for use if and only if any of the warning
11026 categories is by default enabled even if not within the scope
11027 of "use warnings".
11028
11029 bool ckWARN3_d(U32 w1, U32 w2, U32 w3)
11030
11031 ckWARN4_d
11032 Like "ckWARN4", but for use if and only if any of the warning
11033 categories is by default enabled even if not within the scope
11034 of "use warnings".
11035
11036 bool ckWARN4_d(U32 w1, U32 w2, U32 w3, U32 w4)
11037
11038 croak This is an XS interface to Perl's "die" function.
11039
11040 Take a sprintf-style format pattern and argument list. These
11041 are used to generate a string message. If the message does not
11042 end with a newline, then it will be extended with some
11043 indication of the current location in the code, as described
11044 for "mess_sv".
11045
11046 The error message will be used as an exception, by default
11047 returning control to the nearest enclosing "eval", but subject
11048 to modification by a $SIG{__DIE__} handler. In any case, the
11049 "croak" function never returns normally.
11050
11051 For historical reasons, if "pat" is null then the contents of
11052 "ERRSV" ($@) will be used as an error message or object instead
11053 of building an error message from arguments. If you want to
11054 throw a non-string object, or build an error message in an SV
11055 yourself, it is preferable to use the "croak_sv" function,
11056 which does not involve clobbering "ERRSV".
11057
11058 void croak(const char *pat, ...)
11059
11060 croak_no_modify
11061 Exactly equivalent to "Perl_croak(aTHX_ "%s", PL_no_modify)",
11062 but generates terser object code than using "Perl_croak". Less
11063 code used on exception code paths reduces CPU cache pressure.
11064
11065 void croak_no_modify()
11066
11067 croak_sv
11068 This is an XS interface to Perl's "die" function.
11069
11070 "baseex" is the error message or object. If it is a reference,
11071 it will be used as-is. Otherwise it is used as a string, and
11072 if it does not end with a newline then it will be extended with
11073 some indication of the current location in the code, as
11074 described for "mess_sv".
11075
11076 The error message or object will be used as an exception, by
11077 default returning control to the nearest enclosing "eval", but
11078 subject to modification by a $SIG{__DIE__} handler. In any
11079 case, the "croak_sv" function never returns normally.
11080
11081 To die with a simple string message, the "croak" function may
11082 be more convenient.
11083
11084 void croak_sv(SV *baseex)
11085
11086 die Behaves the same as "croak", except for the return type. It
11087 should be used only where the "OP *" return type is required.
11088 The function never actually returns.
11089
11090 OP * die(const char *pat, ...)
11091
11092 die_sv Behaves the same as "croak_sv", except for the return type. It
11093 should be used only where the "OP *" return type is required.
11094 The function never actually returns.
11095
11096 OP * die_sv(SV *baseex)
11097
11098 vcroak This is an XS interface to Perl's "die" function.
11099
11100 "pat" and "args" are a sprintf-style format pattern and
11101 encapsulated argument list. These are used to generate a
11102 string message. If the message does not end with a newline,
11103 then it will be extended with some indication of the current
11104 location in the code, as described for "mess_sv".
11105
11106 The error message will be used as an exception, by default
11107 returning control to the nearest enclosing "eval", but subject
11108 to modification by a $SIG{__DIE__} handler. In any case, the
11109 "croak" function never returns normally.
11110
11111 For historical reasons, if "pat" is null then the contents of
11112 "ERRSV" ($@) will be used as an error message or object instead
11113 of building an error message from arguments. If you want to
11114 throw a non-string object, or build an error message in an SV
11115 yourself, it is preferable to use the "croak_sv" function,
11116 which does not involve clobbering "ERRSV".
11117
11118 void vcroak(const char *pat, va_list *args)
11119
11120 vwarn This is an XS interface to Perl's "warn" function.
11121
11122 "pat" and "args" are a sprintf-style format pattern and
11123 encapsulated argument list. These are used to generate a
11124 string message. If the message does not end with a newline,
11125 then it will be extended with some indication of the current
11126 location in the code, as described for "mess_sv".
11127
11128 The error message or object will by default be written to
11129 standard error, but this is subject to modification by a
11130 $SIG{__WARN__} handler.
11131
11132 Unlike with "vcroak", "pat" is not permitted to be null.
11133
11134 void vwarn(const char *pat, va_list *args)
11135
11136 warn This is an XS interface to Perl's "warn" function.
11137
11138 Take a sprintf-style format pattern and argument list. These
11139 are used to generate a string message. If the message does not
11140 end with a newline, then it will be extended with some
11141 indication of the current location in the code, as described
11142 for "mess_sv".
11143
11144 The error message or object will by default be written to
11145 standard error, but this is subject to modification by a
11146 $SIG{__WARN__} handler.
11147
11148 Unlike with "croak", "pat" is not permitted to be null.
11149
11150 void warn(const char *pat, ...)
11151
11152 warn_sv This is an XS interface to Perl's "warn" function.
11153
11154 "baseex" is the error message or object. If it is a reference,
11155 it will be used as-is. Otherwise it is used as a string, and
11156 if it does not end with a newline then it will be extended with
11157 some indication of the current location in the code, as
11158 described for "mess_sv".
11159
11160 The error message or object will by default be written to
11161 standard error, but this is subject to modification by a
11162 $SIG{__WARN__} handler.
11163
11164 To warn with a simple string message, the "warn" function may
11165 be more convenient.
11166
11167 void warn_sv(SV *baseex)
11168
11170 The following functions have been flagged as part of the public API,
11171 but are currently undocumented. Use them at your own risk, as the
11172 interfaces are subject to change. Functions that are not listed in
11173 this document are not intended for public use, and should NOT be used
11174 under any circumstances.
11175
11176 If you feel you need to use one of these functions, first send email to
11177 perl5-porters@perl.org <mailto:perl5-porters@perl.org>. It may be that
11178 there is a good reason for the function not being documented, and it
11179 should be removed from this list; or it may just be that no one has
11180 gotten around to documenting it. In the latter case, you will be asked
11181 to submit a patch to document the function. Once your patch is
11182 accepted, it will indicate that the interface is stable (unless it is
11183 explicitly marked otherwise) and usable by you.
11184
11185 GetVars
11186 Gv_AMupdate
11187 PerlIO_clearerr
11188 PerlIO_close
11189 PerlIO_context_layers
11190 PerlIO_eof
11191 PerlIO_error
11192 PerlIO_fileno
11193 PerlIO_fill
11194 PerlIO_flush
11195 PerlIO_get_base
11196 PerlIO_get_bufsiz
11197 PerlIO_get_cnt
11198 PerlIO_get_ptr
11199 PerlIO_read
11200 PerlIO_seek
11201 PerlIO_set_cnt
11202 PerlIO_set_ptrcnt
11203 PerlIO_setlinebuf
11204 PerlIO_stderr
11205 PerlIO_stdin
11206 PerlIO_stdout
11207 PerlIO_tell
11208 PerlIO_unread
11209 PerlIO_write
11210 _variant_byte_number
11211 amagic_call
11212 amagic_deref_call
11213 any_dup
11214 atfork_lock
11215 atfork_unlock
11216 av_arylen_p
11217 av_iter_p
11218 block_gimme
11219 call_atexit
11220 call_list
11221 calloc
11222 cast_i32
11223 cast_iv
11224 cast_ulong
11225 cast_uv
11226 ck_warner
11227 ck_warner_d
11228 ckwarn
11229 ckwarn_d
11230 clear_defarray
11231 clone_params_del
11232 clone_params_new
11233 croak_memory_wrap
11234 croak_nocontext
11235 csighandler
11236 cx_dump
11237 cx_dup
11238 cxinc
11239 deb
11240 deb_nocontext
11241 debop
11242 debprofdump
11243 debstack
11244 debstackptrs
11245 delimcpy
11246 despatch_signals
11247 die_nocontext
11248 dirp_dup
11249 do_aspawn
11250 do_binmode
11251 do_close
11252 do_gv_dump
11253 do_gvgv_dump
11254 do_hv_dump
11255 do_join
11256 do_magic_dump
11257 do_op_dump
11258 do_open
11259 do_open9
11260 do_openn
11261 do_pmop_dump
11262 do_spawn
11263 do_spawn_nowait
11264 do_sprintf
11265 do_sv_dump
11266 doing_taint
11267 doref
11268 dounwind
11269 dowantarray
11270 dump_eval
11271 dump_form
11272 dump_indent
11273 dump_mstats
11274 dump_sub
11275 dump_vindent
11276 filter_add
11277 filter_del
11278 filter_read
11279 foldEQ_latin1
11280 form_nocontext
11281 fp_dup
11282 fprintf_nocontext
11283 free_global_struct
11284 free_tmps
11285 get_context
11286 get_mstats
11287 get_op_descs
11288 get_op_names
11289 get_ppaddr
11290 get_vtbl
11291 gp_dup
11292 gp_free
11293 gp_ref
11294 gv_AVadd
11295 gv_HVadd
11296 gv_IOadd
11297 gv_SVadd
11298 gv_add_by_type
11299 gv_autoload4
11300 gv_autoload_pv
11301 gv_autoload_pvn
11302 gv_autoload_sv
11303 gv_check
11304 gv_dump
11305 gv_efullname
11306 gv_efullname3
11307 gv_efullname4
11308 gv_fetchfile
11309 gv_fetchfile_flags
11310 gv_fetchpv
11311 gv_fetchpvn_flags
11312 gv_fetchsv
11313 gv_fullname
11314 gv_fullname3
11315 gv_fullname4
11316 gv_handler
11317 gv_name_set
11318 he_dup
11319 hek_dup
11320 hv_common
11321 hv_common_key_len
11322 hv_delayfree_ent
11323 hv_eiter_p
11324 hv_eiter_set
11325 hv_free_ent
11326 hv_ksplit
11327 hv_name_set
11328 hv_placeholders_get
11329 hv_placeholders_set
11330 hv_rand_set
11331 hv_riter_p
11332 hv_riter_set
11333 ibcmp_utf8
11334 init_global_struct
11335 init_stacks
11336 init_tm
11337 instr
11338 is_lvalue_sub
11339 leave_scope
11340 load_module_nocontext
11341 magic_dump
11342 malloc
11343 markstack_grow
11344 mess_nocontext
11345 mfree
11346 mg_dup
11347 mg_size
11348 mini_mktime
11349 moreswitches
11350 mro_get_from_name
11351 mro_get_private_data
11352 mro_set_mro
11353 mro_set_private_data
11354 my_atof
11355 my_atof2
11356 my_atof3
11357 my_chsize
11358 my_cxt_index
11359 my_cxt_init
11360 my_dirfd
11361 my_exit
11362 my_failure_exit
11363 my_fflush_all
11364 my_fork
11365 my_lstat
11366 my_pclose
11367 my_popen
11368 my_popen_list
11369 my_setenv
11370 my_socketpair
11371 my_stat
11372 my_strftime
11373 newANONATTRSUB
11374 newANONHASH
11375 newANONLIST
11376 newANONSUB
11377 newATTRSUB
11378 newAVREF
11379 newCVREF
11380 newFORM
11381 newGVREF
11382 newGVgen
11383 newGVgen_flags
11384 newHVREF
11385 newHVhv
11386 newIO
11387 newMYSUB
11388 newPROG
11389 newRV
11390 newSUB
11391 newSVREF
11392 newSVpvf_nocontext
11393 newSVsv_flags
11394 new_stackinfo
11395 op_refcnt_lock
11396 op_refcnt_unlock
11397 parser_dup
11398 perl_alloc_using
11399 perl_clone_using
11400 pmop_dump
11401 pop_scope
11402 pregcomp
11403 pregexec
11404 pregfree
11405 pregfree2
11406 printf_nocontext
11407 ptr_table_fetch
11408 ptr_table_free
11409 ptr_table_new
11410 ptr_table_split
11411 ptr_table_store
11412 push_scope
11413 re_compile
11414 re_dup_guts
11415 re_intuit_start
11416 re_intuit_string
11417 realloc
11418 reentrant_free
11419 reentrant_init
11420 reentrant_retry
11421 reentrant_size
11422 ref
11423 reg_named_buff_all
11424 reg_named_buff_exists
11425 reg_named_buff_fetch
11426 reg_named_buff_firstkey
11427 reg_named_buff_nextkey
11428 reg_named_buff_scalar
11429 regdump
11430 regdupe_internal
11431 regexec_flags
11432 regfree_internal
11433 reginitcolors
11434 regnext
11435 repeatcpy
11436 rsignal
11437 rsignal_state
11438 runops_debug
11439 runops_standard
11440 rvpv_dup
11441 safesyscalloc
11442 safesysfree
11443 safesysmalloc
11444 safesysrealloc
11445 save_I16
11446 save_I32
11447 save_I8
11448 save_adelete
11449 save_aelem
11450 save_aelem_flags
11451 save_alloc
11452 save_aptr
11453 save_ary
11454 save_bool
11455 save_clearsv
11456 save_delete
11457 save_destructor
11458 save_destructor_x
11459 save_freeop
11460 save_freepv
11461 save_freesv
11462 save_generic_pvref
11463 save_generic_svref
11464 save_hash
11465 save_hdelete
11466 save_helem
11467 save_helem_flags
11468 save_hints
11469 save_hptr
11470 save_int
11471 save_item
11472 save_iv
11473 save_list
11474 save_long
11475 save_mortalizesv
11476 save_nogv
11477 save_op
11478 save_padsv_and_mortalize
11479 save_pptr
11480 save_pushi32ptr
11481 save_pushptr
11482 save_pushptrptr
11483 save_re_context
11484 save_scalar
11485 save_set_svflags
11486 save_shared_pvref
11487 save_sptr
11488 save_svref
11489 save_vptr
11490 savestack_grow
11491 savestack_grow_cnt
11492 scan_num
11493 scan_vstring
11494 seed
11495 set_context
11496 share_hek
11497 si_dup
11498 ss_dup
11499 stack_grow
11500 start_subparse
11501 str_to_version
11502 sv_2iv
11503 sv_2pv
11504 sv_2uv
11505 sv_catpvf_mg_nocontext
11506 sv_catpvf_nocontext
11507 sv_dup
11508 sv_dup_inc
11509 sv_peek
11510 sv_pvn_nomg
11511 sv_setpvf_mg_nocontext
11512 sv_setpvf_nocontext
11513 sys_init
11514 sys_init3
11515 sys_intern_clear
11516 sys_intern_dup
11517 sys_intern_init
11518 sys_term
11519 taint_env
11520 taint_proper
11521 unlnk
11522 unsharepvn
11523 uvuni_to_utf8
11524 vdeb
11525 vform
11526 vload_module
11527 vnewSVpvf
11528 vwarner
11529 warn_nocontext
11530 warner
11531 warner_nocontext
11532 whichsig
11533 whichsig_pv
11534 whichsig_pvn
11535 whichsig_sv
11536
11538 Until May 1997, this document was maintained by Jeff Okamoto
11539 <okamoto@corp.hp.com>. It is now maintained as part of Perl itself.
11540
11541 With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
11542 Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
11543 Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
11544 Stephen McCamant, and Gurusamy Sarathy.
11545
11546 API Listing originally by Dean Roehrich <roehrich@cray.com>.
11547
11548 Updated to be autogenerated from comments in the source by Benjamin
11549 Stuhl.
11550
11552 perlguts, perlxs, perlxstut, perlintern
11553
11554
11555
11556perl v5.30.1 2019-11-29 PERLAPI(1)