1XS(3) User Contributed Perl Documentation XS(3)
2
3
4
6 Cpanel::JSON::XS - cPanel fork of JSON::XS, fast and correct
7 serializing
8
10 use Cpanel::JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = Cpanel::JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 # Note that 5.6 misses most smart utf8 and encoding functionalities
25 # of newer releases.
26
27 # Note that L<JSON::MaybeXS> will automatically use Cpanel::JSON::XS
28 # if available, at virtually no speed overhead either, so you should
29 # be able to just:
30
31 use JSON::MaybeXS;
32
33 # and do the same things, except that you have a pure-perl fallback now.
34
35 Note that this module will be replaced by a new JSON::Safe module soon,
36 with the same API just guaranteed safe defaults.
37
39 This module converts Perl data structures to JSON and vice versa. Its
40 primary goal is to be correct and its secondary goal is to be fast. To
41 reach the latter goal it was written in C.
42
43 As this is the n-th-something JSON module on CPAN, what was the reason
44 to write yet another JSON module? While it seems there are many JSON
45 modules, none of them correctly handle all corner cases, and in most
46 cases their maintainers are unresponsive, gone missing, or not
47 listening to bug reports for other reasons.
48
49 See below for the cPanel fork.
50
51 See MAPPING, below, on how Cpanel::JSON::XS maps perl values to JSON
52 values and vice versa.
53
54 FEATURES
55 • correct Unicode handling
56
57 This module knows how to handle Unicode with Perl version higher
58 than 5.8.5, documents how and when it does so, and even documents
59 what "correct" means.
60
61 • round-trip integrity
62
63 When you serialize a perl data structure using only data types
64 supported by JSON and Perl, the deserialized data structure is
65 identical on the Perl level. (e.g. the string "2.0" doesn't
66 suddenly become "2" just because it looks like a number). There are
67 minor exceptions to this, read the MAPPING section below to learn
68 about those.
69
70 • strict checking of JSON correctness
71
72 There is no guessing, no generating of illegal JSON texts by
73 default, and only JSON is accepted as input by default. the latter
74 is a security feature.
75
76 • fast
77
78 Compared to other JSON modules and other serializers such as
79 Storable, this module usually compares favourably in terms of
80 speed, too.
81
82 • simple to use
83
84 This module has both a simple functional interface as well as an
85 object oriented interface.
86
87 • reasonably versatile output formats
88
89 You can choose between the most compact guaranteed-single-line
90 format possible (nice for simple line-based protocols), a pure-
91 ASCII format (for when your transport is not 8-bit clean, still
92 supports the whole Unicode range), or a pretty-printed format (for
93 when you want to read that stuff). Or you can combine those
94 features in whatever way you like.
95
96 cPanel fork
97 Since the original author MLEHMANN has no public bugtracker, this
98 cPanel fork sits now on github.
99
100 src repo: <https://github.com/rurban/Cpanel-JSON-XS> original:
101 <http://cvs.schmorp.de/JSON-XS/>
102
103 RT: <https://github.com/rurban/Cpanel-JSON-XS/issues> or
104 <https://rt.cpan.org/Public/Dist/Display.html?Queue=Cpanel-JSON-XS>
105
106 Changes to JSON::XS
107
108 - bare hashkeys are now checked for utf8. (GH #209)
109
110 - stricter decode_json() as documented. non-refs are disallowed.
111 safe by default.
112 added a 2nd optional argument. decode() honors now allow_nonref.
113
114 - fixed encode of numbers for dual-vars. Different string
115 representations are preserved, but numbers with temporary strings
116 which represent the same number are here treated as numbers, not
117 strings. Cpanel::JSON::XS is a bit slower, but preserves numeric
118 types better.
119
120 - numbers ending with .0 stray numbers, are not converted to
121 integers. [#63] dual-vars which are represented as number not
122 integer (42+"bar" != 5.8.9) are now encoded as number (=> 42.0)
123 because internally it's now a NOK type. However !!1 which is
124 wrongly encoded in 5.8 as "1"/1.0 is still represented as integer.
125
126 - different handling of inf/nan. Default now to null, optionally with
127 stringify_infnan() to "inf"/"nan". [#28, #32]
128
129 - added "binary" extension, non-JSON and non JSON parsable, allows
130 "\xNN" and "\NNN" sequences.
131
132 - 5.6.2 support; sacrificing some utf8 features (assuming bytes
133 all-over), no multi-byte unicode characters with 5.6.
134
135 - interop for true/false overloading. JSON::XS, JSON::PP and Mojo::JSON
136 representations for booleans are accepted and JSON::XS accepts
137 Cpanel::JSON::XS booleans [#13, #37]
138 Fixed overloading of booleans. Cpanel::JSON::XS::true stringifies
139 again
140 to "1", not "true", analog to all other JSON modules.
141
142 - native boolean mapping of yes and no to true and false, as in
143 YAML::XS.
144 In perl "!0" is yes, "!1" is no.
145 The JSON value true maps to 1, false maps to 0. [#39]
146
147 - support arbitrary stringification with encode, with convert_blessed
148 and allow_blessed.
149
150 - ithread support. Cpanel::JSON::XS is thread-safe, JSON::XS not
151
152 - is_bool can be called as method, JSON::XS::is_bool not.
153
154 - performance optimizations for threaded Perls
155
156 - relaxed mode, allowing many popular extensions
157
158 - protect our magic object from corruption by wrong or missing external
159 methods, like FREEZE/THAW or serialization with other methods.
160
161 - additional fixes for:
162
163 - #208 - no security-relevant out-of-bounds reading of module memory
164 when decoding hash keys without ending ':'
165
166 - [cpan #88061] AIX atof without USE_LONG_DOUBLE
167
168 - #10 unshare_hek crash
169
170 - #7, #29 avoid re-blessing where possible. It fails in JSON::XS for
171 READONLY values, i.e. restricted hashes.
172
173 - #41 overloading of booleans, use the object not the reference.
174
175 - #62 -Dusequadmath conversion and no SEGV.
176
177 - #72 parsing of values followed \0, like 1\0 does fail.
178
179 - #72 parsing of illegal unicode or non-unicode characters.
180
181 - #96 locale-insensitive numeric conversion.
182
183 - #154 numeric conversion fixed since 5.22, using the same strtold as perl5.
184
185 - #167 sort tied hashes with canonical.
186
187 - #212 fix utf8 object stringification
188
189 - public maintenance and bugtracker
190
191 - use ppport.h, sanify XS.xs comment styles, harness C coding style
192
193 - common::sense is optional. When available it is not used in the
194 published production module, just during development and testing.
195
196 - extended testsuite, passes all http://seriot.ch/parsing_json.html
197 tests. In fact it is the only know JSON decoder which does so,
198 while also being the fastest.
199
200 - support many more options and methods from JSON::PP:
201 stringify_infnan, allow_unknown, allow_stringify, allow_barekey,
202 encode_stringify, allow_bignum, allow_singlequote,
203 dupkeys_as_arrayref,
204 sort_by (partially), escape_slash, convert_blessed, ...
205 optional decode_json(, allow_nonref) arg.
206 relaxed implements allow_dupkeys.
207
208 - support all 5 unicode BOM's: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE,
209 UTF-32BE, encoding internally to UTF-8.
210
212 The following convenience methods are provided by this module. They are
213 exported by default:
214
215 $json_text = encode_json $perl_scalar, [json_type]
216 Converts the given Perl data structure to a UTF-8 encoded, binary
217 string (that is, the string contains octets only). Croaks on error.
218
219 This function call is functionally identical to:
220
221 $json_text = Cpanel::JSON::XS->new->utf8->encode ($perl_scalar, $json_type)
222
223 Except being faster.
224
225 For the type argument see Cpanel::JSON::XS::Type.
226
227 $perl_scalar = decode_json $json_text [, $allow_nonref [, my $json_type
228 ] ]
229 The opposite of "encode_json": expects an UTF-8 (binary) string of
230 an json reference and tries to parse that as an UTF-8 encoded JSON
231 text, returning the resulting reference. Croaks on error.
232
233 This function call is functionally identical to:
234
235 $perl_scalar = Cpanel::JSON::XS->new->utf8->decode ($json_text, $json_type)
236
237 except being faster.
238
239 Note that older decode_json versions in Cpanel::JSON::XS older than
240 3.0116 and JSON::XS did not set allow_nonref but allowed them due
241 to a bug in the decoder.
242
243 If the new 2nd optional $allow_nonref argument is set and not
244 false, the "allow_nonref" option will be set and the function will
245 act is described as in the relaxed RFC 7159 allowing all values
246 such as objects, arrays, strings, numbers, "null", "true", and
247 "false". See ""OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159)" below,
248 why you don't want to do that.
249
250 For the 3rd optional type argument see Cpanel::JSON::XS::Type.
251
252 $is_boolean = Cpanel::JSON::XS::is_bool $scalar
253 Returns true if the passed scalar represents either
254 "JSON::PP::true" or "JSON::PP::false", two constants that act like
255 1 and 0, respectively and are used to represent JSON "true" and
256 "false" values in Perl. (Also recognizes the booleans produced by
257 JSON::XS.)
258
259 See MAPPING, below, for more information on how JSON values are
260 mapped to Perl.
261
263 from_json
264 from_json has been renamed to decode_json
265
266 to_json
267 to_json has been renamed to encode_json
268
270 Since this often leads to confusion, here are a few very clear words on
271 how Unicode works in Perl, modulo bugs.
272
273 1. Perl strings can store characters with ordinal values > 255.
274 This enables you to store Unicode characters as single characters
275 in a Perl string - very natural.
276
277 2. Perl does not associate an encoding with your strings.
278 ... until you force it to, e.g. when matching it against a regex,
279 or printing the scalar to a file, in which case Perl either
280 interprets your string as locale-encoded text, octets/binary, or as
281 Unicode, depending on various settings. In no case is an encoding
282 stored together with your data, it is use that decides encoding,
283 not any magical meta data.
284
285 3. The internal utf-8 flag has no meaning with regards to the encoding
286 of your string.
287 4. A "Unicode String" is simply a string where each character can be
288 validly interpreted as a Unicode code point.
289 If you have UTF-8 encoded data, it is no longer a Unicode string,
290 but a Unicode string encoded in UTF-8, giving you a binary string.
291
292 5. A string containing "high" (> 255) character values is not a UTF-8
293 string.
294 6. Unicode noncharacters only warn, as in core.
295 The 66 Unicode noncharacters U+FDD0..U+FDEF, and U+*FFFE, U+*FFFF
296 just warn, see <http://www.unicode.org/versions/corrigendum9.html>.
297 But illegal surrogate pairs fail to parse.
298
299 7. Raw non-Unicode characters above U+10FFFF are disallowed.
300 Raw non-Unicode characters outside the valid unicode range fail to
301 parse, because "A string is a sequence of zero or more Unicode
302 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
303 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER
304 flag when parsing unicode.
305
306 I hope this helps :)
307
309 The object oriented interface lets you configure your own encoding or
310 decoding style, within the limits of supported formats.
311
312 $json = new Cpanel::JSON::XS
313 Creates a new JSON object that can be used to de/encode JSON
314 strings. All boolean flags described below are by default disabled.
315
316 The mutators for flags all return the JSON object again and thus
317 calls can be chained:
318
319 my $json = Cpanel::JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
320 => {"a": [1, 2]}
321
322 $json = $json->ascii ([$enable])
323 $enabled = $json->get_ascii
324 If $enable is true (or missing), then the "encode" method will not
325 generate characters outside the code range 0..127 (which is ASCII).
326 Any Unicode characters outside that range will be escaped using
327 either a single "\uXXXX" (BMP characters) or a double
328 "\uHHHH\uLLLLL" escape sequence, as per RFC4627. The resulting
329 encoded JSON text can be treated as a native Unicode string, an
330 ascii-encoded, latin1-encoded or UTF-8 encoded string, or any other
331 superset of ASCII.
332
333 If $enable is false, then the "encode" method will not escape
334 Unicode characters unless required by the JSON syntax or other
335 flags. This results in a faster and more compact format.
336
337 See also the section ENCODING/CODESET FLAG NOTES later in this
338 document.
339
340 The main use for this flag is to produce JSON texts that can be
341 transmitted over a 7-bit channel, as the encoded JSON texts will
342 not contain any 8 bit characters.
343
344 Cpanel::JSON::XS->new->ascii (1)->encode ([chr 0x10401])
345 => ["\ud801\udc01"]
346
347 $json = $json->latin1 ([$enable])
348 $enabled = $json->get_latin1
349 If $enable is true (or missing), then the "encode" method will
350 encode the resulting JSON text as latin1 (or ISO-8859-1), escaping
351 any characters outside the code range 0..255. The resulting string
352 can be treated as a latin1-encoded JSON text or a native Unicode
353 string. The "decode" method will not be affected in any way by this
354 flag, as "decode" by default expects Unicode, which is a strict
355 superset of latin1.
356
357 If $enable is false, then the "encode" method will not escape
358 Unicode characters unless required by the JSON syntax or other
359 flags.
360
361 See also the section ENCODING/CODESET FLAG NOTES later in this
362 document.
363
364 The main use for this flag is efficiently encoding binary data as
365 JSON text, as most octets will not be escaped, resulting in a
366 smaller encoded size. The disadvantage is that the resulting JSON
367 text is encoded in latin1 (and must correctly be treated as such
368 when storing and transferring), a rare encoding for JSON. It is
369 therefore most useful when you want to store data structures known
370 to contain binary data efficiently in files or databases, not when
371 talking to other JSON encoders/decoders.
372
373 Cpanel::JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
374 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
375
376 $json = $json->binary ([$enable])
377 $enabled = $json = $json->get_binary
378 If the $enable argument is true (or missing), then the "encode"
379 method will not try to detect an UTF-8 encoding in any JSON string,
380 it will strictly interpret it as byte sequence. The result might
381 contain new "\xNN" sequences, which is unparsable JSON. The
382 "decode" method forbids "\uNNNN" sequences and accepts "\xNN" and
383 octal "\NNN" sequences.
384
385 There is also a special logic for perl 5.6 and utf8. 5.6 encodes
386 any string to utf-8 automatically when seeing a codepoint >= 0x80
387 and < 0x100. With the binary flag enabled decode the perl utf8
388 encoded string to the original byte encoding and encode this with
389 "\xNN" escapes. This will result to the same encodings as with
390 newer perls. But note that binary multi-byte codepoints with 5.6
391 will result in "illegal unicode character in binary string" errors,
392 unlike with newer perls.
393
394 If $enable is false, then the "encode" method will smartly try to
395 detect Unicode characters unless required by the JSON syntax or
396 other flags and hex and octal sequences are forbidden.
397
398 See also the section ENCODING/CODESET FLAG NOTES later in this
399 document.
400
401 The main use for this flag is to avoid the smart unicode detection
402 and possible double encoding. The disadvantage is that the
403 resulting JSON text is encoded in new "\xNN" and in latin1
404 characters and must correctly be treated as such when storing and
405 transferring, a rare encoding for JSON. It will produce non-
406 readable JSON strings in the browser. It is therefore most useful
407 when you want to store data structures known to contain binary data
408 efficiently in files or databases, not when talking to other JSON
409 encoders/decoders. The binary decoding method can also be used
410 when an encoder produced a non-JSON conformant hex or octal
411 encoding "\xNN" or "\NNN".
412
413 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{abc}"])
414 5.6: Error: malformed or illegal unicode character in binary string
415 >=5.8: ['\x89\xe0\xaa\xbc']
416
417 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{bc}"])
418 => ["\x89\xbc"]
419
420 Cpanel::JSON::XS->new->binary->decode (["\x89\ua001"])
421 Error: malformed or illegal unicode character in binary string
422
423 Cpanel::JSON::XS->new->decode (["\x89"])
424 Error: illegal hex character in non-binary string
425
426 $json = $json->utf8 ([$enable])
427 $enabled = $json->get_utf8
428 If $enable is true (or missing), then the "encode" method will
429 encode the JSON result into UTF-8, as required by many protocols,
430 while the "decode" method expects to be handled an UTF-8-encoded
431 string. Please note that UTF-8-encoded strings do not contain any
432 characters outside the range 0..255, they are thus useful for
433 bytewise/binary I/O. In future versions, enabling this option might
434 enable autodetection of the UTF-16 and UTF-32 encoding families, as
435 described in RFC4627.
436
437 If $enable is false, then the "encode" method will return the JSON
438 string as a (non-encoded) Unicode string, while "decode" expects
439 thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or
440 UTF-16) needs to be done yourself, e.g. using the Encode module.
441
442 See also the section ENCODING/CODESET FLAG NOTES later in this
443 document.
444
445 Example, output UTF-16BE-encoded JSON:
446
447 use Encode;
448 $jsontext = encode "UTF-16BE", Cpanel::JSON::XS->new->encode ($object);
449
450 Example, decode UTF-32LE-encoded JSON:
451
452 use Encode;
453 $object = Cpanel::JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
454
455 $json = $json->pretty ([$enable])
456 This enables (or disables) all of the "indent", "space_before" and
457 "space_after" (and in the future possibly more) flags in one call
458 to generate the most readable (or most compact) form possible.
459
460 Example, pretty-print some simple structure:
461
462 my $json = Cpanel::JSON::XS->new->pretty(1)->encode ({a => [1,2]})
463 =>
464 {
465 "a" : [
466 1,
467 2
468 ]
469 }
470
471 $json = $json->indent ([$enable])
472 $enabled = $json->get_indent
473 If $enable is true (or missing), then the "encode" method will use
474 a multiline format as output, putting every array member or
475 object/hash key-value pair into its own line, indenting them
476 properly.
477
478 If $enable is false, no newlines or indenting will be produced, and
479 the resulting JSON text is guaranteed not to contain any
480 "newlines".
481
482 This setting has no effect when decoding JSON texts.
483
484 $json = $json->indent_length([$number_of_spaces])
485 $length = $json->get_indent_length()
486 Set the indent length (default 3). This option is only useful when
487 you also enable indent or pretty. The acceptable range is from 0
488 (no indentation) to 15
489
490 $json = $json->space_before ([$enable])
491 $enabled = $json->get_space_before
492 If $enable is true (or missing), then the "encode" method will add
493 an extra optional space before the ":" separating keys from values
494 in JSON objects.
495
496 If $enable is false, then the "encode" method will not add any
497 extra space at those places.
498
499 This setting has no effect when decoding JSON texts. You will also
500 most likely combine this setting with "space_after".
501
502 Example, space_before enabled, space_after and indent disabled:
503
504 {"key" :"value"}
505
506 $json = $json->space_after ([$enable])
507 $enabled = $json->get_space_after
508 If $enable is true (or missing), then the "encode" method will add
509 an extra optional space after the ":" separating keys from values
510 in JSON objects and extra whitespace after the "," separating key-
511 value pairs and array members.
512
513 If $enable is false, then the "encode" method will not add any
514 extra space at those places.
515
516 This setting has no effect when decoding JSON texts.
517
518 Example, space_before and indent disabled, space_after enabled:
519
520 {"key": "value"}
521
522 $json = $json->relaxed ([$enable])
523 $enabled = $json->get_relaxed
524 If $enable is true (or missing), then "decode" will accept some
525 extensions to normal JSON syntax (see below). "encode" will not be
526 affected in anyway. Be aware that this option makes you accept
527 invalid JSON texts as if they were valid!. I suggest only to use
528 this option to parse application-specific files written by humans
529 (configuration files, resource files etc.)
530
531 If $enable is false (the default), then "decode" will only accept
532 valid JSON texts.
533
534 Currently accepted extensions are:
535
536 • list items can have an end-comma
537
538 JSON separates array elements and key-value pairs with commas.
539 This can be annoying if you write JSON texts manually and want
540 to be able to quickly append elements, so this extension
541 accepts comma at the end of such items not just between them:
542
543 [
544 1,
545 2, <- this comma not normally allowed
546 ]
547 {
548 "k1": "v1",
549 "k2": "v2", <- this comma not normally allowed
550 }
551
552 • shell-style '#'-comments
553
554 Whenever JSON allows whitespace, shell-style comments are
555 additionally allowed. They are terminated by the first
556 carriage-return or line-feed character, after which more white-
557 space and comments are allowed.
558
559 [
560 1, # this comment not allowed in JSON
561 # neither this one...
562 ]
563
564 • literal ASCII TAB characters in strings
565
566 Literal ASCII TAB characters are now allowed in strings (and
567 treated as "\t") in relaxed mode. Despite JSON mandates, that
568 TAB character is substituted for "\t" sequence.
569
570 [
571 "Hello\tWorld",
572 "Hello<TAB>World", # literal <TAB> would not normally be allowed
573 ]
574
575 • allow_singlequote
576
577 Single quotes are accepted instead of double quotes. See the
578 "allow_singlequote" option.
579
580 { "foo":'bar' }
581 { 'foo':"bar" }
582 { 'foo':'bar' }
583
584 • allow_barekey
585
586 Accept unquoted object keys instead of with mandatory double
587 quotes. See the "allow_barekey" option.
588
589 { foo:"bar" }
590
591 • allow_dupkeys
592
593 Allow decoding of duplicate keys in hashes. By default
594 duplicate keys are forbidden. See
595 <http://seriot.ch/parsing_json.php#24>: RFC 7159 section 4:
596 "The names within an object should be unique." See the
597 "allow_dupkeys" option.
598
599 $json = $json->canonical ([$enable])
600 $enabled = $json->get_canonical
601 If $enable is true (or missing), then the "encode" method will
602 output JSON objects by sorting their keys. This is adding a
603 comparatively high overhead.
604
605 If $enable is false, then the "encode" method will output key-value
606 pairs in the order Perl stores them (which will likely change
607 between runs of the same script, and can change even within the
608 same run from 5.18 onwards).
609
610 This option is useful if you want the same data structure to be
611 encoded as the same JSON text (given the same overall settings). If
612 it is disabled, the same hash might be encoded differently even if
613 contains the same data, as key-value pairs have no inherent
614 ordering in Perl.
615
616 This setting has no effect when decoding JSON texts.
617
618 This is now also done with tied hashes, contrary to JSON::XS. But
619 note that with most large tied hashes stored as tree it is advised
620 to sort the iterator already and don't sort the hash output here.
621 Most such iterators are already sorted, as such e.g. DB_File with
622 "DB_BTREE".
623
624 $json = $json->sort_by (undef, 0, 1 or a block)
625 This currently only (un)sets the "canonical" option, and ignores
626 custom sort blocks.
627
628 This setting has no effect when decoding JSON texts.
629
630 This setting has currently no effect on tied hashes.
631
632 $json = $json->escape_slash ([$enable])
633 $enabled = $json->get_escape_slash
634 According to the JSON Grammar, the forward slash character (U+002F)
635 "/" need to be escaped. But by default strings are encoded without
636 escaping slashes in all perl JSON encoders.
637
638 If $enable is true (or missing), then "encode" will escape slashes,
639 "\/".
640
641 This setting has no effect when decoding JSON texts.
642
643 $json = $json->unblessed_bool ([$enable])
644 $enabled = $json->get_unblessed_bool
645 $json = $json->unblessed_bool([$enable])
646
647 If $enable is true (or missing), then "decode" will return Perl
648 non-object boolean variables (1 and 0) for JSON booleans ("true"
649 and "false"). If $enable is false, then "decode" will return
650 "JSON::PP::Boolean" objects for JSON booleans.
651
652 $json = $json->allow_singlequote ([$enable])
653 $enabled = $json->get_allow_singlequote
654 $json = $json->allow_singlequote([$enable])
655
656 If $enable is true (or missing), then "decode" will accept JSON
657 strings quoted by single quotations that are invalid JSON format.
658
659 $json->allow_singlequote->decode({"foo":'bar'});
660 $json->allow_singlequote->decode({'foo':"bar"});
661 $json->allow_singlequote->decode({'foo':'bar'});
662
663 This is also enabled with "relaxed". As same as the "relaxed"
664 option, this option may be used to parse application-specific files
665 written by humans.
666
667 $json = $json->allow_barekey ([$enable])
668 $enabled = $json->get_allow_barekey
669 $json = $json->allow_barekey([$enable])
670
671 If $enable is true (or missing), then "decode" will accept bare
672 keys of JSON object that are invalid JSON format.
673
674 Same as with the "relaxed" option, this option may be used to parse
675 application-specific files written by humans.
676
677 $json->allow_barekey->decode('{foo:"bar"}');
678
679 $json = $json->allow_bignum ([$enable])
680 $enabled = $json->get_allow_bignum
681 $json = $json->allow_bignum([$enable])
682
683 If $enable is true (or missing), then "decode" will convert the big
684 integer Perl cannot handle as integer into a Math::BigInt object
685 and convert a floating number (any) into a Math::BigFloat.
686
687 On the contrary, "encode" converts "Math::BigInt" objects and
688 "Math::BigFloat" objects into JSON numbers with "allow_blessed"
689 enable.
690
691 $json->allow_nonref->allow_blessed->allow_bignum;
692 $bigfloat = $json->decode('2.000000000000000000000000001');
693 print $json->encode($bigfloat);
694 # => 2.000000000000000000000000001
695
696 See "MAPPING" about the normal conversion of JSON number.
697
698 $json = $json->allow_bigint ([$enable])
699 This option is obsolete and replaced by allow_bignum.
700
701 $json = $json->allow_nonref ([$enable])
702 $enabled = $json->get_allow_nonref
703 If $enable is true (or missing), then the "encode" method can
704 convert a non-reference into its corresponding string, number or
705 null JSON value, which is an extension to RFC4627. Likewise,
706 "decode" will accept those JSON values instead of croaking.
707
708 If $enable is false, then the "encode" method will croak if it
709 isn't passed an arrayref or hashref, as JSON texts must either be
710 an object or array. Likewise, "decode" will croak if given
711 something that is not a JSON object or array.
712
713 Example, encode a Perl scalar as JSON value with enabled
714 "allow_nonref", resulting in an invalid JSON text:
715
716 Cpanel::JSON::XS->new->allow_nonref->encode ("Hello, World!")
717 => "Hello, World!"
718
719 $json = $json->allow_unknown ([$enable])
720 $enabled = $json->get_allow_unknown
721 If $enable is true (or missing), then "encode" will not throw an
722 exception when it encounters values it cannot represent in JSON
723 (for example, filehandles) but instead will encode a JSON "null"
724 value. Note that blessed objects are not included here and are
725 handled separately by c<allow_nonref>.
726
727 If $enable is false (the default), then "encode" will throw an
728 exception when it encounters anything it cannot encode as JSON.
729
730 This option does not affect "decode" in any way, and it is
731 recommended to leave it off unless you know your communications
732 partner.
733
734 $json = $json->allow_stringify ([$enable])
735 $enabled = $json->get_allow_stringify
736 If $enable is true (or missing), then "encode" will stringify the
737 non-object perl value or reference. Note that blessed objects are
738 not included here and are handled separately by "allow_blessed" and
739 "convert_blessed". String references are stringified to the string
740 value, other references as in perl.
741
742 This option does not affect "decode" in any way.
743
744 This option is special to this module, it is not supported by other
745 encoders. So it is not recommended to use it.
746
747 $json = $json->require_types ([$enable])
748 $enable = $json->get_require_types
749 $json = $json->require_types([$enable])
750
751 If $enable is true (or missing), then "encode" will require either
752 enabled "type_all_string" or second argument with supplied JSON
753 types. See Cpanel::JSON::XS::Type. When "type_all_string" is not
754 enabled or second argument is not provided (or is undef), then
755 "encode" croaks. It also croaks when the type for provided
756 structure in "encode" is incomplete.
757
758 $json = $json->type_all_string ([$enable])
759 $enable = $json->get_type_all_string
760 $json = $json->type_all_string([$enable])
761
762 If $enable is true (or missing), then "encode" will always produce
763 stable deterministic JSON string types in resulted output.
764
765 When $enable is false, then result of encoded JSON output may be
766 different for different Perl versions and may depends on loaded
767 modules.
768
769 This is useful it you need deterministic JSON types, independently
770 of used Perl version and other modules, but do not want to write
771 complicated type definitions for Cpanel::JSON::XS::Type.
772
773 $json = $json->allow_dupkeys ([$enable])
774 $enabled = $json->get_allow_dupkeys
775 If $enable is true (or missing), then the "decode" method will not
776 die when it encounters duplicate keys in a hash. "allow_dupkeys"
777 is also enabled in the "relaxed" mode.
778
779 The JSON spec allows duplicate name in objects but recommends to
780 disable it, however with Perl hashes they are impossible, parsing
781 JSON in Perl silently ignores duplicate names, using the last value
782 found.
783
784 See <http://seriot.ch/parsing_json.php#24>: RFC 7159 section 4:
785 "The names within an object should be unique."
786
787 $json = $json->dupkeys_as_arrayref ([$enable])
788 $enabled = $json->get_dupkeys_as_arrayref
789 If enabled, allow decoding of duplicate keys in hashes and store
790 the values as arrayref in the hash instead. By default duplicate
791 keys are forbidden. Enabling this also enables the "allow_dupkeys"
792 option, but disabling this does not disable the "allow_dupkeys"
793 option.
794
795 Example:
796
797 $json->dupkeys_as_arrayref;
798 print encode_json ($json->decode ('{"a":"b","a":"c"}'));
799
800 => {"a":["b","c"]}
801
802 This changes the result structure, thus cannot be enabled by
803 default. The client must be aware of it. The resulting arrayref is
804 not yet marked somehow (blessed or such).
805
806 $json = $json->allow_blessed ([$enable])
807 $enabled = $json->get_allow_blessed
808 If $enable is true (or missing), then the "encode" method will not
809 barf when it encounters a blessed reference. Instead, the value of
810 the convert_blessed option will decide whether "null"
811 ("convert_blessed" disabled or no "TO_JSON" method found) or a
812 representation of the object ("convert_blessed" enabled and
813 "TO_JSON" method found) is being encoded. Has no effect on
814 "decode".
815
816 If $enable is false (the default), then "encode" will throw an
817 exception when it encounters a blessed object without
818 "convert_blessed" and a "TO_JSON" method.
819
820 This setting has no effect on "decode".
821
822 $json = $json->convert_blessed ([$enable])
823 $enabled = $json->get_convert_blessed
824 If $enable is true (or missing), then "encode", upon encountering a
825 blessed object, will check for the availability of the "TO_JSON"
826 method on the object's class. If found, it will be called in scalar
827 context and the resulting scalar will be encoded instead of the
828 object. If no "TO_JSON" method is found, a stringification overload
829 method is tried next. If both are not found, the value of
830 "allow_blessed" will decide what to do.
831
832 The "TO_JSON" method may safely call die if it wants. If "TO_JSON"
833 returns other blessed objects, those will be handled in the same
834 way. "TO_JSON" must take care of not causing an endless recursion
835 cycle (== crash) in this case. The same care must be taken with
836 calling encode in stringify overloads (even if this works by luck
837 in older perls) or other callbacks. The name of "TO_JSON" was
838 chosen because other methods called by the Perl core (== not by the
839 user of the object) are usually in upper case letters and to avoid
840 collisions with any "to_json" function or method.
841
842 If $enable is false (the default), then "encode" will not consider
843 this type of conversion.
844
845 This setting has no effect on "decode".
846
847 $json = $json->allow_tags ([$enable])
848 $enabled = $json->get_allow_tags
849 See "OBJECT SERIALIZATION" for details.
850
851 If $enable is true (or missing), then "encode", upon encountering a
852 blessed object, will check for the availability of the "FREEZE"
853 method on the object's class. If found, it will be used to
854 serialize the object into a nonstandard tagged JSON value (that
855 JSON decoders cannot decode).
856
857 It also causes "decode" to parse such tagged JSON values and
858 deserialize them via a call to the "THAW" method.
859
860 If $enable is false (the default), then "encode" will not consider
861 this type of conversion, and tagged JSON values will cause a parse
862 error in "decode", as if tags were not part of the grammar.
863
864 $json = $json->filter_json_object ([$coderef->($hashref)])
865 When $coderef is specified, it will be called from "decode" each
866 time it decodes a JSON object. The only argument is a reference to
867 the newly-created hash. If the code references returns a single
868 scalar (which need not be a reference), this value (i.e. a copy of
869 that scalar to avoid aliasing) is inserted into the deserialized
870 data structure. If it returns an empty list (NOTE: not "undef",
871 which is a valid scalar), the original deserialized hash will be
872 inserted. This setting can slow down decoding considerably.
873
874 When $coderef is omitted or undefined, any existing callback will
875 be removed and "decode" will not change the deserialized hash in
876 any way.
877
878 Example, convert all JSON objects into the integer 5:
879
880 my $js = Cpanel::JSON::XS->new->filter_json_object (sub { 5 });
881 # returns [5]
882 $js->decode ('[{}]')
883 # throw an exception because allow_nonref is not enabled
884 # so a lone 5 is not allowed.
885 $js->decode ('{"a":1, "b":2}');
886
887 $json = $json->filter_json_single_key_object ($key [=>
888 $coderef->($value)])
889 Works remotely similar to "filter_json_object", but is only called
890 for JSON objects having a single key named $key.
891
892 This $coderef is called before the one specified via
893 "filter_json_object", if any. It gets passed the single value in
894 the JSON object. If it returns a single value, it will be inserted
895 into the data structure. If it returns nothing (not even "undef"
896 but the empty list), the callback from "filter_json_object" will be
897 called next, as if no single-key callback were specified.
898
899 If $coderef is omitted or undefined, the corresponding callback
900 will be disabled. There can only ever be one callback for a given
901 key.
902
903 As this callback gets called less often then the
904 "filter_json_object" one, decoding speed will not usually suffer as
905 much. Therefore, single-key objects make excellent targets to
906 serialize Perl objects into, especially as single-key JSON objects
907 are as close to the type-tagged value concept as JSON gets (it's
908 basically an ID/VALUE tuple). Of course, JSON does not support this
909 in any way, so you need to make sure your data never looks like a
910 serialized Perl hash.
911
912 Typical names for the single object key are "__class_whatever__",
913 or "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or
914 even things like "__class_md5sum(classname)__", to reduce the risk
915 of clashing with real hashes.
916
917 Example, decode JSON objects of the form "{ "__widget__" => <id> }"
918 into the corresponding $WIDGET{<id>} object:
919
920 # return whatever is in $WIDGET{5}:
921 Cpanel::JSON::XS
922 ->new
923 ->filter_json_single_key_object (__widget__ => sub {
924 $WIDGET{ $_[0] }
925 })
926 ->decode ('{"__widget__": 5')
927
928 # this can be used with a TO_JSON method in some "widget" class
929 # for serialization to json:
930 sub WidgetBase::TO_JSON {
931 my ($self) = @_;
932
933 unless ($self->{id}) {
934 $self->{id} = ..get..some..id..;
935 $WIDGET{$self->{id}} = $self;
936 }
937
938 { __widget__ => $self->{id} }
939 }
940
941 $json = $json->shrink ([$enable])
942 $enabled = $json->get_shrink
943 Perl usually over-allocates memory a bit when allocating space for
944 strings. This flag optionally resizes strings generated by either
945 "encode" or "decode" to their minimum size possible. This can save
946 memory when your JSON texts are either very very long or you have
947 many short strings. It will also try to downgrade any strings to
948 octet-form if possible: perl stores strings internally either in an
949 encoding called UTF-X or in octet-form. The latter cannot store
950 everything but uses less space in general (and some buggy Perl or C
951 code might even rely on that internal representation being used).
952
953 The actual definition of what shrink does might change in future
954 versions, but it will always try to save space at the expense of
955 time.
956
957 If $enable is true (or missing), the string returned by "encode"
958 will be shrunk-to-fit, while all strings generated by "decode" will
959 also be shrunk-to-fit.
960
961 If $enable is false, then the normal perl allocation algorithms are
962 used. If you work with your data, then this is likely to be
963 faster.
964
965 In the future, this setting might control other things, such as
966 converting strings that look like integers or floats into integers
967 or floats internally (there is no difference on the Perl level),
968 saving space.
969
970 $json = $json->max_depth ([$maximum_nesting_depth])
971 $max_depth = $json->get_max_depth
972 Sets the maximum nesting level (default 512) accepted while
973 encoding or decoding. If a higher nesting level is detected in JSON
974 text or a Perl data structure, then the encoder and decoder will
975 stop and croak at that point.
976
977 Nesting level is defined by number of hash- or arrayrefs that the
978 encoder needs to traverse to reach a given point or the number of
979 "{" or "[" characters without their matching closing parenthesis
980 crossed to reach a given character in a string.
981
982 Setting the maximum depth to one disallows any nesting, so that
983 ensures that the object is only a single hash/object or array.
984
985 If no argument is given, the highest possible setting will be used,
986 which is rarely useful.
987
988 Note that nesting is implemented by recursion in C. The default
989 value has been chosen to be as large as typical operating systems
990 allow without crashing.
991
992 See "SECURITY CONSIDERATIONS", below, for more info on why this is
993 useful.
994
995 $json = $json->max_size ([$maximum_string_size])
996 $max_size = $json->get_max_size
997 Set the maximum length a JSON text may have (in bytes) where
998 decoding is being attempted. The default is 0, meaning no limit.
999 When "decode" is called on a string that is longer then this many
1000 bytes, it will not attempt to decode the string but throw an
1001 exception. This setting has no effect on "encode" (yet).
1002
1003 If no argument is given, the limit check will be deactivated (same
1004 as when 0 is specified).
1005
1006 See "SECURITY CONSIDERATIONS", below, for more info on why this is
1007 useful.
1008
1009 $json->stringify_infnan ([$infnan_mode = 1])
1010 $infnan_mode = $json->get_stringify_infnan
1011 Get or set how Cpanel::JSON::XS encodes "inf", "-inf" or "nan" for
1012 numeric values. Also qnan, snan or negative nan on some platforms.
1013
1014 "null": infnan_mode = 0. Similar to most JSON modules in other
1015 languages. Always null.
1016
1017 stringified: infnan_mode = 1. As in Mojo::JSON. Platform specific
1018 strings. Stringified via sprintf(%g), with double quotes.
1019
1020 inf/nan: infnan_mode = 2. As in JSON::XS, and older releases.
1021 Passes through platform dependent values, invalid JSON. Stringified
1022 via sprintf(%g), but without double quotes.
1023
1024 "inf/-inf/nan": infnan_mode = 3. Platform independent inf/nan/-inf
1025 strings. No QNAN/SNAN/negative NAN support, unified to "nan". Much
1026 easier to detect, but may conflict with valid strings.
1027
1028 $json_text = $json->encode ($perl_scalar, $json_type)
1029 Converts the given Perl data structure (a simple scalar or a
1030 reference to a hash or array) to its JSON representation. Simple
1031 scalars will be converted into JSON string or number sequences,
1032 while references to arrays become JSON arrays and references to
1033 hashes become JSON objects. Undefined Perl values (e.g. "undef")
1034 become JSON "null" values. Neither "true" nor "false" values will
1035 be generated.
1036
1037 For the type argument see Cpanel::JSON::XS::Type.
1038
1039 $perl_scalar = $json->decode ($json_text, my $json_type)
1040 The opposite of "encode": expects a JSON text and tries to parse
1041 it, returning the resulting simple scalar or reference. Croaks on
1042 error.
1043
1044 JSON numbers and strings become simple Perl scalars. JSON arrays
1045 become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
1046 becomes 1, "false" becomes 0 and "null" becomes "undef".
1047
1048 For the type argument see Cpanel::JSON::XS::Type.
1049
1050 ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
1051 This works like the "decode" method, but instead of raising an
1052 exception when there is trailing garbage after the first JSON
1053 object, it will silently stop parsing there and return the number
1054 of characters consumed so far.
1055
1056 This is useful if your JSON texts are not delimited by an outer
1057 protocol and you need to know where the JSON text ends.
1058
1059 Cpanel::JSON::XS->new->decode_prefix ("[1] the tail")
1060 => ([1], 3)
1061
1062 $json->to_json ($perl_hash_or_arrayref)
1063 Deprecated method for perl 5.8 and newer. Use encode_json instead.
1064
1065 $json->from_json ($utf8_encoded_json_text)
1066 Deprecated method for perl 5.8 and newer. Use decode_json instead.
1067
1069 In some cases, there is the need for incremental parsing of JSON texts.
1070 While this module always has to keep both JSON text and resulting Perl
1071 data structure in memory at one time, it does allow you to parse a JSON
1072 stream incrementally. It does so by accumulating text until it has a
1073 full JSON object, which it then can decode. This process is similar to
1074 using "decode_prefix" to see if a full JSON object is available, but is
1075 much more efficient (and can be implemented with a minimum of method
1076 calls).
1077
1078 Cpanel::JSON::XS will only attempt to parse the JSON text once it is
1079 sure it has enough text to get a decisive result, using a very simple
1080 but truly incremental parser. This means that it sometimes won't stop
1081 as early as the full parser, for example, it doesn't detect mismatched
1082 parentheses. The only thing it guarantees is that it starts decoding as
1083 soon as a syntactically valid JSON text has been seen. This means you
1084 need to set resource limits (e.g. "max_size") to ensure the parser will
1085 stop parsing in the presence if syntax errors.
1086
1087 The following methods implement this incremental parser.
1088
1089 [void, scalar or list context] = $json->incr_parse ([$string])
1090 This is the central parsing function. It can both append new text
1091 and extract objects from the stream accumulated so far (both of
1092 these functions are optional).
1093
1094 If $string is given, then this string is appended to the already
1095 existing JSON fragment stored in the $json object.
1096
1097 After that, if the function is called in void context, it will
1098 simply return without doing anything further. This can be used to
1099 add more text in as many chunks as you want.
1100
1101 If the method is called in scalar context, then it will try to
1102 extract exactly one JSON object. If that is successful, it will
1103 return this object, otherwise it will return "undef". If there is a
1104 parse error, this method will croak just as "decode" would do (one
1105 can then use "incr_skip" to skip the erroneous part). This is the
1106 most common way of using the method.
1107
1108 And finally, in list context, it will try to extract as many
1109 objects from the stream as it can find and return them, or the
1110 empty list otherwise. For this to work, there must be no separators
1111 between the JSON objects or arrays, instead they must be
1112 concatenated back-to-back. If an error occurs, an exception will be
1113 raised as in the scalar context case. Note that in this case, any
1114 previously-parsed JSON texts will be lost.
1115
1116 Example: Parse some JSON arrays/objects in a given string and
1117 return them.
1118
1119 my @objs = Cpanel::JSON::XS->new->incr_parse ("[5][7][1,2]");
1120
1121 $lvalue_string = $json->incr_text (>5.8 only)
1122 This method returns the currently stored JSON fragment as an
1123 lvalue, that is, you can manipulate it. This only works when a
1124 preceding call to "incr_parse" in scalar context successfully
1125 returned an object, and 2. only with Perl >= 5.8
1126
1127 Under all other circumstances you must not call this function (I
1128 mean it. although in simple tests it might actually work, it will
1129 fail under real world conditions). As a special exception, you can
1130 also call this method before having parsed anything.
1131
1132 This function is useful in two cases: a) finding the trailing text
1133 after a JSON object or b) parsing multiple JSON objects separated
1134 by non-JSON text (such as commas).
1135
1136 $json->incr_skip
1137 This will reset the state of the incremental parser and will remove
1138 the parsed text from the input buffer so far. This is useful after
1139 "incr_parse" died, in which case the input buffer and incremental
1140 parser state is left unchanged, to skip the text parsed so far and
1141 to reset the parse state.
1142
1143 The difference to "incr_reset" is that only text until the parse
1144 error occurred is removed.
1145
1146 $json->incr_reset
1147 This completely resets the incremental parser, that is, after this
1148 call, it will be as if the parser had never parsed anything.
1149
1150 This is useful if you want to repeatedly parse JSON objects and
1151 want to ignore any trailing data, which means you have to reset the
1152 parser after each successful decode.
1153
1154 LIMITATIONS
1155 All options that affect decoding are supported, except "allow_nonref".
1156 The reason for this is that it cannot be made to work sensibly: JSON
1157 objects and arrays are self-delimited, i.e. you can concatenate them
1158 back to back and still decode them perfectly. This does not hold true
1159 for JSON numbers, however.
1160
1161 For example, is the string 1 a single JSON number, or is it simply the
1162 start of 12? Or is 12 a single JSON number, or the concatenation of 1
1163 and 2? In neither case you can tell, and this is why Cpanel::JSON::XS
1164 takes the conservative route and disallows this case.
1165
1166 EXAMPLES
1167 Some examples will make all this clearer. First, a simple example that
1168 works similarly to "decode_prefix": We want to decode the JSON object
1169 at the start of a string and identify the portion after the JSON
1170 object:
1171
1172 my $text = "[1,2,3] hello";
1173
1174 my $json = new Cpanel::JSON::XS;
1175
1176 my $obj = $json->incr_parse ($text)
1177 or die "expected JSON object or array at beginning of string";
1178
1179 my $tail = $json->incr_text;
1180 # $tail now contains " hello"
1181
1182 Easy, isn't it?
1183
1184 Now for a more complicated example: Imagine a hypothetical protocol
1185 where you read some requests from a TCP stream, and each request is a
1186 JSON array, without any separation between them (in fact, it is often
1187 useful to use newlines as "separators", as these get interpreted as
1188 whitespace at the start of the JSON text, which makes it possible to
1189 test said protocol with "telnet"...).
1190
1191 Here is how you'd do it (it is trivial to write this in an event-based
1192 manner):
1193
1194 my $json = new Cpanel::JSON::XS;
1195
1196 # read some data from the socket
1197 while (sysread $socket, my $buf, 4096) {
1198
1199 # split and decode as many requests as possible
1200 for my $request ($json->incr_parse ($buf)) {
1201 # act on the $request
1202 }
1203 }
1204
1205 Another complicated example: Assume you have a string with JSON objects
1206 or arrays, all separated by (optional) comma characters (e.g. "[1],[2],
1207 [3]"). To parse them, we have to skip the commas between the JSON
1208 texts, and here is where the lvalue-ness of "incr_text" comes in
1209 useful:
1210
1211 my $text = "[1],[2], [3]";
1212 my $json = new Cpanel::JSON::XS;
1213
1214 # void context, so no parsing done
1215 $json->incr_parse ($text);
1216
1217 # now extract as many objects as possible. note the
1218 # use of scalar context so incr_text can be called.
1219 while (my $obj = $json->incr_parse) {
1220 # do something with $obj
1221
1222 # now skip the optional comma
1223 $json->incr_text =~ s/^ \s* , //x;
1224 }
1225
1226 Now lets go for a very complex example: Assume that you have a gigantic
1227 JSON array-of-objects, many gigabytes in size, and you want to parse
1228 it, but you cannot load it into memory fully (this has actually
1229 happened in the real world :).
1230
1231 Well, you lost, you have to implement your own JSON parser. But
1232 Cpanel::JSON::XS can still help you: You implement a (very simple)
1233 array parser and let JSON decode the array elements, which are all full
1234 JSON objects on their own (this wouldn't work if the array elements
1235 could be JSON numbers, for example):
1236
1237 my $json = new Cpanel::JSON::XS;
1238
1239 # open the monster
1240 open my $fh, "<bigfile.json"
1241 or die "bigfile: $!";
1242
1243 # first parse the initial "["
1244 for (;;) {
1245 sysread $fh, my $buf, 65536
1246 or die "read error: $!";
1247 $json->incr_parse ($buf); # void context, so no parsing
1248
1249 # Exit the loop once we found and removed(!) the initial "[".
1250 # In essence, we are (ab-)using the $json object as a simple scalar
1251 # we append data to.
1252 last if $json->incr_text =~ s/^ \s* \[ //x;
1253 }
1254
1255 # now we have the skipped the initial "[", so continue
1256 # parsing all the elements.
1257 for (;;) {
1258 # in this loop we read data until we got a single JSON object
1259 for (;;) {
1260 if (my $obj = $json->incr_parse) {
1261 # do something with $obj
1262 last;
1263 }
1264
1265 # add more data
1266 sysread $fh, my $buf, 65536
1267 or die "read error: $!";
1268 $json->incr_parse ($buf); # void context, so no parsing
1269 }
1270
1271 # in this loop we read data until we either found and parsed the
1272 # separating "," between elements, or the final "]"
1273 for (;;) {
1274 # first skip whitespace
1275 $json->incr_text =~ s/^\s*//;
1276
1277 # if we find "]", we are done
1278 if ($json->incr_text =~ s/^\]//) {
1279 print "finished.\n";
1280 exit;
1281 }
1282
1283 # if we find ",", we can continue with the next element
1284 if ($json->incr_text =~ s/^,//) {
1285 last;
1286 }
1287
1288 # if we find anything else, we have a parse error!
1289 if (length $json->incr_text) {
1290 die "parse error near ", $json->incr_text;
1291 }
1292
1293 # else add more data
1294 sysread $fh, my $buf, 65536
1295 or die "read error: $!";
1296 $json->incr_parse ($buf); # void context, so no parsing
1297 }
1298
1299 This is a complex example, but most of the complexity comes from the
1300 fact that we are trying to be correct (bear with me if I am wrong, I
1301 never ran the above example :).
1302
1304 Detect all unicode Byte Order Marks on decode. Which are UTF-8,
1305 UTF-16LE, UTF-16BE, UTF-32LE and UTF-32BE.
1306
1307 The BOM encoding is set only for one specific decode call, it does not
1308 change the state of the JSON object.
1309
1310 Warning: With perls older than 5.20 you need load the Encode module
1311 before loading a multibyte BOM, i.e. >= UTF-16. Otherwise an error is
1312 thrown. This is an implementation limitation and might get fixed later.
1313
1314 See <https://tools.ietf.org/html/rfc7159#section-8.1> "JSON text SHALL
1315 be encoded in UTF-8, UTF-16, or UTF-32."
1316
1317 "Implementations MUST NOT add a byte order mark to the beginning of a
1318 JSON text", "implementations (...) MAY ignore the presence of a byte
1319 order mark rather than treating it as an error".
1320
1321 See also <http://www.unicode.org/faq/utf_bom.html#BOM>.
1322
1323 Beware that Cpanel::JSON::XS is currently the only JSON module which
1324 does accept and decode a BOM.
1325
1326 The latest JSON spec
1327 <https://www.greenbytes.de/tech/webdav/rfc8259.html#character.encoding>
1328 forbid the usage of UTF-16 or UTF-32, the character encoding is UTF-8.
1329 Thus in subsequent updates BOM's of UTF-16 or UTF-32 will throw an
1330 error.
1331
1333 This section describes how Cpanel::JSON::XS maps Perl values to JSON
1334 values and vice versa. These mappings are designed to "do the right
1335 thing" in most circumstances automatically, preserving round-tripping
1336 characteristics (what you put in comes out as something equivalent).
1337
1338 For the more enlightened: note that in the following descriptions,
1339 lowercase perl refers to the Perl interpreter, while uppercase Perl
1340 refers to the abstract Perl language itself.
1341
1342 JSON -> PERL
1343 object
1344 A JSON object becomes a reference to a hash in Perl. No ordering of
1345 object keys is preserved (JSON does not preserve object key
1346 ordering itself).
1347
1348 array
1349 A JSON array becomes a reference to an array in Perl.
1350
1351 string
1352 A JSON string becomes a string scalar in Perl - Unicode codepoints
1353 in JSON are represented by the same codepoints in the Perl string,
1354 so no manual decoding is necessary.
1355
1356 number
1357 A JSON number becomes either an integer, numeric (floating point)
1358 or string scalar in perl, depending on its range and any fractional
1359 parts. On the Perl level, there is no difference between those as
1360 Perl handles all the conversion details, but an integer may take
1361 slightly less memory and might represent more values exactly than
1362 floating point numbers.
1363
1364 If the number consists of digits only, Cpanel::JSON::XS will try to
1365 represent it as an integer value. If that fails, it will try to
1366 represent it as a numeric (floating point) value if that is
1367 possible without loss of precision. Otherwise it will preserve the
1368 number as a string value (in which case you lose roundtripping
1369 ability, as the JSON number will be re-encoded to a JSON string).
1370
1371 Numbers containing a fractional or exponential part will always be
1372 represented as numeric (floating point) values, possibly at a loss
1373 of precision (in which case you might lose perfect roundtripping
1374 ability, but the JSON number will still be re-encoded as a JSON
1375 number).
1376
1377 Note that precision is not accuracy - binary floating point values
1378 cannot represent most decimal fractions exactly, and when
1379 converting from and to floating point, "Cpanel::JSON::XS" only
1380 guarantees precision up to but not including the least significant
1381 bit.
1382
1383 true, false
1384 When "unblessed_bool" is set to true, then JSON "true" becomes 1
1385 and JSON "false" becomes 0.
1386
1387 Otherwise these JSON atoms become "JSON::PP::true" and
1388 "JSON::PP::false", respectively. They are "JSON::PP::Boolean"
1389 objects and are overloaded to act almost exactly like the numbers 1
1390 and 0. You can check whether a scalar is a JSON boolean by using
1391 the "Cpanel::JSON::XS::is_bool" function.
1392
1393 The other round, from perl to JSON, "!0" which is represented as
1394 "yes" becomes "true", and "!1" which is represented as "no" becomes
1395 "false".
1396
1397 Via Cpanel::JSON::XS::Type you can now even force negation in
1398 "encode", without overloading of "!":
1399
1400 my $false = Cpanel::JSON::XS::false;
1401 print($json->encode([!$false], [JSON_TYPE_BOOL]));
1402 => [true]
1403
1404 null
1405 A JSON null atom becomes "undef" in Perl.
1406
1407 shell-style comments ("# text")
1408 As a nonstandard extension to the JSON syntax that is enabled by
1409 the "relaxed" setting, shell-style comments are allowed. They can
1410 start anywhere outside strings and go till the end of the line.
1411
1412 tagged values ("(tag)value").
1413 Another nonstandard extension to the JSON syntax, enabled with the
1414 "allow_tags" setting, are tagged values. In this implementation,
1415 the tag must be a perl package/class name encoded as a JSON string,
1416 and the value must be a JSON array encoding optional constructor
1417 arguments.
1418
1419 See "OBJECT SERIALIZATION", below, for details.
1420
1421 PERL -> JSON
1422 The mapping from Perl to JSON is slightly more difficult, as Perl is a
1423 truly typeless language, so we can only guess which JSON type is meant
1424 by a Perl value.
1425
1426 hash references
1427 Perl hash references become JSON objects. As there is no inherent
1428 ordering in hash keys (or JSON objects), they will usually be
1429 encoded in a pseudo-random order that can change between runs of
1430 the same program but stays generally the same within a single run
1431 of a program. Cpanel::JSON::XS can optionally sort the hash keys
1432 (determined by the canonical flag), so the same datastructure will
1433 serialize to the same JSON text (given same settings and version of
1434 Cpanel::JSON::XS), but this incurs a runtime overhead and is only
1435 rarely useful, e.g. when you want to compare some JSON text against
1436 another for equality.
1437
1438 array references
1439 Perl array references become JSON arrays.
1440
1441 other references
1442 Other unblessed references are generally not allowed and will cause
1443 an exception to be thrown, except for references to the integers 0
1444 and 1, which get turned into "false" and "true" atoms in JSON.
1445
1446 With the option "allow_stringify", you can ignore the exception and
1447 return the stringification of the perl value.
1448
1449 With the option "allow_unknown", you can ignore the exception and
1450 return "null" instead.
1451
1452 encode_json [\"x"] # => cannot encode reference to scalar 'SCALAR(0x..)'
1453 # unless the scalar is 0 or 1
1454 encode_json [\0, \1] # yields [false,true]
1455
1456 allow_stringify->encode_json [\"x"] # yields "x" unlike JSON::PP
1457 allow_unknown->encode_json [\"x"] # yields null as in JSON::PP
1458
1459 Cpanel::JSON::XS::true, Cpanel::JSON::XS::false
1460 These special values become JSON true and JSON false values,
1461 respectively. You can also use "\1" and "\0" or "!0" and "!1"
1462 directly if you want.
1463
1464 encode_json [Cpanel::JSON::XS::false, Cpanel::JSON::XS::true] # yields [false,true]
1465 encode_json [!1, !0], [JSON_TYPE_BOOL, JSON_TYPE_BOOL] # yields [false,true]
1466
1467 eq/ne comparisons with true, false:
1468
1469 false is eq to the empty string or the string 'false' or the
1470 special empty string "!!0" or "!1", i.e. "SV_NO", or the numbers 0
1471 or 0.0.
1472
1473 true is eq to the string 'true' or to the special string "!0" (i.e.
1474 "SV_YES") or to the numbers 1 or 1.0.
1475
1476 blessed objects
1477 Blessed objects are not directly representable in JSON, but
1478 "Cpanel::JSON::XS" allows various optional ways of handling
1479 objects. See "OBJECT SERIALIZATION", below, for details.
1480
1481 See the "allow_blessed" and "convert_blessed" methods on various
1482 options on how to deal with this: basically, you can choose between
1483 throwing an exception, encoding the reference as if it weren't
1484 blessed, use the objects overloaded stringification method or
1485 provide your own serializer method.
1486
1487 simple scalars
1488 Simple Perl scalars (any scalar that is not a reference) are the
1489 most difficult objects to encode: Cpanel::JSON::XS will encode
1490 undefined scalars or inf/nan as JSON "null" values and other
1491 scalars to either number or string in non-deterministic way which
1492 may be affected or changed by Perl version or any other loaded Perl
1493 module.
1494
1495 If you want to have stable and deterministic types in JSON encoder
1496 then use Cpanel::JSON::XS::Type.
1497
1498 Alternative way for deterministic types is to use "type_all_string"
1499 method when all perl scalars are encoded to JSON strings.
1500
1501 Non-deterministic behavior is following: scalars that have last
1502 been used in a string context before encoding as JSON strings, and
1503 anything else as number value:
1504
1505 # dump as number
1506 encode_json [2] # yields [2]
1507 encode_json [-3.0e17] # yields [-3e+17]
1508 my $value = 5; encode_json [$value] # yields [5]
1509
1510 # used as string, but the two representations are for the same number
1511 print $value;
1512 encode_json [$value] # yields [5]
1513
1514 # used as different string (non-matching dual-var)
1515 my $str = '0 but true';
1516 my $num = 1 + $str;
1517 encode_json [$num, $str] # yields [1,"0 but true"]
1518
1519 # undef becomes null
1520 encode_json [undef] # yields [null]
1521
1522 # inf or nan becomes null, unless you answered
1523 # "Do you want to handle inf/nan as strings" with yes
1524 encode_json [9**9**9] # yields [null]
1525
1526 You can force the type to be a JSON string by stringifying it:
1527
1528 my $x = 3.1; # some variable containing a number
1529 "$x"; # stringified
1530 $x .= ""; # another, more awkward way to stringify
1531 print $x; # perl does it for you, too, quite often
1532
1533 You can force the type to be a JSON number by numifying it:
1534
1535 my $x = "3"; # some variable containing a string
1536 $x += 0; # numify it, ensuring it will be dumped as a number
1537 $x *= 1; # same thing, the choice is yours.
1538
1539 Note that numerical precision has the same meaning as under Perl
1540 (so binary to decimal conversion follows the same rules as in Perl,
1541 which can differ to other languages). Also, your perl interpreter
1542 might expose extensions to the floating point numbers of your
1543 platform, such as infinities or NaN's - these cannot be represented
1544 in JSON, and thus null is returned instead. Optionally you can
1545 configure it to stringify inf and nan values.
1546
1547 OBJECT SERIALIZATION
1548 As JSON cannot directly represent Perl objects, you have to choose
1549 between a pure JSON representation (without the ability to deserialize
1550 the object automatically again), and a nonstandard extension to the
1551 JSON syntax, tagged values.
1552
1553 SERIALIZATION
1554
1555 What happens when "Cpanel::JSON::XS" encounters a Perl object depends
1556 on the "allow_blessed", "convert_blessed" and "allow_tags" settings,
1557 which are used in this order:
1558
1559 1. "allow_tags" is enabled and the object has a "FREEZE" method.
1560 In this case, "Cpanel::JSON::XS" uses the Types::Serialiser object
1561 serialization protocol to create a tagged JSON value, using a
1562 nonstandard extension to the JSON syntax.
1563
1564 This works by invoking the "FREEZE" method on the object, with the
1565 first argument being the object to serialize, and the second
1566 argument being the constant string "JSON" to distinguish it from
1567 other serializers.
1568
1569 The "FREEZE" method can return any number of values (i.e. zero or
1570 more). These values and the paclkage/classname of the object will
1571 then be encoded as a tagged JSON value in the following format:
1572
1573 ("classname")[FREEZE return values...]
1574
1575 e.g.:
1576
1577 ("URI")["http://www.google.com/"]
1578 ("MyDate")[2013,10,29]
1579 ("ImageData::JPEG")["Z3...VlCg=="]
1580
1581 For example, the hypothetical "My::Object" "FREEZE" method might
1582 use the objects "type" and "id" members to encode the object:
1583
1584 sub My::Object::FREEZE {
1585 my ($self, $serializer) = @_;
1586
1587 ($self->{type}, $self->{id})
1588 }
1589
1590 2. "convert_blessed" is enabled and the object has a "TO_JSON" method.
1591 In this case, the "TO_JSON" method of the object is invoked in
1592 scalar context. It must return a single scalar that can be directly
1593 encoded into JSON. This scalar replaces the object in the JSON
1594 text.
1595
1596 For example, the following "TO_JSON" method will convert all URI
1597 objects to JSON strings when serialized. The fact that these values
1598 originally were URI objects is lost.
1599
1600 sub URI::TO_JSON {
1601 my ($uri) = @_;
1602 $uri->as_string
1603 }
1604
1605 3. "convert_blessed" is enabled and the object has a stringification
1606 overload.
1607 In this case, the overloaded "" method of the object is invoked in
1608 scalar context. It must return a single scalar that can be directly
1609 encoded into JSON. This scalar replaces the object in the JSON
1610 text.
1611
1612 For example, the following "" method will convert all URI objects
1613 to JSON strings when serialized. The fact that these values
1614 originally were URI objects is lost.
1615
1616 package URI;
1617 use overload '""' => sub { shift->as_string };
1618
1619 4. "allow_blessed" is enabled.
1620 The object will be serialized as a JSON null value.
1621
1622 5. none of the above
1623 If none of the settings are enabled or the respective methods are
1624 missing, "Cpanel::JSON::XS" throws an exception.
1625
1626 DESERIALIZATION
1627
1628 For deserialization there are only two cases to consider: either
1629 nonstandard tagging was used, in which case "allow_tags" decides, or
1630 objects cannot be automatically be deserialized, in which case you can
1631 use postprocessing or the "filter_json_object" or
1632 "filter_json_single_key_object" callbacks to get some real objects our
1633 of your JSON.
1634
1635 This section only considers the tagged value case: I a tagged JSON
1636 object is encountered during decoding and "allow_tags" is disabled, a
1637 parse error will result (as if tagged values were not part of the
1638 grammar).
1639
1640 If "allow_tags" is enabled, "Cpanel::JSON::XS" will look up the "THAW"
1641 method of the package/classname used during serialization (it will not
1642 attempt to load the package as a Perl module). If there is no such
1643 method, the decoding will fail with an error.
1644
1645 Otherwise, the "THAW" method is invoked with the classname as first
1646 argument, the constant string "JSON" as second argument, and all the
1647 values from the JSON array (the values originally returned by the
1648 "FREEZE" method) as remaining arguments.
1649
1650 The method must then return the object. While technically you can
1651 return any Perl scalar, you might have to enable the "enable_nonref"
1652 setting to make that work in all cases, so better return an actual
1653 blessed reference.
1654
1655 As an example, let's implement a "THAW" function that regenerates the
1656 "My::Object" from the "FREEZE" example earlier:
1657
1658 sub My::Object::THAW {
1659 my ($class, $serializer, $type, $id) = @_;
1660
1661 $class->new (type => $type, id => $id)
1662 }
1663
1664 See the "SECURITY CONSIDERATIONS" section below. Allowing external json
1665 objects being deserialized to perl objects is usually a very bad idea.
1666
1668 The interested reader might have seen a number of flags that signify
1669 encodings or codesets - "utf8", "latin1", "binary" and "ascii". There
1670 seems to be some confusion on what these do, so here is a short
1671 comparison:
1672
1673 "utf8" controls whether the JSON text created by "encode" (and expected
1674 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only
1675 control whether "encode" escapes character values outside their
1676 respective codeset range. Neither of these flags conflict with each
1677 other, although some combinations make less sense than others.
1678
1679 Care has been taken to make all flags symmetrical with respect to
1680 "encode" and "decode", that is, texts encoded with any combination of
1681 these flag values will be correctly decoded when the same flags are
1682 used - in general, if you use different flag settings while encoding
1683 vs. when decoding you likely have a bug somewhere.
1684
1685 Below comes a verbose discussion of these flags. Note that a "codeset"
1686 is simply an abstract set of character-codepoint pairs, while an
1687 encoding takes those codepoint numbers and encodes them, in our case
1688 into octets. Unicode is (among other things) a codeset, UTF-8 is an
1689 encoding, and ISO-8859-1 (= latin 1) and ASCII are both codesets and
1690 encodings at the same time, which can be confusing.
1691
1692 "utf8" flag disabled
1693 When "utf8" is disabled (the default), then "encode"/"decode"
1694 generate and expect Unicode strings, that is, characters with high
1695 ordinal Unicode values (> 255) will be encoded as such characters,
1696 and likewise such characters are decoded as-is, no changes to them
1697 will be done, except "(re-)interpreting" them as Unicode codepoints
1698 or Unicode characters, respectively (to Perl, these are the same
1699 thing in strings unless you do funny/weird/dumb stuff).
1700
1701 This is useful when you want to do the encoding yourself (e.g. when
1702 you want to have UTF-16 encoded JSON texts) or when some other
1703 layer does the encoding for you (for example, when printing to a
1704 terminal using a filehandle that transparently encodes to UTF-8 you
1705 certainly do NOT want to UTF-8 encode your data first and have Perl
1706 encode it another time).
1707
1708 "utf8" flag enabled
1709 If the "utf8"-flag is enabled, "encode"/"decode" will encode all
1710 characters using the corresponding UTF-8 multi-byte sequence, and
1711 will expect your input strings to be encoded as UTF-8, that is, no
1712 "character" of the input string must have any value > 255, as UTF-8
1713 does not allow that.
1714
1715 The "utf8" flag therefore switches between two modes: disabled
1716 means you will get a Unicode string in Perl, enabled means you get
1717 an UTF-8 encoded octet/binary string in Perl.
1718
1719 "latin1", "binary" or "ascii" flags enabled
1720 With "latin1" (or "ascii") enabled, "encode" will escape characters
1721 with ordinal values > 255 (> 127 with "ascii") and encode the
1722 remaining characters as specified by the "utf8" flag. With
1723 "binary" enabled, ordinal values > 255 are illegal.
1724
1725 If "utf8" is disabled, then the result is also correctly encoded in
1726 those character sets (as both are proper subsets of Unicode,
1727 meaning that a Unicode string with all character values < 256 is
1728 the same thing as a ISO-8859-1 string, and a Unicode string with
1729 all character values < 128 is the same thing as an ASCII string in
1730 Perl).
1731
1732 If "utf8" is enabled, you still get a correct UTF-8-encoded string,
1733 regardless of these flags, just some more characters will be
1734 escaped using "\uXXXX" then before.
1735
1736 Note that ISO-8859-1-encoded strings are not compatible with UTF-8
1737 encoding, while ASCII-encoded strings are. That is because the
1738 ISO-8859-1 encoding is NOT a subset of UTF-8 (despite the
1739 ISO-8859-1 codeset being a subset of Unicode), while ASCII is.
1740
1741 Surprisingly, "decode" will ignore these flags and so treat all
1742 input values as governed by the "utf8" flag. If it is disabled,
1743 this allows you to decode ISO-8859-1- and ASCII-encoded strings, as
1744 both strict subsets of Unicode. If it is enabled, you can correctly
1745 decode UTF-8 encoded strings.
1746
1747 So neither "latin1", "binary" nor "ascii" are incompatible with the
1748 "utf8" flag - they only govern when the JSON output engine escapes
1749 a character or not.
1750
1751 The main use for "latin1" or "binary" is to relatively efficiently
1752 store binary data as JSON, at the expense of breaking compatibility
1753 with most JSON decoders.
1754
1755 The main use for "ascii" is to force the output to not contain
1756 characters with values > 127, which means you can interpret the
1757 resulting string as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about
1758 any character set and 8-bit-encoding, and still get the same data
1759 structure back. This is useful when your channel for JSON transfer
1760 is not 8-bit clean or the encoding might be mangled in between
1761 (e.g. in mail), and works because ASCII is a proper subset of most
1762 8-bit and multibyte encodings in use in the world.
1763
1764 JSON and ECMAscript
1765 JSON syntax is based on how literals are represented in javascript (the
1766 not-standardized predecessor of ECMAscript) which is presumably why it
1767 is called "JavaScript Object Notation".
1768
1769 However, JSON is not a subset (and also not a superset of course) of
1770 ECMAscript (the standard) or javascript (whatever browsers actually
1771 implement).
1772
1773 If you want to use javascript's "eval" function to "parse" JSON, you
1774 might run into parse errors for valid JSON texts, or the resulting data
1775 structure might not be queryable:
1776
1777 One of the problems is that U+2028 and U+2029 are valid characters
1778 inside JSON strings, but are not allowed in ECMAscript string literals,
1779 so the following Perl fragment will not output something that can be
1780 guaranteed to be parsable by javascript's "eval":
1781
1782 use Cpanel::JSON::XS;
1783
1784 print encode_json [chr 0x2028];
1785
1786 The right fix for this is to use a proper JSON parser in your
1787 javascript programs, and not rely on "eval" (see for example Douglas
1788 Crockford's json2.js parser).
1789
1790 If this is not an option, you can, as a stop-gap measure, simply encode
1791 to ASCII-only JSON:
1792
1793 use Cpanel::JSON::XS;
1794
1795 print Cpanel::JSON::XS->new->ascii->encode ([chr 0x2028]);
1796
1797 Note that this will enlarge the resulting JSON text quite a bit if you
1798 have many non-ASCII characters. You might be tempted to run some
1799 regexes to only escape U+2028 and U+2029, e.g.:
1800
1801 # DO NOT USE THIS!
1802 my $json = Cpanel::JSON::XS->new->utf8->encode ([chr 0x2028]);
1803 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1804 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1805 print $json;
1806
1807 Note that this is a bad idea: the above only works for U+2028 and
1808 U+2029 and thus only for fully ECMAscript-compliant parsers. Many
1809 existing javascript implementations, however, have issues with other
1810 characters as well - using "eval" naively simply will cause problems.
1811
1812 Another problem is that some javascript implementations reserve some
1813 property names for their own purposes (which probably makes them non-
1814 ECMAscript-compliant). For example, Iceweasel reserves the "__proto__"
1815 property name for its own purposes.
1816
1817 If that is a problem, you could parse try to filter the resulting JSON
1818 output for these property strings, e.g.:
1819
1820 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
1821
1822 This works because "__proto__" is not valid outside of strings, so
1823 every occurrence of ""__proto__"\s*:" must be a string used as property
1824 name.
1825
1826 Unicode non-characters between U+FFFD and U+10FFFF are decoded either
1827 to the recommended U+FFFD REPLACEMENT CHARACTER (see Unicode PR #121:
1828 Recommended Practice for Replacement Characters), or in the binary or
1829 relaxed mode left as is, keeping the illegal non-characters as before.
1830
1831 Raw non-Unicode characters outside the valid unicode range fail now to
1832 parse, because "A string is a sequence of zero or more Unicode
1833 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
1834 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER flag
1835 when parsing unicode.
1836
1837 If you know of other incompatibilities, please let me know.
1838
1839 JSON and YAML
1840 You often hear that JSON is a subset of YAML. in general, there is no
1841 way to configure JSON::XS to output a data structure as valid YAML that
1842 works in all cases. If you really must use Cpanel::JSON::XS to
1843 generate YAML, you should use this algorithm (subject to change in
1844 future versions):
1845
1846 my $to_yaml = Cpanel::JSON::XS->new->utf8->space_after (1);
1847 my $yaml = $to_yaml->encode ($ref) . "\n";
1848
1849 This will usually generate JSON texts that also parse as valid YAML.
1850
1851 SPEED
1852 It seems that JSON::XS is surprisingly fast, as shown in the following
1853 tables. They have been generated with the help of the "eg/bench"
1854 program in the JSON::XS distribution, to make it easy to compare on
1855 your own system.
1856
1857 JSON::XS is with Data::MessagePack and Sereal one of the fastest
1858 serializers, because JSON and JSON::XS do not support backrefs (no
1859 graph structures), only trees. Storable supports backrefs, i.e. graphs.
1860 Data::MessagePack encodes its data binary (as Storable) and supports
1861 only very simple subset of JSON.
1862
1863 First comes a comparison between various modules using a very short
1864 single-line JSON string (also available at
1865 <http://dist.schmorp.de/misc/json/short.json>).
1866
1867 {"method": "handleMessage", "params": ["user1",
1868 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1869 1, 0]}
1870
1871 It shows the number of encodes/decodes per second (JSON::XS uses the
1872 functional interface, while Cpanel::JSON::XS/2 uses the OO interface
1873 with pretty-printing and hash key sorting enabled, Cpanel::JSON::XS/3
1874 enables shrink. JSON::DWIW/DS uses the deserialize function, while
1875 JSON::DWIW::FJ uses the from_json method). Higher is better:
1876
1877 module | encode | decode |
1878 --------------|------------|------------|
1879 JSON::DWIW/DS | 86302.551 | 102300.098 |
1880 JSON::DWIW/FJ | 86302.551 | 75983.768 |
1881 JSON::PP | 15827.562 | 6638.658 |
1882 JSON::Syck | 63358.066 | 47662.545 |
1883 JSON::XS | 511500.488 | 511500.488 |
1884 JSON::XS/2 | 291271.111 | 388361.481 |
1885 JSON::XS/3 | 361577.931 | 361577.931 |
1886 Storable | 66788.280 | 265462.278 |
1887 --------------+------------+------------+
1888
1889 That is, JSON::XS is almost six times faster than JSON::DWIW on
1890 encoding, about five times faster on decoding, and over thirty to
1891 seventy times faster than JSON's pure perl implementation. It also
1892 compares favourably to Storable for small amounts of data.
1893
1894 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1895 search API (<http://dist.schmorp.de/misc/json/long.json>).
1896
1897 module | encode | decode |
1898 --------------|------------|------------|
1899 JSON::DWIW/DS | 1647.927 | 2673.916 |
1900 JSON::DWIW/FJ | 1630.249 | 2596.128 |
1901 JSON::PP | 400.640 | 62.311 |
1902 JSON::Syck | 1481.040 | 1524.869 |
1903 JSON::XS | 20661.596 | 9541.183 |
1904 JSON::XS/2 | 10683.403 | 9416.938 |
1905 JSON::XS/3 | 20661.596 | 9400.054 |
1906 Storable | 19765.806 | 10000.725 |
1907 --------------+------------+------------+
1908
1909 Again, JSON::XS leads by far (except for Storable which non-
1910 surprisingly decodes a bit faster).
1911
1912 On large strings containing lots of high Unicode characters, some
1913 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
1914 result will be broken due to missing (or wrong) Unicode handling.
1915 Others refuse to decode or encode properly, so it was impossible to
1916 prepare a fair comparison table for that case.
1917
1918 For updated graphs see
1919 <https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs>
1920
1922 As long as you only serialize data that can be directly expressed in
1923 JSON, "Cpanel::JSON::XS" is incapable of generating invalid JSON output
1924 (modulo bugs, but "JSON::XS" has found more bugs in the official JSON
1925 testsuite (1) than the official JSON testsuite has found in "JSON::XS"
1926 (0)). "Cpanel::JSON::XS" is currently the only known JSON decoder
1927 which passes all <http://seriot.ch/parsing_json.html> tests, while
1928 being the fastest also.
1929
1930 When you have trouble decoding JSON generated by this module using
1931 other decoders, then it is very likely that you have an encoding
1932 mismatch or the other decoder is broken.
1933
1934 When decoding, "JSON::XS" is strict by default and will likely catch
1935 all errors. There are currently two settings that change this:
1936 "relaxed" makes "JSON::XS" accept (but not generate) some non-standard
1937 extensions, and "allow_tags" or "allow_blessed" will allow you to
1938 encode and decode Perl objects, at the cost of being totally insecure
1939 and not outputting valid JSON anymore.
1940
1941 JSON-XS-3.01 broke interoperability with JSON-2.90 with booleans. See
1942 JSON.
1943
1944 Cpanel::JSON::XS needs to know the JSON and JSON::XS versions to be
1945 able work with those objects, especially when encoding a booleans like
1946 "{"is_true":true}". So you need to load these modules before.
1947
1948 true/false overloading and boolean representations are supported.
1949
1950 JSON::XS and JSON::PP representations are accepted and older JSON::XS
1951 accepts Cpanel::JSON::XS booleans. All JSON modules JSON, JSON, PP,
1952 JSON::XS, Cpanel::JSON::XS produce JSON::PP::Boolean objects, just Mojo
1953 and JSON::YAJL not. Mojo produces Mojo::JSON::_Bool and
1954 JSON::YAJL::Parser just an unblessed IV.
1955
1956 Cpanel::JSON::XS accepts JSON::PP::Boolean and Mojo::JSON::_Bool
1957 objects as booleans.
1958
1959 I cannot think of any reason to still use JSON::XS anymore.
1960
1961 TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS
1962 When you use "allow_tags" to use the extended (and also nonstandard and
1963 invalid) JSON syntax for serialized objects, and you still want to
1964 decode the generated serialize objects, you can run a regex to replace
1965 the tagged syntax by standard JSON arrays (it only works for "normal"
1966 package names without comma, newlines or single colons). First, the
1967 readable Perl version:
1968
1969 # if your FREEZE methods return no values, you need this replace first:
1970 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[\s*\]/[$1]/gx;
1971
1972 # this works for non-empty constructor arg lists:
1973 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[/[$1,/gx;
1974
1975 And here is a less readable version that is easy to adapt to other
1976 languages:
1977
1978 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/[$1,/g;
1979
1980 Here is an ECMAScript version (same regex):
1981
1982 json = json.replace (/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/g, "[$1,");
1983
1984 Since this syntax converts to standard JSON arrays, it might be hard to
1985 distinguish serialized objects from normal arrays. You can prepend a
1986 "magic number" as first array element to reduce chances of a collision:
1987
1988 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g;
1989
1990 And after decoding the JSON text, you could walk the data structure
1991 looking for arrays with a first element of
1992 "XU1peReLzT4ggEllLanBYq4G9VzliwKF".
1993
1994 The same approach can be used to create the tagged format with another
1995 encoder. First, you create an array with the magic string as first
1996 member, the classname as second, and constructor arguments last, encode
1997 it as part of your JSON structure, and then:
1998
1999 $json =~ s/\[\s*"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s*,\s*("([^\\":,]+|\\.|::)*")\s*,/($1)[/g;
2000
2001 Again, this has some limitations - the magic string must not be encoded
2002 with character escapes, and the constructor arguments must be non-
2003 empty.
2004
2006 Since this module was written, Google has written a new JSON RFC, RFC
2007 7159 (and RFC7158). Unfortunately, this RFC breaks compatibility with
2008 both the original JSON specification on www.json.org and RFC4627.
2009
2010 As far as I can see, you can get partial compatibility when parsing by
2011 using "->allow_nonref". However, consider the security implications of
2012 doing so.
2013
2014 I haven't decided yet when to break compatibility with RFC4627 by
2015 default (and potentially leave applications insecure) and change the
2016 default to follow RFC7159, but application authors are well advised to
2017 call "->allow_nonref(0)" even if this is the current default, if they
2018 cannot handle non-reference values, in preparation for the day when the
2019 default will change.
2020
2022 JSON::XS and Cpanel::JSON::XS are not only fast. JSON is generally the
2023 most secure serializing format, because it is the only one besides
2024 Data::MessagePack, which does not deserialize objects per default. For
2025 all languages, not just perl. The binary variant BSON (MongoDB) does
2026 more but is unsafe.
2027
2028 It is trivial for any attacker to create such serialized objects in
2029 JSON and trick perl into expanding them, thereby triggering certain
2030 methods. Watch <https://www.youtube.com/watch?v=Gzx6KlqiIZE> for an
2031 exploit demo for "CVE-2015-1592 SixApart MovableType Storable Perl Code
2032 Execution" for a deserializer which expands objects. Deserializing
2033 even coderefs (methods, functions) or external data would be considered
2034 the most dangerous.
2035
2036 Security relevant overview of serializers regarding deserializing
2037 objects by default:
2038
2039 Objects Coderefs External Data
2040
2041 Data::Dumper YES YES YES
2042 Storable YES NO (def) NO
2043 Sereal YES NO NO
2044 YAML YES NO NO
2045 B::C YES YES YES
2046 B::Bytecode YES YES YES
2047 BSON YES YES NO
2048 JSON::SL YES NO YES
2049 JSON NO (def) NO NO
2050 Data::MessagePack NO NO NO
2051 XML NO NO YES
2052
2053 Pickle YES YES YES
2054 PHP Deserialize YES NO NO
2055
2056 When you are using JSON in a protocol, talking to untrusted potentially
2057 hostile creatures requires relatively few measures.
2058
2059 First of all, your JSON decoder should be secure, that is, should not
2060 have any buffer overflows. Obviously, this module should ensure that.
2061
2062 Second, you need to avoid resource-starving attacks. That means you
2063 should limit the size of JSON texts you accept, or make sure then when
2064 your resources run out, that's just fine (e.g. by using a separate
2065 process that can crash safely). The size of a JSON text in octets or
2066 characters is usually a good indication of the size of the resources
2067 required to decode it into a Perl structure. While JSON::XS can check
2068 the size of the JSON text, it might be too late when you already have
2069 it in memory, so you might want to check the size before you accept the
2070 string.
2071
2072 Third, Cpanel::JSON::XS recurses using the C stack when decoding
2073 objects and arrays. The C stack is a limited resource: for instance, on
2074 my amd64 machine with 8MB of stack size I can decode around 180k nested
2075 arrays but only 14k nested JSON objects (due to perl itself recursing
2076 deeply on croak to free the temporary). If that is exceeded, the
2077 program crashes. To be conservative, the default nesting limit is set
2078 to 512. If your process has a smaller stack, you should adjust this
2079 setting accordingly with the "max_depth" method.
2080
2081 Also keep in mind that Cpanel::JSON::XS might leak contents of your
2082 Perl data structures in its error messages, so when you serialize
2083 sensitive information you might want to make sure that exceptions
2084 thrown by JSON::XS will not end up in front of untrusted eyes.
2085
2086 If you are using Cpanel::JSON::XS to return packets to consumption by
2087 JavaScript scripts in a browser you should have a look at
2088 <http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/>
2089 to see whether you are vulnerable to some common attack vectors (which
2090 really are browser design bugs, but it is still you who will have to
2091 deal with it, as major browser developers care only for features, not
2092 about getting security right). You might also want to also look at
2093 Mojo::JSON special escape rules to prevent from XSS attacks.
2094
2096 TL;DR: Due to security concerns, Cpanel::JSON::XS will not allow scalar
2097 data in JSON texts by default - you need to create your own
2098 Cpanel::JSON::XS object and enable "allow_nonref":
2099
2100 my $json = JSON::XS->new->allow_nonref;
2101
2102 $text = $json->encode ($data);
2103 $data = $json->decode ($text);
2104
2105 The long version: JSON being an important and supposedly stable format,
2106 the IETF standardized it as RFC 4627 in 2006. Unfortunately the
2107 inventor of JSON Douglas Crockford unilaterally changed the definition
2108 of JSON in javascript. Rather than create a fork, the IETF decided to
2109 standardize the new syntax (apparently, so I as told, without finding
2110 it very amusing).
2111
2112 The biggest difference between the original JSON and the new JSON is
2113 that the new JSON supports scalars (anything other than arrays and
2114 objects) at the top-level of a JSON text. While this is strictly
2115 backwards compatible to older versions, it breaks a number of protocols
2116 that relied on sending JSON back-to-back, and is a minor security
2117 concern.
2118
2119 For example, imagine you have two banks communicating, and on one side,
2120 the JSON coder gets upgraded. Two messages, such as 10 and 1000 might
2121 then be confused to mean 101000, something that couldn't happen in the
2122 original JSON, because neither of these messages would be valid JSON.
2123
2124 If one side accepts these messages, then an upgrade in the coder on
2125 either side could result in this becoming exploitable.
2126
2127 This module has always allowed these messages as an optional extension,
2128 by default disabled. The security concerns are the reason why the
2129 default is still disabled, but future versions might/will likely
2130 upgrade to the newer RFC as default format, so you are advised to check
2131 your implementation and/or override the default with "->allow_nonref
2132 (0)" to ensure that future versions are safe.
2133
2135 Cpanel::JSON::XS has proper ithreads support, unlike JSON::XS. If you
2136 encounter any bugs with thread support please report them.
2137
2138 From Version 4.00 - 4.19 you couldn't encode true with threads::shared
2139 magic.
2140
2142 While the goal of the Cpanel::JSON::XS module is to be correct, that
2143 unfortunately does not mean it's bug-free, only that the author thinks
2144 its design is bug-free. If you keep reporting bugs and tests they will
2145 be fixed swiftly, though.
2146
2147 Since the JSON::XS author refuses to use a public bugtracker and
2148 prefers private emails, we use the tracker at github, so you might want
2149 to report any issues twice. Once in private to MLEHMANN to be fixed in
2150 JSON::XS and one to our the public tracker. Issues fixed by JSON::XS
2151 with a new release will also be backported to Cpanel::JSON::XS and
2152 5.6.2, as long as cPanel relies on 5.6.2 and Cpanel::JSON::XS as our
2153 serializer of choice.
2154
2155 <https://github.com/rurban/Cpanel-JSON-XS/issues>
2156
2158 This module is available under the same licences as perl, the Artistic
2159 license and the GPL.
2160
2162 The cpanel_json_xs command line utility for quick experiments.
2163
2164 JSON, JSON::XS, JSON::MaybeXS, Mojo::JSON, Mojo::JSON::MaybeXS,
2165 JSON::SL, JSON::DWIW, JSON::YAJL, JSON::Any, Test::JSON,
2166 Locale::Wolowitz, <https://metacpan.org/search?q=JSON>
2167
2168 <https://tools.ietf.org/html/rfc7159>
2169
2170 <https://tools.ietf.org/html/rfc4627>
2171
2173 Reini Urban <rurban@cpan.org>
2174
2175 Marc Lehmann <schmorp@schmorp.de>, http://home.schmorp.de/
2176
2178 Reini Urban <rurban@cpan.org>
2179
2180
2181
2182perl v5.36.0 2023-02-22 XS(3)