1XS(3) User Contributed Perl Documentation XS(3)
2
3
4
6 Cpanel::JSON::XS - cPanel fork of JSON::XS, fast and correct
7 serializing
8
10 use Cpanel::JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = Cpanel::JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 # Note that 5.6 misses most smart utf8 and encoding functionalities
25 # of newer releases.
26
27 # Note that L<JSON::MaybeXS> will automatically use Cpanel::JSON::XS
28 # if available, at virtually no speed overhead either, so you should
29 # be able to just:
30
31 use JSON::MaybeXS;
32
33 # and do the same things, except that you have a pure-perl fallback now.
34
35 Note that this module will be replaced by a new JSON::Safe module soon,
36 with the same API just guaranteed safe defaults.
37
39 This module converts Perl data structures to JSON and vice versa. Its
40 primary goal is to be correct and its secondary goal is to be fast. To
41 reach the latter goal it was written in C.
42
43 As this is the n-th-something JSON module on CPAN, what was the reason
44 to write yet another JSON module? While it seems there are many JSON
45 modules, none of them correctly handle all corner cases, and in most
46 cases their maintainers are unresponsive, gone missing, or not
47 listening to bug reports for other reasons.
48
49 See below for the cPanel fork.
50
51 See MAPPING, below, on how Cpanel::JSON::XS maps perl values to JSON
52 values and vice versa.
53
54 FEATURES
55 • correct Unicode handling
56
57 This module knows how to handle Unicode with Perl version higher
58 than 5.8.5, documents how and when it does so, and even documents
59 what "correct" means.
60
61 • round-trip integrity
62
63 When you serialize a perl data structure using only data types
64 supported by JSON and Perl, the deserialized data structure is
65 identical on the Perl level. (e.g. the string "2.0" doesn't
66 suddenly become "2" just because it looks like a number). There are
67 minor exceptions to this, read the MAPPING section below to learn
68 about those.
69
70 • strict checking of JSON correctness
71
72 There is no guessing, no generating of illegal JSON texts by
73 default, and only JSON is accepted as input by default. the latter
74 is a security feature.
75
76 • fast
77
78 Compared to other JSON modules and other serializers such as
79 Storable, this module usually compares favourably in terms of
80 speed, too.
81
82 • simple to use
83
84 This module has both a simple functional interface as well as an
85 object oriented interface.
86
87 • reasonably versatile output formats
88
89 You can choose between the most compact guaranteed-single-line
90 format possible (nice for simple line-based protocols), a pure-
91 ASCII format (for when your transport is not 8-bit clean, still
92 supports the whole Unicode range), or a pretty-printed format (for
93 when you want to read that stuff). Or you can combine those
94 features in whatever way you like.
95
96 cPanel fork
97 Since the original author MLEHMANN has no public bugtracker, this
98 cPanel fork sits now on github.
99
100 src repo: <https://github.com/rurban/Cpanel-JSON-XS> original:
101 <http://cvs.schmorp.de/JSON-XS/>
102
103 RT: <https://github.com/rurban/Cpanel-JSON-XS/issues> or
104 <https://rt.cpan.org/Public/Dist/Display.html?Queue=Cpanel-JSON-XS>
105
106 Changes to JSON::XS
107
108 - bare hashkeys are now checked for utf8. (GH #209)
109
110 - stricter decode_json() as documented. non-refs are disallowed.
111 safe by default.
112 added a 2nd optional argument. decode() honors now allow_nonref.
113
114 - fixed encode of numbers for dual-vars. Different string
115 representations are preserved, but numbers with temporary strings
116 which represent the same number are here treated as numbers, not
117 strings. Cpanel::JSON::XS is a bit slower, but preserves numeric
118 types better.
119
120 - numbers ending with .0 stray numbers, are not converted to
121 integers. [#63] dual-vars which are represented as number not
122 integer (42+"bar" != 5.8.9) are now encoded as number (=> 42.0)
123 because internally it's now a NOK type. However !!1 which is
124 wrongly encoded in 5.8 as "1"/1.0 is still represented as integer.
125
126 - different handling of inf/nan. Default now to null, optionally with
127 stringify_infnan() to "inf"/"nan". [#28, #32]
128
129 - added "binary" extension, non-JSON and non JSON parsable, allows
130 "\xNN" and "\NNN" sequences.
131
132 - 5.6.2 support; sacrificing some utf8 features (assuming bytes
133 all-over), no multi-byte unicode characters with 5.6.
134
135 - interop for true/false overloading. JSON::XS, JSON::PP and Mojo::JSON
136 representations for booleans are accepted and JSON::XS accepts
137 Cpanel::JSON::XS booleans [#13, #37]
138 Fixed overloading of booleans. Cpanel::JSON::XS::true stringifies
139 again
140 to "1", not "true", analog to all other JSON modules.
141
142 - native boolean mapping of yes and no to true and false, as in
143 YAML::XS.
144 In perl "!0" is yes, "!1" is no.
145 The JSON value true maps to 1, false maps to 0. [#39]
146
147 - support arbitrary stringification with encode, with convert_blessed
148 and allow_blessed.
149
150 - ithread support. Cpanel::JSON::XS is thread-safe, JSON::XS not
151
152 - is_bool can be called as method, JSON::XS::is_bool not.
153
154 - performance optimizations for threaded Perls
155
156 - relaxed mode, allowing many popular extensions
157
158 - protect our magic object from corruption by wrong or missing external
159 methods, like FREEZE/THAW or serialization with other methods.
160
161 - additional fixes for:
162
163 - #208 - no security-relevant out-of-bounds reading of module memory
164 when decoding hash keys without ending ':'
165
166 - [cpan #88061] AIX atof without USE_LONG_DOUBLE
167
168 - #10 unshare_hek crash
169
170 - #7, #29 avoid re-blessing where possible. It fails in JSON::XS for
171 READONLY values, i.e. restricted hashes.
172
173 - #41 overloading of booleans, use the object not the reference.
174
175 - #62 -Dusequadmath conversion and no SEGV.
176
177 - #72 parsing of values followed \0, like 1\0 does fail.
178
179 - #72 parsing of illegal unicode or non-unicode characters.
180
181 - #96 locale-insensitive numeric conversion.
182
183 - #154 numeric conversion fixed since 5.22, using the same strtold as perl5.
184
185 - #167 sort tied hashes with canonical.
186
187 - #212 fix utf8 object stringification
188
189 - public maintenance and bugtracker
190
191 - use ppport.h, sanify XS.xs comment styles, harness C coding style
192
193 - common::sense is optional. When available it is not used in the
194 published production module, just during development and testing.
195
196 - extended testsuite, passes all
197 http://seriot.ch/projects/parsing_json.html
198 tests. In fact it is the only know JSON decoder which does so,
199 while also being the fastest.
200
201 - support many more options and methods from JSON::PP:
202 stringify_infnan, allow_unknown, allow_stringify, allow_barekey,
203 encode_stringify, allow_bignum, allow_singlequote,
204 dupkeys_as_arrayref,
205 sort_by (partially), escape_slash, convert_blessed, ...
206 optional decode_json(, allow_nonref) arg.
207 relaxed implements allow_dupkeys.
208
209 - support all 5 unicode BOM's: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE,
210 UTF-32BE, encoding internally to UTF-8.
211
213 The following convenience methods are provided by this module. They are
214 exported by default:
215
216 $json_text = encode_json $perl_scalar, [json_type]
217 Converts the given Perl data structure to a UTF-8 encoded, binary
218 string (that is, the string contains octets only). Croaks on error.
219
220 This function call is functionally identical to:
221
222 $json_text = Cpanel::JSON::XS->new->utf8->encode ($perl_scalar, $json_type)
223
224 Except being faster.
225
226 For the type argument see Cpanel::JSON::XS::Type.
227
228 $perl_scalar = decode_json $json_text [, $allow_nonref [, my $json_type
229 ] ]
230 The opposite of "encode_json": expects an UTF-8 (binary) string of
231 an json reference and tries to parse that as an UTF-8 encoded JSON
232 text, returning the resulting reference. Croaks on error.
233
234 This function call is functionally identical to:
235
236 $perl_scalar = Cpanel::JSON::XS->new->utf8->decode ($json_text, $json_type)
237
238 except being faster.
239
240 Note that older decode_json versions in Cpanel::JSON::XS older than
241 3.0116 and JSON::XS did not set allow_nonref but allowed them due
242 to a bug in the decoder.
243
244 If the new 2nd optional $allow_nonref argument is set and not
245 false, the "allow_nonref" option will be set and the function will
246 act is described as in the relaxed RFC 7159 allowing all values
247 such as objects, arrays, strings, numbers, "null", "true", and
248 "false". See ""OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159)" below,
249 why you don't want to do that.
250
251 For the 3rd optional type argument see Cpanel::JSON::XS::Type.
252
253 $is_boolean = Cpanel::JSON::XS::is_bool $scalar
254 Returns true if the passed scalar represents either
255 "JSON::PP::true" or "JSON::PP::false", two constants that act like
256 1 and 0, respectively and are used to represent JSON "true" and
257 "false" values in Perl. (Also recognizes the booleans produced by
258 JSON::XS.)
259
260 See MAPPING, below, for more information on how JSON values are
261 mapped to Perl.
262
264 from_json
265 from_json has been renamed to decode_json
266
267 to_json
268 to_json has been renamed to encode_json
269
271 Since this often leads to confusion, here are a few very clear words on
272 how Unicode works in Perl, modulo bugs.
273
274 1. Perl strings can store characters with ordinal values > 255.
275 This enables you to store Unicode characters as single characters
276 in a Perl string - very natural.
277
278 2. Perl does not associate an encoding with your strings.
279 ... until you force it to, e.g. when matching it against a regex,
280 or printing the scalar to a file, in which case Perl either
281 interprets your string as locale-encoded text, octets/binary, or as
282 Unicode, depending on various settings. In no case is an encoding
283 stored together with your data, it is use that decides encoding,
284 not any magical meta data.
285
286 3. The internal utf-8 flag has no meaning with regards to the encoding
287 of your string.
288 4. A "Unicode String" is simply a string where each character can be
289 validly interpreted as a Unicode code point.
290 If you have UTF-8 encoded data, it is no longer a Unicode string,
291 but a Unicode string encoded in UTF-8, giving you a binary string.
292
293 5. A string containing "high" (> 255) character values is not a UTF-8
294 string.
295 6. Unicode noncharacters only warn, as in core.
296 The 66 Unicode noncharacters U+FDD0..U+FDEF, and U+*FFFE, U+*FFFF
297 just warn, see <http://www.unicode.org/versions/corrigendum9.html>.
298 But illegal surrogate pairs fail to parse.
299
300 7. Raw non-Unicode characters above U+10FFFF are disallowed.
301 Raw non-Unicode characters outside the valid unicode range fail to
302 parse, because "A string is a sequence of zero or more Unicode
303 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
304 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER
305 flag when parsing unicode.
306
307 I hope this helps :)
308
310 The object oriented interface lets you configure your own encoding or
311 decoding style, within the limits of supported formats.
312
313 $json = new Cpanel::JSON::XS
314 Creates a new JSON object that can be used to de/encode JSON
315 strings. All boolean flags described below are by default disabled.
316
317 The mutators for flags all return the JSON object again and thus
318 calls can be chained:
319
320 my $json = Cpanel::JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
321 => {"a": [1, 2]}
322
323 $json = $json->ascii ([$enable])
324 $enabled = $json->get_ascii
325 If $enable is true (or missing), then the "encode" method will not
326 generate characters outside the code range 0..127 (which is ASCII).
327 Any Unicode characters outside that range will be escaped using
328 either a single "\uXXXX" (BMP characters) or a double
329 "\uHHHH\uLLLLL" escape sequence, as per RFC4627. The resulting
330 encoded JSON text can be treated as a native Unicode string, an
331 ascii-encoded, latin1-encoded or UTF-8 encoded string, or any other
332 superset of ASCII.
333
334 If $enable is false, then the "encode" method will not escape
335 Unicode characters unless required by the JSON syntax or other
336 flags. This results in a faster and more compact format.
337
338 See also the section ENCODING/CODESET FLAG NOTES later in this
339 document.
340
341 The main use for this flag is to produce JSON texts that can be
342 transmitted over a 7-bit channel, as the encoded JSON texts will
343 not contain any 8 bit characters.
344
345 Cpanel::JSON::XS->new->ascii (1)->encode ([chr 0x10401])
346 => ["\ud801\udc01"]
347
348 $json = $json->latin1 ([$enable])
349 $enabled = $json->get_latin1
350 If $enable is true (or missing), then the "encode" method will
351 encode the resulting JSON text as latin1 (or ISO-8859-1), escaping
352 any characters outside the code range 0..255. The resulting string
353 can be treated as a latin1-encoded JSON text or a native Unicode
354 string. The "decode" method will not be affected in any way by this
355 flag, as "decode" by default expects Unicode, which is a strict
356 superset of latin1.
357
358 If $enable is false, then the "encode" method will not escape
359 Unicode characters unless required by the JSON syntax or other
360 flags.
361
362 See also the section ENCODING/CODESET FLAG NOTES later in this
363 document.
364
365 The main use for this flag is efficiently encoding binary data as
366 JSON text, as most octets will not be escaped, resulting in a
367 smaller encoded size. The disadvantage is that the resulting JSON
368 text is encoded in latin1 (and must correctly be treated as such
369 when storing and transferring), a rare encoding for JSON. It is
370 therefore most useful when you want to store data structures known
371 to contain binary data efficiently in files or databases, not when
372 talking to other JSON encoders/decoders.
373
374 Cpanel::JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
375 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
376
377 $json = $json->binary ([$enable])
378 $enabled = $json = $json->get_binary
379 If the $enable argument is true (or missing), then the "encode"
380 method will not try to detect an UTF-8 encoding in any JSON string,
381 it will strictly interpret it as byte sequence. The result might
382 contain new "\xNN" sequences, which is unparsable JSON. The
383 "decode" method forbids "\uNNNN" sequences and accepts "\xNN" and
384 octal "\NNN" sequences.
385
386 There is also a special logic for perl 5.6 and utf8. 5.6 encodes
387 any string to utf-8 automatically when seeing a codepoint >= 0x80
388 and < 0x100. With the binary flag enabled decode the perl utf8
389 encoded string to the original byte encoding and encode this with
390 "\xNN" escapes. This will result to the same encodings as with
391 newer perls. But note that binary multi-byte codepoints with 5.6
392 will result in "illegal unicode character in binary string" errors,
393 unlike with newer perls.
394
395 If $enable is false, then the "encode" method will smartly try to
396 detect Unicode characters unless required by the JSON syntax or
397 other flags and hex and octal sequences are forbidden.
398
399 See also the section ENCODING/CODESET FLAG NOTES later in this
400 document.
401
402 The main use for this flag is to avoid the smart unicode detection
403 and possible double encoding. The disadvantage is that the
404 resulting JSON text is encoded in new "\xNN" and in latin1
405 characters and must correctly be treated as such when storing and
406 transferring, a rare encoding for JSON. It will produce non-
407 readable JSON strings in the browser. It is therefore most useful
408 when you want to store data structures known to contain binary data
409 efficiently in files or databases, not when talking to other JSON
410 encoders/decoders. The binary decoding method can also be used
411 when an encoder produced a non-JSON conformant hex or octal
412 encoding "\xNN" or "\NNN".
413
414 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{abc}"])
415 5.6: Error: malformed or illegal unicode character in binary string
416 >=5.8: ['\x89\xe0\xaa\xbc']
417
418 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{bc}"])
419 => ["\x89\xbc"]
420
421 Cpanel::JSON::XS->new->binary->decode (["\x89\ua001"])
422 Error: malformed or illegal unicode character in binary string
423
424 Cpanel::JSON::XS->new->decode (["\x89"])
425 Error: illegal hex character in non-binary string
426
427 $json = $json->utf8 ([$enable])
428 $enabled = $json->get_utf8
429 If $enable is true (or missing), then the "encode" method will
430 encode the JSON result into UTF-8, as required by many protocols,
431 while the "decode" method expects to be handled an UTF-8-encoded
432 string. Please note that UTF-8-encoded strings do not contain any
433 characters outside the range 0..255, they are thus useful for
434 bytewise/binary I/O. In future versions, enabling this option might
435 enable autodetection of the UTF-16 and UTF-32 encoding families, as
436 described in RFC4627.
437
438 If $enable is false, then the "encode" method will return the JSON
439 string as a (non-encoded) Unicode string, while "decode" expects
440 thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or
441 UTF-16) needs to be done yourself, e.g. using the Encode module.
442
443 See also the section ENCODING/CODESET FLAG NOTES later in this
444 document.
445
446 Example, output UTF-16BE-encoded JSON:
447
448 use Encode;
449 $jsontext = encode "UTF-16BE", Cpanel::JSON::XS->new->encode ($object);
450
451 Example, decode UTF-32LE-encoded JSON:
452
453 use Encode;
454 $object = Cpanel::JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
455
456 $json = $json->pretty ([$enable])
457 This enables (or disables) all of the "indent", "space_before" and
458 "space_after" (and in the future possibly more) flags in one call
459 to generate the most readable (or most compact) form possible.
460
461 Example, pretty-print some simple structure:
462
463 my $json = Cpanel::JSON::XS->new->pretty(1)->encode ({a => [1,2]})
464 =>
465 {
466 "a" : [
467 1,
468 2
469 ]
470 }
471
472 $json = $json->indent ([$enable])
473 $enabled = $json->get_indent
474 If $enable is true (or missing), then the "encode" method will use
475 a multiline format as output, putting every array member or
476 object/hash key-value pair into its own line, indenting them
477 properly.
478
479 If $enable is false, no newlines or indenting will be produced, and
480 the resulting JSON text is guaranteed not to contain any
481 "newlines".
482
483 This setting has no effect when decoding JSON texts.
484
485 $json = $json->indent_length([$number_of_spaces])
486 $length = $json->get_indent_length()
487 Set the indent length (default 3). This option is only useful when
488 you also enable indent or pretty. The acceptable range is from 0
489 (no indentation) to 15
490
491 $json = $json->space_before ([$enable])
492 $enabled = $json->get_space_before
493 If $enable is true (or missing), then the "encode" method will add
494 an extra optional space before the ":" separating keys from values
495 in JSON objects.
496
497 If $enable is false, then the "encode" method will not add any
498 extra space at those places.
499
500 This setting has no effect when decoding JSON texts. You will also
501 most likely combine this setting with "space_after".
502
503 Example, space_before enabled, space_after and indent disabled:
504
505 {"key" :"value"}
506
507 $json = $json->space_after ([$enable])
508 $enabled = $json->get_space_after
509 If $enable is true (or missing), then the "encode" method will add
510 an extra optional space after the ":" separating keys from values
511 in JSON objects and extra whitespace after the "," separating key-
512 value pairs and array members.
513
514 If $enable is false, then the "encode" method will not add any
515 extra space at those places.
516
517 This setting has no effect when decoding JSON texts.
518
519 Example, space_before and indent disabled, space_after enabled:
520
521 {"key": "value"}
522
523 $json = $json->relaxed ([$enable])
524 $enabled = $json->get_relaxed
525 If $enable is true (or missing), then "decode" will accept some
526 extensions to normal JSON syntax (see below). "encode" will not be
527 affected in anyway. Be aware that this option makes you accept
528 invalid JSON texts as if they were valid!. I suggest only to use
529 this option to parse application-specific files written by humans
530 (configuration files, resource files etc.)
531
532 If $enable is false (the default), then "decode" will only accept
533 valid JSON texts.
534
535 Currently accepted extensions are:
536
537 • list items can have an end-comma
538
539 JSON separates array elements and key-value pairs with commas.
540 This can be annoying if you write JSON texts manually and want
541 to be able to quickly append elements, so this extension
542 accepts comma at the end of such items not just between them:
543
544 [
545 1,
546 2, <- this comma not normally allowed
547 ]
548 {
549 "k1": "v1",
550 "k2": "v2", <- this comma not normally allowed
551 }
552
553 • shell-style '#'-comments
554
555 Whenever JSON allows whitespace, shell-style comments are
556 additionally allowed. They are terminated by the first
557 carriage-return or line-feed character, after which more white-
558 space and comments are allowed.
559
560 [
561 1, # this comment not allowed in JSON
562 # neither this one...
563 ]
564
565 • literal ASCII TAB characters in strings
566
567 Literal ASCII TAB characters are now allowed in strings (and
568 treated as "\t") in relaxed mode. Despite JSON mandates, that
569 TAB character is substituted for "\t" sequence.
570
571 [
572 "Hello\tWorld",
573 "Hello<TAB>World", # literal <TAB> would not normally be allowed
574 ]
575
576 • allow_singlequote
577
578 Single quotes are accepted instead of double quotes. See the
579 "allow_singlequote" option.
580
581 { "foo":'bar' }
582 { 'foo':"bar" }
583 { 'foo':'bar' }
584
585 • allow_barekey
586
587 Accept unquoted object keys instead of with mandatory double
588 quotes. See the "allow_barekey" option.
589
590 { foo:"bar" }
591
592 • allow_dupkeys
593
594 Allow decoding of duplicate keys in hashes. By default
595 duplicate keys are forbidden. See
596 <http://seriot.ch/projects/parsing_json.php#24>: RFC 7159
597 section 4: "The names within an object should be unique." See
598 the "allow_dupkeys" option.
599
600 $json = $json->canonical ([$enable])
601 $enabled = $json->get_canonical
602 If $enable is true (or missing), then the "encode" method will
603 output JSON objects by sorting their keys. This is adding a
604 comparatively high overhead.
605
606 If $enable is false, then the "encode" method will output key-value
607 pairs in the order Perl stores them (which will likely change
608 between runs of the same script, and can change even within the
609 same run from 5.18 onwards).
610
611 This option is useful if you want the same data structure to be
612 encoded as the same JSON text (given the same overall settings). If
613 it is disabled, the same hash might be encoded differently even if
614 contains the same data, as key-value pairs have no inherent
615 ordering in Perl.
616
617 This setting has no effect when decoding JSON texts.
618
619 This is now also done with tied hashes, contrary to JSON::XS. But
620 note that with most large tied hashes stored as tree it is advised
621 to sort the iterator already and don't sort the hash output here.
622 Most such iterators are already sorted, as such e.g. DB_File with
623 "DB_BTREE".
624
625 $json = $json->sort_by (undef, 0, 1 or a block)
626 This currently only (un)sets the "canonical" option, and ignores
627 custom sort blocks.
628
629 This setting has no effect when decoding JSON texts.
630
631 This setting has currently no effect on tied hashes.
632
633 $json = $json->escape_slash ([$enable])
634 $enabled = $json->get_escape_slash
635 According to the JSON Grammar, the forward slash character (U+002F)
636 "/" need to be escaped. But by default strings are encoded without
637 escaping slashes in all perl JSON encoders.
638
639 If $enable is true (or missing), then "encode" will escape slashes,
640 "\/".
641
642 This setting has no effect when decoding JSON texts.
643
644 $json = $json->unblessed_bool ([$enable])
645 $enabled = $json->get_unblessed_bool
646 $json = $json->unblessed_bool([$enable])
647
648 If $enable is true (or missing), then "decode" will return Perl
649 non-object boolean variables (1 and 0) for JSON booleans ("true"
650 and "false"). If $enable is false, then "decode" will return
651 "JSON::PP::Boolean" objects for JSON booleans.
652
653 $json = $json->allow_singlequote ([$enable])
654 $enabled = $json->get_allow_singlequote
655 $json = $json->allow_singlequote([$enable])
656
657 If $enable is true (or missing), then "decode" will accept JSON
658 strings quoted by single quotations that are invalid JSON format.
659
660 $json->allow_singlequote->decode({"foo":'bar'});
661 $json->allow_singlequote->decode({'foo':"bar"});
662 $json->allow_singlequote->decode({'foo':'bar'});
663
664 This is also enabled with "relaxed". As same as the "relaxed"
665 option, this option may be used to parse application-specific files
666 written by humans.
667
668 $json = $json->allow_barekey ([$enable])
669 $enabled = $json->get_allow_barekey
670 $json = $json->allow_barekey([$enable])
671
672 If $enable is true (or missing), then "decode" will accept bare
673 keys of JSON object that are invalid JSON format.
674
675 Same as with the "relaxed" option, this option may be used to parse
676 application-specific files written by humans.
677
678 $json->allow_barekey->decode('{foo:"bar"}');
679
680 $json = $json->allow_bignum ([$enable])
681 $enabled = $json->get_allow_bignum
682 $json = $json->allow_bignum([$enable])
683
684 If $enable is true (or missing), then "decode" will convert the big
685 integer Perl cannot handle as integer into a Math::BigInt object
686 and convert a floating number (any) into a Math::BigFloat.
687
688 On the contrary, "encode" converts "Math::BigInt" objects and
689 "Math::BigFloat" objects into JSON numbers with "allow_blessed"
690 enable.
691
692 $json->allow_nonref->allow_blessed->allow_bignum;
693 $bigfloat = $json->decode('2.000000000000000000000000001');
694 print $json->encode($bigfloat);
695 # => 2.000000000000000000000000001
696
697 See "MAPPING" about the normal conversion of JSON number.
698
699 $json = $json->allow_bigint ([$enable])
700 This option is obsolete and replaced by allow_bignum.
701
702 $json = $json->allow_nonref ([$enable])
703 $enabled = $json->get_allow_nonref
704 If $enable is true (or missing), then the "encode" method can
705 convert a non-reference into its corresponding string, number or
706 null JSON value, which is an extension to RFC4627. Likewise,
707 "decode" will accept those JSON values instead of croaking.
708
709 If $enable is false, then the "encode" method will croak if it
710 isn't passed an arrayref or hashref, as JSON texts must either be
711 an object or array. Likewise, "decode" will croak if given
712 something that is not a JSON object or array.
713
714 Example, encode a Perl scalar as JSON value with enabled
715 "allow_nonref", resulting in an invalid JSON text:
716
717 Cpanel::JSON::XS->new->allow_nonref->encode ("Hello, World!")
718 => "Hello, World!"
719
720 $json = $json->allow_unknown ([$enable])
721 $enabled = $json->get_allow_unknown
722 If $enable is true (or missing), then "encode" will not throw an
723 exception when it encounters values it cannot represent in JSON
724 (for example, filehandles) but instead will encode a JSON "null"
725 value. Note that blessed objects are not included here and are
726 handled separately by c<allow_nonref>.
727
728 If $enable is false (the default), then "encode" will throw an
729 exception when it encounters anything it cannot encode as JSON.
730
731 This option does not affect "decode" in any way, and it is
732 recommended to leave it off unless you know your communications
733 partner.
734
735 $json = $json->allow_stringify ([$enable])
736 $enabled = $json->get_allow_stringify
737 If $enable is true (or missing), then "encode" will stringify the
738 non-object perl value or reference. Note that blessed objects are
739 not included here and are handled separately by "allow_blessed" and
740 "convert_blessed". String references are stringified to the string
741 value, other references as in perl.
742
743 This option does not affect "decode" in any way.
744
745 This option is special to this module, it is not supported by other
746 encoders. So it is not recommended to use it.
747
748 $json = $json->require_types ([$enable])
749 $enable = $json->get_require_types
750 $json = $json->require_types([$enable])
751
752 If $enable is true (or missing), then "encode" will require either
753 enabled "type_all_string" or second argument with supplied JSON
754 types. See Cpanel::JSON::XS::Type. When "type_all_string" is not
755 enabled or second argument is not provided (or is undef), then
756 "encode" croaks. It also croaks when the type for provided
757 structure in "encode" is incomplete.
758
759 $json = $json->type_all_string ([$enable])
760 $enable = $json->get_type_all_string
761 $json = $json->type_all_string([$enable])
762
763 If $enable is true (or missing), then "encode" will always produce
764 stable deterministic JSON string types in resulted output.
765
766 When $enable is false, then result of encoded JSON output may be
767 different for different Perl versions and may depends on loaded
768 modules.
769
770 This is useful it you need deterministic JSON types, independently
771 of used Perl version and other modules, but do not want to write
772 complicated type definitions for Cpanel::JSON::XS::Type.
773
774 $json = $json->allow_dupkeys ([$enable])
775 $enabled = $json->get_allow_dupkeys
776 If $enable is true (or missing), then the "decode" method will not
777 die when it encounters duplicate keys in a hash. "allow_dupkeys"
778 is also enabled in the "relaxed" mode.
779
780 The JSON spec allows duplicate name in objects but recommends to
781 disable it, however with Perl hashes they are impossible, parsing
782 JSON in Perl silently ignores duplicate names, using the last value
783 found.
784
785 See <http://seriot.ch/projects/parsing_json.php#24>: RFC 7159
786 section 4: "The names within an object should be unique."
787
788 $json = $json->dupkeys_as_arrayref ([$enable])
789 $enabled = $json->get_dupkeys_as_arrayref
790 If enabled, allow decoding of duplicate keys in hashes and store
791 the values as arrayref in the hash instead. By default duplicate
792 keys are forbidden. Enabling this also enables the "allow_dupkeys"
793 option, but disabling this does not disable the "allow_dupkeys"
794 option.
795
796 Example:
797
798 $json->dupkeys_as_arrayref;
799 print encode_json ($json->decode ('{"a":"b","a":"c"}'));
800
801 => {"a":["b","c"]}
802
803 This changes the result structure, thus cannot be enabled by
804 default. The client must be aware of it. The resulting arrayref is
805 not yet marked somehow (blessed or such).
806
807 $json = $json->allow_blessed ([$enable])
808 $enabled = $json->get_allow_blessed
809 If $enable is true (or missing), then the "encode" method will not
810 barf when it encounters a blessed reference. Instead, the value of
811 the convert_blessed option will decide whether "null"
812 ("convert_blessed" disabled or no "TO_JSON" method found) or a
813 representation of the object ("convert_blessed" enabled and
814 "TO_JSON" method found) is being encoded. Has no effect on
815 "decode".
816
817 If $enable is false (the default), then "encode" will throw an
818 exception when it encounters a blessed object without
819 "convert_blessed" and a "TO_JSON" method.
820
821 This setting has no effect on "decode".
822
823 $json = $json->convert_blessed ([$enable])
824 $enabled = $json->get_convert_blessed
825 If $enable is true (or missing), then "encode", upon encountering a
826 blessed object, will check for the availability of the "TO_JSON"
827 method on the object's class. If found, it will be called in scalar
828 context and the resulting scalar will be encoded instead of the
829 object. If no "TO_JSON" method is found, a stringification overload
830 method is tried next. If both are not found, the value of
831 "allow_blessed" will decide what to do.
832
833 The "TO_JSON" method may safely call die if it wants. If "TO_JSON"
834 returns other blessed objects, those will be handled in the same
835 way. "TO_JSON" must take care of not causing an endless recursion
836 cycle (== crash) in this case. The same care must be taken with
837 calling encode in stringify overloads (even if this works by luck
838 in older perls) or other callbacks. The name of "TO_JSON" was
839 chosen because other methods called by the Perl core (== not by the
840 user of the object) are usually in upper case letters and to avoid
841 collisions with any "to_json" function or method.
842
843 If $enable is false (the default), then "encode" will not consider
844 this type of conversion.
845
846 This setting has no effect on "decode".
847
848 $json = $json->allow_tags ([$enable])
849 $enabled = $json->get_allow_tags
850 See "OBJECT SERIALIZATION" for details.
851
852 If $enable is true (or missing), then "encode", upon encountering a
853 blessed object, will check for the availability of the "FREEZE"
854 method on the object's class. If found, it will be used to
855 serialize the object into a nonstandard tagged JSON value (that
856 JSON decoders cannot decode).
857
858 It also causes "decode" to parse such tagged JSON values and
859 deserialize them via a call to the "THAW" method.
860
861 If $enable is false (the default), then "encode" will not consider
862 this type of conversion, and tagged JSON values will cause a parse
863 error in "decode", as if tags were not part of the grammar.
864
865 $json = $json->filter_json_object ([$coderef->($hashref)])
866 When $coderef is specified, it will be called from "decode" each
867 time it decodes a JSON object. The only argument is a reference to
868 the newly-created hash. If the code references returns a single
869 scalar (which need not be a reference), this value (i.e. a copy of
870 that scalar to avoid aliasing) is inserted into the deserialized
871 data structure. If it returns an empty list (NOTE: not "undef",
872 which is a valid scalar), the original deserialized hash will be
873 inserted. This setting can slow down decoding considerably.
874
875 When $coderef is omitted or undefined, any existing callback will
876 be removed and "decode" will not change the deserialized hash in
877 any way.
878
879 Example, convert all JSON objects into the integer 5:
880
881 my $js = Cpanel::JSON::XS->new->filter_json_object (sub { 5 });
882 # returns [5]
883 $js->decode ('[{}]')
884 # throw an exception because allow_nonref is not enabled
885 # so a lone 5 is not allowed.
886 $js->decode ('{"a":1, "b":2}');
887
888 $json = $json->filter_json_single_key_object ($key [=>
889 $coderef->($value)])
890 Works remotely similar to "filter_json_object", but is only called
891 for JSON objects having a single key named $key.
892
893 This $coderef is called before the one specified via
894 "filter_json_object", if any. It gets passed the single value in
895 the JSON object. If it returns a single value, it will be inserted
896 into the data structure. If it returns nothing (not even "undef"
897 but the empty list), the callback from "filter_json_object" will be
898 called next, as if no single-key callback were specified.
899
900 If $coderef is omitted or undefined, the corresponding callback
901 will be disabled. There can only ever be one callback for a given
902 key.
903
904 As this callback gets called less often then the
905 "filter_json_object" one, decoding speed will not usually suffer as
906 much. Therefore, single-key objects make excellent targets to
907 serialize Perl objects into, especially as single-key JSON objects
908 are as close to the type-tagged value concept as JSON gets (it's
909 basically an ID/VALUE tuple). Of course, JSON does not support this
910 in any way, so you need to make sure your data never looks like a
911 serialized Perl hash.
912
913 Typical names for the single object key are "__class_whatever__",
914 or "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or
915 even things like "__class_md5sum(classname)__", to reduce the risk
916 of clashing with real hashes.
917
918 Example, decode JSON objects of the form "{ "__widget__" => <id> }"
919 into the corresponding $WIDGET{<id>} object:
920
921 # return whatever is in $WIDGET{5}:
922 Cpanel::JSON::XS
923 ->new
924 ->filter_json_single_key_object (__widget__ => sub {
925 $WIDGET{ $_[0] }
926 })
927 ->decode ('{"__widget__": 5')
928
929 # this can be used with a TO_JSON method in some "widget" class
930 # for serialization to json:
931 sub WidgetBase::TO_JSON {
932 my ($self) = @_;
933
934 unless ($self->{id}) {
935 $self->{id} = ..get..some..id..;
936 $WIDGET{$self->{id}} = $self;
937 }
938
939 { __widget__ => $self->{id} }
940 }
941
942 $json = $json->shrink ([$enable])
943 $enabled = $json->get_shrink
944 Perl usually over-allocates memory a bit when allocating space for
945 strings. This flag optionally resizes strings generated by either
946 "encode" or "decode" to their minimum size possible. This can save
947 memory when your JSON texts are either very very long or you have
948 many short strings. It will also try to downgrade any strings to
949 octet-form if possible: perl stores strings internally either in an
950 encoding called UTF-X or in octet-form. The latter cannot store
951 everything but uses less space in general (and some buggy Perl or C
952 code might even rely on that internal representation being used).
953
954 The actual definition of what shrink does might change in future
955 versions, but it will always try to save space at the expense of
956 time.
957
958 If $enable is true (or missing), the string returned by "encode"
959 will be shrunk-to-fit, while all strings generated by "decode" will
960 also be shrunk-to-fit.
961
962 If $enable is false, then the normal perl allocation algorithms are
963 used. If you work with your data, then this is likely to be
964 faster.
965
966 In the future, this setting might control other things, such as
967 converting strings that look like integers or floats into integers
968 or floats internally (there is no difference on the Perl level),
969 saving space.
970
971 $json = $json->max_depth ([$maximum_nesting_depth])
972 $max_depth = $json->get_max_depth
973 Sets the maximum nesting level (default 512) accepted while
974 encoding or decoding. If a higher nesting level is detected in JSON
975 text or a Perl data structure, then the encoder and decoder will
976 stop and croak at that point.
977
978 Nesting level is defined by number of hash- or arrayrefs that the
979 encoder needs to traverse to reach a given point or the number of
980 "{" or "[" characters without their matching closing parenthesis
981 crossed to reach a given character in a string.
982
983 Setting the maximum depth to one disallows any nesting, so that
984 ensures that the object is only a single hash/object or array.
985
986 If no argument is given, the highest possible setting will be used,
987 which is rarely useful.
988
989 Note that nesting is implemented by recursion in C. The default
990 value has been chosen to be as large as typical operating systems
991 allow without crashing.
992
993 See "SECURITY CONSIDERATIONS", below, for more info on why this is
994 useful.
995
996 $json = $json->max_size ([$maximum_string_size])
997 $max_size = $json->get_max_size
998 Set the maximum length a JSON text may have (in bytes) where
999 decoding is being attempted. The default is 0, meaning no limit.
1000 When "decode" is called on a string that is longer then this many
1001 bytes, it will not attempt to decode the string but throw an
1002 exception. This setting has no effect on "encode" (yet).
1003
1004 If no argument is given, the limit check will be deactivated (same
1005 as when 0 is specified).
1006
1007 See "SECURITY CONSIDERATIONS", below, for more info on why this is
1008 useful.
1009
1010 $json->stringify_infnan ([$infnan_mode = 1])
1011 $infnan_mode = $json->get_stringify_infnan
1012 Get or set how Cpanel::JSON::XS encodes "inf", "-inf" or "nan" for
1013 numeric values. Also qnan, snan or negative nan on some platforms.
1014
1015 "null": infnan_mode = 0. Similar to most JSON modules in other
1016 languages. Always null.
1017
1018 stringified: infnan_mode = 1. As in Mojo::JSON. Platform specific
1019 strings. Stringified via sprintf(%g), with double quotes.
1020
1021 inf/nan: infnan_mode = 2. As in JSON::XS, and older releases.
1022 Passes through platform dependent values, invalid JSON. Stringified
1023 via sprintf(%g), but without double quotes.
1024
1025 "inf/-inf/nan": infnan_mode = 3. Platform independent inf/nan/-inf
1026 strings. No QNAN/SNAN/negative NAN support, unified to "nan". Much
1027 easier to detect, but may conflict with valid strings.
1028
1029 $json_text = $json->encode ($perl_scalar, $json_type)
1030 Converts the given Perl data structure (a simple scalar or a
1031 reference to a hash or array) to its JSON representation. Simple
1032 scalars will be converted into JSON string or number sequences,
1033 while references to arrays become JSON arrays and references to
1034 hashes become JSON objects. Undefined Perl values (e.g. "undef")
1035 become JSON "null" values. Neither "true" nor "false" values will
1036 be generated.
1037
1038 For the type argument see Cpanel::JSON::XS::Type.
1039
1040 $perl_scalar = $json->decode ($json_text, my $json_type)
1041 The opposite of "encode": expects a JSON text and tries to parse
1042 it, returning the resulting simple scalar or reference. Croaks on
1043 error.
1044
1045 JSON numbers and strings become simple Perl scalars. JSON arrays
1046 become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
1047 becomes 1, "false" becomes 0 and "null" becomes "undef".
1048
1049 For the type argument see Cpanel::JSON::XS::Type.
1050
1051 ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
1052 This works like the "decode" method, but instead of raising an
1053 exception when there is trailing garbage after the first JSON
1054 object, it will silently stop parsing there and return the number
1055 of characters consumed so far.
1056
1057 This is useful if your JSON texts are not delimited by an outer
1058 protocol and you need to know where the JSON text ends.
1059
1060 Cpanel::JSON::XS->new->decode_prefix ("[1] the tail")
1061 => ([1], 3)
1062
1063 $json->to_json ($perl_hash_or_arrayref)
1064 Deprecated method for perl 5.8 and newer. Use encode_json instead.
1065
1066 $json->from_json ($utf8_encoded_json_text)
1067 Deprecated method for perl 5.8 and newer. Use decode_json instead.
1068
1070 In some cases, there is the need for incremental parsing of JSON texts.
1071 While this module always has to keep both JSON text and resulting Perl
1072 data structure in memory at one time, it does allow you to parse a JSON
1073 stream incrementally. It does so by accumulating text until it has a
1074 full JSON object, which it then can decode. This process is similar to
1075 using "decode_prefix" to see if a full JSON object is available, but is
1076 much more efficient (and can be implemented with a minimum of method
1077 calls).
1078
1079 Cpanel::JSON::XS will only attempt to parse the JSON text once it is
1080 sure it has enough text to get a decisive result, using a very simple
1081 but truly incremental parser. This means that it sometimes won't stop
1082 as early as the full parser, for example, it doesn't detect mismatched
1083 parentheses. The only thing it guarantees is that it starts decoding as
1084 soon as a syntactically valid JSON text has been seen. This means you
1085 need to set resource limits (e.g. "max_size") to ensure the parser will
1086 stop parsing in the presence if syntax errors.
1087
1088 The following methods implement this incremental parser.
1089
1090 [void, scalar or list context] = $json->incr_parse ([$string])
1091 This is the central parsing function. It can both append new text
1092 and extract objects from the stream accumulated so far (both of
1093 these functions are optional).
1094
1095 If $string is given, then this string is appended to the already
1096 existing JSON fragment stored in the $json object.
1097
1098 After that, if the function is called in void context, it will
1099 simply return without doing anything further. This can be used to
1100 add more text in as many chunks as you want.
1101
1102 If the method is called in scalar context, then it will try to
1103 extract exactly one JSON object. If that is successful, it will
1104 return this object, otherwise it will return "undef". If there is a
1105 parse error, this method will croak just as "decode" would do (one
1106 can then use "incr_skip" to skip the erroneous part). This is the
1107 most common way of using the method.
1108
1109 And finally, in list context, it will try to extract as many
1110 objects from the stream as it can find and return them, or the
1111 empty list otherwise. For this to work, there must be no separators
1112 between the JSON objects or arrays, instead they must be
1113 concatenated back-to-back. If an error occurs, an exception will be
1114 raised as in the scalar context case. Note that in this case, any
1115 previously-parsed JSON texts will be lost.
1116
1117 Example: Parse some JSON arrays/objects in a given string and
1118 return them.
1119
1120 my @objs = Cpanel::JSON::XS->new->incr_parse ("[5][7][1,2]");
1121
1122 $lvalue_string = $json->incr_text (>5.8 only)
1123 This method returns the currently stored JSON fragment as an
1124 lvalue, that is, you can manipulate it. This only works when a
1125 preceding call to "incr_parse" in scalar context successfully
1126 returned an object, and 2. only with Perl >= 5.8
1127
1128 Under all other circumstances you must not call this function (I
1129 mean it. although in simple tests it might actually work, it will
1130 fail under real world conditions). As a special exception, you can
1131 also call this method before having parsed anything.
1132
1133 This function is useful in two cases: a) finding the trailing text
1134 after a JSON object or b) parsing multiple JSON objects separated
1135 by non-JSON text (such as commas).
1136
1137 $json->incr_skip
1138 This will reset the state of the incremental parser and will remove
1139 the parsed text from the input buffer so far. This is useful after
1140 "incr_parse" died, in which case the input buffer and incremental
1141 parser state is left unchanged, to skip the text parsed so far and
1142 to reset the parse state.
1143
1144 The difference to "incr_reset" is that only text until the parse
1145 error occurred is removed.
1146
1147 $json->incr_reset
1148 This completely resets the incremental parser, that is, after this
1149 call, it will be as if the parser had never parsed anything.
1150
1151 This is useful if you want to repeatedly parse JSON objects and
1152 want to ignore any trailing data, which means you have to reset the
1153 parser after each successful decode.
1154
1155 LIMITATIONS
1156 All options that affect decoding are supported, except "allow_nonref".
1157 The reason for this is that it cannot be made to work sensibly: JSON
1158 objects and arrays are self-delimited, i.e. you can concatenate them
1159 back to back and still decode them perfectly. This does not hold true
1160 for JSON numbers, however.
1161
1162 For example, is the string 1 a single JSON number, or is it simply the
1163 start of 12? Or is 12 a single JSON number, or the concatenation of 1
1164 and 2? In neither case you can tell, and this is why Cpanel::JSON::XS
1165 takes the conservative route and disallows this case.
1166
1167 EXAMPLES
1168 Some examples will make all this clearer. First, a simple example that
1169 works similarly to "decode_prefix": We want to decode the JSON object
1170 at the start of a string and identify the portion after the JSON
1171 object:
1172
1173 my $text = "[1,2,3] hello";
1174
1175 my $json = new Cpanel::JSON::XS;
1176
1177 my $obj = $json->incr_parse ($text)
1178 or die "expected JSON object or array at beginning of string";
1179
1180 my $tail = $json->incr_text;
1181 # $tail now contains " hello"
1182
1183 Easy, isn't it?
1184
1185 Now for a more complicated example: Imagine a hypothetical protocol
1186 where you read some requests from a TCP stream, and each request is a
1187 JSON array, without any separation between them (in fact, it is often
1188 useful to use newlines as "separators", as these get interpreted as
1189 whitespace at the start of the JSON text, which makes it possible to
1190 test said protocol with "telnet"...).
1191
1192 Here is how you'd do it (it is trivial to write this in an event-based
1193 manner):
1194
1195 my $json = new Cpanel::JSON::XS;
1196
1197 # read some data from the socket
1198 while (sysread $socket, my $buf, 4096) {
1199
1200 # split and decode as many requests as possible
1201 for my $request ($json->incr_parse ($buf)) {
1202 # act on the $request
1203 }
1204 }
1205
1206 Another complicated example: Assume you have a string with JSON objects
1207 or arrays, all separated by (optional) comma characters (e.g. "[1],[2],
1208 [3]"). To parse them, we have to skip the commas between the JSON
1209 texts, and here is where the lvalue-ness of "incr_text" comes in
1210 useful:
1211
1212 my $text = "[1],[2], [3]";
1213 my $json = new Cpanel::JSON::XS;
1214
1215 # void context, so no parsing done
1216 $json->incr_parse ($text);
1217
1218 # now extract as many objects as possible. note the
1219 # use of scalar context so incr_text can be called.
1220 while (my $obj = $json->incr_parse) {
1221 # do something with $obj
1222
1223 # now skip the optional comma
1224 $json->incr_text =~ s/^ \s* , //x;
1225 }
1226
1227 Now lets go for a very complex example: Assume that you have a gigantic
1228 JSON array-of-objects, many gigabytes in size, and you want to parse
1229 it, but you cannot load it into memory fully (this has actually
1230 happened in the real world :).
1231
1232 Well, you lost, you have to implement your own JSON parser. But
1233 Cpanel::JSON::XS can still help you: You implement a (very simple)
1234 array parser and let JSON decode the array elements, which are all full
1235 JSON objects on their own (this wouldn't work if the array elements
1236 could be JSON numbers, for example):
1237
1238 my $json = new Cpanel::JSON::XS;
1239
1240 # open the monster
1241 open my $fh, "<bigfile.json"
1242 or die "bigfile: $!";
1243
1244 # first parse the initial "["
1245 for (;;) {
1246 sysread $fh, my $buf, 65536
1247 or die "read error: $!";
1248 $json->incr_parse ($buf); # void context, so no parsing
1249
1250 # Exit the loop once we found and removed(!) the initial "[".
1251 # In essence, we are (ab-)using the $json object as a simple scalar
1252 # we append data to.
1253 last if $json->incr_text =~ s/^ \s* \[ //x;
1254 }
1255
1256 # now we have the skipped the initial "[", so continue
1257 # parsing all the elements.
1258 for (;;) {
1259 # in this loop we read data until we got a single JSON object
1260 for (;;) {
1261 if (my $obj = $json->incr_parse) {
1262 # do something with $obj
1263 last;
1264 }
1265
1266 # add more data
1267 sysread $fh, my $buf, 65536
1268 or die "read error: $!";
1269 $json->incr_parse ($buf); # void context, so no parsing
1270 }
1271
1272 # in this loop we read data until we either found and parsed the
1273 # separating "," between elements, or the final "]"
1274 for (;;) {
1275 # first skip whitespace
1276 $json->incr_text =~ s/^\s*//;
1277
1278 # if we find "]", we are done
1279 if ($json->incr_text =~ s/^\]//) {
1280 print "finished.\n";
1281 exit;
1282 }
1283
1284 # if we find ",", we can continue with the next element
1285 if ($json->incr_text =~ s/^,//) {
1286 last;
1287 }
1288
1289 # if we find anything else, we have a parse error!
1290 if (length $json->incr_text) {
1291 die "parse error near ", $json->incr_text;
1292 }
1293
1294 # else add more data
1295 sysread $fh, my $buf, 65536
1296 or die "read error: $!";
1297 $json->incr_parse ($buf); # void context, so no parsing
1298 }
1299
1300 This is a complex example, but most of the complexity comes from the
1301 fact that we are trying to be correct (bear with me if I am wrong, I
1302 never ran the above example :).
1303
1305 Detect all unicode Byte Order Marks on decode. Which are UTF-8,
1306 UTF-16LE, UTF-16BE, UTF-32LE and UTF-32BE.
1307
1308 The BOM encoding is set only for one specific decode call, it does not
1309 change the state of the JSON object.
1310
1311 Warning: With perls older than 5.20 you need load the Encode module
1312 before loading a multibyte BOM, i.e. >= UTF-16. Otherwise an error is
1313 thrown. This is an implementation limitation and might get fixed later.
1314
1315 See <https://tools.ietf.org/html/rfc7159#section-8.1> "JSON text SHALL
1316 be encoded in UTF-8, UTF-16, or UTF-32."
1317
1318 "Implementations MUST NOT add a byte order mark to the beginning of a
1319 JSON text", "implementations (...) MAY ignore the presence of a byte
1320 order mark rather than treating it as an error".
1321
1322 See also <http://www.unicode.org/faq/utf_bom.html#BOM>.
1323
1324 Beware that Cpanel::JSON::XS is currently the only JSON module which
1325 does accept and decode a BOM.
1326
1327 The latest JSON spec
1328 <https://www.greenbytes.de/tech/webdav/rfc8259.html#character.encoding>
1329 forbid the usage of UTF-16 or UTF-32, the character encoding is UTF-8.
1330 Thus in subsequent updates BOM's of UTF-16 or UTF-32 will throw an
1331 error.
1332
1334 This section describes how Cpanel::JSON::XS maps Perl values to JSON
1335 values and vice versa. These mappings are designed to "do the right
1336 thing" in most circumstances automatically, preserving round-tripping
1337 characteristics (what you put in comes out as something equivalent).
1338
1339 For the more enlightened: note that in the following descriptions,
1340 lowercase perl refers to the Perl interpreter, while uppercase Perl
1341 refers to the abstract Perl language itself.
1342
1343 JSON -> PERL
1344 object
1345 A JSON object becomes a reference to a hash in Perl. No ordering of
1346 object keys is preserved (JSON does not preserve object key
1347 ordering itself).
1348
1349 array
1350 A JSON array becomes a reference to an array in Perl.
1351
1352 string
1353 A JSON string becomes a string scalar in Perl - Unicode codepoints
1354 in JSON are represented by the same codepoints in the Perl string,
1355 so no manual decoding is necessary.
1356
1357 number
1358 A JSON number becomes either an integer, numeric (floating point)
1359 or string scalar in perl, depending on its range and any fractional
1360 parts. On the Perl level, there is no difference between those as
1361 Perl handles all the conversion details, but an integer may take
1362 slightly less memory and might represent more values exactly than
1363 floating point numbers.
1364
1365 If the number consists of digits only, Cpanel::JSON::XS will try to
1366 represent it as an integer value. If that fails, it will try to
1367 represent it as a numeric (floating point) value if that is
1368 possible without loss of precision. Otherwise it will preserve the
1369 number as a string value (in which case you lose roundtripping
1370 ability, as the JSON number will be re-encoded to a JSON string).
1371
1372 Numbers containing a fractional or exponential part will always be
1373 represented as numeric (floating point) values, possibly at a loss
1374 of precision (in which case you might lose perfect roundtripping
1375 ability, but the JSON number will still be re-encoded as a JSON
1376 number).
1377
1378 Note that precision is not accuracy - binary floating point values
1379 cannot represent most decimal fractions exactly, and when
1380 converting from and to floating point, "Cpanel::JSON::XS" only
1381 guarantees precision up to but not including the least significant
1382 bit.
1383
1384 true, false
1385 When "unblessed_bool" is set to true, then JSON "true" becomes 1
1386 and JSON "false" becomes 0.
1387
1388 Otherwise these JSON atoms become "JSON::PP::true" and
1389 "JSON::PP::false", respectively. They are "JSON::PP::Boolean"
1390 objects and are overloaded to act almost exactly like the numbers 1
1391 and 0. You can check whether a scalar is a JSON boolean by using
1392 the "Cpanel::JSON::XS::is_bool" function.
1393
1394 The other round, from perl to JSON, "!0" which is represented as
1395 "yes" becomes "true", and "!1" which is represented as "no" becomes
1396 "false".
1397
1398 Via Cpanel::JSON::XS::Type you can now even force negation in
1399 "encode", without overloading of "!":
1400
1401 my $false = Cpanel::JSON::XS::false;
1402 print($json->encode([!$false], [JSON_TYPE_BOOL]));
1403 => [true]
1404
1405 null
1406 A JSON null atom becomes "undef" in Perl.
1407
1408 shell-style comments ("# text")
1409 As a nonstandard extension to the JSON syntax that is enabled by
1410 the "relaxed" setting, shell-style comments are allowed. They can
1411 start anywhere outside strings and go till the end of the line.
1412
1413 tagged values ("(tag)value").
1414 Another nonstandard extension to the JSON syntax, enabled with the
1415 "allow_tags" setting, are tagged values. In this implementation,
1416 the tag must be a perl package/class name encoded as a JSON string,
1417 and the value must be a JSON array encoding optional constructor
1418 arguments.
1419
1420 See "OBJECT SERIALIZATION", below, for details.
1421
1422 PERL -> JSON
1423 The mapping from Perl to JSON is slightly more difficult, as Perl is a
1424 truly typeless language, so we can only guess which JSON type is meant
1425 by a Perl value.
1426
1427 hash references
1428 Perl hash references become JSON objects. As there is no inherent
1429 ordering in hash keys (or JSON objects), they will usually be
1430 encoded in a pseudo-random order that can change between runs of
1431 the same program but stays generally the same within a single run
1432 of a program. Cpanel::JSON::XS can optionally sort the hash keys
1433 (determined by the canonical flag), so the same datastructure will
1434 serialize to the same JSON text (given same settings and version of
1435 Cpanel::JSON::XS), but this incurs a runtime overhead and is only
1436 rarely useful, e.g. when you want to compare some JSON text against
1437 another for equality.
1438
1439 array references
1440 Perl array references become JSON arrays.
1441
1442 other references
1443 Other unblessed references are generally not allowed and will cause
1444 an exception to be thrown, except for references to the integers 0
1445 and 1, which get turned into "false" and "true" atoms in JSON.
1446
1447 With the option "allow_stringify", you can ignore the exception and
1448 return the stringification of the perl value.
1449
1450 With the option "allow_unknown", you can ignore the exception and
1451 return "null" instead.
1452
1453 encode_json [\"x"] # => cannot encode reference to scalar 'SCALAR(0x..)'
1454 # unless the scalar is 0 or 1
1455 encode_json [\0, \1] # yields [false,true]
1456
1457 allow_stringify->encode_json [\"x"] # yields "x" unlike JSON::PP
1458 allow_unknown->encode_json [\"x"] # yields null as in JSON::PP
1459
1460 Cpanel::JSON::XS::true, Cpanel::JSON::XS::false
1461 These special values become JSON true and JSON false values,
1462 respectively. You can also use "\1" and "\0" or "!0" and "!1"
1463 directly if you want.
1464
1465 encode_json [Cpanel::JSON::XS::false, Cpanel::JSON::XS::true] # yields [false,true]
1466 encode_json [!1, !0], [JSON_TYPE_BOOL, JSON_TYPE_BOOL] # yields [false,true]
1467
1468 eq/ne comparisons with true, false:
1469
1470 false is eq to the empty string or the string 'false' or the
1471 special empty string "!!0" or "!1", i.e. "SV_NO", or the numbers 0
1472 or 0.0.
1473
1474 true is eq to the string 'true' or to the special string "!0" (i.e.
1475 "SV_YES") or to the numbers 1 or 1.0.
1476
1477 blessed objects
1478 Blessed objects are not directly representable in JSON, but
1479 "Cpanel::JSON::XS" allows various optional ways of handling
1480 objects. See "OBJECT SERIALIZATION", below, for details.
1481
1482 See the "allow_blessed" and "convert_blessed" methods on various
1483 options on how to deal with this: basically, you can choose between
1484 throwing an exception, encoding the reference as if it weren't
1485 blessed, use the objects overloaded stringification method or
1486 provide your own serializer method.
1487
1488 simple scalars
1489 Simple Perl scalars (any scalar that is not a reference) are the
1490 most difficult objects to encode: Cpanel::JSON::XS will encode
1491 undefined scalars or inf/nan as JSON "null" values and other
1492 scalars to either number or string in non-deterministic way which
1493 may be affected or changed by Perl version or any other loaded Perl
1494 module.
1495
1496 If you want to have stable and deterministic types in JSON encoder
1497 then use Cpanel::JSON::XS::Type.
1498
1499 Alternative way for deterministic types is to use "type_all_string"
1500 method when all perl scalars are encoded to JSON strings.
1501
1502 Non-deterministic behavior is following: scalars that have last
1503 been used in a string context before encoding as JSON strings, and
1504 anything else as number value:
1505
1506 # dump as number
1507 encode_json [2] # yields [2]
1508 encode_json [-3.0e17] # yields [-3e+17]
1509 my $value = 5; encode_json [$value] # yields [5]
1510
1511 # used as string, but the two representations are for the same number
1512 print $value;
1513 encode_json [$value] # yields [5]
1514
1515 # used as different string (non-matching dual-var)
1516 my $str = '0 but true';
1517 my $num = 1 + $str;
1518 encode_json [$num, $str] # yields [1,"0 but true"]
1519
1520 # undef becomes null
1521 encode_json [undef] # yields [null]
1522
1523 # inf or nan becomes null, unless you answered
1524 # "Do you want to handle inf/nan as strings" with yes
1525 encode_json [9**9**9] # yields [null]
1526
1527 You can force the type to be a JSON string by stringifying it:
1528
1529 my $x = 3.1; # some variable containing a number
1530 "$x"; # stringified
1531 $x .= ""; # another, more awkward way to stringify
1532 print $x; # perl does it for you, too, quite often
1533
1534 You can force the type to be a JSON number by numifying it:
1535
1536 my $x = "3"; # some variable containing a string
1537 $x += 0; # numify it, ensuring it will be dumped as a number
1538 $x *= 1; # same thing, the choice is yours.
1539
1540 Note that numerical precision has the same meaning as under Perl
1541 (so binary to decimal conversion follows the same rules as in Perl,
1542 which can differ to other languages). Also, your perl interpreter
1543 might expose extensions to the floating point numbers of your
1544 platform, such as infinities or NaN's - these cannot be represented
1545 in JSON, and thus null is returned instead. Optionally you can
1546 configure it to stringify inf and nan values.
1547
1548 OBJECT SERIALIZATION
1549 As JSON cannot directly represent Perl objects, you have to choose
1550 between a pure JSON representation (without the ability to deserialize
1551 the object automatically again), and a nonstandard extension to the
1552 JSON syntax, tagged values.
1553
1554 SERIALIZATION
1555
1556 What happens when "Cpanel::JSON::XS" encounters a Perl object depends
1557 on the "allow_blessed", "convert_blessed" and "allow_tags" settings,
1558 which are used in this order:
1559
1560 1. "allow_tags" is enabled and the object has a "FREEZE" method.
1561 In this case, "Cpanel::JSON::XS" uses the Types::Serialiser object
1562 serialization protocol to create a tagged JSON value, using a
1563 nonstandard extension to the JSON syntax.
1564
1565 This works by invoking the "FREEZE" method on the object, with the
1566 first argument being the object to serialize, and the second
1567 argument being the constant string "JSON" to distinguish it from
1568 other serializers.
1569
1570 The "FREEZE" method can return any number of values (i.e. zero or
1571 more). These values and the paclkage/classname of the object will
1572 then be encoded as a tagged JSON value in the following format:
1573
1574 ("classname")[FREEZE return values...]
1575
1576 e.g.:
1577
1578 ("URI")["http://www.google.com/"]
1579 ("MyDate")[2013,10,29]
1580 ("ImageData::JPEG")["Z3...VlCg=="]
1581
1582 For example, the hypothetical "My::Object" "FREEZE" method might
1583 use the objects "type" and "id" members to encode the object:
1584
1585 sub My::Object::FREEZE {
1586 my ($self, $serializer) = @_;
1587
1588 ($self->{type}, $self->{id})
1589 }
1590
1591 2. "convert_blessed" is enabled and the object has a "TO_JSON" method.
1592 In this case, the "TO_JSON" method of the object is invoked in
1593 scalar context. It must return a single scalar that can be directly
1594 encoded into JSON. This scalar replaces the object in the JSON
1595 text.
1596
1597 For example, the following "TO_JSON" method will convert all URI
1598 objects to JSON strings when serialized. The fact that these values
1599 originally were URI objects is lost.
1600
1601 sub URI::TO_JSON {
1602 my ($uri) = @_;
1603 $uri->as_string
1604 }
1605
1606 3. "convert_blessed" is enabled and the object has a stringification
1607 overload.
1608 In this case, the overloaded "" method of the object is invoked in
1609 scalar context. It must return a single scalar that can be directly
1610 encoded into JSON. This scalar replaces the object in the JSON
1611 text.
1612
1613 For example, the following "" method will convert all URI objects
1614 to JSON strings when serialized. The fact that these values
1615 originally were URI objects is lost.
1616
1617 package URI;
1618 use overload '""' => sub { shift->as_string };
1619
1620 4. "allow_blessed" is enabled.
1621 The object will be serialized as a JSON null value.
1622
1623 5. none of the above
1624 If none of the settings are enabled or the respective methods are
1625 missing, "Cpanel::JSON::XS" throws an exception.
1626
1627 DESERIALIZATION
1628
1629 For deserialization there are only two cases to consider: either
1630 nonstandard tagging was used, in which case "allow_tags" decides, or
1631 objects cannot be automatically be deserialized, in which case you can
1632 use postprocessing or the "filter_json_object" or
1633 "filter_json_single_key_object" callbacks to get some real objects our
1634 of your JSON.
1635
1636 This section only considers the tagged value case: I a tagged JSON
1637 object is encountered during decoding and "allow_tags" is disabled, a
1638 parse error will result (as if tagged values were not part of the
1639 grammar).
1640
1641 If "allow_tags" is enabled, "Cpanel::JSON::XS" will look up the "THAW"
1642 method of the package/classname used during serialization (it will not
1643 attempt to load the package as a Perl module). If there is no such
1644 method, the decoding will fail with an error.
1645
1646 Otherwise, the "THAW" method is invoked with the classname as first
1647 argument, the constant string "JSON" as second argument, and all the
1648 values from the JSON array (the values originally returned by the
1649 "FREEZE" method) as remaining arguments.
1650
1651 The method must then return the object. While technically you can
1652 return any Perl scalar, you might have to enable the "enable_nonref"
1653 setting to make that work in all cases, so better return an actual
1654 blessed reference.
1655
1656 As an example, let's implement a "THAW" function that regenerates the
1657 "My::Object" from the "FREEZE" example earlier:
1658
1659 sub My::Object::THAW {
1660 my ($class, $serializer, $type, $id) = @_;
1661
1662 $class->new (type => $type, id => $id)
1663 }
1664
1665 See the "SECURITY CONSIDERATIONS" section below. Allowing external json
1666 objects being deserialized to perl objects is usually a very bad idea.
1667
1669 The interested reader might have seen a number of flags that signify
1670 encodings or codesets - "utf8", "latin1", "binary" and "ascii". There
1671 seems to be some confusion on what these do, so here is a short
1672 comparison:
1673
1674 "utf8" controls whether the JSON text created by "encode" (and expected
1675 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only
1676 control whether "encode" escapes character values outside their
1677 respective codeset range. Neither of these flags conflict with each
1678 other, although some combinations make less sense than others.
1679
1680 Care has been taken to make all flags symmetrical with respect to
1681 "encode" and "decode", that is, texts encoded with any combination of
1682 these flag values will be correctly decoded when the same flags are
1683 used - in general, if you use different flag settings while encoding
1684 vs. when decoding you likely have a bug somewhere.
1685
1686 Below comes a verbose discussion of these flags. Note that a "codeset"
1687 is simply an abstract set of character-codepoint pairs, while an
1688 encoding takes those codepoint numbers and encodes them, in our case
1689 into octets. Unicode is (among other things) a codeset, UTF-8 is an
1690 encoding, and ISO-8859-1 (= latin 1) and ASCII are both codesets and
1691 encodings at the same time, which can be confusing.
1692
1693 "utf8" flag disabled
1694 When "utf8" is disabled (the default), then "encode"/"decode"
1695 generate and expect Unicode strings, that is, characters with high
1696 ordinal Unicode values (> 255) will be encoded as such characters,
1697 and likewise such characters are decoded as-is, no changes to them
1698 will be done, except "(re-)interpreting" them as Unicode codepoints
1699 or Unicode characters, respectively (to Perl, these are the same
1700 thing in strings unless you do funny/weird/dumb stuff).
1701
1702 This is useful when you want to do the encoding yourself (e.g. when
1703 you want to have UTF-16 encoded JSON texts) or when some other
1704 layer does the encoding for you (for example, when printing to a
1705 terminal using a filehandle that transparently encodes to UTF-8 you
1706 certainly do NOT want to UTF-8 encode your data first and have Perl
1707 encode it another time).
1708
1709 "utf8" flag enabled
1710 If the "utf8"-flag is enabled, "encode"/"decode" will encode all
1711 characters using the corresponding UTF-8 multi-byte sequence, and
1712 will expect your input strings to be encoded as UTF-8, that is, no
1713 "character" of the input string must have any value > 255, as UTF-8
1714 does not allow that.
1715
1716 The "utf8" flag therefore switches between two modes: disabled
1717 means you will get a Unicode string in Perl, enabled means you get
1718 an UTF-8 encoded octet/binary string in Perl.
1719
1720 "latin1", "binary" or "ascii" flags enabled
1721 With "latin1" (or "ascii") enabled, "encode" will escape characters
1722 with ordinal values > 255 (> 127 with "ascii") and encode the
1723 remaining characters as specified by the "utf8" flag. With
1724 "binary" enabled, ordinal values > 255 are illegal.
1725
1726 If "utf8" is disabled, then the result is also correctly encoded in
1727 those character sets (as both are proper subsets of Unicode,
1728 meaning that a Unicode string with all character values < 256 is
1729 the same thing as a ISO-8859-1 string, and a Unicode string with
1730 all character values < 128 is the same thing as an ASCII string in
1731 Perl).
1732
1733 If "utf8" is enabled, you still get a correct UTF-8-encoded string,
1734 regardless of these flags, just some more characters will be
1735 escaped using "\uXXXX" then before.
1736
1737 Note that ISO-8859-1-encoded strings are not compatible with UTF-8
1738 encoding, while ASCII-encoded strings are. That is because the
1739 ISO-8859-1 encoding is NOT a subset of UTF-8 (despite the
1740 ISO-8859-1 codeset being a subset of Unicode), while ASCII is.
1741
1742 Surprisingly, "decode" will ignore these flags and so treat all
1743 input values as governed by the "utf8" flag. If it is disabled,
1744 this allows you to decode ISO-8859-1- and ASCII-encoded strings, as
1745 both strict subsets of Unicode. If it is enabled, you can correctly
1746 decode UTF-8 encoded strings.
1747
1748 So neither "latin1", "binary" nor "ascii" are incompatible with the
1749 "utf8" flag - they only govern when the JSON output engine escapes
1750 a character or not.
1751
1752 The main use for "latin1" or "binary" is to relatively efficiently
1753 store binary data as JSON, at the expense of breaking compatibility
1754 with most JSON decoders.
1755
1756 The main use for "ascii" is to force the output to not contain
1757 characters with values > 127, which means you can interpret the
1758 resulting string as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about
1759 any character set and 8-bit-encoding, and still get the same data
1760 structure back. This is useful when your channel for JSON transfer
1761 is not 8-bit clean or the encoding might be mangled in between
1762 (e.g. in mail), and works because ASCII is a proper subset of most
1763 8-bit and multibyte encodings in use in the world.
1764
1765 JSON and ECMAscript
1766 JSON syntax is based on how literals are represented in javascript (the
1767 not-standardized predecessor of ECMAscript) which is presumably why it
1768 is called "JavaScript Object Notation".
1769
1770 However, JSON is not a subset (and also not a superset of course) of
1771 ECMAscript (the standard) or javascript (whatever browsers actually
1772 implement).
1773
1774 If you want to use javascript's "eval" function to "parse" JSON, you
1775 might run into parse errors for valid JSON texts, or the resulting data
1776 structure might not be queryable:
1777
1778 One of the problems is that U+2028 and U+2029 are valid characters
1779 inside JSON strings, but are not allowed in ECMAscript string literals,
1780 so the following Perl fragment will not output something that can be
1781 guaranteed to be parsable by javascript's "eval":
1782
1783 use Cpanel::JSON::XS;
1784
1785 print encode_json [chr 0x2028];
1786
1787 The right fix for this is to use a proper JSON parser in your
1788 javascript programs, and not rely on "eval" (see for example Douglas
1789 Crockford's json2.js parser).
1790
1791 If this is not an option, you can, as a stop-gap measure, simply encode
1792 to ASCII-only JSON:
1793
1794 use Cpanel::JSON::XS;
1795
1796 print Cpanel::JSON::XS->new->ascii->encode ([chr 0x2028]);
1797
1798 Note that this will enlarge the resulting JSON text quite a bit if you
1799 have many non-ASCII characters. You might be tempted to run some
1800 regexes to only escape U+2028 and U+2029, e.g.:
1801
1802 # DO NOT USE THIS!
1803 my $json = Cpanel::JSON::XS->new->utf8->encode ([chr 0x2028]);
1804 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1805 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1806 print $json;
1807
1808 Note that this is a bad idea: the above only works for U+2028 and
1809 U+2029 and thus only for fully ECMAscript-compliant parsers. Many
1810 existing javascript implementations, however, have issues with other
1811 characters as well - using "eval" naively simply will cause problems.
1812
1813 Another problem is that some javascript implementations reserve some
1814 property names for their own purposes (which probably makes them non-
1815 ECMAscript-compliant). For example, Iceweasel reserves the "__proto__"
1816 property name for its own purposes.
1817
1818 If that is a problem, you could parse try to filter the resulting JSON
1819 output for these property strings, e.g.:
1820
1821 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
1822
1823 This works because "__proto__" is not valid outside of strings, so
1824 every occurrence of ""__proto__"\s*:" must be a string used as property
1825 name.
1826
1827 Unicode non-characters between U+FFFD and U+10FFFF are decoded either
1828 to the recommended U+FFFD REPLACEMENT CHARACTER (see Unicode PR #121:
1829 Recommended Practice for Replacement Characters), or in the binary or
1830 relaxed mode left as is, keeping the illegal non-characters as before.
1831
1832 Raw non-Unicode characters outside the valid unicode range fail now to
1833 parse, because "A string is a sequence of zero or more Unicode
1834 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
1835 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER flag
1836 when parsing unicode.
1837
1838 If you know of other incompatibilities, please let me know.
1839
1840 JSON and YAML
1841 You often hear that JSON is a subset of YAML. in general, there is no
1842 way to configure JSON::XS to output a data structure as valid YAML that
1843 works in all cases. If you really must use Cpanel::JSON::XS to
1844 generate YAML, you should use this algorithm (subject to change in
1845 future versions):
1846
1847 my $to_yaml = Cpanel::JSON::XS->new->utf8->space_after (1);
1848 my $yaml = $to_yaml->encode ($ref) . "\n";
1849
1850 This will usually generate JSON texts that also parse as valid YAML.
1851
1852 SPEED
1853 It seems that JSON::XS is surprisingly fast, as shown in the following
1854 tables. They have been generated with the help of the "eg/bench"
1855 program in the JSON::XS distribution, to make it easy to compare on
1856 your own system.
1857
1858 JSON::XS is with Data::MessagePack and Sereal one of the fastest
1859 serializers, because JSON and JSON::XS do not support backrefs (no
1860 graph structures), only trees. Storable supports backrefs, i.e. graphs.
1861 Data::MessagePack encodes its data binary (as Storable) and supports
1862 only very simple subset of JSON.
1863
1864 First comes a comparison between various modules using a very short
1865 single-line JSON string (also available at
1866 <http://dist.schmorp.de/misc/json/short.json>).
1867
1868 {"method": "handleMessage", "params": ["user1",
1869 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1870 1, 0]}
1871
1872 It shows the number of encodes/decodes per second (JSON::XS uses the
1873 functional interface, while Cpanel::JSON::XS/2 uses the OO interface
1874 with pretty-printing and hash key sorting enabled, Cpanel::JSON::XS/3
1875 enables shrink. JSON::DWIW/DS uses the deserialize function, while
1876 JSON::DWIW::FJ uses the from_json method). Higher is better:
1877
1878 module | encode | decode |
1879 --------------|------------|------------|
1880 JSON::DWIW/DS | 86302.551 | 102300.098 |
1881 JSON::DWIW/FJ | 86302.551 | 75983.768 |
1882 JSON::PP | 15827.562 | 6638.658 |
1883 JSON::Syck | 63358.066 | 47662.545 |
1884 JSON::XS | 511500.488 | 511500.488 |
1885 JSON::XS/2 | 291271.111 | 388361.481 |
1886 JSON::XS/3 | 361577.931 | 361577.931 |
1887 Storable | 66788.280 | 265462.278 |
1888 --------------+------------+------------+
1889
1890 That is, JSON::XS is almost six times faster than JSON::DWIW on
1891 encoding, about five times faster on decoding, and over thirty to
1892 seventy times faster than JSON's pure perl implementation. It also
1893 compares favourably to Storable for small amounts of data.
1894
1895 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1896 search API (<http://dist.schmorp.de/misc/json/long.json>).
1897
1898 module | encode | decode |
1899 --------------|------------|------------|
1900 JSON::DWIW/DS | 1647.927 | 2673.916 |
1901 JSON::DWIW/FJ | 1630.249 | 2596.128 |
1902 JSON::PP | 400.640 | 62.311 |
1903 JSON::Syck | 1481.040 | 1524.869 |
1904 JSON::XS | 20661.596 | 9541.183 |
1905 JSON::XS/2 | 10683.403 | 9416.938 |
1906 JSON::XS/3 | 20661.596 | 9400.054 |
1907 Storable | 19765.806 | 10000.725 |
1908 --------------+------------+------------+
1909
1910 Again, JSON::XS leads by far (except for Storable which non-
1911 surprisingly decodes a bit faster).
1912
1913 On large strings containing lots of high Unicode characters, some
1914 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
1915 result will be broken due to missing (or wrong) Unicode handling.
1916 Others refuse to decode or encode properly, so it was impossible to
1917 prepare a fair comparison table for that case.
1918
1919 For updated graphs see
1920 <https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs>
1921
1923 As long as you only serialize data that can be directly expressed in
1924 JSON, "Cpanel::JSON::XS" is incapable of generating invalid JSON output
1925 (modulo bugs, but "JSON::XS" has found more bugs in the official JSON
1926 testsuite (1) than the official JSON testsuite has found in "JSON::XS"
1927 (0)). "Cpanel::JSON::XS" is currently the only known JSON decoder
1928 which passes all <http://seriot.ch/projects/parsing_json.html> tests,
1929 while being the fastest also.
1930
1931 When you have trouble decoding JSON generated by this module using
1932 other decoders, then it is very likely that you have an encoding
1933 mismatch or the other decoder is broken.
1934
1935 When decoding, "JSON::XS" is strict by default and will likely catch
1936 all errors. There are currently two settings that change this:
1937 "relaxed" makes "JSON::XS" accept (but not generate) some non-standard
1938 extensions, and "allow_tags" or "allow_blessed" will allow you to
1939 encode and decode Perl objects, at the cost of being totally insecure
1940 and not outputting valid JSON anymore.
1941
1942 JSON-XS-3.01 broke interoperability with JSON-2.90 with booleans. See
1943 JSON.
1944
1945 Cpanel::JSON::XS needs to know the JSON and JSON::XS versions to be
1946 able work with those objects, especially when encoding a booleans like
1947 "{"is_true":true}". So you need to load these modules before.
1948
1949 true/false overloading and boolean representations are supported.
1950
1951 JSON::XS and JSON::PP representations are accepted and older JSON::XS
1952 accepts Cpanel::JSON::XS booleans. All JSON modules JSON, JSON, PP,
1953 JSON::XS, Cpanel::JSON::XS produce JSON::PP::Boolean objects, just Mojo
1954 and JSON::YAJL not. Mojo produces Mojo::JSON::_Bool and
1955 JSON::YAJL::Parser just an unblessed IV.
1956
1957 Cpanel::JSON::XS accepts JSON::PP::Boolean and Mojo::JSON::_Bool
1958 objects as booleans.
1959
1960 I cannot think of any reason to still use JSON::XS anymore.
1961
1962 TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS
1963 When you use "allow_tags" to use the extended (and also nonstandard and
1964 invalid) JSON syntax for serialized objects, and you still want to
1965 decode the generated serialize objects, you can run a regex to replace
1966 the tagged syntax by standard JSON arrays (it only works for "normal"
1967 package names without comma, newlines or single colons). First, the
1968 readable Perl version:
1969
1970 # if your FREEZE methods return no values, you need this replace first:
1971 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[\s*\]/[$1]/gx;
1972
1973 # this works for non-empty constructor arg lists:
1974 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[/[$1,/gx;
1975
1976 And here is a less readable version that is easy to adapt to other
1977 languages:
1978
1979 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/[$1,/g;
1980
1981 Here is an ECMAScript version (same regex):
1982
1983 json = json.replace (/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/g, "[$1,");
1984
1985 Since this syntax converts to standard JSON arrays, it might be hard to
1986 distinguish serialized objects from normal arrays. You can prepend a
1987 "magic number" as first array element to reduce chances of a collision:
1988
1989 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g;
1990
1991 And after decoding the JSON text, you could walk the data structure
1992 looking for arrays with a first element of
1993 "XU1peReLzT4ggEllLanBYq4G9VzliwKF".
1994
1995 The same approach can be used to create the tagged format with another
1996 encoder. First, you create an array with the magic string as first
1997 member, the classname as second, and constructor arguments last, encode
1998 it as part of your JSON structure, and then:
1999
2000 $json =~ s/\[\s*"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s*,\s*("([^\\":,]+|\\.|::)*")\s*,/($1)[/g;
2001
2002 Again, this has some limitations - the magic string must not be encoded
2003 with character escapes, and the constructor arguments must be non-
2004 empty.
2005
2007 Since this module was written, Google has written a new JSON RFC, RFC
2008 7159 (and RFC7158). Unfortunately, this RFC breaks compatibility with
2009 both the original JSON specification on www.json.org and RFC4627.
2010
2011 As far as I can see, you can get partial compatibility when parsing by
2012 using "->allow_nonref". However, consider the security implications of
2013 doing so.
2014
2015 I haven't decided yet when to break compatibility with RFC4627 by
2016 default (and potentially leave applications insecure) and change the
2017 default to follow RFC7159, but application authors are well advised to
2018 call "->allow_nonref(0)" even if this is the current default, if they
2019 cannot handle non-reference values, in preparation for the day when the
2020 default will change.
2021
2023 JSON::XS and Cpanel::JSON::XS are not only fast. JSON is generally the
2024 most secure serializing format, because it is the only one besides
2025 Data::MessagePack, which does not deserialize objects per default. For
2026 all languages, not just perl. The binary variant BSON (MongoDB) does
2027 more but is unsafe.
2028
2029 It is trivial for any attacker to create such serialized objects in
2030 JSON and trick perl into expanding them, thereby triggering certain
2031 methods. Watch <https://www.youtube.com/watch?v=Gzx6KlqiIZE> for an
2032 exploit demo for "CVE-2015-1592 SixApart MovableType Storable Perl Code
2033 Execution" for a deserializer which expands objects. Deserializing
2034 even coderefs (methods, functions) or external data would be considered
2035 the most dangerous.
2036
2037 Security relevant overview of serializers regarding deserializing
2038 objects by default:
2039
2040 Objects Coderefs External Data
2041
2042 Data::Dumper YES YES YES
2043 Storable YES NO (def) NO
2044 Sereal YES NO NO
2045 YAML YES NO NO
2046 B::C YES YES YES
2047 B::Bytecode YES YES YES
2048 BSON YES YES NO
2049 JSON::SL YES NO YES
2050 JSON NO (def) NO NO
2051 Data::MessagePack NO NO NO
2052 XML NO NO YES
2053
2054 Pickle YES YES YES
2055 PHP Deserialize YES NO NO
2056
2057 When you are using JSON in a protocol, talking to untrusted potentially
2058 hostile creatures requires relatively few measures.
2059
2060 First of all, your JSON decoder should be secure, that is, should not
2061 have any buffer overflows. Obviously, this module should ensure that.
2062
2063 Second, you need to avoid resource-starving attacks. That means you
2064 should limit the size of JSON texts you accept, or make sure then when
2065 your resources run out, that's just fine (e.g. by using a separate
2066 process that can crash safely). The size of a JSON text in octets or
2067 characters is usually a good indication of the size of the resources
2068 required to decode it into a Perl structure. While JSON::XS can check
2069 the size of the JSON text, it might be too late when you already have
2070 it in memory, so you might want to check the size before you accept the
2071 string.
2072
2073 Third, Cpanel::JSON::XS recurses using the C stack when decoding
2074 objects and arrays. The C stack is a limited resource: for instance, on
2075 my amd64 machine with 8MB of stack size I can decode around 180k nested
2076 arrays but only 14k nested JSON objects (due to perl itself recursing
2077 deeply on croak to free the temporary). If that is exceeded, the
2078 program crashes. To be conservative, the default nesting limit is set
2079 to 512. If your process has a smaller stack, you should adjust this
2080 setting accordingly with the "max_depth" method.
2081
2082 Also keep in mind that Cpanel::JSON::XS might leak contents of your
2083 Perl data structures in its error messages, so when you serialize
2084 sensitive information you might want to make sure that exceptions
2085 thrown by JSON::XS will not end up in front of untrusted eyes.
2086
2087 If you are using Cpanel::JSON::XS to return packets to consumption by
2088 JavaScript scripts in a browser you should have a look at
2089 <http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/>
2090 to see whether you are vulnerable to some common attack vectors (which
2091 really are browser design bugs, but it is still you who will have to
2092 deal with it, as major browser developers care only for features, not
2093 about getting security right). You might also want to also look at
2094 Mojo::JSON special escape rules to prevent from XSS attacks.
2095
2097 TL;DR: Due to security concerns, Cpanel::JSON::XS will not allow scalar
2098 data in JSON texts by default - you need to create your own
2099 Cpanel::JSON::XS object and enable "allow_nonref":
2100
2101 my $json = JSON::XS->new->allow_nonref;
2102
2103 $text = $json->encode ($data);
2104 $data = $json->decode ($text);
2105
2106 The long version: JSON being an important and supposedly stable format,
2107 the IETF standardized it as RFC 4627 in 2006. Unfortunately the
2108 inventor of JSON Douglas Crockford unilaterally changed the definition
2109 of JSON in javascript. Rather than create a fork, the IETF decided to
2110 standardize the new syntax (apparently, so I as told, without finding
2111 it very amusing).
2112
2113 The biggest difference between the original JSON and the new JSON is
2114 that the new JSON supports scalars (anything other than arrays and
2115 objects) at the top-level of a JSON text. While this is strictly
2116 backwards compatible to older versions, it breaks a number of protocols
2117 that relied on sending JSON back-to-back, and is a minor security
2118 concern.
2119
2120 For example, imagine you have two banks communicating, and on one side,
2121 the JSON coder gets upgraded. Two messages, such as 10 and 1000 might
2122 then be confused to mean 101000, something that couldn't happen in the
2123 original JSON, because neither of these messages would be valid JSON.
2124
2125 If one side accepts these messages, then an upgrade in the coder on
2126 either side could result in this becoming exploitable.
2127
2128 This module has always allowed these messages as an optional extension,
2129 by default disabled. The security concerns are the reason why the
2130 default is still disabled, but future versions might/will likely
2131 upgrade to the newer RFC as default format, so you are advised to check
2132 your implementation and/or override the default with "->allow_nonref
2133 (0)" to ensure that future versions are safe.
2134
2136 Cpanel::JSON::XS has proper ithreads support, unlike JSON::XS. If you
2137 encounter any bugs with thread support please report them.
2138
2139 From Version 4.00 - 4.19 you couldn't encode true with threads::shared
2140 magic.
2141
2143 While the goal of the Cpanel::JSON::XS module is to be correct, that
2144 unfortunately does not mean it's bug-free, only that the author thinks
2145 its design is bug-free. If you keep reporting bugs and tests they will
2146 be fixed swiftly, though.
2147
2148 Since the JSON::XS author refuses to use a public bugtracker and
2149 prefers private emails, we use the tracker at github, so you might want
2150 to report any issues twice. Once in private to MLEHMANN to be fixed in
2151 JSON::XS and one to our the public tracker. Issues fixed by JSON::XS
2152 with a new release will also be backported to Cpanel::JSON::XS and
2153 5.6.2, as long as cPanel relies on 5.6.2 and Cpanel::JSON::XS as our
2154 serializer of choice.
2155
2156 <https://github.com/rurban/Cpanel-JSON-XS/issues>
2157
2159 This module is available under the same licences as perl, the Artistic
2160 license and the GPL.
2161
2163 The cpanel_json_xs command line utility for quick experiments.
2164
2165 JSON, JSON::XS, JSON::MaybeXS, Mojo::JSON, Mojo::JSON::MaybeXS,
2166 JSON::SL, JSON::DWIW, JSON::YAJL, JSON::Any, Test::JSON,
2167 Locale::Wolowitz, <https://metacpan.org/search?q=JSON>
2168
2169 <https://tools.ietf.org/html/rfc7159>
2170
2171 <https://tools.ietf.org/html/rfc4627>
2172
2174 Reini Urban <rurban@cpan.org>
2175
2176 Marc Lehmann <schmorp@schmorp.de>, http://home.schmorp.de/
2177
2179 Reini Urban <rurban@cpan.org>
2180
2181
2182
2183perl v5.38.0 2023-07-20 XS(3)