1XS(3) User Contributed Perl Documentation XS(3)
2
3
4
6 Cpanel::JSON::XS - cPanel fork of JSON::XS, fast and correct
7 serializing
8
10 use Cpanel::JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = Cpanel::JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 # Note that 5.6 misses most smart utf8 and encoding functionalities
25 # of newer releases.
26
27 # Note that L<JSON::MaybeXS> will automatically use Cpanel::JSON::XS
28 # if available, at virtually no speed overhead either, so you should
29 # be able to just:
30
31 use JSON::MaybeXS;
32
33 # and do the same things, except that you have a pure-perl fallback now.
34
36 This module converts Perl data structures to JSON and vice versa. Its
37 primary goal is to be correct and its secondary goal is to be fast. To
38 reach the latter goal it was written in C.
39
40 As this is the n-th-something JSON module on CPAN, what was the reason
41 to write yet another JSON module? While it seems there are many JSON
42 modules, none of them correctly handle all corner cases, and in most
43 cases their maintainers are unresponsive, gone missing, or not
44 listening to bug reports for other reasons.
45
46 See below for the cPanel fork.
47
48 See MAPPING, below, on how Cpanel::JSON::XS maps perl values to JSON
49 values and vice versa.
50
51 FEATURES
52 · correct Unicode handling
53
54 This module knows how to handle Unicode with Perl version higher
55 than 5.8.5, documents how and when it does so, and even documents
56 what "correct" means.
57
58 · round-trip integrity
59
60 When you serialize a perl data structure using only data types
61 supported by JSON and Perl, the deserialized data structure is
62 identical on the Perl level. (e.g. the string "2.0" doesn't
63 suddenly become "2" just because it looks like a number). There are
64 minor exceptions to this, read the MAPPING section below to learn
65 about those.
66
67 · strict checking of JSON correctness
68
69 There is no guessing, no generating of illegal JSON texts by
70 default, and only JSON is accepted as input by default. the latter
71 is a security feature.
72
73 · fast
74
75 Compared to other JSON modules and other serializers such as
76 Storable, this module usually compares favourably in terms of
77 speed, too.
78
79 · simple to use
80
81 This module has both a simple functional interface as well as an
82 object oriented interface.
83
84 · reasonably versatile output formats
85
86 You can choose between the most compact guaranteed-single-line
87 format possible (nice for simple line-based protocols), a pure-
88 ASCII format (for when your transport is not 8-bit clean, still
89 supports the whole Unicode range), or a pretty-printed format (for
90 when you want to read that stuff). Or you can combine those
91 features in whatever way you like.
92
93 cPanel fork
94 Since the original author MLEHMANN has no public bugtracker, this
95 cPanel fork sits now on github.
96
97 src repo: <https://github.com/rurban/Cpanel-JSON-XS> original:
98 <http://cvs.schmorp.de/JSON-XS/>
99
100 RT: <https://github.com/rurban/Cpanel-JSON-XS/issues> or
101 <https://rt.cpan.org/Public/Dist/Display.html?Queue=Cpanel-JSON-XS>
102
103 Changes to JSON::XS
104
105 - stricter decode_json() as documented. non-refs are disallowed.
106 added a 2nd optional argument. decode() honors now allow_nonref.
107
108 - fixed encode of numbers for dual-vars. Different string
109 representations are preserved, but numbers with temporary strings
110 which represent the same number are here treated as numbers, not
111 strings. Cpanel::JSON::XS is a bit slower, but preserves numeric
112 types better.
113
114 - numbers ending with .0 stray numbers, are not converted to
115 integers. [#63] dual-vars which are represented as number not
116 integer (42+"bar" != 5.8.9) are now encoded as number (=> 42.0)
117 because internally it's now a NOK type. However !!1 which is
118 wrongly encoded in 5.8 as "1"/1.0 is still represented as integer.
119
120 - different handling of inf/nan. Default now to null, optionally with
121 stringify_infnan() to "inf"/"nan". [#28, #32]
122
123 - added "binary" extension, non-JSON and non JSON parsable, allows
124 "\xNN" and "\NNN" sequences.
125
126 - 5.6.2 support; sacrificing some utf8 features (assuming bytes
127 all-over), no multi-byte unicode characters with 5.6.
128
129 - interop for true/false overloading. JSON::XS, JSON::PP and Mojo::JSON
130 representations for booleans are accepted and JSON::XS accepts
131 Cpanel::JSON::XS booleans [#13, #37]
132 Fixed overloading of booleans. Cpanel::JSON::XS::true stringifies
133 again
134 to "1", not "true", analog to all other JSON modules.
135
136 - native boolean mapping of yes and no to true and false, as in
137 YAML::XS.
138 In perl "!0" is yes, "!1" is no.
139 The JSON value true maps to 1, false maps to 0. [#39]
140
141 - support arbitrary stringification with encode, with convert_blessed
142 and allow_blessed.
143
144 - ithread support. Cpanel::JSON::XS is thread-safe, JSON::XS not
145
146 - is_bool can be called as method, JSON::XS::is_bool not.
147
148 - performance optimizations for threaded Perls
149
150 - relaxed mode, allowing many popular extensions
151
152 - additional fixes for:
153
154 - [cpan #88061] AIX atof without USE_LONG_DOUBLE
155
156 - #10 unshare_hek crash
157
158 - #7, #29 avoid re-blessing where possible. It fails in JSON::XS for
159 READONLY values, i.e. restricted hashes.
160
161 - #41 overloading of booleans, use the object not the reference.
162
163 - #62 -Dusequadmath conversion and no SEGV.
164
165 - #72 parsing of values followed \0, like 1\0 does fail.
166
167 - #72 parsing of illegal unicode or non-unicode characters.
168
169 - #96 locale-insensitive numeric conversion
170
171 - public maintenance and bugtracker
172
173 - use ppport.h, sanify XS.xs comment styles, harness C coding style
174
175 - common::sense is optional. When available it is not used in the
176 published production module, just during development and testing.
177
178 - extended testsuite, passes all http://seriot.ch/parsing_json.html
179 tests. In fact it is the only know JSON decoder which does so,
180 while also being the fastest.
181
182 - support many more options and methods from JSON::PP:
183 stringify_infnan, allow_unknown, allow_stringify, allow_barekey,
184 encode_stringify, allow_bignum, allow_singlequote, sort_by
185 (partially), escape_slash, convert_blessed, ... optional
186 decode_json(, allow_nonref) arg.
187 relaxed implements allow_dupkeys.
188
189 - support all 5 unicode BOM's: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE,
190 UTF-32BE, encoding internally to UTF-8.
191
193 The following convenience methods are provided by this module. They are
194 exported by default:
195
196 $json_text = encode_json $perl_scalar, [json_type]
197 Converts the given Perl data structure to a UTF-8 encoded, binary
198 string (that is, the string contains octets only). Croaks on error.
199
200 This function call is functionally identical to:
201
202 $json_text = Cpanel::JSON::XS->new->utf8->encode ($perl_scalar)
203
204 Except being faster.
205
206 For the type argument see Cpanel::JSON::XS::Type.
207
208 $perl_scalar = decode_json $json_text [, $allow_nonref ]
209 The opposite of "encode_json": expects an UTF-8 (binary) string of
210 an json reference and tries to parse that as an UTF-8 encoded JSON
211 text, returning the resulting reference. Croaks on error.
212
213 This function call is functionally identical to:
214
215 $perl_scalar = Cpanel::JSON::XS->new->utf8->decode ($json_text)
216
217 except being faster.
218
219 Note that older decode_json versions in Cpanel::JSON::XS older than
220 3.0116 and JSON::XS did not set allow_nonref but allowed them due
221 to a bug in the decoder.
222
223 If the new optional $allow_nonref argument is set and not false,
224 the allow_nonref option will be set and the function will act is
225 described as in the relaxed RFC 7159 allowing all values such as
226 objects, arrays, strings, numbers, "null", "true", and "false".
227
228 $is_boolean = Cpanel::JSON::XS::is_bool $scalar
229 Returns true if the passed scalar represents either
230 "JSON::XS::true" or "JSON::XS::false", two constants that act like
231 1 and 0, respectively and are used to represent JSON "true" and
232 "false" values in Perl.
233
234 See MAPPING, below, for more information on how JSON values are
235 mapped to Perl.
236
238 from_json
239 from_json has been renamed to decode_json
240
241 to_json
242 to_json has been renamed to encode_json
243
245 Since this often leads to confusion, here are a few very clear words on
246 how Unicode works in Perl, modulo bugs.
247
248 1. Perl strings can store characters with ordinal values > 255.
249 This enables you to store Unicode characters as single characters
250 in a Perl string - very natural.
251
252 2. Perl does not associate an encoding with your strings.
253 ... until you force it to, e.g. when matching it against a regex,
254 or printing the scalar to a file, in which case Perl either
255 interprets your string as locale-encoded text, octets/binary, or as
256 Unicode, depending on various settings. In no case is an encoding
257 stored together with your data, it is use that decides encoding,
258 not any magical meta data.
259
260 3. The internal utf-8 flag has no meaning with regards to the encoding
261 of your string.
262 4. A "Unicode String" is simply a string where each character can be
263 validly interpreted as a Unicode code point.
264 If you have UTF-8 encoded data, it is no longer a Unicode string,
265 but a Unicode string encoded in UTF-8, giving you a binary string.
266
267 5. A string containing "high" (> 255) character values is not a UTF-8
268 string.
269 6. Unicode noncharacters only warn, as in core.
270 The 66 Unicode noncharacters U+FDD0..U+FDEF, and U+*FFFE, U+*FFFF
271 just warn, see <http://www.unicode.org/versions/corrigendum9.html>.
272 But illegal surrogate pairs fail to parse.
273
274 7. Raw non-Unicode characters above U+10FFFF are disallowed.
275 Raw non-Unicode characters outside the valid unicode range fail to
276 parse, because "A string is a sequence of zero or more Unicode
277 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
278 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER
279 flag when parsing unicode.
280
281 I hope this helps :)
282
284 The object oriented interface lets you configure your own encoding or
285 decoding style, within the limits of supported formats.
286
287 $json = new Cpanel::JSON::XS
288 Creates a new JSON object that can be used to de/encode JSON
289 strings. All boolean flags described below are by default disabled.
290
291 The mutators for flags all return the JSON object again and thus
292 calls can be chained:
293
294 my $json = Cpanel::JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
295 => {"a": [1, 2]}
296
297 $json = $json->ascii ([$enable])
298 $enabled = $json->get_ascii
299 If $enable is true (or missing), then the "encode" method will not
300 generate characters outside the code range 0..127 (which is ASCII).
301 Any Unicode characters outside that range will be escaped using
302 either a single "\uXXXX" (BMP characters) or a double
303 "\uHHHH\uLLLLL" escape sequence, as per RFC4627. The resulting
304 encoded JSON text can be treated as a native Unicode string, an
305 ascii-encoded, latin1-encoded or UTF-8 encoded string, or any other
306 superset of ASCII.
307
308 If $enable is false, then the "encode" method will not escape
309 Unicode characters unless required by the JSON syntax or other
310 flags. This results in a faster and more compact format.
311
312 See also the section ENCODING/CODESET FLAG NOTES later in this
313 document.
314
315 The main use for this flag is to produce JSON texts that can be
316 transmitted over a 7-bit channel, as the encoded JSON texts will
317 not contain any 8 bit characters.
318
319 Cpanel::JSON::XS->new->ascii (1)->encode ([chr 0x10401])
320 => ["\ud801\udc01"]
321
322 $json = $json->latin1 ([$enable])
323 $enabled = $json->get_latin1
324 If $enable is true (or missing), then the "encode" method will
325 encode the resulting JSON text as latin1 (or ISO-8859-1), escaping
326 any characters outside the code range 0..255. The resulting string
327 can be treated as a latin1-encoded JSON text or a native Unicode
328 string. The "decode" method will not be affected in any way by this
329 flag, as "decode" by default expects Unicode, which is a strict
330 superset of latin1.
331
332 If $enable is false, then the "encode" method will not escape
333 Unicode characters unless required by the JSON syntax or other
334 flags.
335
336 See also the section ENCODING/CODESET FLAG NOTES later in this
337 document.
338
339 The main use for this flag is efficiently encoding binary data as
340 JSON text, as most octets will not be escaped, resulting in a
341 smaller encoded size. The disadvantage is that the resulting JSON
342 text is encoded in latin1 (and must correctly be treated as such
343 when storing and transferring), a rare encoding for JSON. It is
344 therefore most useful when you want to store data structures known
345 to contain binary data efficiently in files or databases, not when
346 talking to other JSON encoders/decoders.
347
348 Cpanel::JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
349 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
350
351 $json = $json->binary ([$enable])
352 $enabled = $json = $json->get_binary
353 If the $enable argument is true (or missing), then the "encode"
354 method will not try to detect an UTF-8 encoding in any JSON string,
355 it will strictly interpret it as byte sequence. The result might
356 contain new "\xNN" sequences, which is unparsable JSON. The
357 "decode" method forbids "\uNNNN" sequences and accepts "\xNN" and
358 octal "\NNN" sequences.
359
360 There is also a special logic for perl 5.6 and utf8. 5.6 encodes
361 any string to utf-8 automatically when seeing a codepoint >= 0x80
362 and < 0x100. With the binary flag enabled decode the perl utf8
363 encoded string to the original byte encoding and encode this with
364 "\xNN" escapes. This will result to the same encodings as with
365 newer perls. But note that binary multi-byte codepoints with 5.6
366 will result in "illegal unicode character in binary string" errors,
367 unlike with newer perls.
368
369 If $enable is false, then the "encode" method will smartly try to
370 detect Unicode characters unless required by the JSON syntax or
371 other flags and hex and octal sequences are forbidden.
372
373 See also the section ENCODING/CODESET FLAG NOTES later in this
374 document.
375
376 The main use for this flag is to avoid the smart unicode detection
377 and possible double encoding. The disadvantage is that the
378 resulting JSON text is encoded in new "\xNN" and in latin1
379 characters and must correctly be treated as such when storing and
380 transferring, a rare encoding for JSON. It will produce non-
381 readable JSON strings in the browser. It is therefore most useful
382 when you want to store data structures known to contain binary data
383 efficiently in files or databases, not when talking to other JSON
384 encoders/decoders. The binary decoding method can also be used
385 when an encoder produced a non-JSON conformant hex or octal
386 encoding "\xNN" or "\NNN".
387
388 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{abc}"])
389 5.6: Error: malformed or illegal unicode character in binary string
390 >=5.8: ['\x89\xe0\xaa\xbc']
391
392 Cpanel::JSON::XS->new->binary->encode (["\x{89}\x{bc}"])
393 => ["\x89\xbc"]
394
395 Cpanel::JSON::XS->new->binary->decode (["\x89\ua001"])
396 Error: malformed or illegal unicode character in binary string
397
398 Cpanel::JSON::XS->new->decode (["\x89"])
399 Error: illegal hex character in non-binary string
400
401 $json = $json->utf8 ([$enable])
402 $enabled = $json->get_utf8
403 If $enable is true (or missing), then the "encode" method will
404 encode the JSON result into UTF-8, as required by many protocols,
405 while the "decode" method expects to be handled an UTF-8-encoded
406 string. Please note that UTF-8-encoded strings do not contain any
407 characters outside the range 0..255, they are thus useful for
408 bytewise/binary I/O. In future versions, enabling this option might
409 enable autodetection of the UTF-16 and UTF-32 encoding families, as
410 described in RFC4627.
411
412 If $enable is false, then the "encode" method will return the JSON
413 string as a (non-encoded) Unicode string, while "decode" expects
414 thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or
415 UTF-16) needs to be done yourself, e.g. using the Encode module.
416
417 See also the section ENCODING/CODESET FLAG NOTES later in this
418 document.
419
420 Example, output UTF-16BE-encoded JSON:
421
422 use Encode;
423 $jsontext = encode "UTF-16BE", Cpanel::JSON::XS->new->encode ($object);
424
425 Example, decode UTF-32LE-encoded JSON:
426
427 use Encode;
428 $object = Cpanel::JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
429
430 $json = $json->pretty ([$enable])
431 This enables (or disables) all of the "indent", "space_before" and
432 "space_after" (and in the future possibly more) flags in one call
433 to generate the most readable (or most compact) form possible.
434
435 Example, pretty-print some simple structure:
436
437 my $json = Cpanel::JSON::XS->new->pretty(1)->encode ({a => [1,2]})
438 =>
439 {
440 "a" : [
441 1,
442 2
443 ]
444 }
445
446 $json = $json->indent ([$enable])
447 $enabled = $json->get_indent
448 If $enable is true (or missing), then the "encode" method will use
449 a multiline format as output, putting every array member or
450 object/hash key-value pair into its own line, indenting them
451 properly.
452
453 If $enable is false, no newlines or indenting will be produced, and
454 the resulting JSON text is guaranteed not to contain any
455 "newlines".
456
457 This setting has no effect when decoding JSON texts.
458
459 $json = $json->indent_length([$number_of_spaces])
460 $length = $json->get_indent_length()
461 Set the indent length (default 3). This option is only useful when
462 you also enable indent or pretty. The acceptable range is from 0
463 (no indentation) to 15
464
465 $json = $json->space_before ([$enable])
466 $enabled = $json->get_space_before
467 If $enable is true (or missing), then the "encode" method will add
468 an extra optional space before the ":" separating keys from values
469 in JSON objects.
470
471 If $enable is false, then the "encode" method will not add any
472 extra space at those places.
473
474 This setting has no effect when decoding JSON texts. You will also
475 most likely combine this setting with "space_after".
476
477 Example, space_before enabled, space_after and indent disabled:
478
479 {"key" :"value"}
480
481 $json = $json->space_after ([$enable])
482 $enabled = $json->get_space_after
483 If $enable is true (or missing), then the "encode" method will add
484 an extra optional space after the ":" separating keys from values
485 in JSON objects and extra whitespace after the "," separating key-
486 value pairs and array members.
487
488 If $enable is false, then the "encode" method will not add any
489 extra space at those places.
490
491 This setting has no effect when decoding JSON texts.
492
493 Example, space_before and indent disabled, space_after enabled:
494
495 {"key": "value"}
496
497 $json = $json->relaxed ([$enable])
498 $enabled = $json->get_relaxed
499 If $enable is true (or missing), then "decode" will accept some
500 extensions to normal JSON syntax (see below). "encode" will not be
501 affected in anyway. Be aware that this option makes you accept
502 invalid JSON texts as if they were valid!. I suggest only to use
503 this option to parse application-specific files written by humans
504 (configuration files, resource files etc.)
505
506 If $enable is false (the default), then "decode" will only accept
507 valid JSON texts.
508
509 Currently accepted extensions are:
510
511 · list items can have an end-comma
512
513 JSON separates array elements and key-value pairs with commas.
514 This can be annoying if you write JSON texts manually and want
515 to be able to quickly append elements, so this extension
516 accepts comma at the end of such items not just between them:
517
518 [
519 1,
520 2, <- this comma not normally allowed
521 ]
522 {
523 "k1": "v1",
524 "k2": "v2", <- this comma not normally allowed
525 }
526
527 · shell-style '#'-comments
528
529 Whenever JSON allows whitespace, shell-style comments are
530 additionally allowed. They are terminated by the first
531 carriage-return or line-feed character, after which more white-
532 space and comments are allowed.
533
534 [
535 1, # this comment not allowed in JSON
536 # neither this one...
537 ]
538
539 · literal ASCII TAB characters in strings
540
541 Literal ASCII TAB characters are now allowed in strings (and
542 treated as "\t") in relaxed mode. Despite JSON mandates, that
543 TAB character is substituted for "\t" sequence.
544
545 [
546 "Hello\tWorld",
547 "Hello<TAB>World", # literal <TAB> would not normally be allowed
548 ]
549
550 · allow_singlequote
551
552 Single quotes are accepted instead of double quotes. See the
553 "allow_singlequote" option.
554
555 { "foo":'bar' }
556 { 'foo':"bar" }
557 { 'foo':'bar' }
558
559 · allow_barekey
560
561 Accept unquoted object keys instead of with mandatory double
562 quotes. See the "allow_barekey" option.
563
564 { foo:"bar" }
565
566 · allow_dupkeys
567
568 Allow decoding of duplicate keys in hashes. By default
569 duplicate keys are forbidden. See
570 <http://seriot.ch/parsing_json.php#24>: RFC 7159 section 4:
571 "The names within an object should be unique." See the
572 "allow_dupkeys" option.
573
574 $json = $json->canonical ([$enable])
575 $enabled = $json->get_canonical
576 If $enable is true (or missing), then the "encode" method will
577 output JSON objects by sorting their keys. This is adding a
578 comparatively high overhead.
579
580 If $enable is false, then the "encode" method will output key-value
581 pairs in the order Perl stores them (which will likely change
582 between runs of the same script, and can change even within the
583 same run from 5.18 onwards).
584
585 This option is useful if you want the same data structure to be
586 encoded as the same JSON text (given the same overall settings). If
587 it is disabled, the same hash might be encoded differently even if
588 contains the same data, as key-value pairs have no inherent
589 ordering in Perl.
590
591 This setting has no effect when decoding JSON texts.
592
593 This setting has currently no effect on tied hashes.
594
595 $json = $json->sort_by (undef, 0, 1 or a block)
596 This currently only (un)sets the "canonical" option, and ignores
597 custom sort blocks.
598
599 This setting has no effect when decoding JSON texts.
600
601 This setting has currently no effect on tied hashes.
602
603 $json = $json->escape_slash ([$enable])
604 $enabled = $json->get_escape_slash
605 According to the JSON Grammar, the forward slash character (U+002F)
606 "/" need to be escaped. But by default strings are encoded without
607 escaping slashes in all perl JSON encoders.
608
609 If $enable is true (or missing), then "encode" will escape slashes,
610 "\/".
611
612 This setting has no effect when decoding JSON texts.
613
614 $json = $json->unblessed_bool ([$enable])
615 $enabled = $json->get_unblessed_bool
616 $json = $json->unblessed_bool([$enable])
617
618 If $enable is true (or missing), then "decode" will return Perl
619 non-object boolean variables (1 and 0) for JSON booleans ("true"
620 and "false"). If $enable is false, then "decode" will return
621 "Cpanel::JSON::XS::Boolean" objects for JSON booleans.
622
623 $json = $json->allow_singlequote ([$enable])
624 $enabled = $json->get_allow_singlequote
625 $json = $json->allow_singlequote([$enable])
626
627 If $enable is true (or missing), then "decode" will accept JSON
628 strings quoted by single quotations that are invalid JSON format.
629
630 $json->allow_singlequote->decode({"foo":'bar'});
631 $json->allow_singlequote->decode({'foo':"bar"});
632 $json->allow_singlequote->decode({'foo':'bar'});
633
634 This is also enabled with "relaxed". As same as the "relaxed"
635 option, this option may be used to parse application-specific files
636 written by humans.
637
638 $json = $json->allow_barekey ([$enable])
639 $enabled = $json->get_allow_barekey
640 $json = $json->allow_barekey([$enable])
641
642 If $enable is true (or missing), then "decode" will accept bare
643 keys of JSON object that are invalid JSON format.
644
645 Same as with the "relaxed" option, this option may be used to parse
646 application-specific files written by humans.
647
648 $json->allow_barekey->decode('{foo:"bar"}');
649
650 $json = $json->allow_bignum ([$enable])
651 $enabled = $json->get_allow_bignum
652 $json = $json->allow_bignum([$enable])
653
654 If $enable is true (or missing), then "decode" will convert the big
655 integer Perl cannot handle as integer into a Math::BigInt object
656 and convert a floating number (any) into a Math::BigFloat.
657
658 On the contrary, "encode" converts "Math::BigInt" objects and
659 "Math::BigFloat" objects into JSON numbers with "allow_blessed"
660 enable.
661
662 $json->allow_nonref->allow_blessed->allow_bignum;
663 $bigfloat = $json->decode('2.000000000000000000000000001');
664 print $json->encode($bigfloat);
665 # => 2.000000000000000000000000001
666
667 See "MAPPING" about the normal conversion of JSON number.
668
669 $json = $json->allow_bigint ([$enable])
670 This option is obsolete and replaced by allow_bignum.
671
672 $json = $json->allow_nonref ([$enable])
673 $enabled = $json->get_allow_nonref
674 If $enable is true (or missing), then the "encode" method can
675 convert a non-reference into its corresponding string, number or
676 null JSON value, which is an extension to RFC4627. Likewise,
677 "decode" will accept those JSON values instead of croaking.
678
679 If $enable is false, then the "encode" method will croak if it
680 isn't passed an arrayref or hashref, as JSON texts must either be
681 an object or array. Likewise, "decode" will croak if given
682 something that is not a JSON object or array.
683
684 Example, encode a Perl scalar as JSON value with enabled
685 "allow_nonref", resulting in an invalid JSON text:
686
687 Cpanel::JSON::XS->new->allow_nonref->encode ("Hello, World!")
688 => "Hello, World!"
689
690 $json = $json->allow_unknown ([$enable])
691 $enabled = $json->get_allow_unknown
692 If $enable is true (or missing), then "encode" will not throw an
693 exception when it encounters values it cannot represent in JSON
694 (for example, filehandles) but instead will encode a JSON "null"
695 value. Note that blessed objects are not included here and are
696 handled separately by c<allow_nonref>.
697
698 If $enable is false (the default), then "encode" will throw an
699 exception when it encounters anything it cannot encode as JSON.
700
701 This option does not affect "decode" in any way, and it is
702 recommended to leave it off unless you know your communications
703 partner.
704
705 $json = $json->allow_stringify ([$enable])
706 $enabled = $json->get_allow_stringify
707 If $enable is true (or missing), then "encode" will stringify the
708 non-object perl value or reference. Note that blessed objects are
709 not included here and are handled separately by "allow_blessed" and
710 "convert_blessed". String references are stringified to the string
711 value, other references as in perl.
712
713 This option does not affect "decode" in any way.
714
715 This option is special to this module, it is not supported by other
716 encoders. So it is not recommended to use it.
717
718 $json = $json->allow_dupkeys ([$enable])
719 $enabled = $json->get_allow_dupkeys
720 If $enable is true (or missing), then the "decode" method will not
721 die when it encounters duplicate keys in a hash. "allow_dupkeys"
722 is also enabled in the "relaxed" mode.
723
724 The JSON spec allows duplicate name in objects but recommends to
725 disable it, however with Perl hashes they are impossible, parsing
726 JSON in Perl silently ignores duplicate names, using the last value
727 found.
728
729 See <http://seriot.ch/parsing_json.php#24>: RFC 7159 section 4:
730 "The names within an object should be unique."
731
732 $json = $json->allow_blessed ([$enable])
733 $enabled = $json->get_allow_blessed
734 If $enable is true (or missing), then the "encode" method will not
735 barf when it encounters a blessed reference. Instead, the value of
736 the convert_blessed option will decide whether "null"
737 ("convert_blessed" disabled or no "TO_JSON" method found) or a
738 representation of the object ("convert_blessed" enabled and
739 "TO_JSON" method found) is being encoded. Has no effect on
740 "decode".
741
742 If $enable is false (the default), then "encode" will throw an
743 exception when it encounters a blessed object.
744
745 This setting has no effect on "decode".
746
747 $json = $json->convert_blessed ([$enable])
748 $enabled = $json->get_convert_blessed
749 If $enable is true (or missing), then "encode", upon encountering a
750 blessed object, will check for the availability of the "TO_JSON"
751 method on the object's class. If found, it will be called in scalar
752 context and the resulting scalar will be encoded instead of the
753 object. If no "TO_JSON" method is found, a stringification overload
754 method is tried next. If both are not found, the value of
755 "allow_blessed" will decide what to do.
756
757 The "TO_JSON" method may safely call die if it wants. If "TO_JSON"
758 returns other blessed objects, those will be handled in the same
759 way. "TO_JSON" must take care of not causing an endless recursion
760 cycle (== crash) in this case. The name of "TO_JSON" was chosen
761 because other methods called by the Perl core (== not by the user
762 of the object) are usually in upper case letters and to avoid
763 collisions with any "to_json" function or method.
764
765 If $enable is false (the default), then "encode" will not consider
766 this type of conversion.
767
768 This setting has no effect on "decode".
769
770 $json = $json->allow_tags ([$enable])
771 $enabled = $json->get_allow_tags
772 See "OBJECT SERIALIZATION" for details.
773
774 If $enable is true (or missing), then "encode", upon encountering a
775 blessed object, will check for the availability of the "FREEZE"
776 method on the object's class. If found, it will be used to
777 serialize the object into a nonstandard tagged JSON value (that
778 JSON decoders cannot decode).
779
780 It also causes "decode" to parse such tagged JSON values and
781 deserialize them via a call to the "THAW" method.
782
783 If $enable is false (the default), then "encode" will not consider
784 this type of conversion, and tagged JSON values will cause a parse
785 error in "decode", as if tags were not part of the grammar.
786
787 $json = $json->filter_json_object ([$coderef->($hashref)])
788 When $coderef is specified, it will be called from "decode" each
789 time it decodes a JSON object. The only argument is a reference to
790 the newly-created hash. If the code references returns a single
791 scalar (which need not be a reference), this value (i.e. a copy of
792 that scalar to avoid aliasing) is inserted into the deserialized
793 data structure. If it returns an empty list (NOTE: not "undef",
794 which is a valid scalar), the original deserialized hash will be
795 inserted. This setting can slow down decoding considerably.
796
797 When $coderef is omitted or undefined, any existing callback will
798 be removed and "decode" will not change the deserialized hash in
799 any way.
800
801 Example, convert all JSON objects into the integer 5:
802
803 my $js = Cpanel::JSON::XS->new->filter_json_object (sub { 5 });
804 # returns [5]
805 $js->decode ('[{}]')
806 # throw an exception because allow_nonref is not enabled
807 # so a lone 5 is not allowed.
808 $js->decode ('{"a":1, "b":2}');
809
810 $json = $json->filter_json_single_key_object ($key [=>
811 $coderef->($value)])
812 Works remotely similar to "filter_json_object", but is only called
813 for JSON objects having a single key named $key.
814
815 This $coderef is called before the one specified via
816 "filter_json_object", if any. It gets passed the single value in
817 the JSON object. If it returns a single value, it will be inserted
818 into the data structure. If it returns nothing (not even "undef"
819 but the empty list), the callback from "filter_json_object" will be
820 called next, as if no single-key callback were specified.
821
822 If $coderef is omitted or undefined, the corresponding callback
823 will be disabled. There can only ever be one callback for a given
824 key.
825
826 As this callback gets called less often then the
827 "filter_json_object" one, decoding speed will not usually suffer as
828 much. Therefore, single-key objects make excellent targets to
829 serialize Perl objects into, especially as single-key JSON objects
830 are as close to the type-tagged value concept as JSON gets (it's
831 basically an ID/VALUE tuple). Of course, JSON does not support this
832 in any way, so you need to make sure your data never looks like a
833 serialized Perl hash.
834
835 Typical names for the single object key are "__class_whatever__",
836 or "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or
837 even things like "__class_md5sum(classname)__", to reduce the risk
838 of clashing with real hashes.
839
840 Example, decode JSON objects of the form "{ "__widget__" => <id> }"
841 into the corresponding $WIDGET{<id>} object:
842
843 # return whatever is in $WIDGET{5}:
844 Cpanel::JSON::XS
845 ->new
846 ->filter_json_single_key_object (__widget__ => sub {
847 $WIDGET{ $_[0] }
848 })
849 ->decode ('{"__widget__": 5')
850
851 # this can be used with a TO_JSON method in some "widget" class
852 # for serialization to json:
853 sub WidgetBase::TO_JSON {
854 my ($self) = @_;
855
856 unless ($self->{id}) {
857 $self->{id} = ..get..some..id..;
858 $WIDGET{$self->{id}} = $self;
859 }
860
861 { __widget__ => $self->{id} }
862 }
863
864 $json = $json->shrink ([$enable])
865 $enabled = $json->get_shrink
866 Perl usually over-allocates memory a bit when allocating space for
867 strings. This flag optionally resizes strings generated by either
868 "encode" or "decode" to their minimum size possible. This can save
869 memory when your JSON texts are either very very long or you have
870 many short strings. It will also try to downgrade any strings to
871 octet-form if possible: perl stores strings internally either in an
872 encoding called UTF-X or in octet-form. The latter cannot store
873 everything but uses less space in general (and some buggy Perl or C
874 code might even rely on that internal representation being used).
875
876 The actual definition of what shrink does might change in future
877 versions, but it will always try to save space at the expense of
878 time.
879
880 If $enable is true (or missing), the string returned by "encode"
881 will be shrunk-to-fit, while all strings generated by "decode" will
882 also be shrunk-to-fit.
883
884 If $enable is false, then the normal perl allocation algorithms are
885 used. If you work with your data, then this is likely to be
886 faster.
887
888 In the future, this setting might control other things, such as
889 converting strings that look like integers or floats into integers
890 or floats internally (there is no difference on the Perl level),
891 saving space.
892
893 $json = $json->max_depth ([$maximum_nesting_depth])
894 $max_depth = $json->get_max_depth
895 Sets the maximum nesting level (default 512) accepted while
896 encoding or decoding. If a higher nesting level is detected in JSON
897 text or a Perl data structure, then the encoder and decoder will
898 stop and croak at that point.
899
900 Nesting level is defined by number of hash- or arrayrefs that the
901 encoder needs to traverse to reach a given point or the number of
902 "{" or "[" characters without their matching closing parenthesis
903 crossed to reach a given character in a string.
904
905 Setting the maximum depth to one disallows any nesting, so that
906 ensures that the object is only a single hash/object or array.
907
908 If no argument is given, the highest possible setting will be used,
909 which is rarely useful.
910
911 Note that nesting is implemented by recursion in C. The default
912 value has been chosen to be as large as typical operating systems
913 allow without crashing.
914
915 See SECURITY CONSIDERATIONS, below, for more info on why this is
916 useful.
917
918 $json = $json->max_size ([$maximum_string_size])
919 $max_size = $json->get_max_size
920 Set the maximum length a JSON text may have (in bytes) where
921 decoding is being attempted. The default is 0, meaning no limit.
922 When "decode" is called on a string that is longer then this many
923 bytes, it will not attempt to decode the string but throw an
924 exception. This setting has no effect on "encode" (yet).
925
926 If no argument is given, the limit check will be deactivated (same
927 as when 0 is specified).
928
929 See "SECURITY CONSIDERATIONS", below, for more info on why this is
930 useful.
931
932 $json->stringify_infnan ([$infnan_mode = 1])
933 $infnan_mode = $json->get_stringify_infnan
934 Get or set how Cpanel::JSON::XS encodes "inf", "-inf" or "nan" for
935 numeric values. Also qnan, snan or negative nan on some platforms.
936
937 "null": infnan_mode = 0. Similar to most JSON modules in other
938 languages. Always null.
939
940 stringified: infnan_mode = 1. As in Mojo::JSON. Platform specific
941 strings. Stringified via sprintf(%g), with double quotes.
942
943 inf/nan: infnan_mode = 2. As in JSON::XS, and older releases.
944 Passes through platform dependent values, invalid JSON. Stringified
945 via sprintf(%g), but without double quotes.
946
947 "inf/-inf/nan": infnan_mode = 3. Platform independent inf/nan/-inf
948 strings. No QNAN/SNAN/negative NAN support, unified to "nan". Much
949 easier to detect, but may conflict with valid strings.
950
951 $json_text = $json->encode ($perl_scalar)
952 Converts the given Perl data structure (a simple scalar or a
953 reference to a hash or array) to its JSON representation. Simple
954 scalars will be converted into JSON string or number sequences,
955 while references to arrays become JSON arrays and references to
956 hashes become JSON objects. Undefined Perl values (e.g. "undef")
957 become JSON "null" values. Neither "true" nor "false" values will
958 be generated.
959
960 $perl_scalar = $json->decode ($json_text)
961 The opposite of "encode": expects a JSON text and tries to parse
962 it, returning the resulting simple scalar or reference. Croaks on
963 error.
964
965 JSON numbers and strings become simple Perl scalars. JSON arrays
966 become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
967 becomes 1, "false" becomes 0 and "null" becomes "undef".
968
969 ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
970 This works like the "decode" method, but instead of raising an
971 exception when there is trailing garbage after the first JSON
972 object, it will silently stop parsing there and return the number
973 of characters consumed so far.
974
975 This is useful if your JSON texts are not delimited by an outer
976 protocol and you need to know where the JSON text ends.
977
978 Cpanel::JSON::XS->new->decode_prefix ("[1] the tail")
979 => ([1], 3)
980
981 $json->to_json ($perl_hash_or_arrayref)
982 Deprecated method for perl 5.8 and newer. Use encode_json instead.
983
984 $json->from_json ($utf8_encoded_json_text)
985 Deprecated method for perl 5.8 and newer. Use decode_json instead.
986
988 In some cases, there is the need for incremental parsing of JSON texts.
989 While this module always has to keep both JSON text and resulting Perl
990 data structure in memory at one time, it does allow you to parse a JSON
991 stream incrementally. It does so by accumulating text until it has a
992 full JSON object, which it then can decode. This process is similar to
993 using "decode_prefix" to see if a full JSON object is available, but is
994 much more efficient (and can be implemented with a minimum of method
995 calls).
996
997 Cpanel::JSON::XS will only attempt to parse the JSON text once it is
998 sure it has enough text to get a decisive result, using a very simple
999 but truly incremental parser. This means that it sometimes won't stop
1000 as early as the full parser, for example, it doesn't detect mismatched
1001 parentheses. The only thing it guarantees is that it starts decoding as
1002 soon as a syntactically valid JSON text has been seen. This means you
1003 need to set resource limits (e.g. "max_size") to ensure the parser will
1004 stop parsing in the presence if syntax errors.
1005
1006 The following methods implement this incremental parser.
1007
1008 [void, scalar or list context] = $json->incr_parse ([$string])
1009 This is the central parsing function. It can both append new text
1010 and extract objects from the stream accumulated so far (both of
1011 these functions are optional).
1012
1013 If $string is given, then this string is appended to the already
1014 existing JSON fragment stored in the $json object.
1015
1016 After that, if the function is called in void context, it will
1017 simply return without doing anything further. This can be used to
1018 add more text in as many chunks as you want.
1019
1020 If the method is called in scalar context, then it will try to
1021 extract exactly one JSON object. If that is successful, it will
1022 return this object, otherwise it will return "undef". If there is a
1023 parse error, this method will croak just as "decode" would do (one
1024 can then use "incr_skip" to skip the erroneous part). This is the
1025 most common way of using the method.
1026
1027 And finally, in list context, it will try to extract as many
1028 objects from the stream as it can find and return them, or the
1029 empty list otherwise. For this to work, there must be no separators
1030 between the JSON objects or arrays, instead they must be
1031 concatenated back-to-back. If an error occurs, an exception will be
1032 raised as in the scalar context case. Note that in this case, any
1033 previously-parsed JSON texts will be lost.
1034
1035 Example: Parse some JSON arrays/objects in a given string and
1036 return them.
1037
1038 my @objs = Cpanel::JSON::XS->new->incr_parse ("[5][7][1,2]");
1039
1040 $lvalue_string = $json->incr_text (>5.8 only)
1041 This method returns the currently stored JSON fragment as an
1042 lvalue, that is, you can manipulate it. This only works when a
1043 preceding call to "incr_parse" in scalar context successfully
1044 returned an object, and 2. only with Perl >= 5.8
1045
1046 Under all other circumstances you must not call this function (I
1047 mean it. although in simple tests it might actually work, it will
1048 fail under real world conditions). As a special exception, you can
1049 also call this method before having parsed anything.
1050
1051 This function is useful in two cases: a) finding the trailing text
1052 after a JSON object or b) parsing multiple JSON objects separated
1053 by non-JSON text (such as commas).
1054
1055 $json->incr_skip
1056 This will reset the state of the incremental parser and will remove
1057 the parsed text from the input buffer so far. This is useful after
1058 "incr_parse" died, in which case the input buffer and incremental
1059 parser state is left unchanged, to skip the text parsed so far and
1060 to reset the parse state.
1061
1062 The difference to "incr_reset" is that only text until the parse
1063 error occurred is removed.
1064
1065 $json->incr_reset
1066 This completely resets the incremental parser, that is, after this
1067 call, it will be as if the parser had never parsed anything.
1068
1069 This is useful if you want to repeatedly parse JSON objects and
1070 want to ignore any trailing data, which means you have to reset the
1071 parser after each successful decode.
1072
1073 LIMITATIONS
1074 All options that affect decoding are supported, except "allow_nonref".
1075 The reason for this is that it cannot be made to work sensibly: JSON
1076 objects and arrays are self-delimited, i.e. you can concatenate them
1077 back to back and still decode them perfectly. This does not hold true
1078 for JSON numbers, however.
1079
1080 For example, is the string 1 a single JSON number, or is it simply the
1081 start of 12? Or is 12 a single JSON number, or the concatenation of 1
1082 and 2? In neither case you can tell, and this is why Cpanel::JSON::XS
1083 takes the conservative route and disallows this case.
1084
1085 EXAMPLES
1086 Some examples will make all this clearer. First, a simple example that
1087 works similarly to "decode_prefix": We want to decode the JSON object
1088 at the start of a string and identify the portion after the JSON
1089 object:
1090
1091 my $text = "[1,2,3] hello";
1092
1093 my $json = new Cpanel::JSON::XS;
1094
1095 my $obj = $json->incr_parse ($text)
1096 or die "expected JSON object or array at beginning of string";
1097
1098 my $tail = $json->incr_text;
1099 # $tail now contains " hello"
1100
1101 Easy, isn't it?
1102
1103 Now for a more complicated example: Imagine a hypothetical protocol
1104 where you read some requests from a TCP stream, and each request is a
1105 JSON array, without any separation between them (in fact, it is often
1106 useful to use newlines as "separators", as these get interpreted as
1107 whitespace at the start of the JSON text, which makes it possible to
1108 test said protocol with "telnet"...).
1109
1110 Here is how you'd do it (it is trivial to write this in an event-based
1111 manner):
1112
1113 my $json = new Cpanel::JSON::XS;
1114
1115 # read some data from the socket
1116 while (sysread $socket, my $buf, 4096) {
1117
1118 # split and decode as many requests as possible
1119 for my $request ($json->incr_parse ($buf)) {
1120 # act on the $request
1121 }
1122 }
1123
1124 Another complicated example: Assume you have a string with JSON objects
1125 or arrays, all separated by (optional) comma characters (e.g. "[1],[2],
1126 [3]"). To parse them, we have to skip the commas between the JSON
1127 texts, and here is where the lvalue-ness of "incr_text" comes in
1128 useful:
1129
1130 my $text = "[1],[2], [3]";
1131 my $json = new Cpanel::JSON::XS;
1132
1133 # void context, so no parsing done
1134 $json->incr_parse ($text);
1135
1136 # now extract as many objects as possible. note the
1137 # use of scalar context so incr_text can be called.
1138 while (my $obj = $json->incr_parse) {
1139 # do something with $obj
1140
1141 # now skip the optional comma
1142 $json->incr_text =~ s/^ \s* , //x;
1143 }
1144
1145 Now lets go for a very complex example: Assume that you have a gigantic
1146 JSON array-of-objects, many gigabytes in size, and you want to parse
1147 it, but you cannot load it into memory fully (this has actually
1148 happened in the real world :).
1149
1150 Well, you lost, you have to implement your own JSON parser. But
1151 Cpanel::JSON::XS can still help you: You implement a (very simple)
1152 array parser and let JSON decode the array elements, which are all full
1153 JSON objects on their own (this wouldn't work if the array elements
1154 could be JSON numbers, for example):
1155
1156 my $json = new Cpanel::JSON::XS;
1157
1158 # open the monster
1159 open my $fh, "<bigfile.json"
1160 or die "bigfile: $!";
1161
1162 # first parse the initial "["
1163 for (;;) {
1164 sysread $fh, my $buf, 65536
1165 or die "read error: $!";
1166 $json->incr_parse ($buf); # void context, so no parsing
1167
1168 # Exit the loop once we found and removed(!) the initial "[".
1169 # In essence, we are (ab-)using the $json object as a simple scalar
1170 # we append data to.
1171 last if $json->incr_text =~ s/^ \s* \[ //x;
1172 }
1173
1174 # now we have the skipped the initial "[", so continue
1175 # parsing all the elements.
1176 for (;;) {
1177 # in this loop we read data until we got a single JSON object
1178 for (;;) {
1179 if (my $obj = $json->incr_parse) {
1180 # do something with $obj
1181 last;
1182 }
1183
1184 # add more data
1185 sysread $fh, my $buf, 65536
1186 or die "read error: $!";
1187 $json->incr_parse ($buf); # void context, so no parsing
1188 }
1189
1190 # in this loop we read data until we either found and parsed the
1191 # separating "," between elements, or the final "]"
1192 for (;;) {
1193 # first skip whitespace
1194 $json->incr_text =~ s/^\s*//;
1195
1196 # if we find "]", we are done
1197 if ($json->incr_text =~ s/^\]//) {
1198 print "finished.\n";
1199 exit;
1200 }
1201
1202 # if we find ",", we can continue with the next element
1203 if ($json->incr_text =~ s/^,//) {
1204 last;
1205 }
1206
1207 # if we find anything else, we have a parse error!
1208 if (length $json->incr_text) {
1209 die "parse error near ", $json->incr_text;
1210 }
1211
1212 # else add more data
1213 sysread $fh, my $buf, 65536
1214 or die "read error: $!";
1215 $json->incr_parse ($buf); # void context, so no parsing
1216 }
1217
1218 This is a complex example, but most of the complexity comes from the
1219 fact that we are trying to be correct (bear with me if I am wrong, I
1220 never ran the above example :).
1221
1223 Detect all unicode Byte Order Marks on decode. Which are UTF-8,
1224 UTF-16LE, UTF-16BE, UTF-32LE and UTF-32BE.
1225
1226 The BOM encoding is set only for one specific decode call, it does not
1227 change the state of the JSON object.
1228
1229 Warning: With perls older than 5.20 you need load the Encode module
1230 before loading a multibyte BOM, i.e. >= UTF-16. Otherwise an error is
1231 thrown. This is an implementation limitation and might get fixed later.
1232
1233 See <https://tools.ietf.org/html/rfc7159#section-8.1> "JSON text SHALL
1234 be encoded in UTF-8, UTF-16, or UTF-32."
1235
1236 "Implementations MUST NOT add a byte order mark to the beginning of a
1237 JSON text", "implementations (...) MAY ignore the presence of a byte
1238 order mark rather than treating it as an error".
1239
1240 See also <http://www.unicode.org/faq/utf_bom.html#BOM>.
1241
1242 Beware that Cpanel::JSON::XS is currently the only JSON module which
1243 does accept and decode a BOM.
1244
1245 The latest JSON spec
1246 <https://www.greenbytes.de/tech/webdav/rfc8259.html#character.encoding>
1247 forbid the usage of UTF-16 or UTF-32, the character encoding is UTF-8.
1248 Thus in subsequent updates BOM's of UTF-16 or UTF-32 will throw an
1249 error.
1250
1252 This section describes how Cpanel::JSON::XS maps Perl values to JSON
1253 values and vice versa. These mappings are designed to "do the right
1254 thing" in most circumstances automatically, preserving round-tripping
1255 characteristics (what you put in comes out as something equivalent).
1256
1257 For the more enlightened: note that in the following descriptions,
1258 lowercase perl refers to the Perl interpreter, while uppercase Perl
1259 refers to the abstract Perl language itself.
1260
1261 JSON -> PERL
1262 object
1263 A JSON object becomes a reference to a hash in Perl. No ordering of
1264 object keys is preserved (JSON does not preserve object key
1265 ordering itself).
1266
1267 array
1268 A JSON array becomes a reference to an array in Perl.
1269
1270 string
1271 A JSON string becomes a string scalar in Perl - Unicode codepoints
1272 in JSON are represented by the same codepoints in the Perl string,
1273 so no manual decoding is necessary.
1274
1275 number
1276 A JSON number becomes either an integer, numeric (floating point)
1277 or string scalar in perl, depending on its range and any fractional
1278 parts. On the Perl level, there is no difference between those as
1279 Perl handles all the conversion details, but an integer may take
1280 slightly less memory and might represent more values exactly than
1281 floating point numbers.
1282
1283 If the number consists of digits only, Cpanel::JSON::XS will try to
1284 represent it as an integer value. If that fails, it will try to
1285 represent it as a numeric (floating point) value if that is
1286 possible without loss of precision. Otherwise it will preserve the
1287 number as a string value (in which case you lose roundtripping
1288 ability, as the JSON number will be re-encoded to a JSON string).
1289
1290 Numbers containing a fractional or exponential part will always be
1291 represented as numeric (floating point) values, possibly at a loss
1292 of precision (in which case you might lose perfect roundtripping
1293 ability, but the JSON number will still be re-encoded as a JSON
1294 number).
1295
1296 Note that precision is not accuracy - binary floating point values
1297 cannot represent most decimal fractions exactly, and when
1298 converting from and to floating point, "Cpanel::JSON::XS" only
1299 guarantees precision up to but not including the least significant
1300 bit.
1301
1302 true, false
1303 These JSON atoms become "Cpanel::JSON::XS::true" and
1304 "Cpanel::JSON::XS::false", respectively. They are
1305 "JSON::PP::Boolean" objects and are overloaded to act almost
1306 exactly like the numbers 1 and 0. You can check whether a scalar is
1307 a JSON boolean by using the "Cpanel::JSON::XS::is_bool" function.
1308
1309 The other round, from perl to JSON, "!0" which is represented as
1310 "yes" becomes "true", and "!1" which is represented as "no" becomes
1311 "false".
1312
1313 Via Cpanel::JSON::XS::Type you can now even force negation in
1314 "encode", without overloading of "!":
1315
1316 my $false = Cpanel::JSON::XS::false;
1317 print($json->encode([!$false], [JSON_TYPE_BOOL]));
1318 => [true]
1319
1320 null
1321 A JSON null atom becomes "undef" in Perl.
1322
1323 shell-style comments ("# text")
1324 As a nonstandard extension to the JSON syntax that is enabled by
1325 the "relaxed" setting, shell-style comments are allowed. They can
1326 start anywhere outside strings and go till the end of the line.
1327
1328 tagged values ("(tag)value").
1329 Another nonstandard extension to the JSON syntax, enabled with the
1330 "allow_tags" setting, are tagged values. In this implementation,
1331 the tag must be a perl package/class name encoded as a JSON string,
1332 and the value must be a JSON array encoding optional constructor
1333 arguments.
1334
1335 See "OBJECT SERIALIZATION", below, for details.
1336
1337 PERL -> JSON
1338 The mapping from Perl to JSON is slightly more difficult, as Perl is a
1339 truly typeless language, so we can only guess which JSON type is meant
1340 by a Perl value.
1341
1342 hash references
1343 Perl hash references become JSON objects. As there is no inherent
1344 ordering in hash keys (or JSON objects), they will usually be
1345 encoded in a pseudo-random order that can change between runs of
1346 the same program but stays generally the same within a single run
1347 of a program. Cpanel::JSON::XS can optionally sort the hash keys
1348 (determined by the canonical flag), so the same datastructure will
1349 serialize to the same JSON text (given same settings and version of
1350 Cpanel::JSON::XS), but this incurs a runtime overhead and is only
1351 rarely useful, e.g. when you want to compare some JSON text against
1352 another for equality.
1353
1354 array references
1355 Perl array references become JSON arrays.
1356
1357 other references
1358 Other unblessed references are generally not allowed and will cause
1359 an exception to be thrown, except for references to the integers 0
1360 and 1, which get turned into "false" and "true" atoms in JSON.
1361
1362 With the option "allow_stringify", you can ignore the exception and
1363 return the stringification of the perl value.
1364
1365 With the option "allow_unknown", you can ignore the exception and
1366 return "null" instead.
1367
1368 encode_json [\"x"] # => cannot encode reference to scalar 'SCALAR(0x..)'
1369 # unless the scalar is 0 or 1
1370 encode_json [\0, \1] # yields [false,true]
1371
1372 allow_stringify->encode_json [\"x"] # yields "x" unlike JSON::PP
1373 allow_unknown->encode_json [\"x"] # yields null as in JSON::PP
1374
1375 Cpanel::JSON::XS::true, Cpanel::JSON::XS::false
1376 These special values become JSON true and JSON false values,
1377 respectively. You can also use "\1" and "\0" or "!0" and "!1"
1378 directly if you want.
1379
1380 encode_json [Cpanel::JSON::XS::true, Cpanel::JSON::XS::true] # yields [false,true]
1381 encode_json [!1, !0] # yields [false,true]
1382
1383 eq/ne comparisons with true, false:
1384
1385 false is eq to the empty string or the string 'false' or the
1386 special empty string "!!0", i.e. "SV_NO", or the numbers 0 or 0.0.
1387
1388 true is eq to the string 'true' or to the special string "!0" (i.e.
1389 "SV_YES") or to the numbers 1 or 1.0.
1390
1391 blessed objects
1392 Blessed objects are not directly representable in JSON, but
1393 "Cpanel::JSON::XS" allows various optional ways of handling
1394 objects. See "OBJECT SERIALIZATION", below, for details.
1395
1396 See the "allow_blessed" and "convert_blessed" methods on various
1397 options on how to deal with this: basically, you can choose between
1398 throwing an exception, encoding the reference as if it weren't
1399 blessed, use the objects overloaded stringification method or
1400 provide your own serializer method.
1401
1402 simple scalars
1403 Simple Perl scalars (any scalar that is not a reference) are the
1404 most difficult objects to encode: Cpanel::JSON::XS will encode
1405 undefined scalars or inf/nan as JSON "null" values, scalars that
1406 have last been used in a string context before encoding as JSON
1407 strings, and anything else as number value:
1408
1409 # dump as number
1410 encode_json [2] # yields [2]
1411 encode_json [-3.0e17] # yields [-3e+17]
1412 my $value = 5; encode_json [$value] # yields [5]
1413
1414 # used as string, but the two representations are for the same number
1415 print $value;
1416 encode_json [$value] # yields [5]
1417
1418 # used as different string (non-matching dual-var)
1419 my $str = '0 but true';
1420 my $num = 1 + $str;
1421 encode_json [$num, $str] # yields [1,"0 but true"]
1422
1423 # undef becomes null
1424 encode_json [undef] # yields [null]
1425
1426 # inf or nan becomes null, unless you answered
1427 # "Do you want to handle inf/nan as strings" with yes
1428 encode_json [9**9**9] # yields [null]
1429
1430 You can force the type to be a JSON string by stringifying it:
1431
1432 my $x = 3.1; # some variable containing a number
1433 "$x"; # stringified
1434 $x .= ""; # another, more awkward way to stringify
1435 print $x; # perl does it for you, too, quite often
1436
1437 You can force the type to be a JSON number by numifying it:
1438
1439 my $x = "3"; # some variable containing a string
1440 $x += 0; # numify it, ensuring it will be dumped as a number
1441 $x *= 1; # same thing, the choice is yours.
1442
1443 Note that numerical precision has the same meaning as under Perl
1444 (so binary to decimal conversion follows the same rules as in Perl,
1445 which can differ to other languages). Also, your perl interpreter
1446 might expose extensions to the floating point numbers of your
1447 platform, such as infinities or NaN's - these cannot be represented
1448 in JSON, and thus null is returned instead. Optionally you can
1449 configure it to stringify inf and nan values.
1450
1451 OBJECT SERIALIZATION
1452 As JSON cannot directly represent Perl objects, you have to choose
1453 between a pure JSON representation (without the ability to deserialize
1454 the object automatically again), and a nonstandard extension to the
1455 JSON syntax, tagged values.
1456
1457 SERIALIZATION
1458
1459 What happens when "Cpanel::JSON::XS" encounters a Perl object depends
1460 on the "allow_blessed", "convert_blessed" and "allow_tags" settings,
1461 which are used in this order:
1462
1463 1. "allow_tags" is enabled and the object has a "FREEZE" method.
1464 In this case, "Cpanel::JSON::XS" uses the Types::Serialiser object
1465 serialization protocol to create a tagged JSON value, using a
1466 nonstandard extension to the JSON syntax.
1467
1468 This works by invoking the "FREEZE" method on the object, with the
1469 first argument being the object to serialize, and the second
1470 argument being the constant string "JSON" to distinguish it from
1471 other serializers.
1472
1473 The "FREEZE" method can return any number of values (i.e. zero or
1474 more). These values and the paclkage/classname of the object will
1475 then be encoded as a tagged JSON value in the following format:
1476
1477 ("classname")[FREEZE return values...]
1478
1479 e.g.:
1480
1481 ("URI")["http://www.google.com/"]
1482 ("MyDate")[2013,10,29]
1483 ("ImageData::JPEG")["Z3...VlCg=="]
1484
1485 For example, the hypothetical "My::Object" "FREEZE" method might
1486 use the objects "type" and "id" members to encode the object:
1487
1488 sub My::Object::FREEZE {
1489 my ($self, $serializer) = @_;
1490
1491 ($self->{type}, $self->{id})
1492 }
1493
1494 2. "convert_blessed" is enabled and the object has a "TO_JSON" method.
1495 In this case, the "TO_JSON" method of the object is invoked in
1496 scalar context. It must return a single scalar that can be directly
1497 encoded into JSON. This scalar replaces the object in the JSON
1498 text.
1499
1500 For example, the following "TO_JSON" method will convert all URI
1501 objects to JSON strings when serialized. The fact that these values
1502 originally were URI objects is lost.
1503
1504 sub URI::TO_JSON {
1505 my ($uri) = @_;
1506 $uri->as_string
1507 }
1508
1509 2. "convert_blessed" is enabled and the object has a stringification
1510 overload.
1511 In this case, the overloaded "" method of the object is invoked in
1512 scalar context. It must return a single scalar that can be directly
1513 encoded into JSON. This scalar replaces the object in the JSON
1514 text.
1515
1516 For example, the following "" method will convert all URI objects
1517 to JSON strings when serialized. The fact that these values
1518 originally were URI objects is lost.
1519
1520 package URI;
1521 use overload '""' => sub { shift->as_string };
1522
1523 3. "allow_blessed" is enabled.
1524 The object will be serialized as a JSON null value.
1525
1526 4. none of the above
1527 If none of the settings are enabled or the respective methods are
1528 missing, "Cpanel::JSON::XS" throws an exception.
1529
1530 DESERIALIZATION
1531
1532 For deserialization there are only two cases to consider: either
1533 nonstandard tagging was used, in which case "allow_tags" decides, or
1534 objects cannot be automatically be deserialized, in which case you can
1535 use postprocessing or the "filter_json_object" or
1536 "filter_json_single_key_object" callbacks to get some real objects our
1537 of your JSON.
1538
1539 This section only considers the tagged value case: I a tagged JSON
1540 object is encountered during decoding and "allow_tags" is disabled, a
1541 parse error will result (as if tagged values were not part of the
1542 grammar).
1543
1544 If "allow_tags" is enabled, "Cpanel::JSON::XS" will look up the "THAW"
1545 method of the package/classname used during serialization (it will not
1546 attempt to load the package as a Perl module). If there is no such
1547 method, the decoding will fail with an error.
1548
1549 Otherwise, the "THAW" method is invoked with the classname as first
1550 argument, the constant string "JSON" as second argument, and all the
1551 values from the JSON array (the values originally returned by the
1552 "FREEZE" method) as remaining arguments.
1553
1554 The method must then return the object. While technically you can
1555 return any Perl scalar, you might have to enable the "enable_nonref"
1556 setting to make that work in all cases, so better return an actual
1557 blessed reference.
1558
1559 As an example, let's implement a "THAW" function that regenerates the
1560 "My::Object" from the "FREEZE" example earlier:
1561
1562 sub My::Object::THAW {
1563 my ($class, $serializer, $type, $id) = @_;
1564
1565 $class->new (type => $type, id => $id)
1566 }
1567
1568 See the "SECURITY CONSIDERATIONS" section below. Allowing external json
1569 objects being deserialized to perl objects is usually a very bad idea.
1570
1572 The interested reader might have seen a number of flags that signify
1573 encodings or codesets - "utf8", "latin1", "binary" and "ascii". There
1574 seems to be some confusion on what these do, so here is a short
1575 comparison:
1576
1577 "utf8" controls whether the JSON text created by "encode" (and expected
1578 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only
1579 control whether "encode" escapes character values outside their
1580 respective codeset range. Neither of these flags conflict with each
1581 other, although some combinations make less sense than others.
1582
1583 Care has been taken to make all flags symmetrical with respect to
1584 "encode" and "decode", that is, texts encoded with any combination of
1585 these flag values will be correctly decoded when the same flags are
1586 used - in general, if you use different flag settings while encoding
1587 vs. when decoding you likely have a bug somewhere.
1588
1589 Below comes a verbose discussion of these flags. Note that a "codeset"
1590 is simply an abstract set of character-codepoint pairs, while an
1591 encoding takes those codepoint numbers and encodes them, in our case
1592 into octets. Unicode is (among other things) a codeset, UTF-8 is an
1593 encoding, and ISO-8859-1 (= latin 1) and ASCII are both codesets and
1594 encodings at the same time, which can be confusing.
1595
1596 "utf8" flag disabled
1597 When "utf8" is disabled (the default), then "encode"/"decode"
1598 generate and expect Unicode strings, that is, characters with high
1599 ordinal Unicode values (> 255) will be encoded as such characters,
1600 and likewise such characters are decoded as-is, no changes to them
1601 will be done, except "(re-)interpreting" them as Unicode codepoints
1602 or Unicode characters, respectively (to Perl, these are the same
1603 thing in strings unless you do funny/weird/dumb stuff).
1604
1605 This is useful when you want to do the encoding yourself (e.g. when
1606 you want to have UTF-16 encoded JSON texts) or when some other
1607 layer does the encoding for you (for example, when printing to a
1608 terminal using a filehandle that transparently encodes to UTF-8 you
1609 certainly do NOT want to UTF-8 encode your data first and have Perl
1610 encode it another time).
1611
1612 "utf8" flag enabled
1613 If the "utf8"-flag is enabled, "encode"/"decode" will encode all
1614 characters using the corresponding UTF-8 multi-byte sequence, and
1615 will expect your input strings to be encoded as UTF-8, that is, no
1616 "character" of the input string must have any value > 255, as UTF-8
1617 does not allow that.
1618
1619 The "utf8" flag therefore switches between two modes: disabled
1620 means you will get a Unicode string in Perl, enabled means you get
1621 an UTF-8 encoded octet/binary string in Perl.
1622
1623 "latin1", "binary" or "ascii" flags enabled
1624 With "latin1" (or "ascii") enabled, "encode" will escape characters
1625 with ordinal values > 255 (> 127 with "ascii") and encode the
1626 remaining characters as specified by the "utf8" flag. With
1627 "binary" enabled, ordinal values > 255 are illegal.
1628
1629 If "utf8" is disabled, then the result is also correctly encoded in
1630 those character sets (as both are proper subsets of Unicode,
1631 meaning that a Unicode string with all character values < 256 is
1632 the same thing as a ISO-8859-1 string, and a Unicode string with
1633 all character values < 128 is the same thing as an ASCII string in
1634 Perl).
1635
1636 If "utf8" is enabled, you still get a correct UTF-8-encoded string,
1637 regardless of these flags, just some more characters will be
1638 escaped using "\uXXXX" then before.
1639
1640 Note that ISO-8859-1-encoded strings are not compatible with UTF-8
1641 encoding, while ASCII-encoded strings are. That is because the
1642 ISO-8859-1 encoding is NOT a subset of UTF-8 (despite the
1643 ISO-8859-1 codeset being a subset of Unicode), while ASCII is.
1644
1645 Surprisingly, "decode" will ignore these flags and so treat all
1646 input values as governed by the "utf8" flag. If it is disabled,
1647 this allows you to decode ISO-8859-1- and ASCII-encoded strings, as
1648 both strict subsets of Unicode. If it is enabled, you can correctly
1649 decode UTF-8 encoded strings.
1650
1651 So neither "latin1", "binary" nor "ascii" are incompatible with the
1652 "utf8" flag - they only govern when the JSON output engine escapes
1653 a character or not.
1654
1655 The main use for "latin1" or "binary" is to relatively efficiently
1656 store binary data as JSON, at the expense of breaking compatibility
1657 with most JSON decoders.
1658
1659 The main use for "ascii" is to force the output to not contain
1660 characters with values > 127, which means you can interpret the
1661 resulting string as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about
1662 any character set and 8-bit-encoding, and still get the same data
1663 structure back. This is useful when your channel for JSON transfer
1664 is not 8-bit clean or the encoding might be mangled in between
1665 (e.g. in mail), and works because ASCII is a proper subset of most
1666 8-bit and multibyte encodings in use in the world.
1667
1668 JSON and ECMAscript
1669 JSON syntax is based on how literals are represented in javascript (the
1670 not-standardized predecessor of ECMAscript) which is presumably why it
1671 is called "JavaScript Object Notation".
1672
1673 However, JSON is not a subset (and also not a superset of course) of
1674 ECMAscript (the standard) or javascript (whatever browsers actually
1675 implement).
1676
1677 If you want to use javascript's "eval" function to "parse" JSON, you
1678 might run into parse errors for valid JSON texts, or the resulting data
1679 structure might not be queryable:
1680
1681 One of the problems is that U+2028 and U+2029 are valid characters
1682 inside JSON strings, but are not allowed in ECMAscript string literals,
1683 so the following Perl fragment will not output something that can be
1684 guaranteed to be parsable by javascript's "eval":
1685
1686 use Cpanel::JSON::XS;
1687
1688 print encode_json [chr 0x2028];
1689
1690 The right fix for this is to use a proper JSON parser in your
1691 javascript programs, and not rely on "eval" (see for example Douglas
1692 Crockford's json2.js parser).
1693
1694 If this is not an option, you can, as a stop-gap measure, simply encode
1695 to ASCII-only JSON:
1696
1697 use Cpanel::JSON::XS;
1698
1699 print Cpanel::JSON::XS->new->ascii->encode ([chr 0x2028]);
1700
1701 Note that this will enlarge the resulting JSON text quite a bit if you
1702 have many non-ASCII characters. You might be tempted to run some
1703 regexes to only escape U+2028 and U+2029, e.g.:
1704
1705 # DO NOT USE THIS!
1706 my $json = Cpanel::JSON::XS->new->utf8->encode ([chr 0x2028]);
1707 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1708 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1709 print $json;
1710
1711 Note that this is a bad idea: the above only works for U+2028 and
1712 U+2029 and thus only for fully ECMAscript-compliant parsers. Many
1713 existing javascript implementations, however, have issues with other
1714 characters as well - using "eval" naively simply will cause problems.
1715
1716 Another problem is that some javascript implementations reserve some
1717 property names for their own purposes (which probably makes them non-
1718 ECMAscript-compliant). For example, Iceweasel reserves the "__proto__"
1719 property name for its own purposes.
1720
1721 If that is a problem, you could parse try to filter the resulting JSON
1722 output for these property strings, e.g.:
1723
1724 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
1725
1726 This works because "__proto__" is not valid outside of strings, so
1727 every occurrence of ""__proto__"\s*:" must be a string used as property
1728 name.
1729
1730 Unicode non-characters between U+FFFD and U+10FFFF are decoded either
1731 to the recommended U+FFFD REPLACEMENT CHARACTER (see Unicode PR #121:
1732 Recommended Practice for Replacement Characters), or in the binary or
1733 relaxed mode left as is, keeping the illegal non-characters as before.
1734
1735 Raw non-Unicode characters outside the valid unicode range fail now to
1736 parse, because "A string is a sequence of zero or more Unicode
1737 characters" RFC 7159 section 1 and "JSON text SHALL be encoded in
1738 Unicode RFC 7159 section 8.1. We use now the UTF8_DISALLOW_SUPER flag
1739 when parsing unicode.
1740
1741 If you know of other incompatibilities, please let me know.
1742
1743 JSON and YAML
1744 You often hear that JSON is a subset of YAML. in general, there is no
1745 way to configure JSON::XS to output a data structure as valid YAML that
1746 works in all cases. If you really must use Cpanel::JSON::XS to
1747 generate YAML, you should use this algorithm (subject to change in
1748 future versions):
1749
1750 my $to_yaml = Cpanel::JSON::XS->new->utf8->space_after (1);
1751 my $yaml = $to_yaml->encode ($ref) . "\n";
1752
1753 This will usually generate JSON texts that also parse as valid YAML.
1754
1755 SPEED
1756 It seems that JSON::XS is surprisingly fast, as shown in the following
1757 tables. They have been generated with the help of the "eg/bench"
1758 program in the JSON::XS distribution, to make it easy to compare on
1759 your own system.
1760
1761 JSON::XS is with Data::MessagePack and Sereal one of the fastest
1762 serializers, because JSON and JSON::XS do not support backrefs (no
1763 graph structures), only trees. Storable supports backrefs, i.e. graphs.
1764 Data::MessagePack encodes its data binary (as Storable) and supports
1765 only very simple subset of JSON.
1766
1767 First comes a comparison between various modules using a very short
1768 single-line JSON string (also available at
1769 <http://dist.schmorp.de/misc/json/short.json>).
1770
1771 {"method": "handleMessage", "params": ["user1",
1772 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1773 1, 0]}
1774
1775 It shows the number of encodes/decodes per second (JSON::XS uses the
1776 functional interface, while Cpanel::JSON::XS/2 uses the OO interface
1777 with pretty-printing and hash key sorting enabled, Cpanel::JSON::XS/3
1778 enables shrink. JSON::DWIW/DS uses the deserialize function, while
1779 JSON::DWIW::FJ uses the from_json method). Higher is better:
1780
1781 module | encode | decode |
1782 --------------|------------|------------|
1783 JSON::DWIW/DS | 86302.551 | 102300.098 |
1784 JSON::DWIW/FJ | 86302.551 | 75983.768 |
1785 JSON::PP | 15827.562 | 6638.658 |
1786 JSON::Syck | 63358.066 | 47662.545 |
1787 JSON::XS | 511500.488 | 511500.488 |
1788 JSON::XS/2 | 291271.111 | 388361.481 |
1789 JSON::XS/3 | 361577.931 | 361577.931 |
1790 Storable | 66788.280 | 265462.278 |
1791 --------------+------------+------------+
1792
1793 That is, JSON::XS is almost six times faster than JSON::DWIW on
1794 encoding, about five times faster on decoding, and over thirty to
1795 seventy times faster than JSON's pure perl implementation. It also
1796 compares favourably to Storable for small amounts of data.
1797
1798 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1799 search API (<http://dist.schmorp.de/misc/json/long.json>).
1800
1801 module | encode | decode |
1802 --------------|------------|------------|
1803 JSON::DWIW/DS | 1647.927 | 2673.916 |
1804 JSON::DWIW/FJ | 1630.249 | 2596.128 |
1805 JSON::PP | 400.640 | 62.311 |
1806 JSON::Syck | 1481.040 | 1524.869 |
1807 JSON::XS | 20661.596 | 9541.183 |
1808 JSON::XS/2 | 10683.403 | 9416.938 |
1809 JSON::XS/3 | 20661.596 | 9400.054 |
1810 Storable | 19765.806 | 10000.725 |
1811 --------------+------------+------------+
1812
1813 Again, JSON::XS leads by far (except for Storable which non-
1814 surprisingly decodes a bit faster).
1815
1816 On large strings containing lots of high Unicode characters, some
1817 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
1818 result will be broken due to missing (or wrong) Unicode handling.
1819 Others refuse to decode or encode properly, so it was impossible to
1820 prepare a fair comparison table for that case.
1821
1822 For updated graphs see
1823 <https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs>
1824
1826 As long as you only serialize data that can be directly expressed in
1827 JSON, "Cpanel::JSON::XS" is incapable of generating invalid JSON output
1828 (modulo bugs, but "JSON::XS" has found more bugs in the official JSON
1829 testsuite (1) than the official JSON testsuite has found in "JSON::XS"
1830 (0)). "Cpanel::JSON::XS" is currently the only known JSON decoder
1831 which passes all <http://seriot.ch/parsing_json.html> tests, while
1832 being the fastest also.
1833
1834 When you have trouble decoding JSON generated by this module using
1835 other decoders, then it is very likely that you have an encoding
1836 mismatch or the other decoder is broken.
1837
1838 When decoding, "JSON::XS" is strict by default and will likely catch
1839 all errors. There are currently two settings that change this:
1840 "relaxed" makes "JSON::XS" accept (but not generate) some non-standard
1841 extensions, and "allow_tags" or "allow_blessed" will allow you to
1842 encode and decode Perl objects, at the cost of being totally insecure
1843 and not outputting valid JSON anymore.
1844
1845 JSON-XS-3.01 broke interoperability with JSON-2.90 with booleans. See
1846 JSON.
1847
1848 Cpanel::JSON::XS needs to know the JSON and JSON::XS versions to be
1849 able work with those objects, especially when encoding a booleans like
1850 "{"is_true":true}". So you need to load these modules before.
1851
1852 true/false overloading and boolean representations are supported.
1853
1854 JSON::XS and JSON::PP representations are accepted and older JSON::XS
1855 accepts Cpanel::JSON::XS booleans. All JSON modules JSON, JSON, PP,
1856 JSON::XS, Cpanel::JSON::XS produce JSON::PP::Boolean objects, just Mojo
1857 and JSON::YAJL not. Mojo produces Mojo::JSON::_Bool and
1858 JSON::YAJL::Parser just an unblessed IV.
1859
1860 Cpanel::JSON::XS accepts JSON::PP::Boolean and Mojo::JSON::_Bool
1861 objects as booleans.
1862
1863 I cannot think of any reason to still use JSON::XS anymore.
1864
1865 TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS
1866 When you use "allow_tags" to use the extended (and also nonstandard and
1867 invalid) JSON syntax for serialized objects, and you still want to
1868 decode the generated serialize objects, you can run a regex to replace
1869 the tagged syntax by standard JSON arrays (it only works for "normal"
1870 package names without comma, newlines or single colons). First, the
1871 readable Perl version:
1872
1873 # if your FREEZE methods return no values, you need this replace first:
1874 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[\s*\]/[$1]/gx;
1875
1876 # this works for non-empty constructor arg lists:
1877 $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[/[$1,/gx;
1878
1879 And here is a less readable version that is easy to adapt to other
1880 languages:
1881
1882 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/[$1,/g;
1883
1884 Here is an ECMAScript version (same regex):
1885
1886 json = json.replace (/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/g, "[$1,");
1887
1888 Since this syntax converts to standard JSON arrays, it might be hard to
1889 distinguish serialized objects from normal arrays. You can prepend a
1890 "magic number" as first array element to reduce chances of a collision:
1891
1892 $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g;
1893
1894 And after decoding the JSON text, you could walk the data structure
1895 looking for arrays with a first element of
1896 "XU1peReLzT4ggEllLanBYq4G9VzliwKF".
1897
1898 The same approach can be used to create the tagged format with another
1899 encoder. First, you create an array with the magic string as first
1900 member, the classname as second, and constructor arguments last, encode
1901 it as part of your JSON structure, and then:
1902
1903 $json =~ s/\[\s*"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s*,\s*("([^\\":,]+|\\.|::)*")\s*,/($1)[/g;
1904
1905 Again, this has some limitations - the magic string must not be encoded
1906 with character escapes, and the constructor arguments must be non-
1907 empty.
1908
1910 Since this module was written, Google has written a new JSON RFC, RFC
1911 7159 (and RFC7158). Unfortunately, this RFC breaks compatibility with
1912 both the original JSON specification on www.json.org and RFC4627.
1913
1914 As far as I can see, you can get partial compatibility when parsing by
1915 using "->allow_nonref". However, consider the security implications of
1916 doing so.
1917
1918 I haven't decided yet when to break compatibility with RFC4627 by
1919 default (and potentially leave applications insecure) and change the
1920 default to follow RFC7159, but application authors are well advised to
1921 call "->allow_nonref(0)" even if this is the current default, if they
1922 cannot handle non-reference values, in preparation for the day when the
1923 default will change.
1924
1926 JSON::XS and Cpanel::JSON::XS are not only fast. JSON is generally the
1927 most secure serializing format, because it is the only one besides
1928 Data::MessagePack, which does not deserialize objects per default. For
1929 all languages, not just perl. The binary variant BSON (MongoDB) does
1930 more but is unsafe.
1931
1932 It is trivial for any attacker to create such serialized objects in
1933 JSON and trick perl into expanding them, thereby triggering certain
1934 methods. Watch <https://www.youtube.com/watch?v=Gzx6KlqiIZE> for an
1935 exploit demo for "CVE-2015-1592 SixApart MovableType Storable Perl Code
1936 Execution" for a deserializer which expands objects. Deserializing
1937 even coderefs (methods, functions) or external data would be considered
1938 the most dangerous.
1939
1940 Security relevant overview of serializers regarding deserializing
1941 objects by default:
1942
1943 Objects Coderefs External Data
1944
1945 Data::Dumper YES YES YES
1946 Storable YES NO (def) NO
1947 Sereal YES NO NO
1948 YAML YES NO NO
1949 B::C YES YES YES
1950 B::Bytecode YES YES YES
1951 BSON YES YES NO
1952 JSON::SL YES NO YES
1953 JSON NO (def) NO NO
1954 Data::MessagePack NO NO NO
1955 XML NO NO YES
1956
1957 Pickle YES YES YES
1958 PHP Deserialize YES NO NO
1959
1960 When you are using JSON in a protocol, talking to untrusted potentially
1961 hostile creatures requires relatively few measures.
1962
1963 First of all, your JSON decoder should be secure, that is, should not
1964 have any buffer overflows. Obviously, this module should ensure that.
1965
1966 Second, you need to avoid resource-starving attacks. That means you
1967 should limit the size of JSON texts you accept, or make sure then when
1968 your resources run out, that's just fine (e.g. by using a separate
1969 process that can crash safely). The size of a JSON text in octets or
1970 characters is usually a good indication of the size of the resources
1971 required to decode it into a Perl structure. While JSON::XS can check
1972 the size of the JSON text, it might be too late when you already have
1973 it in memory, so you might want to check the size before you accept the
1974 string.
1975
1976 Third, Cpanel::JSON::XS recurses using the C stack when decoding
1977 objects and arrays. The C stack is a limited resource: for instance, on
1978 my amd64 machine with 8MB of stack size I can decode around 180k nested
1979 arrays but only 14k nested JSON objects (due to perl itself recursing
1980 deeply on croak to free the temporary). If that is exceeded, the
1981 program crashes. To be conservative, the default nesting limit is set
1982 to 512. If your process has a smaller stack, you should adjust this
1983 setting accordingly with the "max_depth" method.
1984
1985 Also keep in mind that Cpanel::JSON::XS might leak contents of your
1986 Perl data structures in its error messages, so when you serialize
1987 sensitive information you might want to make sure that exceptions
1988 thrown by JSON::XS will not end up in front of untrusted eyes.
1989
1990 If you are using Cpanel::JSON::XS to return packets to consumption by
1991 JavaScript scripts in a browser you should have a look at
1992 <http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/>
1993 to see whether you are vulnerable to some common attack vectors (which
1994 really are browser design bugs, but it is still you who will have to
1995 deal with it, as major browser developers care only for features, not
1996 about getting security right). You might also want to also look at
1997 Mojo::JSON special escape rules to prevent from XSS attacks.
1998
2000 TL;DR: Due to security concerns, Cpanel::JSON::XS will not allow scalar
2001 data in JSON texts by default - you need to create your own
2002 Cpanel::JSON::XS object and enable "allow_nonref":
2003
2004 my $json = JSON::XS->new->allow_nonref;
2005
2006 $text = $json->encode ($data);
2007 $data = $json->decode ($text);
2008
2009 The long version: JSON being an important and supposedly stable format,
2010 the IETF standardized it as RFC 4627 in 2006. Unfortunately the
2011 inventor of JSON Douglas Crockford unilaterally changed the definition
2012 of JSON in javascript. Rather than create a fork, the IETF decided to
2013 standardize the new syntax (apparently, so I as told, without finding
2014 it very amusing).
2015
2016 The biggest difference between the original JSON and the new JSON is
2017 that the new JSON supports scalars (anything other than arrays and
2018 objects) at the top-level of a JSON text. While this is strictly
2019 backwards compatible to older versions, it breaks a number of protocols
2020 that relied on sending JSON back-to-back, and is a minor security
2021 concern.
2022
2023 For example, imagine you have two banks communicating, and on one side,
2024 the JSON coder gets upgraded. Two messages, such as 10 and 1000 might
2025 then be confused to mean 101000, something that couldn't happen in the
2026 original JSON, because neither of these messages would be valid JSON.
2027
2028 If one side accepts these messages, then an upgrade in the coder on
2029 either side could result in this becoming exploitable.
2030
2031 This module has always allowed these messages as an optional extension,
2032 by default disabled. The security concerns are the reason why the
2033 default is still disabled, but future versions might/will likely
2034 upgrade to the newer RFC as default format, so you are advised to check
2035 your implementation and/or override the default with "->allow_nonref
2036 (0)" to ensure that future versions are safe.
2037
2039 Cpanel::JSON::XS has proper ithreads support, unlike JSON::XS. If you
2040 encounter any bugs with thread support please report them.
2041
2043 While the goal of the Cpanel::JSON::XS module is to be correct, that
2044 unfortunately does not mean it's bug-free, only that the author thinks
2045 its design is bug-free. If you keep reporting bugs and tests they will
2046 be fixed swiftly, though.
2047
2048 Since the JSON::XS author refuses to use a public bugtracker and
2049 prefers private emails, we use the tracker at github, so you might want
2050 to report any issues twice. Once in private to MLEHMANN to be fixed in
2051 JSON::XS and one to our the public tracker. Issues fixed by JSON::XS
2052 with a new release will also be backported to Cpanel::JSON::XS and
2053 5.6.2, as long as cPanel relies on 5.6.2 and Cpanel::JSON::XS as our
2054 serializer of choice.
2055
2056 <https://github.com/rurban/Cpanel-JSON-XS/issues>
2057
2059 This module is available under the same licences as perl, the Artistic
2060 license and the GPL.
2061
2063 The cpanel_json_xs command line utility for quick experiments.
2064
2065 JSON, JSON::XS, JSON::MaybeXS, Mojo::JSON, Mojo::JSON::MaybeXS,
2066 JSON::SL, JSON::DWIW, JSON::YAJL, JSON::Any, Test::JSON,
2067 Locale::Wolowitz, <https://metacpan.org/search?q=JSON>
2068
2069 <https://tools.ietf.org/html/rfc7159>
2070
2071 <https://tools.ietf.org/html/rfc4627>
2072
2074 Reini Urban <rurban@cpan.org>
2075
2076 Marc Lehmann <schmorp@schmorp.de>, http://home.schmorp.de/
2077
2079 Reini Urban <rurban@cpan.org>
2080
2081
2082
2083perl v5.28.1 2019-03-26 XS(3)