1Sereal::Encoder(3)    User Contributed Perl Documentation   Sereal::Encoder(3)
2
3
4

NAME

6       Sereal::Encoder - Fast, compact, powerful binary serialization
7

SYNOPSIS

9         use Sereal::Encoder qw(encode_sereal sereal_encode_with_object);
10
11         my $encoder = Sereal::Encoder->new({...options...});
12         my $out = $encoder->encode($structure);
13
14         # alternatively the functional interface:
15         $out = sereal_encode_with_object($encoder, $structure);
16
17         # much slower functional interface with no persistent objects:
18         $out = encode_sereal($structure, {... options ...});
19

DESCRIPTION

21       This library implements an efficient, compact-output, and feature-rich
22       serializer using a binary protocol called Sereal.  Its sister module
23       Sereal::Decoder implements a decoder for this format.  The two are
24       released separately to allow for independent and safer upgrading.  If
25       you care greatly about performance, consider reading the
26       Sereal::Performance documentation after finishing this document.
27
28       The Sereal protocol version emitted by this encoder implementation is
29       currently protocol version 4 by default.
30
31       The protocol specification and many other bits of documentation can be
32       found in the github repository. Right now, the specification is at
33       <https://github.com/Sereal/Sereal/blob/master/sereal_spec.pod>, there
34       is a discussion of the design objectives in
35       <https://github.com/Sereal/Sereal/blob/master/README.pod>, and the
36       output of our benchmarks can be seen at
37       <https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs>.  For
38       more information on getting the best performance out of Sereal, have a
39       look at the "PERFORMANCE" section below.
40

CLASS METHODS

42   new
43       Constructor. Optionally takes a hash reference as first parameter. This
44       hash reference may contain any number of options that influence the
45       behaviour of the encoder.
46
47       Currently, the following options are recognized, none of them are on by
48       default.
49
50       compress
51
52       If this option provided and true, compression of the document body is
53       enabled.  As of Sereal version 4, three different compression
54       techniques are supported and can be enabled by setting "compress" to
55       the respective named constants (exportable from the "Sereal::Encoder"
56       module): Snappy (named constant: "SRL_SNAPPY"), Zlib ("SRL_ZLIB") and
57       Zstd ("SRL_ZSTD").  For your convenience, there is also a
58       "SRL_UNCOMPRESSED" constant.
59
60       If this option is set, then the Snappy-related options below are
61       ignored. They are otherwise recognized for compatibility only.
62
63       compress_threshold
64
65       The size threshold (in bytes) of the uncompressed output below which
66       compression is not even attempted even if enabled.  Defaults to one
67       kilobyte (1024 bytes). Set this to 0 and "compress" to a
68       non-"SRL_UNCOMPRESSED" value to always attempt to compress.  Note that
69       the document will not be compressed if the resulting size will be
70       bigger than the original size (even if "compress_threshold" is 0).
71
72       compress_level
73
74       If Zlib or Zstd compressions are used, then this option will set a
75       compression level: Zlib uses range from 1 (fastest) to 9 (best).
76       Defaults to 6. Zstd uses range from 1 (fastest) to 22 (best). Default
77       is 3.
78
79       snappy
80
81       See also the "compress" option. This option is provided only for
82       compatibility with Sereal V1.
83
84       If set, the main payload of the Sereal document will be compressed
85       using Google's Snappy algorithm. This can yield anywhere from no effect
86       to significant savings on output size at rather low run time cost.  If
87       in doubt, test with your data whether this helps or not.
88
89       The decoder (version 0.04 and up) will know how to handle Snappy-
90       compressed Sereal documents transparently.
91
92       Note: The "snappy_incr" and "snappy" options are identical in Sereal
93       protocol v2 and up (so by default). If using an older protocol version
94       (see "protocol_version" and "use_protocol_v1" options below) to emit
95       Sereal V1 documents, this emits non-incrementally decodable documents.
96       See "snappy_incr" in those cases.
97
98       snappy_incr
99
100       See also the "compress" option. This option is provided only for
101       compatibility with Sereal V1.
102
103       Same as the "snappy" option for default operation (that is in Sereal v2
104       or up).
105
106       In Sereal V1, enables a version of the Snappy protocol which is
107       suitable for incremental parsing of packets. See also the "snappy"
108       option above for more details.
109
110       snappy_threshold
111
112       See also the "compress" option. This option is provided only for
113       compatibility with Sereal V1.
114
115       This option is a synonym for the "compress_threshold" option, but only
116       if Snappy compression is enabled.
117
118       croak_on_bless
119
120       If this option is set, then the encoder will refuse to serialize
121       blessed references and throw an exception instead.
122
123       This can be important because blessed references can mean executing a
124       destructor on a remote system or generally executing code based on
125       data.
126
127       See also "no_bless_objects" to skip the blessing of objects.  When both
128       flags are set, "croak_on_bless" has a higher precedence then
129       "no_bless_objects".
130
131       freeze_callbacks
132
133       This option was introduced in Sereal v2 and needs a Sereal v2 decoder.
134
135       If this option is set, the encoder will check for and possibly invoke
136       the "FREEZE" method on any object in the input data. An object that was
137       serialized using its "FREEZE" method will have its corresponding "THAW"
138       class method called during deserialization. The exact semantics are
139       documented below under "FREEZE/THAW CALLBACK MECHANISM".
140
141       Beware that using this functionality means a significant slowdown for
142       object serialization. Even when serializing objects without a "FREEZE"
143       method, the additional method look up will cost a small amount of
144       runtime.  Yes, "Sereal::Encoder" is so fast that this may make a
145       difference.
146
147       no_bless_objects
148
149       If this option is set, then the encoder will serialize blessed
150       references without the bless information and provide plain data
151       structures instead.
152
153       See also the "croak_on_bless" option above for more details.
154
155       undef_unknown
156
157       If set, unknown/unsupported data structures will be encoded as "undef"
158       instead of throwing an exception.
159
160       Mutually exclusive with "stringify_unknown".  See also "warn_unknown"
161       below.
162
163       stringify_unknown
164
165       If set, unknown/unsupported data structures will be stringified and
166       encoded as that string instead of throwing an exception. The
167       stringification may cause a warning to be emitted by perl.
168
169       Mutually exclusive with "undef_unknown".  See also "warn_unknown"
170       below.
171
172       warn_unknown
173
174       Only has an effect if "undef_unknown" or "stringify_unknown" are
175       enabled.
176
177       If set to a positive integer, any unknown/unsupported data structure
178       encountered will emit a warning. If set to a negative integer, it will
179       warn for unsupported data structures just the same as for a positive
180       value with one exception: For blessed, unsupported items that have
181       string overloading, we silently stringify without warning.
182
183       max_recursion_depth
184
185       "Sereal::Encoder" is recursive. If you pass it a Perl data structure
186       that is deeply nested, it will eventually exhaust the C stack.
187       Therefore, there is a limit on the depth of recursion that is accepted.
188       It defaults to 10000 nested calls. You may choose to override this
189       value with the "max_recursion_depth" option. Beware that setting it too
190       high can cause hard crashes, so only do that if you KNOW that it is
191       safe to do so.
192
193       Do note that the setting is somewhat approximate. Setting it to 10000
194       may break at somewhere between 9997 and 10003 nested structures
195       depending on their types.
196
197       canonical
198
199       Enable all options which are related to producing canonical output, so
200       that two strucutures with similar contents produce the same serialized
201       form.
202
203       See the caveats elsewhere in this document about producing canonical
204       output.
205
206       Currently sets the default for the following parameters:
207       "canonical_refs" and "sort_keys". If the option is explicitly set then
208       this setting is ignored.  More options may be added in the future.
209
210       You are warned that use of this option may incur additional performance
211       penalties in a future release by enabling other options than those
212       listed here.
213
214       canonical_refs
215
216       Normally "Sereal::Encoder" will ARRAYREF and HASHREF tags when the item
217       contains less than 16 items, and and is not referenced more than once.
218       This flag will override this optimization and use a standard REFN ARRAY
219       style tag output. This is primarily useful for producing canonical
220       output and for testing Sereal itself.
221
222       See "CANONICAL REPRESENTATION" for why you might want to use this, and
223       for the various caveats involved.
224
225       sort_keys
226
227       Normally "Sereal::Encoder" will output hashes in whatever order is
228       convenient, generally that used by perl to actually store the hash, or
229       whatever order was returned by a tied hash.
230
231       If this option is enabled then the Encoder will sort the keys before
232       outputting them. It uses more memory, and is quite a bit slower than
233       the default.
234
235       Generally speaking this should mean that a hash and a copy should
236       produce the same output. Nevertheless the user is warned that Perl has
237       a way of "morphing" variables on use, and some of its rules are a
238       little arcane (for instance utf8 keys), and so two hashes that might
239       appear to be the same might still produce different output as far as
240       Sereal is concerned.
241
242       As of 3.006_007 (prerelease candidate for 3.007) the sort order has
243       been changed to the following: order by length of keys (in bytes)
244       ascending, then by byte order of the raw underlying string, then by
245       utf8ness, with non-utf8 first. This order was chosen because it is the
246       most efficient to implement, both in terms of memory and time. This
247       sort order is enabled when sort_keys is set to 1.
248
249       You may also produce output in Perl "cmp" order, by setting sort_keys
250       to 2.  And for backwards compatibility you may also produce output in
251       reverse Perl "cmp" order by setting sort_keys to 3. Prior to 3.006_007
252       this was the only sort order possible, although it was not explicitly
253       defined what it was.
254
255       Note that comparatively speaking both of the "cmp" sort orders are slow
256       and memory inefficient. Unless you have a really good reason stick to
257       the default which is fast and as lean as possible.
258
259       Unless you are concerned with "cross process canonical representation"
260       then it doesn't matter what option you choose.
261
262       See "CANONICAL REPRESENTATION" for why you might want to use this, and
263       for the various caveats involved.
264
265       no_shared_hashkeys
266
267       When the "no_shared_hashkeys" option is set to a true value, then the
268       encoder will disable the detection and elimination of repeated hash
269       keys. This only has an effect for serializing structures containing
270       hashes.  By skipping the detection of repeated hash keys, performance
271       goes up a bit, but the size of the output can potentially be much
272       larger.
273
274       Do not disable this unless you have a reason to.
275
276       dedupe_strings
277
278       If this is option is enabled/true then Sereal will use a hash to encode
279       duplicates of strings during serialization efficiently using (internal)
280       backreferences. This has a performance and memory penalty during
281       encoding so it defaults to off.  On the other hand, data structures
282       with many duplicated strings will see a significant reduction in the
283       size of the encoded form. Currently only strings longer than 3
284       characters will be deduped, however this may change in the future.
285
286       Note that Sereal will perform certain types of deduping automatically
287       even without this option. In particular class names and hash keys (see
288       also the "no_shared_hashkeys" setting) are deduped regardless of this
289       option. Only enable this if you have good reason to believe that there
290       are many duplicated strings as values in your data structure.
291
292       Use of this option does not require an upgraded decoder (this option
293       was added in Sereal::Encoder 0.32). The deduping is performed in such a
294       way that older decoders should handle it just fine.  In other words,
295       the output of a Sereal decoder should not depend on whether this option
296       was used during encoding. See also below: aliased_dedupe_strings.
297
298       aliased_dedupe_strings
299
300       This is an advanced option that should be used only after fully
301       understanding its ramifications.
302
303       This option enables a mode of operation that is similar to
304       dedupe_strings and if both options are set, aliased_dedupe_strings
305       takes precedence.
306
307       The behaviour of aliased_dedupe_strings differs from dedupe_strings in
308       that the duplicate occurrences of strings are emitted as Perl language
309       level aliases instead of as Sereal-internal backreferences. This means
310       that using this option actually produces a different output data
311       structure when decoding. The upshot is that with this option, the
312       application using (decoding) the data may save a lot of memory in some
313       situations but at the cost of potential action at a distance due to the
314       aliasing.
315
316       Beware: The test suite currently does not cover this option as well as
317       it probably should. Patches welcome.
318
319       protocol_version
320
321       Specifies the version of the Sereal protocol to emit. Valid are
322       integers between 1 and the current version. If not specified, the most
323       recent protocol version will be used. See also "use_protocol_v1":
324
325       It is strongly advised to use the latest protocol version outside of
326       migration periods.
327
328       use_protocol_v1
329
330       This option is deprecated in favour of the "protocol_version" option
331       (see above).
332
333       If set, the encoder will emit Sereal documents following protocol
334       version 1.  This is strongly discouraged except for temporary
335       compatibility/migration purposes.
336

INSTANCE METHODS

338   encode
339       Given a Perl data structure, serializes that data structure and returns
340       a binary string that can be turned back into the original data
341       structure by Sereal::Decoder. The method expects a data structure to
342       serialize as first argument, optionally followed by a header data
343       structure.
344
345       A header is intended for embedding small amounts of meta data, such as
346       routing information, in a document that allows users to avoid
347       deserializing main body needlessly.
348
349   encode_to_file
350           Sereal::Encoder->encode_to_file($file,$data,$append);
351           $encoder->encode_to_file($file,$data,$append);
352
353       Encode the data specified and write it the named file.  If $append is
354       true then the written data is appended to any existing data, otherwise
355       any existing data will be overwritten.  Dies if any errors occur during
356       writing the encoded data.
357

EXPORTABLE FUNCTIONS

359   sereal_encode_with_object
360       The functional interface that is equivalent to using "encode". Takes an
361       encoder object reference as first argument, followed by a data
362       structure and optional header to serialize.
363
364       This functional interface is marginally faster than the OO interface
365       since it avoids method resolution overhead and, on sufficiently modern
366       Perl versions, can usually avoid subroutine call overhead.
367
368   encode_sereal
369       The functional interface that is equivalent to using "new" and
370       "encode".  Expects a data structure to serialize as first argument,
371       optionally followed by a hash reference of options (see documentation
372       for "new()").
373
374       This function cannot be used for encoding a data structure with a
375       header.  See "encode_sereal_with_header_data".
376
377       This functional interface is significantly slower than the OO interface
378       since it cannot reuse the encoder object.
379
380   encode_sereal_with_header_data
381       The functional interface that is equivalent to using "new" and
382       "encode".  Expects a data structure and a header to serialize as first
383       and second arguments, optionally followed by a hash reference of
384       options (see documentation for "new()").
385
386       This functional interface is significantly slower than the OO interface
387       since it cannot reuse the encoder object.
388

PERFORMANCE

390       See Sereal::Performance for detailed considerations on performance
391       tuning. Let it just be said that:
392
393       If you care about performance at all, then use
394       "sereal_encode_with_object" or the OO interface instead of
395       "encode_sereal". It's a significant difference in performance if you
396       are serializing small data structures.
397
398       The exact performance in time and space depends heavily on the data
399       structure to be serialized. Often there is a trade-off between space
400       and time. If in doubt, do your own testing and most importantly ALWAYS
401       TEST WITH REAL DATA. If you care purely about speed at the expense of
402       output size, you can use the "no_shared_hashkeys" option for a small
403       speed-up. If you need smaller output at the cost of higher CPU load and
404       more memory used during encoding/decoding, try the "dedupe_strings"
405       option and enable Snappy compression.
406
407       For ready-made comparison scripts, see the author_tools/bench.pl and
408       author_tools/dbench.pl programs that are part of this distribution.
409       Suffice to say that this library is easily competitive in both time and
410       space efficiency with the best alternatives.
411

FREEZE/THAW CALLBACK MECHANISM

413       This mechanism is enabled using the "freeze_callbacks" option of the
414       encoder.  It is inspired by the equivalent mechanism in CBOR::XS and
415       differs only in one minor detail, explained below. The general
416       mechanism is documented in the A GENERIC OBJECT SERIALIATION PROTOCOL
417       section of Types::Serializer.  Similar to CBOR using "CBOR", Sereal
418       uses the string "Sereal" as a serializer identifier for the callbacks.
419
420       The one difference to the mechanism as supported by CBOR is that in
421       Sereal, the "FREEZE" callback must return a single value. That value
422       can be any data structure supported by Sereal (hopefully without
423       causing infinite recursion by including the original object). But
424       "FREEZE" can't return a list as with CBOR.  This should not be any
425       practical limitation whatsoever. Just return an array reference instead
426       of a list.
427
428       Here is a contrived example of a class implementing the "FREEZE" /
429       "THAW" mechanism.
430
431         package
432           File;
433
434         use Moo;
435
436         has 'path' => (is => 'ro');
437         has 'fh' => (is => 'rw');
438
439         # open file handle if necessary and return it
440         sub get_fh {
441           my $self = shift;
442           # This could also be done with fancier Moo(se) syntax
443           my $fh = $self->fh;
444           if (not $fh) {
445             open $fh, "<", $self->path or die $!;
446             $self->fh($fh);
447           }
448           return $fh;
449         }
450
451         sub FREEZE {
452           my ($self, $serializer) = @_;
453           # Could switch on $serializer here: JSON, CBOR, Sereal, ...
454           # But this case is so simple that it will work with ALL of them.
455           # Do not try to serialize our file handle! Path will be enough
456           # to recreate.
457           return $self->path;
458         }
459
460         sub THAW {
461           my ($class, $serializer, $data) = @_;
462           # Turn back into object.
463           return $class->new(path => $data);
464         }
465
466       Why is the "FREEZE"/"THAW" mechanism important here? Our contrived
467       "File" class may contain a file handle which can't be serialized. So
468       "FREEZE" not only returns just the path (which is more compact than
469       encoding the actual object contents), but it strips the file handle
470       which can be lazily reopened on the other side of the
471       serialization/deserialization pipe.  But this example also shows that a
472       naive implementation can easily end up with subtle bugs. A file handle
473       itself has state (position in file, etc).  Thus the deserialization in
474       the above example won't accurately reproduce the original state. It
475       can't, of course, if it's deserialized in a different environment
476       anyway.
477

THREAD-SAFETY

479       "Sereal::Encoder" is thread-safe on Perl's 5.8.7 and higher. This means
480       "thread-safe" in the sense that if you create a new thread, all
481       "Sereal::Encoder" objects will become a reference to undef in the new
482       thread. This might change in a future release to become a full clone of
483       the encoder object.
484

CANONICAL REPRESENTATION

486       You might want to compare two data structures by comparing their
487       serialized byte strings.  For that to work reliably the serialization
488       must take extra steps to ensure that identical data structures are
489       encoded into identical serialized byte strings (a so-called "canonical
490       representation").
491
492       Unfortunately in Perl there is no such thing as a "canonical
493       representation".  Most people are interested in "structural
494       equivalence" but even that is less well defined than most people think.
495       For instance in the following example:
496
497           my $array1= [ 0, 0 ];
498           my $array2= do {
499               my $zero= 0;
500               sub{ \@_ }->($zero,$zero);
501           };
502
503       the question of whether $array1 is structurally equivalent to $array2
504       is a subjective one. Sereal for instance would NOT consider them
505       equivalent but "Test::Deep" would.  There are many examples of this in
506       Perl. Simply stringifying a number technically changes the scalar.
507       Storable would notice this, but Sereal generally would not.
508
509       Despite this as of 3.002 the Sereal encoder supports a "canonical"
510       option which will make a "best effort" attempt at producing a canonical
511       representation of a data structure.  This mode is actually a
512       combination of several other modes which may also be enabled
513       independently, and as and when we add new options to the encoder that
514       would assist in this regard then the "canonical" will also enable them.
515       These options may come with a performance penalty so care should be
516       taken to read the Changes file and test the performance implications
517       when upgrading a system that uses this option.
518
519       It is important to note that using canonical representation to
520       determine if two data structures are different is subject to false-
521       positives. If two Sereal encodings are identical you can generally
522       assume that the two data structures are functionally equivalent from
523       the point of view of normal Perl code (XS code might disagree). However
524       if two Sereal encodings differ the data structures may actually be
525       functionally equivalent.  In practice it seems the the false-positive
526       rate is low, but your milage may vary.
527
528       Some of the issues with producing a true canonical representation are
529       outlined below:
530
531       Sereal doesn't order the hash keys by default.
532           This can be enabled via the "sort_keys", which is itself enabled by
533           "canonical" option.
534
535       Sereal output is sensitive to refcounts
536           This can be somewhat mitigated by the use of "canonical_refs", see
537           above.
538
539       There are multiple valid Sereal documents that you can produce for the
540       same Perl data structure.
541           Just sorting hash keys is not enough.  Some of the reasons are
542           outlined below. These issues are especially relevant when
543           considering language interoperability.
544
545           PAD bytes
546               A trivial example is PAD bytes which mean nothing and are
547               skipped. They mostly exist for encoder optimizations to prevent
548               certain nasty backtracking situations from becoming O(n) at the
549               cost of one byte of output. An explicit canonical mode would
550               have to outlaw them (or add more of them) and thus require a
551               much more complicated implementation of refcount/weakref
552               handing in the encoder while at the same time causing some
553               operations to go from O(1) to a full memcpy of everything after
554               the point of where we backtracked to. Nasty.
555
556           COPY tag
557               Another example is COPY. The COPY tag indicates that the next
558               element is an identical copy of a previous element (which is
559               itself forbidden from including COPY's other than for class
560               names). COPY is purely internal. The Perl/XS implementation
561               uses it to share hash keys and class names. One could use it
562               for other strings (theoretically), but doesn't for time-
563               efficiency reasons. We'd have to outlaw the use of this
564               (significant) optimization of canonicalization.
565
566           REF representation
567               Sereal represents a reference to an array as a sequence of tags
568               which, in its simplest form, reads REF, ARRAY $array_length
569               TAG1 TAG2 ....  The separation of "REF" and "ARRAY" is
570               necessary to properly implement all of Perl's referencing and
571               aliasing semantics correctly. Quite frequently, however, your
572               array is only referenced once and plainly so. If it's also at
573               most 15 elements long, Sereal optimizes all of the "REF" and
574               "ARRAY" tags, as well as the length into a special one byte
575               ARRAYREF tag. This is a very significant optimization for
576               common cases. This, however, does mean that most arrays up to
577               15 elements could be represented in two different, yet
578               perfectly valid forms. ARRAYREF would have to be outlawed for a
579               properly canonical form. The exact same logic applies to HASH
580               vs. HASHREF. This behavior can be overridden by the
581               "canonical_refs" option, which disables use of HASHREF and
582               ARRAYREF.
583
584           Numeric representation
585               Similar to how Sereal can represent arrays and hashes in a full
586               and a compact form. For small integers (between -16 and +15
587               inclusive), Sereal emits only one byte including the encoding
588               of the type of data. For larger integers, it can use either
589               variants (positive only) or zigzag encoding, which can also
590               represent negative numbers. For a canonical mode, the space
591               optimizations would have to be turned off and it would have to
592               be explicitly specified whether variant or zigzag encoding is
593               to be used for encoding positive integers.
594
595               Perl may choose to retain multiple representations of a scalar.
596               Specifically, it can convert integers, floating point numbers,
597               and strings on the fly and will aggressively cache the results.
598               Normally, it remembers which of the representations can be
599               considered canonical, that means, which can be used to recreate
600               the others reliably. For example, 0 and "0" can both be
601               considered canonical since they naturally transform into each
602               other. Beyond intrinsic ambiguity, there are ways to trick Perl
603               into allowing a single scalar to have distinct string, integer,
604               and floating point representations that are all flagged as
605               canonical, but can't be transformed into each other. These are
606               the so-called dualvars. Sereal cannot represent dualvars (and
607               that's a good thing).
608
609               Floating point values can appear to be the same but serialize
610               to different byte strings due to insignificant 'noise' in the
611               floating point representation. Sereal supports different
612               floating point precisions and will generally choose the most
613               compact that can represent your floating point number
614               correctly.
615
616           There's also a few cases where Sereal will produce different
617           documents for values that you might think are the same thing,
618           because if you e.g. compared them with "eq" or "==" in perl itself
619           would think they were equivalent. However for the purposes of
620           serialization they're not the same value.
621
622           A good example of these cases is where Test::Deep and Sereal's
623           canonical mode differ. We have tests for some of these cases in
624           t/030_canonical_vs_test_deep.t. Here's the issues we've noticed so
625           far:
626
627           Sereal considers ASCII strings with the UTF-8 flag to be different
628           from the same string without the UTF-8 flag
629               Consider:
630
631                   my $language_code = "en";
632
633               v.s.:
634
635                   my $language_code = "en";
636                   utf8::upgrade($en);
637
638               Sereal's canonical mode will encode these strings differently,
639               as it should, since the UTF-8 flag will be passed along on
640               interpolation.
641
642               But this can be confusing if you're just getting some user-
643               supplied ASCII strings that you may inadvertently toggle the
644               UTF-8 flag on, e.g. because you're comparing an ASCII value in
645               a database to a value submitted in a UTF-8 web form.
646
647           Sereal will encode strings that look like numbers as strings,
648           unless they've been used in numeric context
649               I.e. these values will be encoded differently, respectively:
650
651                   my $IV_x = "12345";
652                   my $IV_y = "12345" + 0;
653                   my $NV_x = "12.345";
654                   my $NV_y = "12.345" + 0;
655
656               But as noted above something like Test::Deep will consider
657               these to be the same thing.
658
659           We might produce certain aggressive flags to the canonical mode in
660           the future to deal with this. For the cases noted above some
661           combination of turning the UTF-8 flag on on all strings, or
662           stripping it from strings that have it but are ASCII-only would
663           "work", similarly we could scan strings to see if they match
664           "looks_like_number()" and if so numify them.
665
666           This would produce output that either would be a lot bigger (having
667           to encode all numbers as strings), or would be more expensive to
668           generate (having to scan strings for numeric or non-ASCII context),
669           and for some cases like the UTF-8 flag munging wouldn't be suitable
670           for general use outside of canonicialization.
671
672       Often, people don't actually care about "canonical" in the strict sense
673       required for real identity checking. They just require a best-effort
674       sort of thing for caching. But it's a slippery slope!
675
676       In a nutshell, the "canonical" option may be sufficient for an
677       application which is simply serializing a cache key, and thus there's
678       little harm in an occasional false-negative, but think carefully before
679       applying Sereal in other use-cases.
680

KNOWN ISSUES

682       Strings Or Numbers
683           Perl does not make a strong distinction between strings and
684           numbers, and from an internal point of view it can be difficult to
685           tell what the "right" representation is for a given variable.
686
687           Sereal tries to not be lossy. So if it detects that the string
688           value of a var, and the numeric value are different it will
689           generally round trip the *string* value. This means that "special"
690           strings often used in Perl function returns, like "0 but true", and
691           "0e0", will round trip in a way that their normal Perl semantics
692           are preserved. However this also means that "non canonical" values,
693           like " 100 ", which will numify as 100 without warnings, will round
694           trip as their string values.
695
696           Perl also has some operators, the binary operators, ^, | and &,
697           which do different things depending on whether their arguments had
698           been used in numeric context as the following examples show:
699
700               perl -le'my $x="1"; $i=int($x); print unpack "H*", $x ^ "1"'
701               30
702
703               perl -le'my $x="1"; print unpack "H*", $x ^ "1"'
704               00
705
706               perl -le'my $x=" 1 "; $i=int($x); print unpack "H*", $x ^ "1"'
707               30
708
709               perl -le'my $x=" 1 "; print unpack "H*", $x ^ "1"'
710               113120
711
712           Sereal currently cannot round trip this property properly.
713
714           An extreme case of this problem is that of "dualvars", which can be
715           created using the Scalar::Util::dualvar() function. This function
716           allows one to create variables which have string and integer values
717           which are completely unrelated to each other.  Sereal currently
718           will choose the *string* value when it detects these items.
719
720           It is possible that a future release of the protocol will fix these
721           issues.
722

BUGS, CONTACT AND SUPPORT

724       For reporting bugs, please use the github bug tracker at
725       <http://github.com/Sereal/Sereal/issues>.
726
727       For support and discussion of Sereal, there are two Google Groups:
728
729       Announcements around Sereal (extremely low volume):
730       <https://groups.google.com/forum/?fromgroups#!forum/sereal-announce>
731
732       Sereal development list:
733       <https://groups.google.com/forum/?fromgroups#!forum/sereal-dev>
734

AUTHORS AND CONTRIBUTORS

736       Yves Orton <demerphq@gmail.com>
737
738       Damian Gryski
739
740       Steffen Mueller <smueller@cpan.org>
741
742       Rafaël Garcia-Suarez
743
744       Ævar Arnfjörð Bjarmason <avar@cpan.org>
745
746       Tim Bunce
747
748       Daniel Dragan <bulkdd@cpan.org> (Windows support and bugfixes)
749
750       Zefram
751
752       Borislav Nikolov
753
754       Ivan Kruglov <ivan.kruglov@yahoo.com>
755
756       Some inspiration and code was taken from Marc Lehmann's excellent
757       JSON::XS module due to obvious overlap in problem domain. Thank you!
758

ACKNOWLEDGMENT

760       This module was originally developed for Booking.com.  With approval
761       from Booking.com, this module was generalized and published on CPAN,
762       for which the authors would like to express their gratitude.
763
765       Copyright (C) 2012, 2013, 2014 by Steffen Mueller Copyright (C) 2012,
766       2013, 2014 by Yves Orton
767
768       The license for the code in this distribution is the following, with
769       the exceptions listed below:
770
771       This library is free software; you can redistribute it and/or modify it
772       under the same terms as Perl itself.
773
774       Except portions taken from Marc Lehmann's code for the JSON::XS module,
775       which is licensed under the same terms as this module.
776
777       Also except the code for Snappy compression library, whose license is
778       reproduced below and which, to the best of our knowledge, is compatible
779       with this module's license. The license for the enclosed Snappy code
780       is:
781
782         Copyright 2011, Google Inc.
783         All rights reserved.
784
785         Redistribution and use in source and binary forms, with or without
786         modification, are permitted provided that the following conditions are
787         met:
788
789           * Redistributions of source code must retain the above copyright
790         notice, this list of conditions and the following disclaimer.
791           * Redistributions in binary form must reproduce the above
792         copyright notice, this list of conditions and the following disclaimer
793         in the documentation and/or other materials provided with the
794         distribution.
795           * Neither the name of Google Inc. nor the names of its
796         contributors may be used to endorse or promote products derived from
797         this software without specific prior written permission.
798
799         THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
800         "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
801         LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
802         A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
803         OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
804         SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
805         LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
806         DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
807         THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
808         (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
809         OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
810
811
812
813perl v5.30.1                      2020-02-04                Sereal::Encoder(3)
Impressum