1Sereal::Encoder(3)    User Contributed Perl Documentation   Sereal::Encoder(3)
2
3
4

NAME

6       Sereal::Encoder - Fast, compact, powerful binary serialization
7

SYNOPSIS

9         use Sereal::Encoder qw(encode_sereal sereal_encode_with_object);
10
11         my $encoder = Sereal::Encoder->new({...options...});
12         my $out = $encoder->encode($structure);
13
14         # alternatively the functional interface:
15         $out = sereal_encode_with_object($encoder, $structure);
16
17         # much slower functional interface with no persistent objects:
18         $out = encode_sereal($structure, {... options ...});
19

DESCRIPTION

21       This library implements an efficient, compact-output, and feature-rich
22       serializer using a binary protocol called Sereal.  Its sister module
23       Sereal::Decoder implements a decoder for this format.  The two are
24       released separately to allow for independent and safer upgrading.  If
25       you care greatly about performance, consider reading the
26       Sereal::Performance documentation after finishing this document.
27
28       The Sereal protocol version emitted by this encoder implementation is
29       currently protocol version 4 by default.
30
31       The protocol specification and many other bits of documentation can be
32       found in the github repository. Right now, the specification is at
33       <https://github.com/Sereal/Sereal/blob/master/sereal_spec.pod>, there
34       is a discussion of the design objectives in
35       <https://github.com/Sereal/Sereal/blob/master/README.pod>, and the
36       output of our benchmarks can be seen at
37       <https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs>.  For
38       more information on getting the best performance out of Sereal, have a
39       look at the "PERFORMANCE" section below.
40

CLASS METHODS

42   new
43       Constructor. Optionally takes a hash reference as first parameter. This
44       hash reference may contain any number of options that influence the
45       behaviour of the encoder.
46
47       Currently, the following options are recognized, none of them are on by
48       default.
49
50       compress
51
52       If this option provided and true, compression of the document body is
53       enabled.  As of Sereal version 4, three different compression
54       techniques are supported and can be enabled by setting "compress" to
55       the respective named constants (exportable from the "Sereal::Encoder"
56       module): Snappy (named constant: "SRL_SNAPPY"), Zlib ("SRL_ZLIB") and
57       Zstd ("SRL_ZSTD").  For your convenience, there is also a
58       "SRL_UNCOMPRESSED" constant.
59
60       If this option is set, then the Snappy-related options below are
61       ignored. They are otherwise recognized for compatibility only.
62
63       compress_threshold
64
65       The size threshold (in bytes) of the uncompressed output below which
66       compression is not even attempted even if enabled.  Defaults to one
67       kilobyte (1024 bytes). Set this to 0 and "compress" to a
68       non-"SRL_UNCOMPRESSED" value to always attempt to compress.  Note that
69       the document will not be compressed if the resulting size will be
70       bigger than the original size (even if "compress_threshold" is 0).
71
72       compress_level
73
74       If Zlib or Zstd compressions are used, then this option will set a
75       compression level: Zlib uses range from 1 (fastest) to 9 (best).
76       Defaults to 6. Zstd uses range from 1 (fastest) to 22 (best). Default
77       is 3.
78
79       snappy
80
81       See also the "compress" option. This option is provided only for
82       compatibility with Sereal V1.
83
84       If set, the main payload of the Sereal document will be compressed
85       using Google's Snappy algorithm. This can yield anywhere from no effect
86       to significant savings on output size at rather low run time cost.  If
87       in doubt, test with your data whether this helps or not.
88
89       The decoder (version 0.04 and up) will know how to handle Snappy-
90       compressed Sereal documents transparently.
91
92       Note: The "snappy_incr" and "snappy" options are identical in Sereal
93       protocol v2 and up (so by default). If using an older protocol version
94       (see "protocol_version" and "use_protocol_v1" options below) to emit
95       Sereal V1 documents, this emits non-incrementally decodable documents.
96       See "snappy_incr" in those cases.
97
98       snappy_incr
99
100       See also the "compress" option. This option is provided only for
101       compatibility with Sereal V1.
102
103       Same as the "snappy" option for default operation (that is in Sereal v2
104       or up).
105
106       In Sereal V1, enables a version of the Snappy protocol which is
107       suitable for incremental parsing of packets. See also the "snappy"
108       option above for more details.
109
110       snappy_threshold
111
112       See also the "compress" option. This option is provided only for
113       compatibility with Sereal V1.
114
115       This option is a synonym for the "compress_threshold" option, but only
116       if Snappy compression is enabled.
117
118       croak_on_bless
119
120       If this option is set, then the encoder will refuse to serialize
121       blessed references and throw an exception instead.
122
123       This can be important because blessed references can mean executing a
124       destructor on a remote system or generally executing code based on
125       data.
126
127       See also "no_bless_objects" to skip the blessing of objects.  When both
128       flags are set, "croak_on_bless" has a higher precedence then
129       "no_bless_objects".
130
131       freeze_callbacks
132
133       This option was introduced in Sereal v2 and needs a Sereal v2 decoder.
134
135       If this option is set, the encoder will check for and possibly invoke
136       the "FREEZE" method on any object in the input data. An object that was
137       serialized using its "FREEZE" method will have its corresponding "THAW"
138       class method called during deserialization. The exact semantics are
139       documented below under "FREEZE/THAW CALLBACK MECHANISM".
140
141       Beware that using this functionality means a significant slowdown for
142       object serialization. Even when serializing objects without a "FREEZE"
143       method, the additional method look up will cost a small amount of
144       runtime.  Yes, "Sereal::Encoder" is so fast that this may make a
145       difference.
146
147       no_bless_objects
148
149       If this option is set, then the encoder will serialize blessed
150       references without the bless information and provide plain data
151       structures instead.
152
153       See also the "croak_on_bless" option above for more details.
154
155       undef_unknown
156
157       If set, unknown/unsupported data structures will be encoded as "undef"
158       instead of throwing an exception.
159
160       Mutually exclusive with "stringify_unknown".  See also "warn_unknown"
161       below.
162
163       stringify_unknown
164
165       If set, unknown/unsupported data structures will be stringified and
166       encoded as that string instead of throwing an exception. The
167       stringification may cause a warning to be emitted by perl.
168
169       Mutually exclusive with "undef_unknown".  See also "warn_unknown"
170       below.
171
172       warn_unknown
173
174       Only has an effect if "undef_unknown" or "stringify_unknown" are
175       enabled.
176
177       If set to a positive integer, any unknown/unsupported data structure
178       encountered will emit a warning. If set to a negative integer, it will
179       warn for unsupported data structures just the same as for a positive
180       value with one exception: For blessed, unsupported items that have
181       string overloading, we silently stringify without warning.
182
183       max_recursion_depth
184
185       "Sereal::Encoder" is recursive. If you pass it a Perl data structure
186       that is deeply nested, it will eventually exhaust the C stack.
187       Therefore, there is a limit on the depth of recursion that is accepted.
188       It defaults to 10000 nested calls. You may choose to override this
189       value with the "max_recursion_depth" option. Beware that setting it too
190       high can cause hard crashes, so only do that if you KNOW that it is
191       safe to do so.
192
193       Do note that the setting is somewhat approximate. Setting it to 10000
194       may break at somewhere between 9997 and 10003 nested structures
195       depending on their types.
196
197       canonical
198
199       Enable all options which are related to producing canonical output, so
200       that two strucutures with similar contents produce the same serialized
201       form.
202
203       See the caveats elsewhere in this document about producing canonical
204       output.
205
206       Currently sets the default for the following parameters:
207       "canonical_refs" and "sort_keys". If the option is explicitly set then
208       this setting is ignored.  More options may be added in the future.
209
210       You are warned that use of this option may incur additional performance
211       penalties in a future release by enabling other options than those
212       listed here.
213
214       canonical_refs
215
216       Normally "Sereal::Encoder" will ARRAYREF and HASHREF tags when the item
217       contains less than 16 items, and and is not referenced more than once.
218       This flag will override this optimization and use a standard REFN ARRAY
219       style tag output. This is primarily useful for producing canonical
220       output and for testing Sereal itself.
221
222       See "CANONICAL REPRESENTATION" for why you might want to use this, and
223       for the various caveats involved.
224
225       sort_keys
226
227       Normally "Sereal::Encoder" will output hashes in whatever order is
228       convenient, generally that used by perl to actually store the hash, or
229       whatever order was returned by a tied hash.
230
231       If this option is enabled then the Encoder will sort the keys before
232       outputting them. It uses more memory, and is quite a bit slower than
233       the default.
234
235       Generally speaking this should mean that a hash and a copy should
236       produce the same output. Nevertheless the user is warned that Perl has
237       a way of "morphing" variables on use, and some of its rules are a
238       little arcane (for instance utf8 keys), and so two hashes that might
239       appear to be the same might still produce different output as far as
240       Sereal is concerned.
241
242       As of 3.006_007 (prerelease candidate for 3.007) the sort order has
243       been changed to the following: order by length of keys (in bytes)
244       ascending, then by byte order of the raw underlying string, then by
245       utf8ness, with non-utf8 first. This order was chosen because it is the
246       most efficient to implement, both in terms of memory and time. This
247       sort order is enabled when sort_keys is set to 1.
248
249       You may also produce output in Perl "cmp" order, by setting sort_keys
250       to 2.  And for backwards compatibility you may also produce output in
251       reverse Perl "cmp" order by setting sort_keys to 3. Prior to 3.006_007
252       this was the only sort order possible, although it was not explicitly
253       defined what it was.
254
255       Note that comparatively speaking both of the "cmp" sort orders are slow
256       and memory inefficient. Unless you have a really good reason stick to
257       the default which is fast and as lean as possible.
258
259       Unless you are concerned with "cross process canonical representation"
260       then it doesn't matter what option you choose.
261
262       See "CANONICAL REPRESENTATION" for why you might want to use this, and
263       for the various caveats involved.
264
265       no_shared_hashkeys
266
267       When the "no_shared_hashkeys" option is set to a true value, then the
268       encoder will disable the detection and elimination of repeated hash
269       keys. This only has an effect for serializing structures containing
270       hashes.  By skipping the detection of repeated hash keys, performance
271       goes up a bit, but the size of the output can potentially be much
272       larger.
273
274       Do not disable this unless you have a reason to.
275
276       dedupe_strings
277
278       If this is option is enabled/true then Sereal will use a hash to encode
279       duplicates of strings during serialization efficiently using (internal)
280       backreferences. This has a performance and memory penalty during
281       encoding so it defaults to off.  On the other hand, data structures
282       with many duplicated strings will see a significant reduction in the
283       size of the encoded form. Currently only strings longer than 3
284       characters will be deduped, however this may change in the future.
285
286       Note that Sereal will perform certain types of deduping automatically
287       even without this option. In particular class names and hash keys (see
288       also the "no_shared_hashkeys" setting) are deduped regardless of this
289       option. Only enable this if you have good reason to believe that there
290       are many duplicated strings as values in your data structure.
291
292       Use of this option does not require an upgraded decoder (this option
293       was added in Sereal::Encoder 0.32). The deduping is performed in such a
294       way that older decoders should handle it just fine.  In other words,
295       the output of a Sereal decoder should not depend on whether this option
296       was used during encoding. See also below: aliased_dedupe_strings.
297
298       aliased_dedupe_strings
299
300       This is an advanced option that should be used only after fully
301       understanding its ramifications.
302
303       This option enables a mode of operation that is similar to
304       dedupe_strings and if both options are set, aliased_dedupe_strings
305       takes precedence.
306
307       The behaviour of aliased_dedupe_strings differs from dedupe_strings in
308       that the duplicate occurrences of strings are emitted as Perl language
309       level aliases instead of as Sereal-internal backreferences. This means
310       that using this option actually produces a different output data
311       structure when decoding. The upshot is that with this option, the
312       application using (decoding) the data may save a lot of memory in some
313       situations but at the cost of potential action at a distance due to the
314       aliasing.
315
316       Beware: The test suite currently does not cover this option as well as
317       it probably should. Patches welcome.
318
319       use_standard_double
320
321       This option can be used to force Perls built with uselongdouble or
322       quadmath to use DOUBLE instead of the native floating point. This can
323       be helpful interoperating with Perls which do not support larger sized
324       floats. Note that "uselongdouble" means different things in different
325       places, so this option may be helpful for such builds. We do not enable
326       this option by default for backwards compatibility reasons, and because
327       doing so would lose precision.
328
329       protocol_version
330
331       Specifies the version of the Sereal protocol to emit. Valid are
332       integers between 1 and the current version. If not specified, the most
333       recent protocol version will be used. See also "use_protocol_v1":
334
335       It is strongly advised to use the latest protocol version outside of
336       migration periods.
337
338       use_protocol_v1
339
340       This option is deprecated in favour of the "protocol_version" option
341       (see above).
342
343       If set, the encoder will emit Sereal documents following protocol
344       version 1.  This is strongly discouraged except for temporary
345       compatibility/migration purposes.
346

INSTANCE METHODS

348   encode
349       Given a Perl data structure, serializes that data structure and returns
350       a binary string that can be turned back into the original data
351       structure by Sereal::Decoder. The method expects a data structure to
352       serialize as first argument, optionally followed by a header data
353       structure.
354
355       A header is intended for embedding small amounts of meta data, such as
356       routing information, in a document that allows users to avoid
357       deserializing main body needlessly.
358
359   encode_to_file
360           Sereal::Encoder->encode_to_file($file,$data,$append);
361           $encoder->encode_to_file($file,$data,$append);
362
363       Encode the data specified and write it the named file.  If $append is
364       true then the written data is appended to any existing data, otherwise
365       any existing data will be overwritten.  Dies if any errors occur during
366       writing the encoded data.
367

EXPORTABLE FUNCTIONS

369   sereal_encode_with_object
370       The functional interface that is equivalent to using "encode". Takes an
371       encoder object reference as first argument, followed by a data
372       structure and optional header to serialize.
373
374       This functional interface is marginally faster than the OO interface
375       since it avoids method resolution overhead and, on sufficiently modern
376       Perl versions, can usually avoid subroutine call overhead.
377
378   encode_sereal
379       The functional interface that is equivalent to using "new" and
380       "encode".  Expects a data structure to serialize as first argument,
381       optionally followed by a hash reference of options (see documentation
382       for "new()").
383
384       This function cannot be used for encoding a data structure with a
385       header.  See "encode_sereal_with_header_data".
386
387       This functional interface is significantly slower than the OO interface
388       since it cannot reuse the encoder object.
389
390   encode_sereal_with_header_data
391       The functional interface that is equivalent to using "new" and
392       "encode".  Expects a data structure and a header to serialize as first
393       and second arguments, optionally followed by a hash reference of
394       options (see documentation for "new()").
395
396       This functional interface is significantly slower than the OO interface
397       since it cannot reuse the encoder object.
398

PERFORMANCE

400       See Sereal::Performance for detailed considerations on performance
401       tuning. Let it just be said that:
402
403       If you care about performance at all, then use
404       "sereal_encode_with_object" or the OO interface instead of
405       "encode_sereal". It's a significant difference in performance if you
406       are serializing small data structures.
407
408       The exact performance in time and space depends heavily on the data
409       structure to be serialized. Often there is a trade-off between space
410       and time. If in doubt, do your own testing and most importantly ALWAYS
411       TEST WITH REAL DATA. If you care purely about speed at the expense of
412       output size, you can use the "no_shared_hashkeys" option for a small
413       speed-up. If you need smaller output at the cost of higher CPU load and
414       more memory used during encoding/decoding, try the "dedupe_strings"
415       option and enable Snappy compression.
416
417       For ready-made comparison scripts, see the author_tools/bench.pl and
418       author_tools/dbench.pl programs that are part of this distribution.
419       Suffice to say that this library is easily competitive in both time and
420       space efficiency with the best alternatives.
421

FREEZE/THAW CALLBACK MECHANISM

423       Some objects do not lend themselves naturally to naive perl
424       datastructure level serialization. For instance XS code might use a
425       hidden structure that would not get serialized, or an object may
426       contain volatile data like a filehandle that would not be reconstituted
427       properly. To support cases like this "Sereal" supports a FREEZE and
428       THAW api. When objects are serialized their FREEZE method is asked for
429       a replacement representation, and when objects are deserialized their
430       THAW method is asked to convert that replacement back to something
431       useful.
432
433       This mechanism is enabled using the "freeze_callbacks" option of the
434       encoder.  It is inspired by the equivalent mechanism in CBOR::XS. The
435       general mechanism is documented in the A GENERIC OBJECT SERIALIATION
436       PROTOCOL section of Types::Serialiser. Similar to CBOR using "CBOR",
437       Sereal uses the string "Sereal" as a serializer identifier for the
438       callbacks.
439
440       Here is a contrived example of a class implementing the "FREEZE" /
441       "THAW" mechanism.
442
443         package
444           File;
445
446         use Moo;
447
448         has 'path' => (is => 'ro');
449         has 'fh' => (is => 'rw');
450
451         # open file handle if necessary and return it
452         sub get_fh {
453           my $self = shift;
454           # This could also be done with fancier Moo(se) syntax
455           my $fh = $self->fh;
456           if (not $fh) {
457             open $fh, "<", $self->path or die $!;
458             $self->fh($fh);
459           }
460           return $fh;
461         }
462
463         sub FREEZE {
464           my ($self, $serializer) = @_;
465           # Could switch on $serializer here: JSON, CBOR, Sereal, ...
466           # But this case is so simple that it will work with ALL of them.
467           # Do not try to serialize our file handle! Path will be enough
468           # to recreate.
469           return $self->path;
470         }
471
472         sub THAW {
473           my ($class, $serializer, $data) = @_;
474           # Turn back into object.
475           return $class->new(path => $data);
476         }
477
478       Why is the "FREEZE"/"THAW" mechanism important here? Our contrived
479       "File" class may contain a file handle which can't be serialized. So
480       "FREEZE" not only returns just the path (which is more compact than
481       encoding the actual object contents), but it strips the file handle
482       which can be lazily reopened on the other side of the
483       serialization/deserialization pipe.  But this example also shows that a
484       naive implementation can easily end up with subtle bugs. A file handle
485       itself has state (position in file, etc).  Thus the deserialization in
486       the above example won't accurately reproduce the original state. It
487       can't, of course, if it's deserialized in a different environment
488       anyway.
489

THREAD-SAFETY

491       "Sereal::Encoder" is thread-safe on Perl's 5.8.7 and higher. This means
492       "thread-safe" in the sense that if you create a new thread, all
493       "Sereal::Encoder" objects will become a reference to undef in the new
494       thread. This might change in a future release to become a full clone of
495       the encoder object.
496

CANONICAL REPRESENTATION

498       You might want to compare two data structures by comparing their
499       serialized byte strings.  For that to work reliably the serialization
500       must take extra steps to ensure that identical data structures are
501       encoded into identical serialized byte strings (a so-called "canonical
502       representation").
503
504       Unfortunately in Perl there is no such thing as a "canonical
505       representation".  Most people are interested in "structural
506       equivalence" but even that is less well defined than most people think.
507       For instance in the following example:
508
509           my $array1= [ 0, 0 ];
510           my $array2= do {
511               my $zero= 0;
512               sub{ \@_ }->($zero,$zero);
513           };
514
515       the question of whether $array1 is structurally equivalent to $array2
516       is a subjective one. Sereal for instance would NOT consider them
517       equivalent but "Test::Deep" would.  There are many examples of this in
518       Perl. Simply stringifying a number technically changes the scalar.
519       Storable would notice this, but Sereal generally would not.
520
521       Despite this as of 3.002 the Sereal encoder supports a "canonical"
522       option which will make a "best effort" attempt at producing a canonical
523       representation of a data structure.  This mode is actually a
524       combination of several other modes which may also be enabled
525       independently, and as and when we add new options to the encoder that
526       would assist in this regard then the "canonical" will also enable them.
527       These options may come with a performance penalty so care should be
528       taken to read the Changes file and test the performance implications
529       when upgrading a system that uses this option.
530
531       It is important to note that using canonical representation to
532       determine if two data structures are different is subject to false-
533       positives. If two Sereal encodings are identical you can generally
534       assume that the two data structures are functionally equivalent from
535       the point of view of normal Perl code (XS code might disagree). However
536       if two Sereal encodings differ the data structures may actually be
537       functionally equivalent.  In practice it seems the the false-positive
538       rate is low, but your milage may vary.
539
540       Some of the issues with producing a true canonical representation are
541       outlined below:
542
543       Sereal doesn't order the hash keys by default.
544           This can be enabled via the "sort_keys", which is itself enabled by
545           "canonical" option.
546
547       Sereal output is sensitive to refcounts
548           This can be somewhat mitigated by the use of "canonical_refs", see
549           above.
550
551       There are multiple valid Sereal documents that you can produce for the
552       same Perl data structure.
553           Just sorting hash keys is not enough.  Some of the reasons are
554           outlined below. These issues are especially relevant when
555           considering language interoperability.
556
557           PAD bytes
558               A trivial example is PAD bytes which mean nothing and are
559               skipped. They mostly exist for encoder optimizations to prevent
560               certain nasty backtracking situations from becoming O(n) at the
561               cost of one byte of output. An explicit canonical mode would
562               have to outlaw them (or add more of them) and thus require a
563               much more complicated implementation of refcount/weakref
564               handing in the encoder while at the same time causing some
565               operations to go from O(1) to a full memcpy of everything after
566               the point of where we backtracked to. Nasty.
567
568           COPY tag
569               Another example is COPY. The COPY tag indicates that the next
570               element is an identical copy of a previous element (which is
571               itself forbidden from including COPY's other than for class
572               names). COPY is purely internal. The Perl/XS implementation
573               uses it to share hash keys and class names. One could use it
574               for other strings (theoretically), but doesn't for time-
575               efficiency reasons. We'd have to outlaw the use of this
576               (significant) optimization of canonicalization.
577
578           REF representation
579               Sereal represents a reference to an array as a sequence of tags
580               which, in its simplest form, reads REF, ARRAY $array_length
581               TAG1 TAG2 ....  The separation of "REF" and "ARRAY" is
582               necessary to properly implement all of Perl's referencing and
583               aliasing semantics correctly. Quite frequently, however, your
584               array is only referenced once and plainly so. If it's also at
585               most 15 elements long, Sereal optimizes all of the "REF" and
586               "ARRAY" tags, as well as the length into a special one byte
587               ARRAYREF tag. This is a very significant optimization for
588               common cases. This, however, does mean that most arrays up to
589               15 elements could be represented in two different, yet
590               perfectly valid forms. ARRAYREF would have to be outlawed for a
591               properly canonical form. The exact same logic applies to HASH
592               vs. HASHREF. This behavior can be overridden by the
593               "canonical_refs" option, which disables use of HASHREF and
594               ARRAYREF.
595
596           Numeric representation
597               Similar to how Sereal can represent arrays and hashes in a full
598               and a compact form. For small integers (between -16 and +15
599               inclusive), Sereal emits only one byte including the encoding
600               of the type of data. For larger integers, it can use either
601               variants (positive only) or zigzag encoding, which can also
602               represent negative numbers. For a canonical mode, the space
603               optimizations would have to be turned off and it would have to
604               be explicitly specified whether variant or zigzag encoding is
605               to be used for encoding positive integers.
606
607               Perl may choose to retain multiple representations of a scalar.
608               Specifically, it can convert integers, floating point numbers,
609               and strings on the fly and will aggressively cache the results.
610               Normally, it remembers which of the representations can be
611               considered canonical, that means, which can be used to recreate
612               the others reliably. For example, 0 and "0" can both be
613               considered canonical since they naturally transform into each
614               other. Beyond intrinsic ambiguity, there are ways to trick Perl
615               into allowing a single scalar to have distinct string, integer,
616               and floating point representations that are all flagged as
617               canonical, but can't be transformed into each other. These are
618               the so-called dualvars. Sereal cannot represent dualvars (and
619               that's a good thing).
620
621               Floating point values can appear to be the same but serialize
622               to different byte strings due to insignificant 'noise' in the
623               floating point representation. Sereal supports different
624               floating point precisions and will generally choose the most
625               compact that can represent your floating point number
626               correctly.
627
628           There's also a few cases where Sereal will produce different
629           documents for values that you might think are the same thing,
630           because if you e.g. compared them with "eq" or "==" in perl itself
631           would think they were equivalent. However for the purposes of
632           serialization they're not the same value.
633
634           A good example of these cases is where Test::Deep and Sereal's
635           canonical mode differ. We have tests for some of these cases in
636           t/030_canonical_vs_test_deep.t. Here's the issues we've noticed so
637           far:
638
639           Sereal considers ASCII strings with the UTF-8 flag to be different
640           from the same string without the UTF-8 flag
641               Consider:
642
643                   my $language_code = "en";
644
645               v.s.:
646
647                   my $language_code = "en";
648                   utf8::upgrade($en);
649
650               Sereal's canonical mode will encode these strings differently,
651               as it should, since the UTF-8 flag will be passed along on
652               interpolation.
653
654               But this can be confusing if you're just getting some user-
655               supplied ASCII strings that you may inadvertently toggle the
656               UTF-8 flag on, e.g. because you're comparing an ASCII value in
657               a database to a value submitted in a UTF-8 web form.
658
659           Sereal will encode strings that look like numbers as strings,
660           unless they've been used in numeric context
661               I.e. these values will be encoded differently, respectively:
662
663                   my $IV_x = "12345";
664                   my $IV_y = "12345" + 0;
665                   my $NV_x = "12.345";
666                   my $NV_y = "12.345" + 0;
667
668               But as noted above something like Test::Deep will consider
669               these to be the same thing.
670
671           We might produce certain aggressive flags to the canonical mode in
672           the future to deal with this. For the cases noted above some
673           combination of turning the UTF-8 flag on on all strings, or
674           stripping it from strings that have it but are ASCII-only would
675           "work", similarly we could scan strings to see if they match
676           "looks_like_number()" and if so numify them.
677
678           This would produce output that either would be a lot bigger (having
679           to encode all numbers as strings), or would be more expensive to
680           generate (having to scan strings for numeric or non-ASCII context),
681           and for some cases like the UTF-8 flag munging wouldn't be suitable
682           for general use outside of canonicialization.
683
684       Often, people don't actually care about "canonical" in the strict sense
685       required for real identity checking. They just require a best-effort
686       sort of thing for caching. But it's a slippery slope!
687
688       In a nutshell, the "canonical" option may be sufficient for an
689       application which is simply serializing a cache key, and thus there's
690       little harm in an occasional false-negative, but think carefully before
691       applying Sereal in other use-cases.
692

KNOWN ISSUES

694       Strings Or Numbers
695           Perl does not make a strong distinction between strings and
696           numbers, and from an internal point of view it can be difficult to
697           tell what the "right" representation is for a given variable.
698
699           Sereal tries to not be lossy. So if it detects that the string
700           value of a var, and the numeric value are different it will
701           generally round trip the *string* value. This means that "special"
702           strings often used in Perl function returns, like "0 but true", and
703           "0e0", will round trip in a way that their normal Perl semantics
704           are preserved. However this also means that "non canonical" values,
705           like " 100 ", which will numify as 100 without warnings, will round
706           trip as their string values.
707
708           Perl also has some operators, the binary operators, ^, | and &,
709           which do different things depending on whether their arguments had
710           been used in numeric context as the following examples show:
711
712               perl -le'my $x="1"; $i=int($x); print unpack "H*", $x ^ "1"'
713               30
714
715               perl -le'my $x="1"; print unpack "H*", $x ^ "1"'
716               00
717
718               perl -le'my $x=" 1 "; $i=int($x); print unpack "H*", $x ^ "1"'
719               30
720
721               perl -le'my $x=" 1 "; print unpack "H*", $x ^ "1"'
722               113120
723
724           Sereal currently cannot round trip this property properly.
725
726           An extreme case of this problem is that of "dualvars", which can be
727           created using the Scalar::Util::dualvar() function. This function
728           allows one to create variables which have string and integer values
729           which are completely unrelated to each other.  Sereal currently
730           will choose the *string* value when it detects these items.
731
732           It is possible that a future release of the protocol will fix these
733           issues.
734
735       Booleans
736           As of Perl 5.36 and protocol version 5 Sereal now supports
737           booleans. The new tags SRL_HDR_YES and SRL_HDR_NO now represent
738           perl bools, the old special variables that SRL_HDR_TRUE and
739           SRL_HDR_FALSE may still be generated, but beyond being readonly
740           these are equivalent to SRL_HDR_YES and SRL_HDR_NO.
741

BUGS, CONTACT AND SUPPORT

743       For reporting bugs, please use the github bug tracker at
744       <http://github.com/Sereal/Sereal/issues>.
745
746       For support and discussion of Sereal, there are two Google Groups:
747
748       Announcements around Sereal (extremely low volume):
749       <https://groups.google.com/forum/?fromgroups#!forum/sereal-announce>
750
751       Sereal development list:
752       <https://groups.google.com/forum/?fromgroups#!forum/sereal-dev>
753

AUTHORS AND CONTRIBUTORS

755       Yves Orton <demerphq@gmail.com>
756
757       Damian Gryski
758
759       Steffen Mueller <smueller@cpan.org>
760
761       Rafaël Garcia-Suarez
762
763       Ævar Arnfjörð Bjarmason <avar@cpan.org>
764
765       Tim Bunce
766
767       Daniel Dragan <bulkdd@cpan.org> (Windows support and bugfixes)
768
769       Zefram
770
771       Borislav Nikolov
772
773       Ivan Kruglov <ivan.kruglov@yahoo.com>
774
775       Some inspiration and code was taken from Marc Lehmann's excellent
776       JSON::XS module due to obvious overlap in problem domain. Thank you!
777

ACKNOWLEDGMENT

779       This module was originally developed for Booking.com.  With approval
780       from Booking.com, this module was generalized and published on CPAN,
781       for which the authors would like to express their gratitude.
782
784       Copyright (C) 2012, 2013, 2014 by Steffen Mueller Copyright (C) 2012,
785       2013, 2014 by Yves Orton
786
787       The license for the code in this distribution is the following, with
788       the exceptions listed below:
789
790       This library is free software; you can redistribute it and/or modify it
791       under the same terms as Perl itself.
792
793       Except portions taken from Marc Lehmann's code for the JSON::XS module,
794       which is licensed under the same terms as this module.
795
796       Also except the code for Snappy compression library, whose license is
797       reproduced below and which, to the best of our knowledge, is compatible
798       with this module's license. The license for the enclosed Snappy code
799       is:
800
801         Copyright 2011, Google Inc.
802         All rights reserved.
803
804         Redistribution and use in source and binary forms, with or without
805         modification, are permitted provided that the following conditions are
806         met:
807
808           * Redistributions of source code must retain the above copyright
809         notice, this list of conditions and the following disclaimer.
810           * Redistributions in binary form must reproduce the above
811         copyright notice, this list of conditions and the following disclaimer
812         in the documentation and/or other materials provided with the
813         distribution.
814           * Neither the name of Google Inc. nor the names of its
815         contributors may be used to endorse or promote products derived from
816         this software without specific prior written permission.
817
818         THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
819         "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
820         LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
821         A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
822         OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
823         SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
824         LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
825         DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
826         THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
827         (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
828         OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
829
830
831
832perl v5.36.0                      2022-09-04                Sereal::Encoder(3)
Impressum