1ECB(1) User Contributed Perl Documentation ECB(1)
2
3
4
6 ABOUT LIBECB
7 Libecb is currently a simple header file that doesn't require any
8 configuration to use or include in your project.
9
10 It's part of the e-suite of libraries, other members of which include
11 libev and libeio.
12
13 Its homepage can be found here:
14
15 http://software.schmorp.de/pkg/libecb
16
17 It mainly provides a number of wrappers around many compiler built-ins,
18 together with replacement functions for other compilers. In addition to
19 this, it provides a number of other lowlevel C utilities, such as
20 endianness detection, byte swapping or bit rotations.
21
22 Or in other words, things that should be built into any standard C
23 system, but aren't, implemented as efficient as possible with GCC
24 (clang, msvc...), and still correct with other compilers.
25
26 More might come.
27
28 ABOUT THE HEADER
29 At the moment, all you have to do is copy ecb.h somewhere where your
30 compiler can find it and include it:
31
32 #include <ecb.h>
33
34 The header should work fine for both C and C++ compilation, and gives
35 you all of inttypes.h in addition to the ECB symbols.
36
37 There are currently no object files to link to - future versions might
38 come with an (optional) object code library to link against, to reduce
39 code size or gain access to additional features.
40
41 It also currently includes everything from inttypes.h.
42
43 ABOUT THIS MANUAL / CONVENTIONS
44 This manual mainly describes each (public) function available after
45 including the ecb.h header. The header might define other symbols than
46 these, but these are not part of the public API, and not supported in
47 any way.
48
49 When the manual mentions a "function" then this could be defined either
50 as as inline function, a macro, or an external symbol.
51
52 When functions use a concrete standard type, such as "int" or
53 "uint32_t", then the corresponding function works only with that type.
54 If only a generic name is used ("expr", "cond", "value" and so on),
55 then the corresponding function relies on C to implement the correct
56 types, and is usually implemented as a macro. Specifically, a "bool" in
57 this manual refers to any kind of boolean value, not a specific type.
58
59 TYPES / TYPE SUPPORT
60 ecb.h makes sure that the following types are defined (in the expected
61 way):
62
63 int8_t uint8_
64 int16_t uint16_t
65 int32_t uint32_
66 int64_t uint64_t
67 int_fast8_t uint_fast8_t
68 int_fast16_t uint_fast16_t
69 int_fast32_t uint_fast32_t
70 int_fast64_t uint_fast64_t
71 intptr_t uintptr_t
72
73 The macro "ECB_PTRSIZE" is defined to the size of a pointer on this
74 platform (currently 4 or 8) and can be used in preprocessor
75 expressions.
76
77 For "ptrdiff_t" and "size_t" use "stddef.h"/"cstddef".
78
79 LANGUAGE/ENVIRONMENT/COMPILER VERSIONS
80 All the following symbols expand to an expression that can be tested in
81 preprocessor instructions as well as treated as a boolean (use "!!" to
82 ensure it's either 0 or 1 if you need that).
83
84 ECB_C
85 True if the implementation defines the "__STDC__" macro to a true
86 value, while not claiming to be C++, i..e C, but not C++.
87
88 ECB_C99
89 True if the implementation claims to be compliant to C99 (ISO/IEC
90 9899:1999) or any later version, while not claiming to be C++.
91
92 Note that later versions (ECB_C11) remove core features again (for
93 example, variable length arrays).
94
95 ECB_C11, ECB_C17
96 True if the implementation claims to be compliant to C11/C17
97 (ISO/IEC 9899:2011, :20187) or any later version, while not
98 claiming to be C++.
99
100 ECB_CPP
101 True if the implementation defines the "__cplusplus__" macro to a
102 true value, which is typically true for C++ compilers.
103
104 ECB_CPP11, ECB_CPP14, ECB_CPP17
105 True if the implementation claims to be compliant to
106 C++11/C++14/C++17 (ISO/IEC 14882:2011, :2014, :2017) or any later
107 version.
108
109 Note that many C++20 features will likely have their own feature
110 test macros (see e.g. <http://eel.is/c++draft/cpp.predefined#1.8>).
111
112 ECB_OPTIMIZE_SIZE
113 Is 1 when the compiler optimizes for size, 0 otherwise. This symbol
114 can also be defined before including ecb.h, in which case it will
115 be unchanged.
116
117 ECB_GCC_VERSION (major, minor)
118 Expands to a true value (suitable for testing by the preprocessor)
119 if the compiler used is GNU C and the version is the given version,
120 or higher.
121
122 This macro tries to return false on compilers that claim to be GCC
123 compatible but aren't.
124
125 ECB_EXTERN_C
126 Expands to "extern "C"" in C++, and a simple "extern" in C.
127
128 This can be used to declare a single external C function:
129
130 ECB_EXTERN_C int printf (const char *format, ...);
131
132 ECB_EXTERN_C_BEG / ECB_EXTERN_C_END
133 These two macros can be used to wrap multiple "extern "C""
134 definitions - they expand to nothing in C.
135
136 They are most useful in header files:
137
138 ECB_EXTERN_C_BEG
139
140 int mycfun1 (int x);
141 int mycfun2 (int x);
142
143 ECB_EXTERN_C_END
144
145 ECB_STDFP
146 If this evaluates to a true value (suitable for testing by the
147 preprocessor), then "float" and "double" use IEEE 754
148 single/binary32 and double/binary64 representations internally and
149 the endianness of both types match the endianness of "uint32_t" and
150 "uint64_t".
151
152 This means you can just copy the bits of a "float" (or "double") to
153 an "uint32_t" (or "uint64_t") and get the raw IEEE 754 bit
154 representation without having to think about format or endianness.
155
156 This is true for basically all modern platforms, although ecb.h
157 might not be able to deduce this correctly everywhere and might err
158 on the safe side.
159
160 ECB_64BIT_NATIVE
161 Evaluates to a true value (suitable for both preprocessor and C
162 code testing) if 64 bit integer types on this architecture are
163 evaluated "natively", that is, with similar speeds as 32 bit
164 integers. While 64 bit integer support is very common (and in fact
165 required by libecb), 32 bit cpus have to emulate operations on
166 them, so you might want to avoid them.
167
168 ECB_AMD64, ECB_AMD64_X32
169 These two macros are defined to 1 on the x86_64/amd64 ABI and the
170 X32 ABI, respectively, and undefined elsewhere.
171
172 The designers of the new X32 ABI for some inexplicable reason
173 decided to make it look exactly like amd64, even though it's
174 completely incompatible to that ABI, breaking about every piece of
175 software that assumed that "__x86_64" stands for, well, the x86-64
176 ABI, making these macros necessary.
177
178 MACRO TRICKERY
179 ECB_CONCAT (a, b)
180 Expands any macros in "a" and "b", then concatenates the result to
181 form a single token. This is mainly useful to form identifiers from
182 components, e.g.:
183
184 #define S1 str
185 #define S2 cpy
186
187 ECB_CONCAT (S1, S2)(dst, src); // == strcpy (dst, src);
188
189 ECB_STRINGIFY (arg)
190 Expands any macros in "arg" and returns the stringified version of
191 it. This is mainly useful to get the contents of a macro in string
192 form, e.g.:
193
194 #define SQL_LIMIT 100
195 sql_exec ("select * from table limit " ECB_STRINGIFY (SQL_LIMIT));
196
197 ECB_STRINGIFY_EXPR (expr)
198 Like "ECB_STRINGIFY", but additionally evaluates "expr" to make
199 sure it is a valid expression. This is useful to catch typos or
200 cases where the macro isn't available:
201
202 #include <errno.h>
203
204 ECB_STRINGIFY (EDOM); // "33" (on my system at least)
205 ECB_STRINGIFY_EXPR (EDOM); // "33"
206
207 // now imagine we had a typo:
208
209 ECB_STRINGIFY (EDAM); // "EDAM"
210 ECB_STRINGIFY_EXPR (EDAM); // error: EDAM undefined
211
212 ATTRIBUTES
213 A major part of libecb deals with additional attributes that can be
214 assigned to functions, variables and sometimes even types - much like
215 "const" or "volatile" in C. They are implemented using either GCC
216 attributes or other compiler/language specific features. Attributes
217 declarations must be put before the whole declaration:
218
219 ecb_const int mysqrt (int a);
220 ecb_unused int i;
221
222 ecb_unused
223 Marks a function or a variable as "unused", which simply suppresses
224 a warning by the compiler when it detects it as unused. This is
225 useful when you e.g. declare a variable but do not always use it:
226
227 {
228 ecb_unused int var;
229
230 #ifdef SOMECONDITION
231 var = ...;
232 return var;
233 #else
234 return 0;
235 #endif
236 }
237
238 ecb_deprecated
239 Similar to "ecb_unused", but marks a function, variable or type as
240 deprecated. This makes some compilers warn when the type is used.
241
242 ecb_deprecated_message (message)
243 Same as "ecb_deprecated", but if possible, the specified diagnostic
244 is used instead of a generic depreciation message when the object
245 is being used.
246
247 ecb_inline
248 Expands either to (a compiler-specific equivalent of) "static
249 inline" or to just "static", if inline isn't supported. It should
250 be used to declare functions that should be inlined, for code size
251 or speed reasons.
252
253 Example: inline this function, it surely will reduce codesize.
254
255 ecb_inline int
256 negmul (int a, int b)
257 {
258 return - (a * b);
259 }
260
261 ecb_noinline
262 Prevents a function from being inlined - it might be optimised
263 away, but not inlined into other functions. This is useful if you
264 know your function is rarely called and large enough for inlining
265 not to be helpful.
266
267 ecb_noreturn
268 Marks a function as "not returning, ever". Some typical functions
269 that don't return are "exit" or "abort" (which really works hard to
270 not return), and now you can make your own:
271
272 ecb_noreturn void
273 my_abort (const char *errline)
274 {
275 puts (errline);
276 abort ();
277 }
278
279 In this case, the compiler would probably be smart enough to deduce
280 it on its own, so this is mainly useful for declarations.
281
282 ecb_restrict
283 Expands to the "restrict" keyword or equivalent on compilers that
284 support them, and to nothing on others. Must be specified on a
285 pointer type or an array index to indicate that the memory doesn't
286 alias with any other restricted pointer in the same scope.
287
288 Example: multiply a vector, and allow the compiler to parallelise
289 the loop, because it knows it doesn't overwrite input values.
290
291 void
292 multiply (ecb_restrict float *src,
293 ecb_restrict float *dst,
294 int len, float factor)
295 {
296 int i;
297
298 for (i = 0; i < len; ++i)
299 dst [i] = src [i] * factor;
300 }
301
302 ecb_const
303 Declares that the function only depends on the values of its
304 arguments, much like a mathematical function. It specifically does
305 not read or write any memory any arguments might point to, global
306 variables, or call any non-const functions. It also must not have
307 any side effects.
308
309 Such a function can be optimised much more aggressively by the
310 compiler - for example, multiple calls with the same arguments can
311 be optimised into a single call, which wouldn't be possible if the
312 compiler would have to expect any side effects.
313
314 It is best suited for functions in the sense of mathematical
315 functions, such as a function returning the square root of its
316 input argument.
317
318 Not suited would be a function that calculates the hash of some
319 memory area you pass in, prints some messages or looks at a global
320 variable to decide on rounding.
321
322 See "ecb_pure" for a slightly less restrictive class of functions.
323
324 ecb_pure
325 Similar to "ecb_const", declares a function that has no side
326 effects. Unlike "ecb_const", the function is allowed to examine
327 global variables and any other memory areas (such as the ones
328 passed to it via pointers).
329
330 While these functions cannot be optimised as aggressively as
331 "ecb_const" functions, they can still be optimised away in many
332 occasions, and the compiler has more freedom in moving calls to
333 them around.
334
335 Typical examples for such functions would be "strlen" or "memcmp".
336 A function that calculates the MD5 sum of some input and updates
337 some MD5 state passed as argument would NOT be pure, however, as it
338 would modify some memory area that is not the return value.
339
340 ecb_hot
341 This declares a function as "hot" with regards to the cache - the
342 function is used so often, that it is very beneficial to keep it in
343 the cache if possible.
344
345 The compiler reacts by trying to place hot functions near to each
346 other in memory.
347
348 Whether a function is hot or not often depends on the whole
349 program, and less on the function itself. "ecb_cold" is likely more
350 useful in practise.
351
352 ecb_cold
353 The opposite of "ecb_hot" - declares a function as "cold" with
354 regards to the cache, or in other words, this function is not
355 called often, or not at speed-critical times, and keeping it in the
356 cache might be a waste of said cache.
357
358 In addition to placing cold functions together (or at least away
359 from hot functions), this knowledge can be used in other ways, for
360 example, the function will be optimised for size, as opposed to
361 speed, and codepaths leading to calls to those functions can
362 automatically be marked as if "ecb_expect_false" had been used to
363 reach them.
364
365 Good examples for such functions would be error reporting
366 functions, or functions only called in exceptional or rare cases.
367
368 ecb_artificial
369 Declares the function as "artificial", in this case meaning that
370 this function is not really meant to be a function, but more like
371 an accessor - many methods in C++ classes are mere accessor
372 functions, and having a crash reported in such a method, or single-
373 stepping through them, is not usually so helpful, especially when
374 it's inlined to just a few instructions.
375
376 Marking them as artificial will instruct the debugger about just
377 this, leading to happier debugging and thus happier lives.
378
379 Example: in some kind of smart-pointer class, mark the pointer
380 accessor as artificial, so that the whole class acts more like a
381 pointer and less like some C++ abstraction monster.
382
383 template<typename T>
384 struct my_smart_ptr
385 {
386 T *value;
387
388 ecb_artificial
389 operator T *()
390 {
391 return value;
392 }
393 };
394
395 OPTIMISATION HINTS
396 bool ecb_is_constant (expr)
397 Returns true iff the expression can be deduced to be a compile-time
398 constant, and false otherwise.
399
400 For example, when you have a "rndm16" function that returns a 16
401 bit random number, and you have a function that maps this to a
402 range from 0..n-1, then you could use this inline function in a
403 header file:
404
405 ecb_inline uint32_t
406 rndm (uint32_t n)
407 {
408 return (n * (uint32_t)rndm16 ()) >> 16;
409 }
410
411 However, for powers of two, you could use a normal mask, but that
412 is only worth it if, at compile time, you can detect this case.
413 This is the case when the passed number is a constant and also a
414 power of two ("n & (n - 1) == 0"):
415
416 ecb_inline uint32_t
417 rndm (uint32_t n)
418 {
419 return is_constant (n) && !(n & (n - 1))
420 ? rndm16 () & (num - 1)
421 : (n * (uint32_t)rndm16 ()) >> 16;
422 }
423
424 ecb_expect (expr, value)
425 Evaluates "expr" and returns it. In addition, it tells the compiler
426 that the "expr" evaluates to "value" a lot, which can be used for
427 static branch optimisations.
428
429 Usually, you want to use the more intuitive "ecb_expect_true" and
430 "ecb_expect_false" functions instead.
431
432 bool ecb_expect_true (cond)
433 bool ecb_expect_false (cond)
434 These two functions expect a expression that is true or false and
435 return 1 or 0, respectively, so when used in the condition of an
436 "if" or other conditional statement, it will not change the
437 program:
438
439 /* these two do the same thing */
440 if (some_condition) ...;
441 if (ecb_expect_true (some_condition)) ...;
442
443 However, by using "ecb_expect_true", you tell the compiler that the
444 condition is likely to be true (and for "ecb_expect_false", that it
445 is unlikely to be true).
446
447 For example, when you check for a null pointer and expect this to
448 be a rare, exceptional, case, then use "ecb_expect_false":
449
450 void my_free (void *ptr)
451 {
452 if (ecb_expect_false (ptr == 0))
453 return;
454 }
455
456 Consequent use of these functions to mark away exceptional cases or
457 to tell the compiler what the hot path through a function is can
458 increase performance considerably.
459
460 You might know these functions under the name "likely" and
461 "unlikely" - while these are common aliases, we find that the
462 expect name is easier to understand when quickly skimming code. If
463 you wish, you can use "ecb_likely" instead of "ecb_expect_true" and
464 "ecb_unlikely" instead of "ecb_expect_false" - these are simply
465 aliases.
466
467 A very good example is in a function that reserves more space for
468 some memory block (for example, inside an implementation of a
469 string stream) - each time something is added, you have to check
470 for a buffer overrun, but you expect that most checks will turn out
471 to be false:
472
473 /* make sure we have "size" extra room in our buffer */
474 ecb_inline void
475 reserve (int size)
476 {
477 if (ecb_expect_false (current + size > end))
478 real_reserve_method (size); /* presumably noinline */
479 }
480
481 ecb_assume (cond)
482 Tries to tell the compiler that some condition is true, even if
483 it's not obvious. This is not a function, but a statement: it
484 cannot be used in another expression.
485
486 This can be used to teach the compiler about invariants or other
487 conditions that might improve code generation, but which are
488 impossible to deduce form the code itself.
489
490 For example, the example reservation function from the
491 "ecb_expect_false" description could be written thus (only
492 "ecb_assume" was added):
493
494 ecb_inline void
495 reserve (int size)
496 {
497 if (ecb_expect_false (current + size > end))
498 real_reserve_method (size); /* presumably noinline */
499
500 ecb_assume (current + size <= end);
501 }
502
503 If you then call this function twice, like this:
504
505 reserve (10);
506 reserve (1);
507
508 Then the compiler might be able to optimise out the second call
509 completely, as it knows that "current + 1 > end" is false and the
510 call will never be executed.
511
512 ecb_unreachable ()
513 This function does nothing itself, except tell the compiler that it
514 will never be executed. Apart from suppressing a warning in some
515 cases, this function can be used to implement "ecb_assume" or
516 similar functionality.
517
518 ecb_prefetch (addr, rw, locality)
519 Tells the compiler to try to prefetch memory at the given "addr"ess
520 for either reading ("rw" = 0) or writing ("rw" = 1). A "locality"
521 of 0 means that there will only be one access later, 3 means that
522 the data will likely be accessed very often, and values in between
523 mean something... in between. The memory pointed to by the address
524 does not need to be accessible (it could be a null pointer for
525 example), but "rw" and "locality" must be compile-time constants.
526
527 This is a statement, not a function: you cannot use it as part of
528 an expression.
529
530 An obvious way to use this is to prefetch some data far away, in a
531 big array you loop over. This prefetches memory some 128 array
532 elements later, in the hope that it will be ready when the CPU
533 arrives at that location.
534
535 int sum = 0;
536
537 for (i = 0; i < N; ++i)
538 {
539 sum += arr [i]
540 ecb_prefetch (arr + i + 128, 0, 0);
541 }
542
543 It's hard to predict how far to prefetch, and most CPUs that can
544 prefetch are often good enough to predict this kind of behaviour
545 themselves. It gets more interesting with linked lists, especially
546 when you do some fair processing on each list element:
547
548 for (node *n = start; n; n = n->next)
549 {
550 ecb_prefetch (n->next, 0, 0);
551 ... do medium amount of work with *n
552 }
553
554 After processing the node, (part of) the next node might already be
555 in cache.
556
557 BIT FIDDLING / BIT WIZARDRY
558 bool ecb_big_endian ()
559 bool ecb_little_endian ()
560 These two functions return true if the byte order is big endian
561 (most-significant byte first) or little endian (least-significant
562 byte first) respectively.
563
564 On systems that are neither, their return values are unspecified.
565
566 int ecb_ctz32 (uint32_t x)
567 int ecb_ctz64 (uint64_t x)
568 int ecb_ctz (T x) [C++]
569 Returns the index of the least significant bit set in "x" (or
570 equivalently the number of bits set to 0 before the least
571 significant bit set), starting from 0. If "x" is 0 the result is
572 undefined.
573
574 For smaller types than "uint32_t" you can safely use "ecb_ctz32".
575
576 The overloaded C++ "ecb_ctz" function supports "uint8_t",
577 "uint16_t", "uint32_t" and "uint64_t" types.
578
579 For example:
580
581 ecb_ctz32 (3) = 0
582 ecb_ctz32 (6) = 1
583
584 bool ecb_is_pot32 (uint32_t x)
585 bool ecb_is_pot64 (uint32_t x)
586 bool ecb_is_pot (T x) [C++]
587 Returns true iff "x" is a power of two or "x == 0".
588
589 For smaller types than "uint32_t" you can safely use
590 "ecb_is_pot32".
591
592 The overloaded C++ "ecb_is_pot" function supports "uint8_t",
593 "uint16_t", "uint32_t" and "uint64_t" types.
594
595 int ecb_ld32 (uint32_t x)
596 int ecb_ld64 (uint64_t x)
597 int ecb_ld64 (T x) [C++]
598 Returns the index of the most significant bit set in "x", or the
599 number of digits the number requires in binary (so that "2**ld <= x
600 < 2**(ld+1)"). If "x" is 0 the result is undefined. A common use
601 case is to compute the integer binary logarithm, i.e. "floor (log2
602 (n))", for example to see how many bits a certain number requires
603 to be encoded.
604
605 This function is similar to the "count leading zero bits" function,
606 except that that one returns how many zero bits are "in front" of
607 the number (in the given data type), while "ecb_ld" returns how
608 many bits the number itself requires.
609
610 For smaller types than "uint32_t" you can safely use "ecb_ld32".
611
612 The overloaded C++ "ecb_ld" function supports "uint8_t",
613 "uint16_t", "uint32_t" and "uint64_t" types.
614
615 int ecb_popcount32 (uint32_t x)
616 int ecb_popcount64 (uint64_t x)
617 int ecb_popcount (T x) [C++]
618 Returns the number of bits set to 1 in "x".
619
620 For smaller types than "uint32_t" you can safely use
621 "ecb_popcount32".
622
623 The overloaded C++ "ecb_popcount" function supports "uint8_t",
624 "uint16_t", "uint32_t" and "uint64_t" types.
625
626 For example:
627
628 ecb_popcount32 (7) = 3
629 ecb_popcount32 (255) = 8
630
631 uint8_t ecb_bitrev8 (uint8_t x)
632 uint16_t ecb_bitrev16 (uint16_t x)
633 uint32_t ecb_bitrev32 (uint32_t x)
634 T ecb_bitrev (T x) [C++]
635 Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes
636 LSB+1 and so on.
637
638 The overloaded C++ "ecb_bitrev" function supports "uint8_t",
639 "uint16_t" and "uint32_t" types.
640
641 Example:
642
643 ecb_bitrev8 (0xa7) = 0xea
644 ecb_bitrev32 (0xffcc4411) = 0x882233ff
645
646 T ecb_bitrev (T x) [C++]
647 Overloaded C++ bitrev function.
648
649 "T" must be one of "uint8_t", "uint16_t" or "uint32_t".
650
651 uint32_t ecb_bswap16 (uint32_t x)
652 uint32_t ecb_bswap32 (uint32_t x)
653 uint64_t ecb_bswap64 (uint64_t x)
654 T ecb_bswap (T x)
655 These functions return the value of the 16-bit (32-bit, 64-bit)
656 value "x" after reversing the order of bytes (0x11223344 becomes
657 0x44332211 in "ecb_bswap32").
658
659 The overloaded C++ "ecb_bswap" function supports "uint8_t",
660 "uint16_t", "uint32_t" and "uint64_t" types.
661
662 uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
663 uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
664 uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
665 uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
666 uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
667 uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
668 uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
669 uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
670 These two families of functions return the value of "x" after
671 rotating all the bits by "count" positions to the right
672 ("ecb_rotr") or left ("ecb_rotl"). There are no restrictions on the
673 value "count", i.e. both zero and values equal or larger than the
674 word width work correctly. Also, notwithstanding "count" being
675 unsigned, negative numbers work and shift to the opposite
676 direction.
677
678 Current GCC/clang versions understand these functions and usually
679 compile them to "optimal" code (e.g. a single "rol" or a
680 combination of "shld" on x86).
681
682 T ecb_rotl (T x, unsigned int count) [C++]
683 T ecb_rotr (T x, unsigned int count) [C++]
684 Overloaded C++ rotl/rotr functions.
685
686 "T" must be one of "uint8_t", "uint16_t", "uint32_t" or "uint64_t".
687
688 BIT MIXING, HASHING
689 Sometimes you have an integer and want to distribute its bits well, for
690 example, to use it as a hash in a hashtable. A common example is
691 pointer values, which often only have a limited range (e.g. low and
692 high bits are often zero).
693
694 The following functions try to mix the bits to get a good bias-free
695 distribution. They were mainly made for pointers, but the underlying
696 integer functions are exposed as well.
697
698 As an added benefit, the functions are reversible, so if you find it
699 convenient to store only the hash value, you can recover the original
700 pointer from the hash ("unmix"), as long as your pinters are 32 or 64
701 bit (if this isn't the case on your platform, drop us a note and we
702 will add functions for other bit widths).
703
704 The unmix functions are very slightly slower than the mix functions, so
705 it is equally very slightly preferable to store the original values
706 wehen convenient.
707
708 The underlying algorithm if subject to change, so currently these
709 functions are not suitable for persistent hash tables, as their result
710 value can change between diferent versions of libecb.
711
712 uintptr_t ecb_ptrmix (void *ptr)
713 Mixes the bits of a pointer so the result is suitable for hash
714 table lookups. In other words, this hashes the pointer value.
715
716 uintptr_t ecb_ptrmix (T *ptr) [C++]
717 Overload the "ecb_ptrmix" function to work for any pointer in C++.
718
719 void *ecb_ptrunmix (uintptr_t v)
720 Unmix the hash value into the original pointer. This only works as
721 long as the hash value is not truncated, i.e. you used "uintptr_t"
722 (or equivalent) throughout to store it.
723
724 T *ecb_ptrunmix<T> (uintptr_t v) [C++]
725 The somewhat less useful template version of "ecb_ptrunmix" for
726 C++. Example:
727
728 sometype *myptr;
729 uintptr_t hash = ecb_ptrmix (myptr);
730 sometype *orig = ecb_ptrunmix<sometype> (hash);
731
732 uint32_t ecb_mix32 (uint32_t v)
733 uint64_t ecb_mix64 (uint64_t v)
734 Sometimes you don't have a pointer but an integer whose values are
735 very badly distributed. In this case you cna sue these integer
736 versions of the mixing function. No C++ template is provided
737 currently.
738
739 uint32_t ecb_unmix32 (uint32_t v)
740 uint64_t ecb_unmix64 (uint64_t v)
741 The reverse of the "ecb_mix" functions - they take a mixed/hashed
742 value and recover the original value.
743
744 HOST ENDIANNESS CONVERSION
745 uint_fast16_t ecb_be_u16_to_host (uint_fast16_t v)
746 uint_fast32_t ecb_be_u32_to_host (uint_fast32_t v)
747 uint_fast64_t ecb_be_u64_to_host (uint_fast64_t v)
748 uint_fast16_t ecb_le_u16_to_host (uint_fast16_t v)
749 uint_fast32_t ecb_le_u32_to_host (uint_fast32_t v)
750 uint_fast64_t ecb_le_u64_to_host (uint_fast64_t v)
751 Convert an unsigned 16, 32 or 64 bit value from big or little
752 endian to host byte order.
753
754 The naming convention is "ecb_"("be"|"le")"_u""16|32|64""_to_host",
755 where "be" and "le" stand for big endian and little endian,
756 respectively.
757
758 uint_fast16_t ecb_host_to_be_u16 (uint_fast16_t v)
759 uint_fast32_t ecb_host_to_be_u32 (uint_fast32_t v)
760 uint_fast64_t ecb_host_to_be_u64 (uint_fast64_t v)
761 uint_fast16_t ecb_host_to_le_u16 (uint_fast16_t v)
762 uint_fast32_t ecb_host_to_le_u32 (uint_fast32_t v)
763 uint_fast64_t ecb_host_to_le_u64 (uint_fast64_t v)
764 Like above, but converts from host byte order to the specified
765 endianness.
766
767 In C++ the following additional template functions are supported:
768
769 T ecb_be_to_host (T v)
770 T ecb_le_to_host (T v)
771 T ecb_host_to_be (T v)
772 T ecb_host_to_le (T v)
773
774 These functions work like their C counterparts, above, but use
775 templates, which make them useful in generic code.
776
777 "T" must be one of "uint8_t", "uint16_t", "uint32_t" or "uint64_t" (so
778 unlike their C counterparts, there is a version for "uint8_t", which
779 again can be useful in generic code).
780
781 UNALIGNED LOAD/STORE
782 These function load or store unaligned multi-byte values.
783
784 uint_fast16_t ecb_peek_u16_u (const void *ptr)
785 uint_fast32_t ecb_peek_u32_u (const void *ptr)
786 uint_fast64_t ecb_peek_u64_u (const void *ptr)
787 These functions load an unaligned, unsigned 16, 32 or 64 bit value
788 from memory.
789
790 uint_fast16_t ecb_peek_be_u16_u (const void *ptr)
791 uint_fast32_t ecb_peek_be_u32_u (const void *ptr)
792 uint_fast64_t ecb_peek_be_u64_u (const void *ptr)
793 uint_fast16_t ecb_peek_le_u16_u (const void *ptr)
794 uint_fast32_t ecb_peek_le_u32_u (const void *ptr)
795 uint_fast64_t ecb_peek_le_u64_u (const void *ptr)
796 Like above, but additionally convert from big endian ("be") or
797 little endian ("le") byte order to host byte order while doing so.
798
799 ecb_poke_u16_u (void *ptr, uint16_t v)
800 ecb_poke_u32_u (void *ptr, uint32_t v)
801 ecb_poke_u64_u (void *ptr, uint64_t v)
802 These functions store an unaligned, unsigned 16, 32 or 64 bit value
803 to memory.
804
805 ecb_poke_be_u16_u (void *ptr, uint_fast16_t v)
806 ecb_poke_be_u32_u (void *ptr, uint_fast32_t v)
807 ecb_poke_be_u64_u (void *ptr, uint_fast64_t v)
808 ecb_poke_le_u16_u (void *ptr, uint_fast16_t v)
809 ecb_poke_le_u32_u (void *ptr, uint_fast32_t v)
810 ecb_poke_le_u64_u (void *ptr, uint_fast64_t v)
811 Like above, but additionally convert from host byte order to big
812 endian ("be") or little endian ("le") byte order while doing so.
813
814 In C++ the following additional template functions are supported:
815
816 T ecb_peek<T> (const void *ptr)
817 T ecb_peek_be<T> (const void *ptr)
818 T ecb_peek_le<T> (const void *ptr)
819 T ecb_peek_u<T> (const void *ptr)
820 T ecb_peek_be_u<T> (const void *ptr)
821 T ecb_peek_le_u<T> (const void *ptr)
822 Similarly to their C counterparts, these functions load an unsigned
823 8, 16, 32 or 64 bit value from memory, with optional conversion
824 from big/little endian.
825
826 Since the type cannot be deduced, it has to be specified
827 explicitly, e.g.
828
829 uint_fast16_t v = ecb_peek<uint16_t> (ptr);
830
831 "T" must be one of "uint8_t", "uint16_t", "uint32_t" or "uint64_t".
832
833 Unlike their C counterparts, these functions support 8 bit
834 quantities ("uint8_t") and also have an aligned version (without
835 the "_u" prefix), all of which hopefully makes them more useful in
836 generic code.
837
838 ecb_poke (void *ptr, T v)
839 ecb_poke_be (void *ptr, T v)
840 ecb_poke_le (void *ptr, T v)
841 ecb_poke_u (void *ptr, T v)
842 ecb_poke_be_u (void *ptr, T v)
843 ecb_poke_le_u (void *ptr, T v)
844 Again, similarly to their C counterparts, these functions store an
845 unsigned 8, 16, 32 or z64 bit value to memory, with optional
846 conversion to big/little endian.
847
848 "T" must be one of "uint8_t", "uint16_t", "uint32_t" or "uint64_t".
849
850 Unlike their C counterparts, these functions support 8 bit
851 quantities ("uint8_t") and also have an aligned version (without
852 the "_u" prefix), all of which hopefully makes them more useful in
853 generic code.
854
855 FAST INTEGER TO STRING
856 Libecb defines a set of very fast integer to decimal string (or integer
857 to ascii, short "i2a") functions. These work by converting the integer
858 to a fixed point representation and then successively multiplying out
859 the topmost digits. Unlike some other, also very fast, libraries, ecb's
860 algorithm should be completely branchless per digit, and does not rely
861 on the presence of special cpu functions (such as clz).
862
863 There is a high level API that takes an "int32_t", "uint32_t",
864 "int64_t" or "uint64_t" as argument, and a low-level API, which is
865 harder to use but supports slightly more formatting options.
866
867 HIGH LEVEL API
868
869 The high level API consists of four functions, one each for "int32_t",
870 "uint32_t", "int64_t" and "uint64_t":
871
872 Example:
873
874 char buf[ECB_I2A_MAX_DIGITS + 1];
875 char *end = ecb_i2a_i32 (buf, 17262);
876 *end = 0;
877 // buf now contains "17262"
878
879 ECB_I2A_I32_DIGITS (=11)
880 char *ecb_i2a_u32 (char *ptr, uint32_t value)
881 Takes an "uint32_t" value and formats it as a decimal number
882 starting at ptr, using at most "ECB_I2A_I32_DIGITS" characters.
883 Returns a pointer to just after the generated string, where you
884 would normally put the terminating 0 character. This function
885 outputs the minimum number of digits.
886
887 ECB_I2A_U32_DIGITS (=10)
888 char *ecb_i2a_i32 (char *ptr, int32_t value)
889 Same as "ecb_i2a_u32", but formats a "int32_t" value, including a
890 minus sign if needed.
891
892 ECB_I2A_I64_DIGITS (=20)
893 char *ecb_i2a_u64 (char *ptr, uint64_t value)
894 ECB_I2A_U64_DIGITS (=21)
895 char *ecb_i2a_i64 (char *ptr, int64_t value)
896 Similar to their 32 bit counterparts, these take a 64 bit argument.
897
898 ECB_I2A_MAX_DIGITS (=21)
899 Instead of using a type specific length macro, you can just use
900 "ECB_I2A_MAX_DIGITS", which is good enough for any "ecb_i2a"
901 function.
902
903 LOW-LEVEL API
904
905 The functions above use a number of low-level APIs which have some
906 strict limitations, but can be used as building blocks (studying
907 "ecb_i2a_i32" and related functions is recommended).
908
909 There are three families of functions: functions that convert a number
910 to a fixed number of digits with leading zeroes ("ecb_i2a_0N", 0 for
911 "leading zeroes"), functions that generate up to N digits, skipping
912 leading zeroes ("_N"), and functions that can generate more digits, but
913 the leading digit has limited range ("_xN").
914
915 None of the functions deal with negative numbers.
916
917 Example: convert an IP address in an u32 into dotted-quad:
918
919 uint32_t ip = 0x0a000164; // 10.0.1.100
920 char ips[3 * 4 + 3 + 1];
921 char *ptr = ips;
922 ptr = ecb_i2a_3 (ptr, ip >> 24 ); *ptr++ = '.';
923 ptr = ecb_i2a_3 (ptr, (ip >> 16) & 0xff); *ptr++ = '.';
924 ptr = ecb_i2a_3 (ptr, (ip >> 8) & 0xff); *ptr++ = '.';
925 ptr = ecb_i2a_3 (ptr, ip & 0xff); *ptr++ = 0;
926 printf ("ip: %s\n", ips); // prints "ip: 10.0.1.100"
927
928 char *ecb_i2a_02 (char *ptr, uint32_t value) // 32 bit
929 char *ecb_i2a_03 (char *ptr, uint32_t value) // 32 bit
930 char *ecb_i2a_04 (char *ptr, uint32_t value) // 32 bit
931 char *ecb_i2a_05 (char *ptr, uint32_t value) // 64 bit
932 char *ecb_i2a_06 (char *ptr, uint32_t value) // 64 bit
933 char *ecb_i2a_07 (char *ptr, uint32_t value) // 64 bit
934 char *ecb_i2a_08 (char *ptr, uint32_t value) // 64 bit
935 char *ecb_i2a_09 (char *ptr, uint32_t value) // 64 bit
936 The "ecb_i2a_0N" functions take an unsigned value and convert them
937 to exactly N digits, returning a pointer to the first character
938 after the digits. The value must be in range. The functions marked
939 with 32 bit do their calculations internally in 32 bit, the ones
940 marked with 64 bit internally use 64 bit integers, which might be
941 slow on 32 bit architectures (the high level API decides on 32 vs.
942 64 bit versions using "ECB_64BIT_NATIVE").
943
944 char *ecb_i2a_2 (char *ptr, uint32_t value) // 32 bit
945 char *ecb_i2a_3 (char *ptr, uint32_t value) // 32 bit
946 char *ecb_i2a_4 (char *ptr, uint32_t value) // 32 bit
947 char *ecb_i2a_5 (char *ptr, uint32_t value) // 64 bit
948 char *ecb_i2a_6 (char *ptr, uint32_t value) // 64 bit
949 char *ecb_i2a_7 (char *ptr, uint32_t value) // 64 bit
950 char *ecb_i2a_8 (char *ptr, uint32_t value) // 64 bit
951 char *ecb_i2a_9 (char *ptr, uint32_t value) // 64 bit
952 Similarly, the "ecb_i2a_N" functions take an unsigned value and
953 convert them to at most N digits, suppressing leading zeroes, and
954 returning a pointer to the first character after the digits.
955
956 ECB_I2A_MAX_X5 (=59074)
957 char *ecb_i2a_x5 (char *ptr, uint32_t value) // 32 bit
958 ECB_I2A_MAX_X10 (=2932500665)
959 char *ecb_i2a_x10 (char *ptr, uint32_t value) // 64 bit
960 The "ecb_i2a_xN" functions are similar to the "ecb_i2a_N"
961 functions, but they can generate one digit more, as long as the
962 number is within range, which is given by the symbols
963 "ECB_I2A_MAX_X5" (almost 16 bit range) and "ECB_I2A_MAX_X10" (a bit
964 more than 31 bit range), respectively.
965
966 For example, the digit part of a 32 bit signed integer just fits
967 into the "ECB_I2A_MAX_X10" range, so while "ecb_i2a_x10" cannot
968 convert a 10 digit number, it can convert all 32 bit signed
969 numbers. Sadly, it's not good enough for 32 bit unsigned numbers.
970
971 FLOATING POINT FIDDLING
972 ECB_INFINITY [-UECB_NO_LIBM]
973 Evaluates to positive infinity if supported by the platform,
974 otherwise to a truly huge number.
975
976 ECB_NAN [-UECB_NO_LIBM]
977 Evaluates to a quiet NAN if supported by the platform, otherwise to
978 "ECB_INFINITY".
979
980 float ecb_ldexpf (float x, int exp) [-UECB_NO_LIBM]
981 Same as "ldexpf", but always available.
982
983 uint32_t ecb_float_to_binary16 (float x) [-UECB_NO_LIBM]
984 uint32_t ecb_float_to_binary32 (float x) [-UECB_NO_LIBM]
985 uint64_t ecb_double_to_binary64 (double x) [-UECB_NO_LIBM]
986 These functions each take an argument in the native "float" or
987 "double" type and return the IEEE 754 bit representation of it
988 (binary16/half, binary32/single or binary64/double precision).
989
990 The bit representation is just as IEEE 754 defines it, i.e. the
991 sign bit will be the most significant bit, followed by exponent and
992 mantissa.
993
994 This function should work even when the native floating point
995 format isn't IEEE compliant, of course at a speed and code size
996 penalty, and of course also within reasonable limits (it tries to
997 convert NaNs, infinities and denormals, but will likely convert
998 negative zero to positive zero).
999
1000 On all modern platforms (where "ECB_STDFP" is true), the compiler
1001 should be able to completely optimise away the 32 and 64 bit
1002 functions.
1003
1004 These functions can be helpful when serialising floats to the
1005 network - you can serialise the return value like a normal
1006 uint16_t/uint32_t/uint64_t.
1007
1008 Another use for these functions is to manipulate floating point
1009 values directly.
1010
1011 Silly example: toggle the sign bit of a float.
1012
1013 /* On gcc-4.7 on amd64, */
1014 /* this results in a single add instruction to toggle the bit, and 4 extra */
1015 /* instructions to move the float value to an integer register and back. */
1016
1017 x = ecb_binary32_to_float (ecb_float_to_binary32 (x) ^ 0x80000000U)
1018
1019 float ecb_binary16_to_float (uint16_t x) [-UECB_NO_LIBM]
1020 float ecb_binary32_to_float (uint32_t x) [-UECB_NO_LIBM]
1021 double ecb_binary64_to_double (uint64_t x) [-UECB_NO_LIBM]
1022 The reverse operation of the previous function - takes the bit
1023 representation of an IEEE binary16, binary32 or binary64 number
1024 (half, single or double precision) and converts it to the native
1025 "float" or "double" format.
1026
1027 This function should work even when the native floating point
1028 format isn't IEEE compliant, of course at a speed and code size
1029 penalty, and of course also within reasonable limits (it tries to
1030 convert normals and denormals, and might be lucky for infinities,
1031 and with extraordinary luck, also for negative zero).
1032
1033 On all modern platforms (where "ECB_STDFP" is true), the compiler
1034 should be able to optimise away this function completely.
1035
1036 uint16_t ecb_binary32_to_binary16 (uint32_t x)
1037 uint32_t ecb_binary16_to_binary32 (uint16_t x)
1038 Convert a IEEE binary32/single precision to binary16/half format,
1039 and vice versa, handling all details (round-to-nearest-even,
1040 subnormals, infinity and NaNs) correctly.
1041
1042 These are functions are available under "-DECB_NO_LIBM", since they
1043 do not rely on the platform floating point format. The
1044 "ecb_float_to_binary16" and "ecb_binary16_to_float" functions are
1045 usually what you want.
1046
1047 ARITHMETIC
1048 x = ecb_mod (m, n)
1049 Returns "m" modulo "n", which is the same as the positive remainder
1050 of the division operation between "m" and "n", using floored
1051 division. Unlike the C remainder operator "%", this function
1052 ensures that the return value is always positive and that the two
1053 numbers m and m' = m + i * n result in the same value modulo n - in
1054 other words, "ecb_mod" implements the mathematical modulo
1055 operation, which is missing in the language.
1056
1057 "n" must be strictly positive (i.e. ">= 1"), while "m" must be
1058 negatable, that is, both "m" and "-m" must be representable in its
1059 type (this typically excludes the minimum signed integer value, the
1060 same limitation as for "/" and "%" in C).
1061
1062 Current GCC/clang versions compile this into an efficient
1063 branchless sequence on almost all CPUs.
1064
1065 For example, when you want to rotate forward through the members of
1066 an array for increasing "m" (which might be negative), then you
1067 should use "ecb_mod", as the "%" operator might give either
1068 negative results, or change direction for negative values:
1069
1070 for (m = -100; m <= 100; ++m)
1071 int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
1072
1073 x = ecb_div_rd (val, div)
1074 x = ecb_div_ru (val, div)
1075 Returns "val" divided by "div" rounded down or up, respectively.
1076 "val" and "div" must have integer types and "div" must be strictly
1077 positive. Note that these functions are implemented with macros in
1078 C and with function templates in C++.
1079
1080 UTILITY
1081 element_count = ecb_array_length (name)
1082 Returns the number of elements in the array "name". For example:
1083
1084 int primes[] = { 2, 3, 5, 7, 11 };
1085 int sum = 0;
1086
1087 for (i = 0; i < ecb_array_length (primes); i++)
1088 sum += primes [i];
1089
1090 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
1091 These symbols need to be defined before including ecb.h the first time.
1092
1093 ECB_NO_THREADS
1094 If ecb.h is never used from multiple threads, then this symbol can
1095 be defined, in which case memory fences (and similar constructs)
1096 are completely removed, leading to more efficient code and fewer
1097 dependencies.
1098
1099 Setting this symbol to a true value implies "ECB_NO_SMP".
1100
1101 ECB_NO_SMP
1102 The weaker version of "ECB_NO_THREADS" - if ecb.h is used from
1103 multiple threads, but never concurrently (e.g. if the system the
1104 program runs on has only a single CPU with a single core, no
1105 hyperthreading and so on), then this symbol can be defined, leading
1106 to more efficient code and fewer dependencies.
1107
1108 ECB_NO_LIBM
1109 When defined to 1, do not export any functions that might introduce
1110 dependencies on the math library (usually called -lm) - these are
1111 marked with [-UECB_NO_LIBM].
1112
1114 ecb.h is full of undocumented functionality as well, some of which is
1115 intended to be internal-use only, some of which we forgot to document,
1116 and some of which we hide because we are not sure we will keep the
1117 interface stable.
1118
1119 While you are welcome to rummage around and use whatever you find
1120 useful (we don't want to stop you), keep in mind that we will change
1121 undocumented functionality in incompatible ways without thinking twice,
1122 while we are considerably more conservative with documented things.
1123
1125 "libecb" is designed and maintained by:
1126
1127 Emanuele Giaquinta <e.giaquinta@glauco.it>
1128 Marc Alexander Lehmann <schmorp@schmorp.de>
1129
1130
1131
1132perl v5.34.0 2022-01-20 ECB(1)