1<float.h>(0P) POSIX Programmer's Manual <float.h>(0P)
2
3
4
6 float.h - floating types
7
9 #include <float.h>
10
12 The characteristics of floating types are defined in terms of a model
13 that describes a representation of floating-point numbers and values
14 that provide information about an implementation's floating-point
15 arithmetic.
16
17 The following parameters are used to define the model for each float‐
18 ing-point type:
19
20 s Sign (±1).
21
22 b Base or radix of exponent representation (an integer >1).
23
24 e Exponent (an integer between a minimum e_min and a maximum
25 e_max).
26
27 p Precision (the number of base-b digits in the significand).
28
29 f_k Non-negative integers less than b (the significand digits).
30
31
32 A floating-point number x is defined by the following model:
33
34 In addition to normalized floating-point numbers (f_1>0 if x!=0),
35 floating types may be able to contain other kinds of floating-point
36 numbers, such as subnormal floating-point numbers ( x!=0, e= e_min,
37 f_1=0) and unnormalized floating-point numbers ( x!=0, e> e_min,
38 f_1=0), and values that are not floating-point numbers, such as infini‐
39 ties and NaNs. A NaN is an encoding signifying Not-a-Number. A quiet
40 NaN propagates through almost every arithmetic operation without rais‐
41 ing a floating-point exception; a signaling NaN generally raises a
42 floating-point exception when occurring as an arithmetic operand.
43
44 The accuracy of the floating-point operations ( '+', '-', '*', '/' )
45 and of the library functions in <math.h> and <complex.h> that return
46 floating-point results is implementation-defined. The implementation
47 may state that the accuracy is unknown.
48
49 All integer values in the <float.h> header, except FLT_ROUNDS, shall be
50 constant expressions suitable for use in #if preprocessing directives;
51 all floating values shall be constant expressions. All except DECI‐
52 MAL_DIG, FLT_EVAL_METHOD, FLT_RADIX, and FLT_ROUNDS have separate names
53 for all three floating-point types. The floating-point model represen‐
54 tation is provided for all values except FLT_EVAL_METHOD and
55 FLT_ROUNDS.
56
57 The rounding mode for floating-point addition is characterized by the
58 implementation-defined value of FLT_ROUNDS:
59
60 -1 Indeterminable.
61
62 0 Toward zero.
63
64 1 To nearest.
65
66 2 Toward positive infinity.
67
68 3 Toward negative infinity.
69
70
71 All other values for FLT_ROUNDS characterize implementation-defined
72 rounding behavior.
73
74 The values of operations with floating operands and values subject to
75 the usual arithmetic conversions and of floating constants are evalu‐
76 ated to a format whose range and precision may be greater than required
77 by the type. The use of evaluation formats is characterized by the
78 implementation-defined value of FLT_EVAL_METHOD:
79
80 -1 Indeterminable.
81
82 0 Evaluate all operations and constants just to the range and pre‐
83 cision of the type.
84
85 1 Evaluate operations and constants of type float and double to
86 the range and precision of the double type; evaluate long double
87 operations and constants to the range and precision of the long
88 double type.
89
90 2 Evaluate all operations and constants to the range and precision
91 of the long double type.
92
93
94 All other negative values for FLT_EVAL_METHOD characterize implementa‐
95 tion-defined behavior.
96
97 The values given in the following list shall be defined as constant
98 expressions with implementation-defined values that are greater or
99 equal in magnitude (absolute value) to those shown, with the same sign.
100
101 * Radix of exponent representation, b.
102
103 FLT_RADIX
104 2
105
106
107 * Number of base-FLT_RADIX digits in the floating-point significand,
108 p.
109
110 FLT_MANT_DIG
111
112 DBL_MANT_DIG
113
114 LDBL_MANT_DIG
115
116
117 * Number of decimal digits, n, such that any floating-point number in
118 the widest supported floating type with p_max radix b digits can be
119 rounded to a floating-point number with n decimal digits and back
120 again without change to the value.
121
122 DECIMAL_DIG
123 10
124
125
126 * Number of decimal digits, q, such that any floating-point number
127 with q decimal digits can be rounded into a floating-point number
128 with p radix b digits and back again without change to the q decimal
129 digits.
130
131 FLT_DIG
132 6
133
134 DBL_DIG
135 10
136
137 LDBL_DIG
138 10
139
140
141 * Minimum negative integer such that FLT_RADIX raised to that power
142 minus 1 is a normalized floating-point number, e_min.
143
144 FLT_MIN_EXP
145
146 DBL_MIN_EXP
147
148 LDBL_MIN_EXP
149
150
151 * Minimum negative integer such that 10 raised to that power is in the
152 range of normalized floating-point numbers.
153
154 FLT_MIN_10_EXP
155 -37
156
157 DBL_MIN_10_EXP
158 -37
159
160 LDBL_MIN_10_EXP
161 -37
162
163
164 * Maximum integer such that FLT_RADIX raised to that power minus 1 is
165 a representable finite floating-point number, e_max.
166
167 FLT_MAX_EXP
168
169 DBL_MAX_EXP
170
171 LDBL_MAX_EXP
172
173
174 * Maximum integer such that 10 raised to that power is in the range of
175 representable finite floating-point numbers.
176
177 FLT_MAX_10_EXP
178 +37
179
180 DBL_MAX_10_EXP
181 +37
182
183 LDBL_MAX_10_EXP
184 +37
185
186
187 The values given in the following list shall be defined as constant
188 expressions with implementation-defined values that are greater than or
189 equal to those shown:
190
191 * Maximum representable finite floating-point number.
192
193 FLT_MAX
194 1E+37
195
196 DBL_MAX
197 1E+37
198
199 LDBL_MAX
200 1E+37
201
202
203 The values given in the following list shall be defined as constant
204 expressions with implementation-defined (positive) values that are less
205 than or equal to those shown:
206
207 * The difference between 1 and the least value greater than 1 that is
208 representable in the given floating-point type, b**1-p.
209
210 FLT_EPSILON
211 1E-5
212
213 DBL_EPSILON
214 1E-9
215
216 LDBL_EPSILON
217 1E-9
218
219
220 * Minimum normalized positive floating-point number, b**e_min.
221
222 FLT_MIN
223 1E-37
224
225 DBL_MIN
226 1E-37
227
228 LDBL_MIN
229 1E-37
230
231
232 The following sections are informative.
233
235 None.
236
238 None.
239
241 None.
242
244 <complex.h>, <math.h>
245
247 Portions of this text are reprinted and reproduced in electronic form
248 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
249 -- Portable Operating System Interface (POSIX), The Open Group Base
250 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
251 Electrical and Electronics Engineers, Inc and The Open Group. In the
252 event of any discrepancy between this version and the original IEEE and
253 The Open Group Standard, the original IEEE and The Open Group Standard
254 is the referee document. The original Standard can be obtained online
255 at http://www.opengroup.org/unix/online.html .
256
257
258
259IEEE/The Open Group 2003 <float.h>(0P)