lib/msun/man/ieee.3

   1 .\" Copyright (c) 1985 Regents of the University of California.
   2 .\" All rights reserved.
   3 .\"
   4 .\" Redistribution and use in source and binary forms, with or without
   5 .\" modification, are permitted provided that the following conditions
   6 .\" are met:
   7 .\" 1. Redistributions of source code must retain the above copyright
   8 .\"    notice, this list of conditions and the following disclaimer.
   9 .\" 2. Redistributions in binary form must reproduce the above copyright
  10 .\"    notice, this list of conditions and the following disclaimer in the
  11 .\"    documentation and/or other materials provided with the distribution.
  12 .\" 4. Neither the name of the University nor the names of its contributors
  13 .\"    may be used to endorse or promote products derived from this software
  14 .\"    without specific prior written permission.
  15 .\"
  16 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  17 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  18 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  19 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  20 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  21 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  22 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  23 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  24 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  25 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  26 .\" SUCH DAMAGE.
  27 .\"
  28 .\"     from: @(#)ieee.3        6.4 (Berkeley) 5/6/91
  29 .\" $FreeBSD$
  30 .\"
  31 .Dd January 26, 2005
  32 .Dt IEEE 3
  33 .Os
  34 .Sh NAME
  35 .Nm ieee
  36 .Nd IEEE standard 754 for floating-point arithmetic
  37 .Sh DESCRIPTION
  38 The IEEE Standard 754 for Binary Floating-Point Arithmetic
  39 defines representations of floating-point numbers and abstract
  40 properties of arithmetic operations relating to precision,
  41 rounding, and exceptional cases, as described below.
  42 .Ss IEEE STANDARD 754 Floating-Point Arithmetic
  43 Radix: Binary.
  44 .Pp
  45 Overflow and underflow:
  46 .Bd -ragged -offset indent -compact
  47 Overflow goes by default to a signed \*(If.
  48 Underflow is
  49 .Em gradual .
  50 .Ed
  51 .Pp
  52 Zero is represented ambiguously as +0 or \-0.
  53 .Bd -ragged -offset indent -compact
  54 Its sign transforms correctly through multiplication or
  55 division, and is preserved by addition of zeros
  56 with like signs; but x\-x yields +0 for every
  57 finite x.
  58 The only operations that reveal zero's
  59 sign are division by zero and
  60 .Fn copysign x \(+-0 .
  61 In particular, comparison (x > y, x \(>= y, etc.)\&
  62 cannot be affected by the sign of zero; but if
  63 finite x = y then \*(If = 1/(x\-y) \(!= \-1/(y\-x) = \-\*(If.
  64 .Ed
  65 .Pp
  66 Infinity is signed.
  67 .Bd -ragged -offset indent -compact
  68 It persists when added to itself
  69 or to any finite number.
  70 Its sign transforms
  71 correctly through multiplication and division, and
  72 (finite)/\(+-\*(If\0=\0\(+-0
  73 (nonzero)/0 = \(+-\*(If.
  74 But
  75 \*(If\-\*(If, \*(If\(**0 and \*(If/\*(If
  76 are, like 0/0 and sqrt(\-3),
  77 invalid operations that produce \*(Na. ...
  78 .Ed
  79 .Pp
  80 Reserved operands (\*(Nas):
  81 .Bd -ragged -offset indent -compact
  82 An \*(Na is
  83 .Em ( N Ns ot Em a N Ns umber ) .
  84 Some \*(Nas, called Signaling \*(Nas, trap any floating-point operation
  85 performed upon them; they are used to mark missing
  86 or uninitialized values, or nonexistent elements
  87 of arrays.
  88 The rest are Quiet \*(Nas; they are
  89 the default results of Invalid Operations, and
  90 propagate through subsequent arithmetic operations.
  91 If x \(!= x then x is \*(Na; every other predicate
  92 (x > y, x = y, x < y, ...) is FALSE if \*(Na is involved.
  93 .Ed
  94 .Pp
  95 Rounding:
  96 .Bd -ragged -offset indent -compact
  97 Every algebraic operation (+, \-, \(**, /,
  98 \(sr)
  99 is rounded by default to within half an
 100 .Em ulp ,
 101 and when the rounding error is exactly half an
 102 .Em ulp
 103 then
 104 the rounded value's least significant bit is zero.
 105 (An
 106 .Em ulp
 107 is one
 108 .Em U Ns nit
 109 in the
 110 .Em L Ns ast
 111 .Em P Ns lace . )
 112 This kind of rounding is usually the best kind,
 113 sometimes provably so; for instance, for every
 114 x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find
 115 (x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ...
 116 despite that both the quotients and the products
 117 have been rounded.
 118 Only rounding like IEEE 754 can do that.
 119 But no single kind of rounding can be
 120 proved best for every circumstance, so IEEE 754
 121 provides rounding towards zero or towards
 122 +\*(If or towards \-\*(If
 123 at the programmer's option.
 124 .Ed
 125 .Pp
 126 Exceptions:
 127 .Bd -ragged -offset indent -compact
 128 IEEE 754 recognizes five kinds of floating-point exceptions,
 129 listed below in declining order of probable importance.
 130 .Bl -column -offset indent "Invalid Operation" "Gradual Underflow"
 131 .Em "Exception  Default Result"
 132 Invalid Operation       \*(Na, or FALSE
 133 Overflow        \(+-\*(If
 134 Divide by Zero  \(+-\*(If
 135 Underflow       Gradual Underflow
 136 Inexact Rounded value
 137 .El
 138 .Pp
 139 NOTE: An Exception is not an Error unless handled
 140 badly.
 141 What makes a class of exceptions exceptional
 142 is that no single default response can be satisfactory
 143 in every instance.
 144 On the other hand, if a default
 145 response will serve most instances satisfactorily,
 146 the unsatisfactory instances cannot justify aborting
 147 computation every time the exception occurs.
 148 .Ed
 149 .Ss Data Formats
 150 Single-precision:
 151 .Bd -ragged -offset indent -compact
 152 Type name:
 153 .Vt float
 154 .Pp
 155 Wordsize: 32 bits.
 156 .Pp
 157 Precision: 24 significant bits,
 158 roughly like 7 significant decimals.
 159 .Bd -ragged -offset indent -compact
 160 If x and x' are consecutive positive single-precision
 161 numbers (they differ by 1
 162 .Em ulp ) ,
 163 then
 164 .Bd -ragged -compact
 165 5.9e\-08 < 0.5**24 < (x'\-x)/x \(<= 0.5**23 < 1.2e\-07.
 166 .Ed
 167 .Ed
 168 .Pp
 169 .Bl -column "XXX" -compact
 170 Range:  Overflow threshold  = 2.0**128 = 3.4e38
 171         Underflow threshold = 0.5**126 = 1.2e\-38
 172 .El
 173 .Bd -ragged -offset indent -compact
 174 Underflowed results round to the nearest
 175 integer multiple of 0.5**149 = 1.4e\-45.
 176 .Ed
 177 .Ed
 178 .Pp
 179 Double-precision:
 180 .Bd -ragged -offset indent -compact
 181 Type name:
 182 .Vt double
 183 .Bd -ragged -offset indent -compact
 184 On some architectures,
 185 .Vt long double
 186 is the the same as
 187 .Vt double .
 188 .Ed
 189 .Pp
 190 Wordsize: 64 bits.
 191 .Pp
 192 Precision: 53 significant bits,
 193 roughly like 16 significant decimals.
 194 .Bd -ragged -offset indent -compact
 195 If x and x' are consecutive positive double-precision
 196 numbers (they differ by 1
 197 .Em ulp ) ,
 198 then
 199 .Bd -ragged -compact
 200 1.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16.
 201 .Ed
 202 .Ed
 203 .Pp
 204 .Bl -column "XXX" -compact
 205 Range:  Overflow threshold  = 2.0**1024 = 1.8e308
 206         Underflow threshold = 0.5**1022 = 2.2e\-308
 207 .El
 208 .Bd -ragged -offset indent -compact
 209 Underflowed results round to the nearest
 210 integer multiple of 0.5**1074 = 4.9e\-324.
 211 .Ed
 212 .Ed
 213 .Pp
 214 Extended-precision:
 215 .Bd -ragged -offset indent -compact
 216 Type name:
 217 .Vt long double
 218 (when supported by the hardware)
 219 .Pp
 220 Wordsize: 96 bits.
 221 .Pp
 222 Precision: 64 significant bits,
 223 roughly like 19 significant decimals.
 224 .Bd -ragged -offset indent -compact
 225 If x and x' are consecutive positive extended-precision
 226 numbers (they differ by 1
 227 .Em ulp ) ,
 228 then
 229 .Bd -ragged -compact
 230 1.0e\-19 < 0.5**63 < (x'\-x)/x \(<= 0.5**62 < 2.2e\-19.
 231 .Ed
 232 .Ed
 233 .Pp
 234 .Bl -column "XXX" -compact
 235 Range:  Overflow threshold  = 2.0**16384 = 1.2e4932
 236         Underflow threshold = 0.5**16382 = 3.4e\-4932
 237 .El
 238 .Bd -ragged -offset indent -compact
 239 Underflowed results round to the nearest
 240 integer multiple of 0.5**16445 = 5.7e\-4953.
 241 .Ed
 242 .Ed
 243 .Pp
 244 Quad-extended-precision:
 245 .Bd -ragged -offset indent -compact
 246 Type name:
 247 .Vt long double
 248 (when supported by the hardware)
 249 .Pp
 250 Wordsize: 128 bits.
 251 .Pp
 252 Precision: 113 significant bits,
 253 roughly like 34 significant decimals.
 254 .Bd -ragged -offset indent -compact
 255 If x and x' are consecutive positive quad-extended-precision
 256 numbers (they differ by 1
 257 .Em ulp ) ,
 258 then
 259 .Bd -ragged -compact
 260 9.6e\-35 < 0.5**113 < (x'\-x)/x \(<= 0.5**112 < 2.0e\-34.
 261 .Ed
 262 .Ed
 263 .Pp
 264 .Bl -column "XXX" -compact
 265 Range:  Overflow threshold  = 2.0**16384 = 1.2e4932
 266         Underflow threshold = 0.5**16382 = 3.4e\-4932
 267 .El
 268 .Bd -ragged -offset indent -compact
 269 Underflowed results round to the nearest
 270 integer multiple of 0.5**16494 = 6.5e\-4966.
 271 .Ed
 272 .Ed
 273 .Ss Additional Information Regarding Exceptions
 274 .Pp
 275 For each kind of floating-point exception, IEEE 754
 276 provides a Flag that is raised each time its exception
 277 is signaled, and stays raised until the program resets
 278 it.
 279 Programs may also test, save and restore a flag.
 280 Thus, IEEE 754 provides three ways by which programs
 281 may cope with exceptions for which the default result
 282 might be unsatisfactory:
 283 .Bl -enum
 284 .It
 285 Test for a condition that might cause an exception
 286 later, and branch to avoid the exception.
 287 .It
 288 Test a flag to see whether an exception has occurred
 289 since the program last reset its flag.
 290 .It
 291 Test a result to see whether it is a value that only
 292 an exception could have produced.
 293 .Pp
 294 CAUTION: The only reliable ways to discover
 295 whether Underflow has occurred are to test whether
 296 products or quotients lie closer to zero than the
 297 underflow threshold, or to test the Underflow
 298 flag.
 299 (Sums and differences cannot underflow in
 300 IEEE 754; if x \(!= y then x\-y is correct to
 301 full precision and certainly nonzero regardless of
 302 how tiny it may be.)
 303 Products and quotients that
 304 underflow gradually can lose accuracy gradually
 305 without vanishing, so comparing them with zero
 306 (as one might on a VAX) will not reveal the loss.
 307 Fortunately, if a gradually underflowed value is
 308 destined to be added to something bigger than the
 309 underflow threshold, as is almost always the case,
 310 digits lost to gradual underflow will not be missed
 311 because they would have been rounded off anyway.
 312 So gradual underflows are usually
 313 .Em provably
 314 ignorable.
 315 The same cannot be said of underflows flushed to 0.
 316 .El
 317 .Pp
 318 At the option of an implementor conforming to IEEE 754,
 319 other ways to cope with exceptions may be provided:
 320 .Bl -enum
 321 .It
 322 ABORT.
 323 This mechanism classifies an exception in
 324 advance as an incident to be handled by means
 325 traditionally associated with error-handling
 326 statements like "ON ERROR GO TO ...".
 327 Different
 328 languages offer different forms of this statement,
 329 but most share the following characteristics:
 330 .Bl -dash
 331 .It
 332 No means is provided to substitute a value for
 333 the offending operation's result and resume
 334 computation from what may be the middle of an
 335 expression.
 336 An exceptional result is abandoned.
 337 .It
 338 In a subprogram that lacks an error-handling
 339 statement, an exception causes the subprogram to
 340 abort within whatever program called it, and so
 341 on back up the chain of calling subprograms until
 342 an error-handling statement is encountered or the
 343 whole task is aborted and memory is dumped.
 344 .El
 345 .It
 346 STOP.
 347 This mechanism, requiring an interactive
 348 debugging environment, is more for the programmer
 349 than the program.
 350 It classifies an exception in
 351 advance as a symptom of a programmer's error; the
 352 exception suspends execution as near as it can to
 353 the offending operation so that the programmer can
 354 look around to see how it happened.
 355 Quite often
 356 the first several exceptions turn out to be quite
 357 unexceptionable, so the programmer ought ideally
 358 to be able to resume execution after each one as if
 359 execution had not been stopped.
 360 .It
 361 \&... Other ways lie beyond the scope of this document.
 362 .El
 363 .Pp
 364 Ideally, each
 365 elementary function should act as if it were indivisible, or
 366 atomic, in the sense that ...
 367 .Bl -enum
 368 .It
 369 No exception should be signaled that is not deserved by
 370 the data supplied to that function.
 371 .It
 372 Any exception signaled should be identified with that
 373 function rather than with one of its subroutines.
 374 .It
 375 The internal behavior of an atomic function should not
 376 be disrupted when a calling program changes from
 377 one to another of the five or so ways of handling
 378 exceptions listed above, although the definition
 379 of the function may be correlated intentionally
 380 with exception handling.
 381 .El
 382 .Pp
 383 The functions in
 384 .Nm libm
 385 are only approximately atomic.
 386 They signal no inappropriate exception except possibly ...
 387 .Bl -tag -width indent -offset indent -compact
 388 .It Xo
 389 Over/Underflow
 390 .Xc
 391 when a result, if properly computed, might have lain barely within range, and
 392 .It Xo
 393 Inexact in
 394 .Fn cabs ,
 395 .Fn cbrt ,
 396 .Fn hypot ,
 397 .Fn log10
 398 and
 399 .Fn pow
 400 .Xc
 401 when it happens to be exact, thanks to fortuitous cancellation of errors.
 402 .El
 403 Otherwise, ...
 404 .Bl -tag -width indent -offset indent -compact
 405 .It Xo
 406 Invalid Operation is signaled only when
 407 .Xc
 408 any result but \*(Na would probably be misleading.
 409 .It Xo
 410 Overflow is signaled only when
 411 .Xc
 412 the exact result would be finite but beyond the overflow threshold.
 413 .It Xo
 414 Divide-by-Zero is signaled only when
 415 .Xc
 416 a function takes exactly infinite values at finite operands.
 417 .It Xo
 418 Underflow is signaled only when
 419 .Xc
 420 the exact result would be nonzero but tinier than the underflow threshold.
 421 .It Xo
 422 Inexact is signaled only when
 423 .Xc
 424 greater range or precision would be needed to represent the exact result.
 425 .El
 426 .Sh SEE ALSO
 427 .Xr fenv 3 ,
 428 .Xr ieee_test 3 ,
 429 .Xr math 3
 430 .Pp
 431 An explanation of IEEE 754 and its proposed extension p854
 432 was published in the IEEE magazine MICRO in August 1984 under
 433 the title "A Proposed Radix- and Word-length-independent
 434 Standard for Floating-point Arithmetic" by
 435 .An "W. J. Cody"
 436 et al.
 437 The manuals for Pascal, C and BASIC on the Apple Macintosh
 438 document the features of IEEE 754 pretty well.
 439 Articles in the IEEE magazine COMPUTER vol.\& 14 no.\& 3 (Mar.\&
 440 1981), and in the ACM SIGNUM Newsletter Special Issue of
 441 Oct.\& 1979, may be helpful although they pertain to
 442 superseded drafts of the standard.
 443 .Sh STANDARDS
 444 .St -ieee754