In computing, octuple precision is a binary floating-point-based computer number format that occupies 32 bytes (256 bits) in computer memory. This 256-bit octuple precision is for applications requiring results in higher than quadruple precision. This format is rarely (if ever) used and very few environments support it.
In its 2008 revision, the IEEE 754 standard specifies a binary256 format among the interchange formats (it is not a basic format), as having:
The format is written with an implicit lead bit with value 1 unless the exponent is all zeros. Thus only 236 bits of the significand appear in the memory format, but the total precision is 237 bits (approximately 71 decimal digits: log_{10}(2^{237}) ≈ 71.344). The bits are laid out as follows:
The octuple-precision binary floating-point exponent is encoded using an offset binary representation, with the zero offset being 262143; also known as exponent bias in the IEEE 754 standard.
Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 262143 has to be subtracted from the stored exponent.
The stored exponents 00000_{16} and 7FFFF_{16} are interpreted specially.
Exponent | Significand zero | Significand non-zero | Equation |
---|---|---|---|
00000_{16} | 0, −0 | subnormal numbers | (-1)^{signbit} × 2^{−262142} × 0.significandbits_{2} |
00001_{16}, ..., 7FFFE_{16} | normalized value | (-1)^{signbit} × 2^{exponent bits2} × 1.significandbits_{2} | |
7FFFF_{16} | ±∞ | NaN (quiet, signalling) |
The minimum strictly positive (subnormal) value is 2^{−262378} ≈ 10^{−78984} and has a precision of only one bit. The minimum positive normal value is 2^{−262142} ≈ 2.4824 × 10^{−78913}. The maximum representable value is 2^{262144} − 2^{261907} ≈ 1.6113 × 10^{78913}.
These examples are given in bit representation, in hexadecimal, of the floating-point value. This includes the sign, (biased) exponent, and significand.
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = +0 8000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = −0
7fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = +infinity ffff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = −infinity
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001_{16} = 2^{−262142} × 2^{−236} = 2^{−262378} ≈ 2.24800708647703657297018614776265182597360918266100276294348974547709294462 × 10^{−78984} (smallest positive subnormal number)
0000 0fff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff_{16} = 2^{−262142} × (1 − 2^{−236}) ≈ 2.4824279514643497882993282229138717236776877060796468692709532979137875392 × 10^{−78913} (largest subnormal number)
0000 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = 2^{−262142} ≈ 2.48242795146434978829932822291387172367768770607964686927095329791378756168 × 10^{−78913} (smallest positive normal number)
7fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff_{16} = 2^{262143} × (2 − 2^{−236}) ≈ 1.61132571748576047361957211845200501064402387454966951747637125049607182699 × 10^{78913} (largest normal number)
3fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff_{16} = 1 − 2^{−237} ≈ 0.999999999999999999999999999999999999999999999999999999999999999999999995472 (largest number less than one)
3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000_{16} = 1 (one)
3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001_{16} = 1 + 2^{−236} ≈ 1.00000000000000000000000000000000000000000000000000000000000000000000000906 (smallest number larger than one)
By default, 1/3 rounds down like double precision, because of the odd number of bits in the significand.
So the bits beyond the rounding point are 0101...
which is less than 1/2 of a unit in the last place.
Octuple precision is rarely implemented since usage of it is extremely rare. Apple Inc. had an implementation of addition, subtraction and multiplication of octuple-precision numbers with a 224-bit two's complement significand and a 32-bit exponent.^{[1]} One can use general arbitrary-precision arithmetic libraries to obtain octuple (or higher) precision, but specialized octuple-precision implementations may achieve higher performance.
There is no hardware support for octuple precision. Apart from astrophysical simulations, octuple-precision arithmetic is too impractical for most commercial uses, making its implementation very rare.
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.
The standard defines:
arithmetic formats: sets of binary and decimal floating-point data, which consist of finite numbers (including signed zeros and subnormal numbers), infinities, and special "not a number" values (NaNs)
interchange formats: encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form
rounding rules: properties to be satisfied when rounding numbers during arithmetic and conversions
operations: arithmetic and other operations (such as trigonometric functions) on arithmetic formats
exception handling: indications of exceptional conditions (such as division by zero, overflow, etc.)The current version, IEEE 754-2008 revision published in August 2008, includes nearly all of the original IEEE 754-1985 standard plus IEEE 854-1987 Standard for Radix-Independent Floating-Point Arithmetic.
Precision (computer science)In computer science, the precision of a numerical quantity is a measure of the detail in which the quantity is expressed. This is usually measured in bits, but sometimes in decimal digits. It is related to precision in mathematics, which describes the number of digits that are used to express a value.
Some of the standardized precision formats are
Half-precision floating-point format
Single-precision floating-point format
Double-precision floating-point format
Quadruple-precision floating-point format
Octuple-precision floating-point formatOf these, octuple-precision format is rarely used. The single- and double-precision formats are most widely used and supported on nearly all platforms. The use of half-precision format has been increasing especially in the field of machine learning since many machine learning algorithms are inherently error-tolerant.
Uninterpreted | |
---|---|
Numeric | |
Pointer | |
Text | |
Composite | |
Other | |
Related topics | |
See also platform-dependent and independent units of information |
This page is based on a Wikipedia article written by authors
(here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.