Doubleprecision floatingpoint format is a computer number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.
Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient.
In the IEEE 7542008 standard, the 64bit base2 format is officially referred to as binary64; it was called double in IEEE 7541985. IEEE 754 specifies additional floatingpoint formats, including 32bit base2 single precision and, more recently, base10 representations.
One of the first programming languages to provide single and doubleprecision floatingpoint data types was Fortran. Before the widespread adoption of IEEE 7541985, the representation and properties of floatingpoint data types depended on the computer manufacturer and computer model, and upon decisions made by programminglanguage implementers. E.g., GWBASIC's doubleprecision data type was the 64bit MBF floatingpoint format.
Doubleprecision binary floatingpoint is a commonly used format on PCs, due to its wider range over singleprecision floating point, in spite of its performance and bandwidth cost. As with singleprecision floatingpoint format, it lacks precision on integer numbers when compared with an integer format of the same size. It is commonly known simply as double. The IEEE 754 standard specifies a binary64 as having:
The sign bit determines the sign of the number (including when this number is zero, which is signed).
The exponent field can be interpreted as either an 11bit signed integer from −1024 to 1023 (2's complement) or an 11bit unsigned integer from 0 to 2047, which is the accepted biased form in the IEEE 754 binary64 definition. If the unsigned integer format is used, the exponent value used in the arithmetic is the exponent shifted by a bias – for the IEEE 754 binary64 case, an exponent value of 1023 represents the actual zero (i.e. for 2^{e − 1023} to be one, e must be 1023). Exponents range from −1022 to +1023 because exponents of −1023 (all 0s) and +1024 (all 1s) are reserved for special numbers.
The 53bit significand precision gives from 15 to 17 significant decimal digits precision (2^{−53} ≈ 1.11 × 10^{−16}). If a decimal string with at most 15 significant digits is converted to IEEE 754 doubleprecision representation, and then converted back to a decimal string with the same number of digits, the final result should match the original string. If an IEEE 754 doubleprecision number is converted to a decimal string with at least 17 significant digits, and then converted back to doubleprecision representation, the final result must match the original number.^{[1]}
The format is written with the significand having an implicit integer bit of value 1 (except for special data, see the exponent encoding below). With the 52 bits of the fraction significand appearing in the memory format, the total precision is therefore 53 bits (approximately 16 decimal digits, 53 log_{10}(2) ≈ 15.955). The bits are laid out as follows:
The real value assumed by a given 64bit doubleprecision datum with a given biased exponent and a 52bit fraction is
or
Between 2^{52}=4,503,599,627,370,496 and 2^{53}=9,007,199,254,740,992 the representable numbers are exactly the integers. For the next range, from 2^{53} to 2^{54}, everything is multiplied by 2, so the representable numbers are the even ones, etc. Conversely, for the previous range from 2^{51} to 2^{52}, the spacing is 0.5, etc.
The spacing as a fraction of the numbers in the range from 2^{n} to 2^{n+1} is 2^{n−52}. The maximum relative rounding error when rounding a number to the nearest representable one (the machine epsilon) is therefore 2^{−53}.
The 11 bit width of the exponent allows the representation of numbers between 10^{−308} and 10^{308}, with full 15–17 decimal digits precision. By compromising precision, the subnormal representation allows even smaller values up to about 5 × 10^{−324}.
The doubleprecision binary floatingpoint exponent is encoded using an offsetbinary representation, with the zero offset being 1023; also known as exponent bias in the IEEE 754 standard. Examples of such representations would be:
e =00000000001_{2} =001_{16} =1:

(smallest exponent for normal numbers)  
e =01111111111_{2} =3ff_{16} =1023:

(zero offset)  
e =10000000101_{2} =405_{16} =1029:


e =11111111110_{2} =7fe_{16} =2046:

(highest exponent) 
The exponents 000_{16}
and 7ff_{16}
have a special meaning:
00000000000_{2}
=000_{16}
is used to represent a signed zero (if F = 0) and subnormals (if F ≠ 0); and11111111111_{2}
=7ff_{16}
is used to represent ∞ (if F = 0) and NaNs (if F ≠ 0),where F is the fractional part of the significand. All bit patterns are valid encoding.
Except for the above exceptions, the entire doubleprecision number is described by:
In the case of subnormals (e = 0) the doubleprecision number is described by:
Although the ubiquitous x86 processors of today use littleendian storage for all types of data (integer, floating point, BCD), there are a number of hardware architectures where floatingpoint numbers are represented in bigendian form while integers are represented in littleendian form.^{[2]} There are ARM processors that have half littleendian, half bigendian floatingpoint representation for doubleprecision numbers: both 32bit words are stored in littleendian like integer registers, but the most significant one first. Because there have been many floatingpoint formats with no "network" standard representation for them, the XDR standard uses bigendian IEEE 754 as its representation. It may therefore appear strange that the widespread IEEE 754 floatingpoint standard does not specify endianness.^{[3]} Theoretically, this means that even standard IEEE floatingpoint data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may in practice safely assume that the endianness is the same for floatingpoint numbers as for integers, making the conversion straightforward regardless of data type. (Small embedded systems using special floatingpoint formats may be another matter however.)
0 01111111111 0000000000000000000000000000000000000000000000000000_{2} ≙ +2^{0} × 1 = 1

0 01111111111 0000000000000000000000000000000000000000000000000001_{2} ≙ +2^{0} × (1 + 2^{−52}) ≈ 1.0000000000000002, the smallest number > 1

0 01111111111 0000000000000000000000000000000000000000000000000010_{2} ≙ +2^{0} × (1 + 2^{−51}) ≈ 1.0000000000000004

0 10000000000 0000000000000000000000000000000000000000000000000000_{2} ≙ +2^{1} × 1 = 2

1 10000000000 0000000000000000000000000000000000000000000000000000_{2} ≙ −2^{1} × 1 = −2

0 10000000000 1000000000000000000000000000000000000000000000000000_{2} ≙ +2^{1} × 1.1_{2}

= 11_{2} = 3 
0 10000000001 0000000000000000000000000000000000000000000000000000_{2} ≙ +2^{2} × 1

= 100_{2} = 4 
0 10000000001 0100000000000000000000000000000000000000000000000000_{2} ≙ +2^{2} × 1.01_{2}

= 101_{2} = 5 
0 10000000001 1000000000000000000000000000000000000000000000000000_{2} ≙ +2^{2} × 1.1_{2}

= 110_{2} = 6 
0 10000000011 0111000000000000000000000000000000000000000000000000_{2} ≙ +2^{4} × 1.0111_{2}

= 10111_{2} = 23 
0 01111111000 1000000000000000000000000000000000000000000000000000_{2} ≙ +2^{7} × 1.1_{2}

= 0.00000011_{2} = 0.01171875 (3/256) 
0 00000000000 0000000000000000000000000000000000000000000000000001_{2}

≙ +2^{−1022} × 2^{−52} = 2^{−1074} ≈ 4.9406564584124654 × 10^{−324} 
(Min. subnormal positive double) 
0 00000000000 1111111111111111111111111111111111111111111111111111_{2}

≙ +2^{−1022} × (1 − 2^{−52}) ≈ 2.2250738585072009 × 10^{−308} 
(Max. subnormal double) 
0 00000000001 0000000000000000000000000000000000000000000000000000_{2}

≙ +2^{−1022} × 1 ≈ 2.2250738585072014 × 10^{−308} 
(Min. normal positive double) 
0 11111111110 1111111111111111111111111111111111111111111111111111_{2}

≙ +2^{1023} × (1 + (1 − 2^{−52})) ≈ 1.7976931348623157 × 10^{308} 
(Max. Double) 
0 00000000000 0000000000000000000000000000000000000000000000000000_{2} ≙ +0


1 00000000000 0000000000000000000000000000000000000000000000000000_{2} ≙ −0


0 11111111111 0000000000000000000000000000000000000000000000000000_{2} ≙ +∞

(positive infinity)  
1 11111111111 0000000000000000000000000000000000000000000000000000_{2} ≙ −∞

(negative infinity)  
0 11111111111 0000000000000000000000000000000000000000000000000001_{2} ≙ NaN

(sNaN on most processors, such as x86 and ARM)  
0 11111111111 1000000000000000000000000000000000000000000000000001_{2} ≙ NaN

(qNaN on most processors, such as x86 and ARM)  
0 11111111111 1111111111111111111111111111111111111111111111111111_{2} ≙ NaN

(an alternative encoding) 
0 01111111101 0101010101010101010101010101010101010101010101010101_{2} = 3fd5 5555 5555 5555_{16}

≙ +2^{−2} × (1 + 2^{−2} + 2^{−4} + ... + 2^{−52}) ≈ ^{1}/_{3} 
0 10000000000 1001001000011111101101010100010001000010110100011000_{2} = 4009 21fb 5444 2d18_{16}

≈ pi 
Encodings of qNaN and sNaN are not completely specified in IEEE 754 and depend on the processor. Most processors, such as the x86 family and the ARM family processors, use the most significant bit of the significand field to indicate a quiet NaN; this is what is recommended by IEEE 754. The PARISC processors use the bit to indicate a signaling NaN.
By default, ^{1}/_{3} rounds down, instead of up like single precision, because of the odd number of bits in the significand.
In more detail:
Given the hexadecimal representation 3FD5 5555 5555 5555_{16}, Sign = 0 Exponent = 3FD_{16} = 1021 Exponent Bias = 1023 (constant value; see above) Fraction = 5 5555 5555 5555_{16} Value = 2^{(Exponent − Exponent Bias)} × 1.Fraction – Note that Fraction must not be converted to decimal here = 2^{−2} × (15 5555 5555 5555_{16} × 2^{−52}) = 2^{−54} × 15 5555 5555 5555_{16} = 0.333333333333333314829616256247390992939472198486328125 ≈ 1/3
Using doubleprecision floatingpoint variables and mathematical functions (e.g., sin, cos, atan2, log, exp and sqrt) are slower than working with their single precision counterparts. One area of computing where this is a particular issue is for parallel code running on GPUs. For example, when using NVIDIA's CUDA platform, calculations with double precision take approximately 2 times longer to complete compared to those done using single precision.^{[4]}
Doubles are implemented in many programming languages in different ways such as the following. On processors with only dynamic precision, such as x86 without SSE2 (or when SSE2 is not used, for compatibility purpose) and with extended precision used by default, software may have difficulties to fulfill some requirements.
C and C++ offer a wide variety of arithmetic types. Double precision is not required by the standards (except by the optional annex F of C99, covering IEEE 754 arithmetic), but on most systems, the double
type corresponds to double precision. However, on 32bit x86 with extended precision by default, some compilers may not conform to the C standard and/or the arithmetic may suffer from double rounding.^{[5]}
Common Lisp provides the types SHORTFLOAT, SINGLEFLOAT, DOUBLEFLOAT and LONGFLOAT. Most implementations provide SINGLEFLOATs and DOUBLEFLOATs with the other types appropriate synonyms. Common Lisp provides exceptions for catching floatingpoint underflows and overflows, and the inexact floatingpoint exception, as per IEEE 754. No infinities and NaNs are described in the ANSI standard, however, several implementations do provide these as extensions.
On Java before version 1.2, every implementation had to be IEEE 754 compliant. Version 1.2 allowed implementations to bring extra precision in intermediate computations for platforms like x87. Thus a modifier strictfp was introduced to enforce strict IEEE 754 computations.
As specified by the ECMAScript standard, all arithmetic in JavaScript shall be done using doubleprecision floatingpoint arithmetic.^{[6]}
170 (one hundred [and] seventy) is the natural number following 169 and preceding 171.
Exponentially modified Gaussian distributionIn probability theory, an exponentially modified Gaussian (EMG) distribution (exGaussian distribution) describes the sum of independent normal and exponential random variables. An exGaussian random variable Z may be expressed as Z = X + Y, where X and Y are independent, X is Gaussian with mean μ and variance σ2, and Y is exponential of rate λ. It has a characteristic positive skew from the exponential component.
It may also be regarded as a weighted function of a shifted exponential with the weight being a function of the normal distribution.
Glossary of computer scienceMost of the terms listed in Wikipedia glossaries are already defined and explained within Wikipedia itself. However, glossaries like this one are useful for looking up, comparing and reviewing large numbers of terms together. You can help enhance this page by adding new terms or writing definitions for existing ones.
This glossary of computer science terms is a list of definitions about computer science, its subdisciplines, and related fields.
IBM hexadecimal floating pointIBM System/360 computers, and subsequent machines based on that architecture (mainframes), support a hexadecimal floatingpoint format (HFP).In comparison to IEEE 754 floatingpoint, the IBM floatingpoint format has a longer significand, and a shorter exponent. All IBM floatingpoint formats have 7 bits of exponent with a bias of 64. The normalized range of representable numbers is from 16−65 to 1663 (approx. 5.39761 × 10−79 to 7.237005 × 1075).
The number is represented as the following formula: (−1)sign × 0.significand × 16exponent−64.
JSONIn computing, JavaScript Object Notation (JSON) ( "Jason", ) is an openstandard file format that uses humanreadable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value). It is a very common data format used for asynchronous browser–server communication, including as a replacement for XML in some AJAXstyle systems.JSON is a languageindependent data format. It was derived from JavaScript, but as of 2017 many programming languages include code to generate and parse JSONformat data. The official Internet media type for JSON is application/json. JSON filenames use the extension .json.
Douglas Crockford originally specified the JSON format in the early 2000s; two competing standards, RFC 8259 and ECMA404, defined it in 2017. The ECMA standard describes only the allowed syntax, whereas the RFC covers some security and interoperability considerations.A restricted profile of JSON, known as IJSON (short for "Internet JSON"), seeks to overcome some of the interoperability problems with JSON. It is defined in RFC 7493.
John Gustafson (scientist)John Leroy Gustafson (born January 19, 1955) is an American computer scientist and businessman, chiefly known for his work in High Performance Computing (HPC) such as the invention of Gustafson's law, introducing the first commercial computer cluster, measuring with QUIPS, leading the reconstruction of the Atanasoff–Berry computer, inventing the unum number format and computation system, and several awards for computer speedup. Currently he is the Chief Technology Officer at Ceranovo, Inc. He was the Chief Graphics Product Architect and Senior Fellow at AMD from September 2012 until June 2013, and he previously held the positions of Director of Intel LabsSC, CEO of Massively Parallel Technologies, Inc. and CTO at ClearSpeed Technology. Gustafson holds applied mathematics degrees from the California Institute of Technology and Iowa State University.
Numeric precision in Microsoft ExcelAs with other spreadsheets, Microsoft Excel works only to limited accuracy because it retains only a certain number of figures to describe numbers (it has limited precision). With some exceptions regarding erroneous values, infinities, and denormalized numbers, Excel calculates in doubleprecision floatingpoint format from the IEEE 754 specification (besides numbers, Excel uses a few other data types). Although Excel can display 30 decimal places, its precision for a specified number is confined to 15 significant figures, and calculations may have an accuracy that is even less due to three issues: round off, truncation, and binary storage.
Orders of magnitude (numbers)This list contains selected positive numbers in increasing order, including counts of things, dimensionless quantity and probabilities. Each number is given a name in the short scale, which is used in Englishspeaking countries, as well as a name in the long scale, which is used in some of the countries that do not have English as their national language.
Poisson distributionIn probability theory and statistics, the Poisson distribution (French pronunciation: [pwasɔ̃]; in English often rendered ), named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.
For instance, an individual keeping track of the amount of mail they receive each day may notice that they receive an average number of 4 letters per day. If receiving any particular piece of mail does not affect the arrival times of future pieces of mail, i.e., if pieces of mail from a wide range of sources arrive independently of one another, then a reasonable assumption is that the number of pieces of mail received in a day obeys a Poisson distribution. Other examples that may follow a Poisson include the number of phone calls received by a call center per hour and the number of decay events per second from a radioactive source.
Power of twoIn mathematics, a power of two is a number of the form 2n where n is an integer, that is, the result of exponentiation with number two as the base and integer n as the exponent.
In a context where only integers are considered, n is restricted to nonnegative values, so we have 1, 2, and 2 multiplied by itself a certain number of times.Because two is the base of the binary numeral system, powers of two are common in computer science. Written in binary, a power of two always has the form 100…000 or 0.00…001, just like a power of ten in the decimal system.
Precision (computer science)In computer science, the precision of a numerical quantity is a measure of the detail in which the quantity is expressed. This is usually measured in bits, but sometimes in decimal digits. It is related to precision in mathematics, which describes the number of digits that are used to express a value.
Some of the standardized precision formats are
Halfprecision floatingpoint format
Singleprecision floatingpoint format
Doubleprecision floatingpoint format
Quadrupleprecision floatingpoint format
Octupleprecision floatingpoint formatOf these, octupleprecision format is rarely used. The single and doubleprecision formats are most widely used and supported on nearly all platforms. The use of halfprecision format has been increasing especially in the field of machine learning since many machine learning algorithms are inherently errortolerant.
Radeon HD 5000 SeriesThe Evergreen series is a family of GPUs developed by Advanced Micro Devices for its Radeon line under the ATI brand name. It was employed in Radeon HD 5000 graphics card series and competed directly with Nvidia's GeForce 400 Series.
SineIn mathematics, the sine is a trigonometric function of an angle. The sine of an acute angle is defined in the context of a right triangle: for the specified angle, it is the ratio of the length of the side that is opposite that angle to the length of the longest side of the triangle (the hypotenuse).
More generally, the definition of sine (and other trigonometric functions) can be extended to any real value in terms of the length of a certain line segment in a unit circle. More modern definitions express the sine as an infinite series or as the solution of certain differential equations, allowing their extension to arbitrary positive and negative values and even to complex numbers.
The sine function is commonly used to model periodic phenomena such as sound and light waves, the position and velocity of harmonic oscillators, sunlight intensity and day length, and average temperature variations throughout the year.
The function sine can be traced to the jyā and koṭijyā functions used in Gupta period Indian astronomy (Aryabhatiya, Surya Siddhanta), via translation from Sanskrit to Arabic and then from Arabic to Latin. The word "sine" (Latin "sinus") comes from a Latin mistranslation of the Arabic jiba, which is a transliteration of the Sanskrit word for half the chord, jyaardha.
Unit in the last placeIn computer science and numerical analysis, unit in the last place or unit of least precision (ULP) is the spacing between floatingpoint numbers, i.e., the value the least significant digit represents if it is 1. It is used as a measure of accuracy in numeric calculations.
Uninterpreted  

Numeric  
Pointer  
Text  
Composite  
Other  
Related topics  
See also platformdependent and independent units of information 
This page is based on a Wikipedia article written by authors
(here).
Text is available under the CC BYSA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.