Densely packed decimal

Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits.

The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10). Densely packed decimal is a more efficient code that packs three digits into ten bits using a scheme that allows compression from, or expansion to, BCD with only two or three gate delays in hardware.[1]

The densely packed decimal encoding is a refinement of Chen–Ho encoding; it gives the same compression and speed advantages, but the particular arrangement of bits used confers additional advantages:

  • Compression of one or two digits (into the optimal four or seven bits respectively) is achieved as a subset of the three-digit encoding. This means that arbitrary numbers of decimal digits (not just multiples of three digits) can be encoded efficiently. For example, 38=12×3+2 decimal digits can be encoded in 12×10+7=127 bits – that is, 12 sets of three decimal digits can be encoded using 12 sets of ten binary bits and the remaining two decimal digits can be encoded using a further seven binary bits.
  • The subset encoding mentioned above is simply the rightmost bits of the standard three-digit encoding; the encoded value can be widened simply by adding leading 0 bits.
  • All seven-bit BCD numbers (0 through 79) are encoded identically by DPD. This makes conversions of common small numbers trivial. (This must break down at 80, because that requires eight bits for BCD, but the above property requires that the DPD encoding must fit into seven bits.)
  • The low-order bit of each digit is copied unmodified. Thus, the non-trivial portion of the encoding can be considered a conversion from three base-5 digits to seven binary bits. Further, digit-wise logical values (in which each digit is either 0 or 1) can be manipulated directly without any encoding or decoding being necessary.

History

In 1971, Tien Chi Chen[2] and Irving Tze Ho[3] devised a lossless prefix code (known as Chen–Ho encoding since 2000[4]) which packed three decimal digits into ten binary bits using a scheme which allowed compression from or expansion to BCD with only two or three gate delays in hardware. Densely packed decimal is a refinement of this, devised by Mike F. Cowlishaw in 2002,[1] which was incorporated into the IEEE 754-2008[5] and ISO/IEC/IEEE 60559:2011[6] standards for decimal floating-point.

Encoding

Like Chen–Ho encoding, DPD encoding classifies each decimal digit into one of two ranges, depending on the most significant bit of the binary form: "small" digits have values 0 through 7 (binary 0000–0111), and "large" digits, 8 through 9 (binary 1000–1001). Once it is known or has been indicated that a digit is small, three more bits are still required to specify the value. If a large value has been indicated, only one bit is required to distinguish between the values 8 or 9.

When encoding, the most significant bit of each of the three digits to be encoded select one of eight coding patterns for the remaining bits, according to the following table. The table shows how, on decoding, the ten bits of the coded form in columns b9 through b0 are copied into the three digits d2 through d0, and the remaining bits are filled in with constant zeros or ones.

Densely packed decimal encoding rules[7]
DPD encoded value Decimal digits
b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description
a b c d e f 0 g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three small digits
a b c d e f 1 0 0 i 0abc 0def 100i (0–7) (0–7) (8–9) Two small digits,
one large
a b c g h f 1 0 1 i 0abc 100f 0ghi (0–7) (8–9) (0–7)
g h c d e f 1 1 0 i 100c 0def 0ghi (8–9) (0–7) (0–7)
g h c 0 0 f 1 1 1 i 100c 100f 0ghi (8–9) (8–9) (0–7) One small digit,
two large
d e c 0 1 f 1 1 1 i 100c 0def 100i (8–9) (0–7) (8–9)
a b c 1 0 f 1 1 1 i 0abc 100f 100i (0–7) (8–9) (8–9)
x x c 1 1 f 1 1 1 i 100c 100f 100i (8–9) (8–9) (8–9) Three large digits

Bits b7, b4 and b0 (c, f and i) are passed through the encoding unchanged, and do not affect the meaning of the other bits. The remaining seven bits can be considered a seven-bit encoding for three base-5 digits.

Bits b8 and b9 are not needed and ignored when decoding DPD groups with three large digits (marked as "x" in the last row of the table above), but are filled with zeros when encoding.

The eight decimal values whose digits are all 8s or 9s have four codings each. The bits marked x in the table above are ignored on input, but will always be 0 in computed results. (The 8×3 = 24 non-standard encodings fill in the gap between 103=1000 and 210=1024.)

Examples

This table shows some representative decimal numbers and their encodings in BCD, Chen–Ho, and densely packed decimal (DPD):

Decimal BCD Chen–Ho DPD
005 0000 0000 0101 000 000 0101 000 000 0101
009 0000 0000 1001 110 000 0001 000 000 1001
055 0000 0101 0101 000 010 1101 000 101 0101
079 0000 0111 1001 110 011 1001 000 111 1001
080 0000 1000 0000 101 000 0000 000 000 1010
099 0000 1001 1001 111 000 1001 000 101 1111
555 0101 0101 0101 010 110 1101 101 101 0101
999 1001 1001 1001 111 111 1001 001 111 1111

See also

References

  1. ^ a b Cowlishaw, Mike F. (May 2002). "Densely Packed Decimal Encoding". IEE Proceedings – Computers and Digital Techniques. London: Institution of Electrical Engineers (IEE). 149 (3): 102–104. doi:10.1049/ip-cdt:20020407. ISSN 1350-2387. Retrieved 2016-02-07.
  2. ^ "CHEN Tien Chi". Archived from the original on 2015-10-23. Retrieved 2016-02-07.
  3. ^ Tseng, Li-Ling (1988-04-01). "High-Tech Leadership: Irving T. Ho". Taiwan Info. Archived from the original on 2016-01-01. Retrieved 2016-02-08.
  4. ^ Cowlishaw, Mike F. (2014) [2000]. "A Summary of Chen-Ho Decimal Data encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.
  5. ^ IEEE Computer Society (2008-08-29). "IEEE Standard for Floating-Point Arithmetic". IEEE. doi:10.1109/IEEESTD.2008.4610935. ISBN 978-0-7381-5753-5. IEEE Std 754-2008. Retrieved 2016-02-08.
  6. ^ "ISO/IEC/IEEE 60559:2011". 2011. Retrieved 2016-02-08.
  7. ^ Cowlishaw, Michael Frederic (2007-02-13) [2000-10-03]. "A Summary of Densely Packed Decimal encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.

Further reading

Binary-coded decimal

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each decimal digit is represented by a fixed number of bits, usually four or eight. Special bit patterns are sometimes used for a sign or for other indications (e.g., error or overflow).

In byte-oriented systems (i.e. most modern computers), the term unpacked BCD usually implies a full byte for each digit (often including a sign), whereas packed BCD typically encodes two decimal digits within a single byte by taking advantage of the fact that four bits are enough to represent the range 0 to 9. The precise 4-bit encoding may vary however, for technical reasons, see Excess-3 for instance. The ten states representing a BCD decimal digit are sometimes called tetrades (for the nibble typically needed to hold them also known as tetrade) with those don't care-states unused named pseudo-tetrad(e)s or pseudo-decimal digit).BCD's main virtue is its more accurate representation and rounding of decimal quantities as well as an ease of conversion into human-readable representations, in comparison to binary positional systems. BCD's principal drawbacks are a small increase in the complexity of the circuits needed to implement basic arithmetics and a slightly less dense storage.

BCD was used in many early decimal computers, and is implemented in the instruction set of machines such as the IBM System/360 series and its descendants, Digital Equipment Corporation's VAX, the Burroughs B1700, and the Motorola 68000-series processors. Although BCD per se is not as widely used as in the past and is no longer implemented in newer computers' instruction sets (such as ARM; x86 does not support its BCD instructions in long mode any more), decimal fixed-point and floating-point formats are still important and continue to be used in financial, commercial, and industrial computing, where subtle conversion and fractional rounding errors that are inherent in floating point binary representations cannot be tolerated.

Binary integer decimal

The IEEE 754-2008 standard includes an encoding format for decimal floating point numbers in which the significand and the exponent (and the payloads of NaNs) can be encoded in two ways, referred to in the draft as binary encoding and decimal encoding.Both formats break a number down into a sign bit s, an exponent q (between qmin and qmax), and a p-digit significand c (between 0 and 10p−1). The value encoded is (−1)s×10q×c. In both formats the range of possible values is identical, but they differ in how the significand c is represented. In the decimal encoding, it is encoded as a series of p decimal digits (using the densely packed decimal (DPD) encoding). This makes conversion to decimal form efficient, but requires a specialized decimal ALU to process. In the binary integer decimal (BID) encoding, it is encoded as a binary number.

Chen–Ho encoding

Chen–Ho encoding is a memory-efficient alternate system of binary encoding for decimal digits.

The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10).The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple Boolean transformations avoiding any complex arithmetic operations like a base conversion.

Decimal

The decimal numeral system (also called base-ten positional numeral system, and occasionally called denary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numeral system. The way of denoting numbers in the decimal system is often referred to as decimal notation.A decimal numeral, or just decimal, or casually decimal number, refers generally to the notation of a number in the decimal numeral system. Decimals may sometimes be identified for containing a decimal separator (for example the "." in 10.00 or 3.14159). "Decimal" may also refer specifically to the digits after the decimal separator, such as in "3.14 is the approximation of π to two decimals".

The numbers that may be represented in the decimal system are the decimal fractions, that is the fractions of the form a/10n, where a is an integer, and n is a non-negative integer.

The decimal system has been extended to infinite decimals, for representing any real number, by using an infinite sequence of digits after the decimal separator (see Decimal representation). In this context, the decimal numerals with a finite number of non–zero places after the decimal separator are sometimes called terminating decimals. A repeating decimal is an infinite decimal that after some place repeats indefinitely the same sequence of digits (for example 5.123144144144144... = 5.123144). An infinite decimal represents a rational number if and only if it is a repeating decimal or has a finite number of nonzero digits.

Decimal128 floating-point format

In computing, decimal128 is a decimal floating-point computer numbering format that occupies 16 bytes (128 bits) in computer memory.

It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial and tax computations.

Decimal128 supports 34 decimal digits of significand and an exponent range of −6143 to +6144, i.e. ±0.000000000000000000000000000000000×10^−6143 to ±9.999999999999999999999999999999999×10^6144. (Equivalently, ±0000000000000000000000000000000000×10^−6176 to ±9999999999999999999999999999999999×10^6111.) Therefore, decimal128 has the greatest range of values compared with other IEEE basic floating point formats. Because the significand is not normalized, most values with less than 34 significant digits have multiple possible representations; 1×102=0.1×103=0.01×104, etc. Zero has 12288 possible representations (24576 if you include both signed zeros).

Decimal128 floating point is a relatively new decimal floating-point format, formally introduced in the 2008 version of IEEE 754 as well as with ISO/IEC/IEEE 60559:2011.

Decimal32 floating-point format

In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes (32 bits) in computer memory.

It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial and tax computations. Like the binary16 format, it is intended for memory saving storage.

Decimal32 supports 7 decimal digits of significand and an exponent range of −95 to +96, i.e. ±0.000000×10^−95 to ±9.999999×10^96. (Equivalently, ±0000000×10^−101 to ±9999999×10^90.) Because the significand is not normalized (there is no implicit leading "1"), most values with less than 7 significant digits have multiple possible representations; 1×102=0.1×103=0.01×104, etc. Zero has 192 possible representations (384 when both signed zeros are included).

Decimal32 floating point is a relatively new decimal floating-point format, formally introduced in the 2008 version of IEEE 754 as well as with ISO/IEC/IEEE 60559:2011.

Decimal64 floating-point format

In computing, decimal64 is a decimal floating-point computer numbering format that occupies 8 bytes (64 bits) in computer memory.

It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial and tax computations.

Decimal64 supports 16 decimal digits of significand and an exponent range of −383 to +384, i.e. ±0.000000000000000×10^−383 to ±9.999999999999999×10^384. (Equivalently, ±0000000000000000×10^−398 to ±9999999999999999×10^369.) In contrast, the corresponding binary format, which is the most commonly used type, has an approximate range of ±0.000000000000001×10^−308 to ±1.797693134862315×10^308. Because the significand is not normalized, most values with less than 16 significant digits have multiple possible representations; 1×102=0.1×103=0.01×104, etc. Zero has 768 possible representations (1536 if both signed zeros are included).

Decimal64 floating point is a relatively new decimal floating-point format, formally introduced in the 2008 version of IEEE 754 as well as with ISO/IEC/IEEE 60559:2011.

Decimal computer

Decimal computers are computers which can represent numbers and addresses in decimal as well as providing instructions to operate on those numbers and addresses directly in decimal, without conversion to a pure binary representation. Some also had a variable wordlength, which enabled operations on numbers with a large number of digits.

Decimal floating point

Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding errors that otherwise typically occur when converting between decimal fractions (common in human-entered data, such as measurements or financial information) and binary (base-2) fractions.

The advantage of decimal floating-point representation over decimal fixed-point and integer representation is that it supports a much wider range of values. For example, while a fixed-point representation that allocates 8 decimal digits and 2 decimal places can represent the numbers 123456.78, 8765.43, 123.00, and so on, a floating-point representation with 8 decimal digits could also represent 1.2345678, 1234567.8, 0.000012345678, 12345678000000000, and so on. This wider range can dramatically slow the accumulation of rounding errors during successive calculations; for example, the Kahan summation algorithm can be used in floating point to add many numbers with no asymptotic accumulation of rounding error.

IEEE 754

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.

The standard defines:

arithmetic formats: sets of binary and decimal floating-point data, which consist of finite numbers (including signed zeros and subnormal numbers), infinities, and special "not a number" values (NaNs)

interchange formats: encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form

rounding rules: properties to be satisfied when rounding numbers during arithmetic and conversions

operations: arithmetic and other operations (such as trigonometric functions) on arithmetic formats

exception handling: indications of exceptional conditions (such as division by zero, overflow, etc.)The current version, IEEE 754-2008 revision published in August 2008, includes nearly all of the original IEEE 754-1985 standard plus IEEE 854-1987 Standard for Radix-Independent Floating-Point Arithmetic.

IEEE 754-2008 revision

IEEE 754-2008 (previously known as IEEE 754r) was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 floating point standard. The revision extended the previous standard where it was necessary, added decimal arithmetic and formats, tightened up certain areas of the original standard which were left undefined, and merged in IEEE 854 (the radix-independent floating-point standard).

In a few cases, where stricter definitions of binary floating-point arithmetic might be performance-incompatible with some existing implementation, they were made optional.

Mike Cowlishaw

Mike Frederic Cowlishaw is a Visiting Professor at the Department of Computer Science at the University of Warwick, and a Fellow of the Royal Academy of Engineering. He is a retired IBM Fellow, and was a Fellow of the Institute of Engineering and Technology, and the British Computer Society. He was educated at Monkton Combe School and The University of Birmingham.

Offset binary

Offset binary, also referred to as excess-K, excess-N, excess code or biased representation, is a digital coding scheme where all-zero corresponds to the minimal negative value and all-one to the maximal positive value. There is no standard for offset binary, but most often the offset K for an n-bit binary word is K = 2n−1. This has the consequence that the "zero" value is represented by a 1 in the most significant bit and zero in all other bits, and in general the effect is conveniently the same as using two's complement except that the most significant bit is inverted. It also has the consequence that in a logical comparison operation, one gets the same result as with a true form numerical comparison operation, whereas, in two's complement notation a logical comparison will agree with true form numerical comparison operation if and only if the numbers being compared have the same sign. Otherwise the sense of the comparison will be inverted, with all negative values being taken as being larger than all positive values.

One historically prominent example of offset-64 (excess-64) notation was in the floating point (exponential) notation in the IBM System/360 and System/370 generations of computers. The "characteristic" (exponent) took the form of a seven-bit excess-64 number (The high-order bit of the same byte contained the sign of the significand).The 8-bit exponent in Microsoft Binary Format, a floating point format used in various programming languages (in particular BASIC) in the 1970s and 1980s, was encoded using an offset-129 notation (excess-129).

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) uses various sizes of exponent, but also uses offset notation for the format of each precision. Unusually however, instead of using "excess 2n−1" it uses "excess 2n−1 − 1" (i.e. excess-15, excess-127, excess-1023, excess-16383) which means that inverting the leading (high-order) bit of the exponent will not convert the exponent to correct two's complement notation.

Offset binary is often used in digital signal processing (DSP). Most analog to digital (A/D) and digital to analog (D/A) chips are unipolar, which means that they cannot handle bipolar signals (signals with both positive and negative values). A simple solution to this is to bias the analog signals with a DC offset equal to half of the A/D and D/A converter's range. The resulting digital data then ends up being in offset binary format.Most standard computer CPU chips cannot handle the offset binary format directly. CPU chips typically can only handle signed and unsigned integers, and floating point value formats. Offset binary values can be handled in several ways by these CPU chips. The data may just be treated as unsigned integers, requiring the programmer to deal with the zero offset in software. The data may also be converted to signed integer format (which the CPU can handle natively) by simply subtracting the zero offset. As a consequence of the most common offset for an n-bit word being 2n−1, which implies that the first bit is inverted relative to two's complement, there is no need for a separate subtraction step, but one simply can invert the first bit. This sometimes is a useful simplification in hardware, and can be convenient in software as well.

Table of offset binary for four bits, with two's complement for comparison

Offset binary may be converted into two's complement by inverting the most significant bit. For example, with 8-bit values, the offset binary value may be XORed with 0x80 in order to convert to two's complement. In specialised hardware it may be simpler to accept the bit as it stands, but to apply its value in inverted significance.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.