IEEE-754 FLOATING POINT REPRESENTATION OF VARIABLES MANTISSA EXPONENT PUNTO FLOTANTE S.A.

tutorial: IEEE-754 standard, to store floating point variables in memory, with C18 compiler.

When using high-level languages in microcontroller systems as Basic or C, it is possible to define the variables used in programs such as: positive integer, signed integer, or floating point.

The floating point representation permits to store real numbers (ie, values that can be positive or negative and handle decimal point), which can take any value, always in a fixed format of 24 or 32 bits of memory, depending on the language and compiler used.

The most widely used standard for microcontroller applications, is IEEE-754, 32 bits. This format is used by the compiler C18 and is described below.

IEEE-754 standard for the representation of real numbers in floating point format:

When you define a variable of type "float" in memory, the value is stored in 4 bytes, or 32 bits, distributed as follows: a sign bit, 8 bit exponent and a mantissa of 23 bits.

BYTE 1

BYTE 2

BYTE 3

BYTE 4

The numbers represented in floating point are generated from a mantissa and an exponent and can take values very large or very small. The exponent moves the binary point value expressed in the mantissa, to positions +127...-127 shifting the binary point right or left respectively.

To convert the floating point format to a decimal value, an implicit "1" is added to the 23 bit mantissa forming a 24 bit mantissa.

The complete 32 bit representation of the floating point value is organized as follows. Please see picture.

Sign: is bit 7 of byte 1. If the value is 0, the number is positive, if 1, negative.

Exponent: is an 8 bit value, with an offset of 7FH. To find the real value of the exponent, you must subtract -7FH to the value stored in memory. The 8 bits of the exponent forms with the seven least significant bits of byte 1 and the most significant bit of byte 2. The real exponent expresses the number of positions to the right (when the value is positive) or left (when the value is negative) that should move the binary point in the mantissa.

Mantissa: consists of 23 bits. These bits are the least significant 7 bits of byte 2 and the 8 bit bytes 3 and 4. When converting to its real value, you must add an implicit "1", as indicated below.

The implicit "1": when converting to decimal the binary value stored in memory, the mantissa must be added a "1" to the left of the binary value of 23 bits, to form the 24-bit representation.

The binary point position: the position of the binary point that separates the integer part of the fractional part, is always after (right) the implicit "1" of the mantissa. This initial position will move to the right or left, according to the real exponent value. Please see the examples below.

Examples:

Decimal number	Floating point format	Sign bit	Real exponent	Mantissa including implicit "1"
+1.0	3F 80 00 00	0	7F - 7F = 0	1.0 ...
+2.0	40 00 00 00	0	80 - 7F = +1	10.0 ...
+3.0	40 40 00 00	0	80 - 7F = +1	11.0 ...
-3.0	C0 40 00 00	1	80 - 7F = +1	11.0 ...
+0.5	3F 00 00 00	0	7E - 7F = -1	.10 ...
+10.0	41 20 00 00	0	82- 7F = +3	1010.0 ...
-100.0	C2 C8 00 00	1	85 - 7F = +6	1100100.0 ...
+3.1416	40 49 0F F9	0	80 - 7F = +1	11.00100100010 ...
-1.25	BF A0 00 00	1	7F - 7F = 0	1.010 ...
-248.75	C3 78 C0 00	1	86 - 7F = +7	11111000.11000011

DEFINITION OF A FLOATING POINT VARIABLE IN C:

PI.jpg (12866 bytes)