Conversion of double precision
Author Message
Conversion of double precision

Quote:
>Subject: Conversion of double precision

>Date: Sat, Nov 15, 1997 19:32 EST

>I have a file in which numbers are stored in double precision format. I'm
>trying to convert a these numbers to a number. In the file the numbers
>are represented by 8 bytes (64 bits). I'm looking to convert the 64bits
>into the corresponding number. I already know the bits 0 to 51 represent
>a mantissa, 52 to 62 the exponent and the 63th bit represent the sign
>bit. Can someone explain how to construct the original number. I'm
>programming in Clipper and Clipper does not have a function who does the
>conversion for me. If I can understand how the calculation with the 64
>bits work, I'll write a function to do the job.

>Ronny

All floating point numbers in computers are basically a binary version of the
familiar base-10 scientific notation.

Let e be the exponent, m the mantissa, and n the value of the number.  And let
^ denote exponentation.  Then you have 4 cases:

1.    if 0 < e < 2047, then n = 1.m * 2^(e-1023)   ["normal" numbers]

Note that the m is really only the fractional part  (the part to the right of
the decimal point) of the true mantissa; there's an implied 1 to the left of
the binary decimal point.  For example, if the number to be represented is 1.0
* 2^0, then e = 1023,  and m = 0, *not 1*, because the 1 to the left of the
binary decimal point is already implied.  Numbers of this form will always
have 53 bits of accuracy (52 bit of m + the implied digit of 1).
Most double precision #s will fall into case 1.

2. if e = 0, then n = 0.m * 2^(-1022)               [denormals]

Note that, unlike the first case, the e=0 cases have an implied 0 rather than a
1 to the left of the binary decimal point, and like case 1, m is only the
fractional part of the true mantissa.  Also note that the exponent is fixed to
-1022.  So for example, the number 0 will be represented with e=0 and m=0.

Note that since it's a 0 rather than a 1 to the left of the binary decimal
point, the leading zeros in m, along with the implied 0, only serve as
placeholders rather than significant bits.  And case 2 #s are also smaller
than case 1 numbers (the true mantissa is < 1 in case 2 while >= 1 in case 1
because of the difference in the implied bit, and the smallest true exponent
in case 1 is 1-1023 = -1022 = the exponent in case 2).  All of these means
that the case 2 numbers are numbers too small to be represented with 53 bit of
accuracy as in case 1.  These "underflowed" numbers are called denormals.

3. if e = 2047 and m = 0, then n = infinity.

Believe it or not, you could represent infinity as double precision value!
However, since normally we don't do math with infinities, any case 3 number
could be thrown out.

4. if e = 2047 and m > 0, then n = NaN (Not a Number).

This is the value for representing results like Sqrt(-1) and ln (-1) that are
not real numbers.  Since they aren't valid numbers, these case 4 numbers must
be thrown out.

In real life, using the case 1 definition is good enough, since nearly all
numbers go in the case 1 definition.  A few numbers, if any, might go into the
case 2 definition.  Almost no numbers should go into the case 3 and 4
definitions.

Good luck and hope it helps.

Wed, 18 Jun 1902 08:00:00 GMT

 Page 1 of 1 [ 2 post ]

Relevant Pages