Decimal can be Binary Too (was decimal or rational)
Author Message
Decimal can be Binary Too (was decimal or rational)

(If this is a dup post, pls excuse, orig got lost -D)

Quote:

>    ...
> > <And thanks to more conversation with Remco Gerlich I think that it
> > is decimal, not rational that is the way to go, somebody try to talk us
> > out of it>  is exactly what Burke was talking about.

> I'm not a candidate for "talking (anybody) out of it", where
> "it" is having (e.g.) 7.35 mean a floating-point-decimal 7.35
> rather than a rational 7.35, because I'm _deeply_ undecided
> myself.  My criterion: what *WOULD* be harder to explain to (and
> have accepted by) an intelligent beginner?  Potentially *VERY
> SLOW* (and memory-consuming) behavior for computation, or the
> risk of unexpected (to said beginner) 'strange' results?  And
> I'm not talking about a vacuum -- I'm talking about a real-world
> where people's expectations ARE very much shaped by the
> calculators they've dealt with for years, and those ARE FP
> decimal (at least typically).  A computation doesn't slow
> down as it proceeds, but it does often lose precision...

If by "floating-point-decimal" you mean internal representation of
both the mantissa and the exponent by some sort of a binary-coded
decimal, let me point out that it is not necessary to go to full
decimal in order to to achieve the `7.35` == "7.35" goal.

By letting the exponent represent powers of 10 rather than 2, the
base (or mantissa) and exponent can both be represented in binary
as an int or long.  Thus, the internal calculations can use the
fast integer hardware instructions, rather than decimal arithmetic
which would have to be done by software.

Scaling would require a multiply or divide by 10 every time the
exponent is incremented/decremented by one, not as fast as a
left or right bit shift; but it can be done pretty fast considering
that a multiply by 10 is equivalent to a left shift by 3 (*8)
followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x):

Quote:
>>> x = 42
>>> (x<<3) + x + x    # multiply by 10 without using * op

420

I'm not sure that this is any faster than x*10; in Python, probably
not, but in C it may be. (Haven't looked at any CPU timing charts
lately, so I may be all wet here; it used to be that mult & div were
very expensive.)  (A quick timing check shows that
the above shift and add calculation takes about 3 times longer in
python than a straight *10, as expected.)

Obviously, this binary-based decimal-powered floating-point would
not be as fast as using the hardware FPU, but it wouldn't be as bad as
doing all calculations in decimal as with a full decimal implementation.

I'm in the process of writing an "efloat" class which implements
these ideas as a proof of concept.  "efloat" for expected float,
i.e., one that meets your expectations, no surprises, as with:

Quote:
>>> 7.35

7.3499999999999996

If someone can think of a better name please let me know.
I didn't want to call it dfloat since that should be reserved for
a full decimal implementation.  Come to think of it, efloat may
not be such a good name since it can be confused with the "e"
in the binary float literal display (0.735e1).

I'll post the source code when I'm done, in case anyone is interested
in checking it out.

As far as decimal vs rational for a built-in Python type, couldn't
we have both?  And if we do get both, I think:

7.35   or   735d-2    should be the literal for the decimal-float
7.35e0  or  735e-2    should be the literal for binary float
735/100  or 735r-2    should be the literal for rationals

-Don

Wed, 19 Nov 2003 08:39:02 GMT
Decimal can be Binary Too (was decimal or rational)
[Don O'Donnell]

Quote:
> ...
> If by "floating-point-decimal" you mean internal representation of
> both the mantissa and the exponent by some sort of a binary-coded
> decimal, let me point out that it is not necessary to go to full
> decimal in order to to achieve the `7.35` == "7.35" goal.>
> By letting the exponent represent powers of 10 rather than 2, the
> base (or mantissa) and exponent can both be represented in binary
> as an int or long.  Thus, the internal calculations can use the
> fast integer hardware instructions, rather than decimal arithmetic
> which would have to be done by software.
> ...

See

<ftp://ftp.python.org/pub/python/contrib-09-Dec-1999/DataStructures/Fi...
nt.py>

for a fixed-point version of that.  For "typical" commercial use, the
problem is that converting between base-2 (internal) and base-10 (string) is
very expensive relative to the one or two arithmetic operations typically
performed on each input.  For example, hook up to a database with a million
sales records, and report on the sum.  The database probably delivers the
sale amounts as strings, like "25017.18".  Even adding them into the total
*as* strings would be cheaper than converting them to a binary format first.

In addition, once you're outside the range of a native platform integer,
you're doing multiprecision operations by hand anyway.  Python's longs use
"digits" in base 2**15 internally, but *could* use, say, base 10**4 instead.
The code would be almost the same, except for using different low-level
tricks to find twodigits/base and twodigits%base efficiently.

Wed, 19 Nov 2003 09:30:40 GMT
Decimal can be Binary Too (was decimal or rational)

Quote:

> [Don O'Donnell]
> > ...
> > If by "floating-point-decimal" you mean internal representation of
> > both the mantissa and the exponent by some sort of a binary-coded
> > decimal, let me point out that it is not necessary to go to full
> > decimal in order to to achieve the `7.35` == "7.35" goal.>
> > By letting the exponent represent powers of 10 rather than 2, the
> > base (or mantissa) and exponent can both be represented in binary
> > as an int or long.  Thus, the internal calculations can use the
> > fast integer hardware instructions, rather than decimal arithmetic
> > which would have to be done by software.
> > ...

> See

> <ftp://ftp.python.org/pub/python/contrib-09-Dec-1999/DataStructures/Fi...
> nt.py>

> for a fixed-point version of that.  For "typical" commercial use, the
> problem is that converting between base-2 (internal) and base-10 (string) is
> very expensive relative to the one or two arithmetic operations typically
> performed on each input.  For example, hook up to a database with a million
> sales records, and report on the sum.  The database probably delivers the
> sale amounts as strings, like "25017.18".  Even adding them into the total
> *as* strings would be cheaper than converting them to a binary format first.

> In addition, once you're outside the range of a native platform integer,
> you're doing multiprecision operations by hand anyway.  Python's longs use
> "digits" in base 2**15 internally, but *could* use, say, base 10**4 instead.
> The code would be almost the same, except for using different low-level
> tricks to find twodigits/base and twodigits%base efficiently.

environments,
input, moving, formatting and output of numbers exceeds the amount of
actual
calculations that are done with them.  Hence the early business oriented
computers did all their calculations in some form of decimal format, to
save
the costly dec-bin-dec conversion steps.  The "revolutionary" IBM 360 of
the
'60s was the first to have both floating-point hardware for scientific
processing as well as fixed-point "packed-decimal" hardware for business
use.

With today's fast processors however, the radix conversion steps are
hardly
noticeable.  I've done a lot of COBOL (yuck) programming on
Univac/Unisys
mainframes, which, lacking hardware decimal instructions, did all their
fixed-point processing in binary.  I never encountered any performance
problems.  Quite the contrary, they were incredibly fast machines for
commercial work.

I took a look at your FixedPoint.py module.  Very nice work, thanks.  As
it
it.
Thanks for the update.  I notice that you are also using a long integer
internally to store the base number and an int to store a power of 10,
as I suggested in my original posting.  I was thinking more along the
lines
of a floating-point type rather than your fixed-point.  I.E., with your

FixedPoint class:
5.12 * 4.22 == 21.61      (it's rounded to two decimal places)

With my dfloat class:
5.12 * 4.22 == 21.6064    (result is exact)

I think there is a real need for both types of numbers.

Especially in view of the fact that with Python's built in types,
what we get today is:

Quote:
>>> 5.12 * 4.22

21.606400000000001

Do you think it would be possible or desirable to extend/generalize your
FixedPoint class to handle the "floating decimal" as an option?  Or
would
it be better to make it a separate class or subclass?  Any opinions?

-Don

Fri, 21 Nov 2003 23:56:49 GMT
Decimal can be Binary Too (was decimal or rational)

Quote:

> [Don O'Donnell]
> > ...
> > If by "floating-point-decimal" you mean internal representation of
> > both the mantissa and the exponent by some sort of a binary-coded
> > decimal, let me point out that it is not necessary to go to full
> > decimal in order to to achieve the `7.35` == "7.35" goal.>
> > By letting the exponent represent powers of 10 rather than 2, the
> > base (or mantissa) and exponent can both be represented in binary
> > as an int or long.  Thus, the internal calculations can use the
> > fast integer hardware instructions, rather than decimal arithmetic
> > which would have to be done by software.
> > ...

> See

> <ftp://ftp.python.org/pub/python/contrib-09-Dec-1999/DataStructures/Fi...
> nt.py>

> for a fixed-point version of that.  For "typical" commercial use, the
> problem is that converting between base-2 (internal) and base-10 (string) is
> very expensive relative to the one or two arithmetic operations typically
> performed on each input.  For example, hook up to a database with a million
> sales records, and report on the sum.  The database probably delivers the
> sale amounts as strings, like "25017.18".  Even adding them into the total
> *as* strings would be cheaper than converting them to a binary format first.

> In addition, once you're outside the range of a native platform integer,
> you're doing multiprecision operations by hand anyway.  Python's longs use
> "digits" in base 2**15 internally, but *could* use, say, base 10**4 instead.
> The code would be almost the same, except for using different low-level
> tricks to find twodigits/base and twodigits%base efficiently.

environments, input, moving, formatting and output of numbers exceeds
the amount of actual calculations that are done with them.  Hence the
early business oriented computers did all their calculations in some
form of decimal format, to save the costly dec-bin-dec conversion steps.
The "revolutionary" IBM 360 of the '60s was the first to have both
floating-point hardware for scientific processing as well as fixed-point

With today's fast processors however, the radix conversion steps are
hardly noticeable.  I've done a lot of COBOL (yuck) programming on
Univac/Unisys mainframes, which, lacking hardware decimal instructions,
did all their fixed-point processing in binary.  I never encountered
any performance problems.  Quite the contrary, they were incredibly
fast machines for commercial work.

I took a look at your FixedPoint.py module.  Very nice work, thanks.
about it.  Thanks for the update.  I notice that you are also using a
long integer internally to store the base number and an int to store a
power of 10, as I suggested in my original posting.  I was thinking
more along the lines of a floating-point type rather than your
fixed-point.  I.E., with your

FixedPoint class:
5.12 * 4.22 == 21.61      (it's rounded to two decimal places)

With my dfloat class:
5.12 * 4.22 == 21.6064    (result is exact)

I think there is a real need for both types of numbers.

Especially in view of the fact that with Python's built in types,
what we get today is:

Quote:
>>> 5.12 * 4.22

21.606400000000001

Do you think it would be possible or desirable to extend/generalize
your FixedPoint class to handle the "floating decimal" as an option?
Or would it be better to make it a separate class or subclass?
Any opinions?

BTW, I believe we also could use a rational type for representing
numbers like 1/3 which can't be represented exactly by a finite
number of decimal or binary digits.

-Don

Sat, 22 Nov 2003 00:33:21 GMT
Decimal can be Binary Too (was decimal or rational)
[Don O'Donnell]

Quote:
> ...
> Do you think it would be possible or desirable to extend/generalize
> your FixedPoint class to handle the "floating decimal" as an option?
> Or would it be better to make it a separate class or subclass?  Any
> opinions?

Sorry to skip most of the text, but I'm very short on time tonight.  The
short and the long course is that I don't want Python to reinvent this wheel
yet again, so have been arguing for serious consideration of IBM's Standard
Decimal Arithmetic proposal:

http://www2.hursley.ibm.com/decimal/

Floating decimal is very much a part of that.  Aahz is in the process of
implementing it as a Python module; see his Decimal.py at

http://starship.python.net/crew/aahz/

If, after (it's not done yet) people get to kick Decimal's tires, they're
happy (OK, I'll settle for happier <wink>) with this approach, Aahz has been
writing in it such a way that the performance-critical parts can be migrated
to C easily.  If not, hey -- it's Python!  It's painless to throw away (and
especially if you're not Aahz <wink>).

Quote:

Ah, in *that* case, FixedPoint.py does way too many magical coercions, the
test suite should be 100x larger, and I'm not sure the intent of seemingly

# n1/10**p / (n2/10**p) = n1/n2 = (n1*10**p/n2)/10**p

is obvious to anyone but me -- but you got what you paid for <wink>.

Sat, 22 Nov 2003 09:45:38 GMT
Decimal can be Binary Too (was decimal or rational)

Quote:

>If, after (it's not done yet) people get to kick Decimal's tires, they're
>happy (OK, I'll settle for happier <wink>) with this approach, Aahz has been
>writing in it such a way that the performance-critical parts can be migrated
>to C easily.  If not, hey -- it's Python!  It's painless to throw away (and
>especially if you're not Aahz <wink>).

Not only that, but the intent (and the reason I started this project in
the first place) is to have the C code release the Global Interpreter
Lock so that multi-CPU machines get a speedup in decimal arithmetic.
--

Androgynous poly {*filter*} vanilla {*filter*} het Pythonista   http://www.*-*-*.com/
Hugs and backrubs -- I break Rule 6

"Do not taunt happy fun for loops. Do not change lists you are looping over."
--Remco Gerlich, comp.lang.python

Sat, 22 Nov 2003 12:53:39 GMT
Decimal can be Binary Too (was decimal or rational)

Quote:

> If someone can think of a better name please let me know.

--
Greg Ewing, Computer Science Dept, University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Tue, 25 Nov 2003 09:20:58 GMT
Decimal can be Binary Too (was decimal or rational)

>  Scaling would require a multiply or divide by 10 every time the
>  exponent is incremented/decremented by one, not as fast as a
>  left or right bit shift; but it can be done pretty fast considering
>  that a multiply by 10 is equivalent to a left shift by 3 (*8)
>  followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x):
>
> >>> x = 42
> >>> (x<<3) + x + x    # multiply by 10 without using * op
>  420

Why not make that one left shift by 3, one left shift by 1, and one add?

Quote:
>>> x = 42
>>> (x<<3) + (x<<1)

420

What's more expensive here, add or shift? I'm more into hardware, and
hardwired shifts are costless (disregarding overflow checks), while

>  I'm not sure that this is any faster than x*10; in Python, probably
>  not, but in C it may be. (Haven't looked at any CPU timing charts
>  lately, so I may be all wet here; it used to be that mult & div were
>  very expensive.)  (A quick timing check shows that
>  the above shift and add calculation takes about 3 times longer in
>  Python than a straight *10, as expected.)

Neither am I. There's too much unknown here for me.

/Mikael

-----------------------------------------------------------------------

WWW:     http://www.dtr.isy.liu.se/dtr/staff/mikael
Phone:   +46 - (0)13 - 28 1343
Telefax: +46 - (0)13 - 28 1339
Date:    08-Jun-2001
Time:    08:28:40

/"\
\ /     ASCII Ribbon Campaign
X      Against HTML Mail
/ \

This message was sent by XF-Mail.
-----------------------------------------------------------------------

Tue, 25 Nov 2003 14:39:30 GMT
Decimal can be Binary Too (was decimal or rational)

>  Scaling would require a multiply or divide by 10 every time the
>  exponent is incremented/decremented by one, not as fast as a
>  left or right bit shift; but it can be done pretty fast considering
>  that a multiply by 10 is equivalent to a left shift by 3 (*8)
>  followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x):
>
> >>> x = 42
> >>> (x<<3) + x + x    # multiply by 10 without using * op
>  420

Why not make that one left shift by 3, one left shift by 1, and one add?

Quote:
>>> x = 42
>>> (x<<3) + (x<<1)

420

What's more expensive here, add or shift? I'm more into hardware, and
hardwired shifts are costless (disregarding overflow checks), while

>  I'm not sure that this is any faster than x*10; in Python, probably
>  not, but in C it may be. (Haven't looked at any CPU timing charts
>  lately, so I may be all wet here; it used to be that mult & div were
>  very expensive.)  (A quick timing check shows that
>  the above shift and add calculation takes about 3 times longer in
>  Python than a straight *10, as expected.)

Neither am I. There's too much unknown here for me.

/Mikael

-----------------------------------------------------------------------

WWW:     http://www.dtr.isy.liu.se/dtr/staff/mikael
Phone:   +46 - (0)13 - 28 1343
Telefax: +46 - (0)13 - 28 1339
Date:    08-Jun-2001
Time:    08:28:40

/"\
\ /     ASCII Ribbon Campaign
X      Against HTML Mail
/ \

This message was sent by XF-Mail.
-----------------------------------------------------------------------

Tue, 25 Nov 2003 14:39:30 GMT
Decimal can be Binary Too (was decimal or rational)

<snip>

Quote:
> ... For "typical" commercial use, the
> problem is that converting between base-2 (internal) and base-10 (string) is
> very expensive relative to the one or two arithmetic operations typically
> performed on each input.  For example, hook up to a database with a million
> sales records, and report on the sum.  The database probably delivers the
> sale amounts as strings, like "25017.18".  Even adding them into the total
> *as* strings would be cheaper than converting them to a binary format first.

Boo! I was hoping that by base-2 you were talking about base -2. We
don't need no s{*filter*}keeng two's complement.

--
Andrew Stribblehill
Systems programmer, IT Service, University of Durham, England

Tue, 25 Nov 2003 22:09:50 GMT
Decimal can be Binary Too (was decimal or rational)

Quote:
> .....< about fastest way to multiply by ten>...
> What's more expensive here, add or shift? I'm more into hardware, and
> hardwired shifts are costless (disregarding overflow checks), while

I am not an expert on modern CPU architechture, but I strongly suspect that
things like multiplying two integers are already optimized, and that there
would be no advantage to trying to be tricky about it.  Besides, you should
let the C compiler decide which way is fastest.

Mon, 01 Dec 2003 06:43:51 GMT

 Page 1 of 1 [ 11 post ]

Relevant Pages