Decimal can be Binary Too (was decimal or rational)
Author 
Message 
Don O'Donnel #1 / 11

Decimal can be Binary Too (was decimal or rational)
(If this is a dup post, pls excuse, orig got lost D) Quote:
> ... > > <And thanks to more conversation with Remco Gerlich I think that it > > is decimal, not rational that is the way to go, somebody try to talk us > > out of it> is exactly what Burke was talking about. > I'm not a candidate for "talking (anybody) out of it", where > "it" is having (e.g.) 7.35 mean a floatingpointdecimal 7.35 > rather than a rational 7.35, because I'm _deeply_ undecided > myself. My criterion: what *WOULD* be harder to explain to (and > have accepted by) an intelligent beginner? Potentially *VERY > SLOW* (and memoryconsuming) behavior for computation, or the > risk of unexpected (to said beginner) 'strange' results? And > I'm not talking about a vacuum  I'm talking about a realworld > where people's expectations ARE very much shaped by the > calculators they've dealt with for years, and those ARE FP > decimal (at least typically). A computation doesn't slow > down as it proceeds, but it does often lose precision...
If by "floatingpointdecimal" you mean internal representation of both the mantissa and the exponent by some sort of a binarycoded decimal, let me point out that it is not necessary to go to full decimal in order to to achieve the `7.35` == "7.35" goal. By letting the exponent represent powers of 10 rather than 2, the base (or mantissa) and exponent can both be represented in binary as an int or long. Thus, the internal calculations can use the fast integer hardware instructions, rather than decimal arithmetic which would have to be done by software. Scaling would require a multiply or divide by 10 every time the exponent is incremented/decremented by one, not as fast as a left or right bit shift; but it can be done pretty fast considering that a multiply by 10 is equivalent to a left shift by 3 (*8) followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x): Quote: >>> x = 42 >>> (x<<3) + x + x # multiply by 10 without using * op
420 I'm not sure that this is any faster than x*10; in Python, probably not, but in C it may be. (Haven't looked at any CPU timing charts lately, so I may be all wet here; it used to be that mult & div were very expensive.) (A quick timing check shows that the above shift and add calculation takes about 3 times longer in python than a straight *10, as expected.) Obviously, this binarybased decimalpowered floatingpoint would not be as fast as using the hardware FPU, but it wouldn't be as bad as doing all calculations in decimal as with a full decimal implementation. I'm in the process of writing an "efloat" class which implements these ideas as a proof of concept. "efloat" for expected float, i.e., one that meets your expectations, no surprises, as with: Quote: >>> 7.35
7.3499999999999996 If someone can think of a better name please let me know. I didn't want to call it dfloat since that should be reserved for a full decimal implementation. Come to think of it, efloat may not be such a good name since it can be confused with the "e" in the binary float literal display (0.735e1). I'll post the source code when I'm done, in case anyone is interested in checking it out. As far as decimal vs rational for a builtin Python type, couldn't we have both? And if we do get both, I think: 7.35 or 735d2 should be the literal for the decimalfloat 7.35e0 or 735e2 should be the literal for binary float 735/100 or 735r2 should be the literal for rationals Don

Wed, 19 Nov 2003 08:39:02 GMT 


Tim Peter #2 / 11

Decimal can be Binary Too (was decimal or rational)
[Don O'Donnell] Quote: > ... > If by "floatingpointdecimal" you mean internal representation of > both the mantissa and the exponent by some sort of a binarycoded > decimal, let me point out that it is not necessary to go to full > decimal in order to to achieve the `7.35` == "7.35" goal.> > By letting the exponent represent powers of 10 rather than 2, the > base (or mantissa) and exponent can both be represented in binary > as an int or long. Thus, the internal calculations can use the > fast integer hardware instructions, rather than decimal arithmetic > which would have to be done by software. > ...
See <ftp://ftp.python.org/pub/python/contrib09Dec1999/DataStructures/Fi... nt.py> for a fixedpoint version of that. For "typical" commercial use, the problem is that converting between base2 (internal) and base10 (string) is very expensive relative to the one or two arithmetic operations typically performed on each input. For example, hook up to a database with a million sales records, and report on the sum. The database probably delivers the sale amounts as strings, like "25017.18". Even adding them into the total *as* strings would be cheaper than converting them to a binary format first. In addition, once you're outside the range of a native platform integer, you're doing multiprecision operations by hand anyway. Python's longs use "digits" in base 2**15 internally, but *could* use, say, base 10**4 instead. The code would be almost the same, except for using different lowlevel tricks to find twodigits/base and twodigits%base efficiently.

Wed, 19 Nov 2003 09:30:40 GMT 


Don O'Donnel #3 / 11

Decimal can be Binary Too (was decimal or rational)
Quote:
> [Don O'Donnell] > > ... > > If by "floatingpointdecimal" you mean internal representation of > > both the mantissa and the exponent by some sort of a binarycoded > > decimal, let me point out that it is not necessary to go to full > > decimal in order to to achieve the `7.35` == "7.35" goal.> > > By letting the exponent represent powers of 10 rather than 2, the > > base (or mantissa) and exponent can both be represented in binary > > as an int or long. Thus, the internal calculations can use the > > fast integer hardware instructions, rather than decimal arithmetic > > which would have to be done by software. > > ... > See > <ftp://ftp.python.org/pub/python/contrib09Dec1999/DataStructures/Fi... > nt.py> > for a fixedpoint version of that. For "typical" commercial use, the > problem is that converting between base2 (internal) and base10 (string) is > very expensive relative to the one or two arithmetic operations typically > performed on each input. For example, hook up to a database with a million > sales records, and report on the sum. The database probably delivers the > sale amounts as strings, like "25017.18". Even adding them into the total > *as* strings would be cheaper than converting them to a binary format first. > In addition, once you're outside the range of a native platform integer, > you're doing multiprecision operations by hand anyway. Python's longs use > "digits" in base 2**15 internally, but *could* use, say, base 10**4 instead. > The code would be almost the same, except for using different lowlevel > tricks to find twodigits/base and twodigits%base efficiently.
Thanks for your comments, Tim. I agree that in most commercial environments, input, moving, formatting and output of numbers exceeds the amount of actual calculations that are done with them. Hence the early business oriented computers did all their calculations in some form of decimal format, to save the costly decbindec conversion steps. The "revolutionary" IBM 360 of the '60s was the first to have both floatingpoint hardware for scientific processing as well as fixedpoint "packeddecimal" hardware for business use. With today's fast processors however, the radix conversion steps are hardly noticeable. I've done a lot of COBOL (yuck) programming on Univac/Unisys mainframes, which, lacking hardware decimal instructions, did all their fixedpoint processing in binary. I never encountered any performance problems. Quite the contrary, they were incredibly fast machines for commercial work. I took a look at your FixedPoint.py module. Very nice work, thanks. As it turns out I already had downloaded ver 0.0.3, but had forgotten about it. Thanks for the update. I notice that you are also using a long integer internally to store the base number and an int to store a power of 10, as I suggested in my original posting. I was thinking more along the lines of a floatingpoint type rather than your fixedpoint. I.E., with your FixedPoint class: 5.12 * 4.22 == 21.61 (it's rounded to two decimal places) With my dfloat class: 5.12 * 4.22 == 21.6064 (result is exact) I think there is a real need for both types of numbers. Especially in view of the fact that with Python's built in types, what we get today is: Quote: >>> 5.12 * 4.22
21.606400000000001 Do you think it would be possible or desirable to extend/generalize your FixedPoint class to handle the "floating decimal" as an option? Or would it be better to make it a separate class or subclass? Any opinions? stilllearningfromreadingyourcodely y'rs Don

Fri, 21 Nov 2003 23:56:49 GMT 


Don O'Donnel #4 / 11

Decimal can be Binary Too (was decimal or rational)
Quote:
> [Don O'Donnell] > > ... > > If by "floatingpointdecimal" you mean internal representation of > > both the mantissa and the exponent by some sort of a binarycoded > > decimal, let me point out that it is not necessary to go to full > > decimal in order to to achieve the `7.35` == "7.35" goal.> > > By letting the exponent represent powers of 10 rather than 2, the > > base (or mantissa) and exponent can both be represented in binary > > as an int or long. Thus, the internal calculations can use the > > fast integer hardware instructions, rather than decimal arithmetic > > which would have to be done by software. > > ... > See > <ftp://ftp.python.org/pub/python/contrib09Dec1999/DataStructures/Fi... > nt.py> > for a fixedpoint version of that. For "typical" commercial use, the > problem is that converting between base2 (internal) and base10 (string) is > very expensive relative to the one or two arithmetic operations typically > performed on each input. For example, hook up to a database with a million > sales records, and report on the sum. The database probably delivers the > sale amounts as strings, like "25017.18". Even adding them into the total > *as* strings would be cheaper than converting them to a binary format first. > In addition, once you're outside the range of a native platform integer, > you're doing multiprecision operations by hand anyway. Python's longs use > "digits" in base 2**15 internally, but *could* use, say, base 10**4 instead. > The code would be almost the same, except for using different lowlevel > tricks to find twodigits/base and twodigits%base efficiently.
Thanks for your comments, Tim. I agree that in most commercial environments, input, moving, formatting and output of numbers exceeds the amount of actual calculations that are done with them. Hence the early business oriented computers did all their calculations in some form of decimal format, to save the costly decbindec conversion steps. The "revolutionary" IBM 360 of the '60s was the first to have both floatingpoint hardware for scientific processing as well as fixedpoint "packeddecimal" hardware for business use. With today's fast processors however, the radix conversion steps are hardly noticeable. I've done a lot of COBOL (yuck) programming on Univac/Unisys mainframes, which, lacking hardware decimal instructions, did all their fixedpoint processing in binary. I never encountered any performance problems. Quite the contrary, they were incredibly fast machines for commercial work. I took a look at your FixedPoint.py module. Very nice work, thanks. As it turns out I already had downloaded ver 0.0.3, but had forgotten about it. Thanks for the update. I notice that you are also using a long integer internally to store the base number and an int to store a power of 10, as I suggested in my original posting. I was thinking more along the lines of a floatingpoint type rather than your fixedpoint. I.E., with your FixedPoint class: 5.12 * 4.22 == 21.61 (it's rounded to two decimal places) With my dfloat class: 5.12 * 4.22 == 21.6064 (result is exact) I think there is a real need for both types of numbers. Especially in view of the fact that with Python's built in types, what we get today is: Quote: >>> 5.12 * 4.22
21.606400000000001 Do you think it would be possible or desirable to extend/generalize your FixedPoint class to handle the "floating decimal" as an option? Or would it be better to make it a separate class or subclass? Any opinions? BTW, I believe we also could use a rational type for representing numbers like 1/3 which can't be represented exactly by a finite number of decimal or binary digits. stilllearningfromreadingyourcodely y'rs Don

Sat, 22 Nov 2003 00:33:21 GMT 


Tim Peter #5 / 11

Decimal can be Binary Too (was decimal or rational)
[Don O'Donnell] Quote: > ... > Do you think it would be possible or desirable to extend/generalize > your FixedPoint class to handle the "floating decimal" as an option? > Or would it be better to make it a separate class or subclass? Any > opinions?
Sorry to skip most of the text, but I'm very short on time tonight. The short and the long course is that I don't want Python to reinvent this wheel yet again, so have been arguing for serious consideration of IBM's Standard Decimal Arithmetic proposal: http://www2.hursley.ibm.com/decimal/ Floating decimal is very much a part of that. Aahz is in the process of implementing it as a Python module; see his Decimal.py at http://starship.python.net/crew/aahz/ If, after (it's not done yet) people get to kick Decimal's tires, they're happy (OK, I'll settle for happier <wink>) with this approach, Aahz has been writing in it such a way that the performancecritical parts can be migrated to C easily. If not, hey  it's Python! It's painless to throw away (and especially if you're not Aahz <wink>). Quote: > stilllearningfromreadingyourcodely y'rs
Ah, in *that* case, FixedPoint.py does way too many magical coercions, the test suite should be 100x larger, and I'm not sure the intent of seemingly tautological comments like # n1/10**p / (n2/10**p) = n1/n2 = (n1*10**p/n2)/10**p is obvious to anyone but me  but you got what you paid for <wink>.

Sat, 22 Nov 2003 09:45:38 GMT 


Aahz Maru #6 / 11

Decimal can be Binary Too (was decimal or rational)
Quote:
>If, after (it's not done yet) people get to kick Decimal's tires, they're >happy (OK, I'll settle for happier <wink>) with this approach, Aahz has been >writing in it such a way that the performancecritical parts can be migrated >to C easily. If not, hey  it's Python! It's painless to throw away (and >especially if you're not Aahz <wink>).
Not only that, but the intent (and the reason I started this project in the first place) is to have the C code release the Global Interpreter Lock so that multiCPU machines get a speedup in decimal arithmetic. 
Androgynous poly {*filter*} vanilla {*filter*} het Pythonista http://www.***.com/ Hugs and backrubs  I break Rule 6 "Do not taunt happy fun for loops. Do not change lists you are looping over." Remco Gerlich, comp.lang.python

Sat, 22 Nov 2003 12:53:39 GMT 


Greg Ewin #7 / 11

Decimal can be Binary Too (was decimal or rational)
Quote:
> If someone can think of a better name please let me know.
How about "float10"?  Greg Ewing, Computer Science Dept, University of Canterbury, Christchurch, New Zealand To get my email address, please visit my web page: http://www.cosc.canterbury.ac.nz/~greg

Tue, 25 Nov 2003 09:20:58 GMT 


Mikael Olofsso #8 / 11

Decimal can be Binary Too (was decimal or rational)
> Scaling would require a multiply or divide by 10 every time the > exponent is incremented/decremented by one, not as fast as a > left or right bit shift; but it can be done pretty fast considering > that a multiply by 10 is equivalent to a left shift by 3 (*8) > followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x): > > >>> x = 42 > >>> (x<<3) + x + x # multiply by 10 without using * op > 420 Why not make that one left shift by 3, one left shift by 1, and one add? Quote: >>> x = 42 >>> (x<<3) + (x<<1)
420 What's more expensive here, add or shift? I'm more into hardware, and hardwired shifts are costless (disregarding overflow checks), while hardwired adds are not. > I'm not sure that this is any faster than x*10; in Python, probably > not, but in C it may be. (Haven't looked at any CPU timing charts > lately, so I may be all wet here; it used to be that mult & div were > very expensive.) (A quick timing check shows that > the above shift and add calculation takes about 3 times longer in > Python than a straight *10, as expected.) Neither am I. There's too much unknown here for me. /Mikael 
WWW: http://www.dtr.isy.liu.se/dtr/staff/mikael Phone: +46  (0)13  28 1343 Telefax: +46  (0)13  28 1339 Date: 08Jun2001 Time: 08:28:40 /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ This message was sent by XFMail. 

Tue, 25 Nov 2003 14:39:30 GMT 


Mikael Olofsso #9 / 11

Decimal can be Binary Too (was decimal or rational)
> Scaling would require a multiply or divide by 10 every time the > exponent is incremented/decremented by one, not as fast as a > left or right bit shift; but it can be done pretty fast considering > that a multiply by 10 is equivalent to a left shift by 3 (*8) > followed by two adds (10*x == (8+2)*x == 8*x + 2*x == 8*x + x + x): > > >>> x = 42 > >>> (x<<3) + x + x # multiply by 10 without using * op > 420 Why not make that one left shift by 3, one left shift by 1, and one add? Quote: >>> x = 42 >>> (x<<3) + (x<<1)
420 What's more expensive here, add or shift? I'm more into hardware, and hardwired shifts are costless (disregarding overflow checks), while hardwired adds are not. > I'm not sure that this is any faster than x*10; in Python, probably > not, but in C it may be. (Haven't looked at any CPU timing charts > lately, so I may be all wet here; it used to be that mult & div were > very expensive.) (A quick timing check shows that > the above shift and add calculation takes about 3 times longer in > Python than a straight *10, as expected.) Neither am I. There's too much unknown here for me. /Mikael 
WWW: http://www.dtr.isy.liu.se/dtr/staff/mikael Phone: +46  (0)13  28 1343 Telefax: +46  (0)13  28 1339 Date: 08Jun2001 Time: 08:28:40 /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ This message was sent by XFMail. 

Tue, 25 Nov 2003 14:39:30 GMT 


Andrew Stribblehil #10 / 11

Decimal can be Binary Too (was decimal or rational)
<snip> Quote: > ... For "typical" commercial use, the > problem is that converting between base2 (internal) and base10 (string) is > very expensive relative to the one or two arithmetic operations typically > performed on each input. For example, hook up to a database with a million > sales records, and report on the sum. The database probably delivers the > sale amounts as strings, like "25017.18". Even adding them into the total > *as* strings would be cheaper than converting them to a binary format first.
Boo! I was hoping that by base2 you were talking about base 2. We don't need no s{*filter*}keeng two's complement.  Andrew Stribblehill Systems programmer, IT Service, University of Durham, England

Tue, 25 Nov 2003 22:09:50 GMT 


Nick Perkin #11 / 11

Decimal can be Binary Too (was decimal or rational)
Quote: > .....< about fastest way to multiply by ten>... > What's more expensive here, add or shift? I'm more into hardware, and > hardwired shifts are costless (disregarding overflow checks), while > hardwired adds are not.
I am not an expert on modern CPU architechture, but I strongly suspect that things like multiplying two integers are already optimized, and that there would be no advantage to trying to be tricky about it. Besides, you should let the C compiler decide which way is fastest.

Mon, 01 Dec 2003 06:43:51 GMT 


