fixed point vs floating point
Author 
Message 
Matthew Hean #1 / 48

fixed point vs floating point
Matthew asks: <<Can you elaborate on that? I've often observed that floating point is used when fixed point is clearly called for, ie a bearing or latitude. What are the specific reasons to avoid fixed point?>> Robert replies: <<Most obviously, on most architectures, fixedpoint is drastically slower than flowtingpoint.>> Matthew asks <<Well, can you elaborate on that? Isn't fixedpoint arithematic implemented using integers? What are the specific reasons why it would be slower? Is it because floating point can be done in parallel on many machines? My real reason for all these questions is because I still don't understand why someone declares heading as type Heading_In_Degrees is digits 6 range 0.0 .. 360.0; when a better fit to the problem seems to be type Heading_In_Degrees is delta 0.1 range 0.0 .. 360.0; The error in my value is the same no matter which way I'm pointing, so why wouldn't I declare the type as a fixed point type? However, almost universally, programmers use a floating point type, but I suspect the reason has nothing to do with efficiency considerations. Any ideas?>> Robert replies: <<Probably this is no longer GNAT specific enough to be here, this thread should be moved to CLA. Briefly, fpt is faster than integer because chips are designed that way (I am specifically thinking of multiplication and division here). Second your definition for Heading_In_Degrees is wrong, a small of 0.0625 seems quite inappropriate. Third, I think you should look at generated code to better undertasnd fixedpoint issues.>> Matthew replies, and then asks: OK, OK, I was being lazy. How about this definition of heading: type Heading_In_Degrees_Base is delta 512.0 * System.Fine_Delta range 512.0 .. 512.0; subtype Heading_In_Degrees is Heading_In_Degrees_Base range 0.0 .. 360.0; Now I know on GNAT that the delta is considerably smaller than 0.0625. (I really do wish implementations would give me extra accuracy and not extra range. Why do they do that?) Anyway, I'll accept that fixed point operations are slower, but the question remains: Why do most programmers insist on using floating point when a fixed point better models the abstraction? Is the reason solely based on efficiency? By the way, are there offtheshelf subprograms (in Ada preferably, but I'd take any language) for computing the elementary functions (sin, cos, etc) of a fixed point angle?  Matthew Heaney Software Development Consultant
(818) 9851271

Wed, 10 May 2000 03:00:00 GMT 


Tucker Ta #2 / 48

fixed point vs floating point
: Matthew asks: : <<Can you elaborate on that? I've often observed that floating point is : used when fixed point is clearly called for, ie a bearing or latitude. : What are the specific reasons to avoid fixed point?>> : Robert replies: : <<Most obviously, on most architectures, fixedpoint is drastically slower : than flowtingpoint.>> I'm not sure to what Robert is referring here. Fixed + fixed is as efficient as the normal integer operation, as is Fixed  Fixed, Fixed * Integer, and Fixed / Integer. The only operations that are potentially inefficient are Fixed * Fixed and Fixed / Fixed, neither of which are particularly likely when the fixedpoint type represents an angle ;). Even if you do use Fixed * Fixed or Fixed / Fixed, the inefficiency has to do with sometimes having to shift the result or one of the operands after/before performing the operation (presuming binary smalls). If the machine has efficient shifting, this is not a major overhead. However, it may be that on some compilers, the fixedfixed multiplication/division operations are handled outofline, and the procedurecall overhead is the primary added expense. : ... : Matthew Heaney : Software Development Consultant
: (818) 9851271 
Intermetrics, Inc. Burlington, MA USA

Wed, 10 May 2000 03:00:00 GMT 


Ken Garlingto #3 / 48

fixed point vs floating point
Quote:
> <<The primary reason we use floating point and not fixed point is that > fixed point is, to us, unreliable. By that I mean the ultimate result > is not what we want. We agree that fixed point is much faster, but > floating point is, almost always, plenty fast enough.>> > *very odd* > First, please do some experiments, all of those who believe that fixedpoint > is faster than floatingpoint. Especially if you include some multiplications, > and especially if you are using precisions greater than 32 bits on a 32bit > machine, you are going to be MIGHTY surprised, I would even say schocked > by what you see. Fixed point arithmetic other than very simple adition > and subtraction of identical types (which is the same speed as fpt on > typical machines) is drastically slower than floatingpoint. If you don't > believe me, try it out and see!
This is probably true if you include your earlier qualifier  on "modern" architectures. On MILSTD1750s, however, most implementations have far more clock cycles required for floatingpoint multiplies and divides than for integer shifts  on the order of 3050 cycles for the floats vs. 58 for the shifts. There is also a penalty if you're dealing with slow memory, and having to fetch the 32bit float vs. the 16bit integer. So, IF...  you can live with the precision of 16 bits, AND  you use poweroftwo deltas, AND  you have an Ada compiler that optimizes well for fixedpoint you could get better performance for fixedpoint. I was on a project where one team used fixedpoint extensively, and another used mostly floatingpoint. The fixedpoint and floatingpoint appeared to be about the same performancewise, mostly because the Ada 83 compiler used did not do a lot of fixedpoint optimization. Quote: > Second, fixedpoint is FAR more reliable in practice, because as Bob > Eachus notes, the error analysis is far simpler. Mars seems to say that > fixedpoint is unreliable because off incompetent programmers and > unreliable compilers, but this is a comment on a particular set of > incopmpetent programmers and unrealiable compilers, not on the fundamental > nature of fixed vs floatingpoint.
We did notice incorrect object code for fixedpoint generated by our compiler for fixedpoint, but I agree that fixedpoint is more reliable in the abstract. Quote: > In my experience programmers are woefully ignorant about floatingpoint > (the completely extraordinary thread on CLA), and this leads to a lot of > serious problems in floatingpoint code. > Interestingly in high critical systens, the objection to fpt is related > to Mars' concern about reliability. The general perception is that fpt > hardware is too complex to be reliable. Intels' widely publicized > difficulties in this area can only help to confirm this worry!
We've never had anyone raise this objection in our systems, although I have yet to use a 1750 CPU that didn't have a bug in its floating point implementation. I guess its due to the "obvious" fact that hardware is more reliable than software :)

Wed, 10 May 2000 03:00:00 GMT 


Matthew Hean #4 / 48

fixed point vs floating point
Quote:
>Sure if you never do multiplications, then the penalty for fixedpoint >can be close to zero (compared with flaotingpoint), unless of course >you want more than 32 bits of precision and you are on a 32 bit machine, >then fixedpoint gets very expensive.
Let's change the problem a bit. What's more expensive: to calculate the sine of a fixed point with a binary small, or to calculate the sine of a floating point? What about making the fixed point 32 bits instead of 64  will that make it more efficient? That processor Intermetrics is building an Ada compiler for (SHARC, or something like that) comes with "fixed point math in hardware." Will that make any difference in efficiency? BTW: how do I calculate the sine of a fixed point number?  Matthew Heaney Software Development Consultant
(818) 9851271

Wed, 10 May 2000 03:00:00 GMT 


Robert Dew #5 / 48

fixed point vs floating point
Tuck said <<I'm not sure to what Robert is referring here. Fixed + fixed is as efficient as the normal integer operation, as is Fixed  Fixed, Fixed * Integer, and Fixed / Integer. The only operations that are potentially inefficient are Fixed * Fixed and Fixed / Fixed, neither of which are particularly likely when the fixedpoint type represents an angle ;). Even if you do use Fixed * Fixed or Fixed / Fixed, the inefficiency has to do with sometimes having to shift the result or one of the operands after/before performing the operation (presuming binary smalls). If the machine has efficient shifting, this is not a major overhead. However, it may be that on some compilers, the fixedfixed multiplication/division operations are handled outofline, and the procedurecall overhead is the primary added expense.
Surely Tuck, you know that on almost all modern machines, integer multiplication is much slower than floatingpoint multiplication? So even without the sift, you are behind. Sure if you never do multiplications, then the penalty for fixedpoint can be close to zero (compared with flaotingpoint), unless of course you want more than 32 bits of precision and you are on a 32 bit machine, then fixedpoint gets very expensive.

Wed, 10 May 2000 03:00:00 GMT 


Geert Bosc #6 / 48

fixed point vs floating point
Anyway, I'll accept that fixed point operations are slower, but the question remains: Why do most programmers insist on using floating point when a fixed point better models the abstraction? Is the reason solely based on efficiency? By the way, are there offtheshelf subprograms (in Ada preferably, but I'd take any language) for computing the elementary functions (sin, cos, etc) of a fixed point angle? Maybe your second questions answers the first. It is really useless to try hard to calculate a sinus using integer arithmetic when you have a perfectly fine and very fast, well tested floatingpoint unit in your computer that can calculate the sinus with a delta smaller than 0.000000000000001 probably. Converting a fixed point value to fpt, issue the sinus instruction and convert it back is much faster than even thinking about doing it in fixed point. ;) Regards, Geert

Thu, 11 May 2000 03:00:00 GMT 


Matthew Hean #7 / 48

fixed point vs floating point
Quote:
>Maybe your second questions answers the first. It is really useless to >try hard to calculate a sinus using integer arithmetic when you have a >perfectly fine and very fast, well tested floatingpoint unit in your >computer that can calculate the sinus with a delta smaller than >0.000000000000001 probably. >Converting a fixed point value to fpt, issue the sinus instruction >and convert it back is much faster than even thinking about doing it in >fixed point. ;)
Sherman, Peabody: I think it's time for us to step into the wayback machine. Remember that thread a few months back about floating point comparison? As Robert reminded us, you have to know what you're doing when you use floating point arithmetic. I was trying to avoid floating point types, because I'm not a numerical analyst, and don't want to just *hope* my floating point calculations produce a correct result. I want to *know* they're correct. My error analysis is simpler if use a fixed point type, right? If my abstraction calls for a real type having an absolute error and not a relative error, then clearly a fixed point type is called for, even if it's less efficient, right? Isn't it always true that we should code for correctness first, by using a type that directly models the abstraction, and then optimize only if necessary? Are you making an exception to this rule for fixed point types? You seem to be suggesting  in the name of efficiency  that we convert a fixed point number into a floating point, calculate the sine, and then convert the result back into fixed point. OK, fair enough, but what is the accuracy of the result? Isn't this exactly the kind of thing Robert was warning us about? How much precision does one require for the floating point type? And you're sure that the conversion process doesn't add more expense than just calculating the sine directly in fixed point? Tell me, Geert, why do we even have fixed point in the language at all, if it's too inefficient to use? All my Ada books tell me to use a fixed point type when my abstraction has absolute error. But I don't remember them saying, Do not use fixed point types because they're too inefficient. None say "you shouldn't even think about calculating a sine using fixed point." Under what circumstances do you use fixed point? Or are you saying never to use fixed point? I still would like someone to give me a reason  unrelated to efficiency  why you'd model a heading (or any other kind of angle) using a floating point type instead of a fixed point type.  Matthew Heaney Software Development Consultant
(818) 9851271

Thu, 11 May 2000 03:00:00 GMT 


Robert Dew #8 / 48

fixed point vs floating point
Matthew says <<Remember that thread a few months back about floating point comparison? As Robert reminded us, you have to know what you're doing when you use floating point arithmetic. I was trying to avoid floating point types, because I'm not a numerical analyst, and don't want to just *hope* my floating point calculations produce a correct result. I want to *know* they're correct. My error analysis is simpler if use a fixed point type, right? If my abstraction calls for a real type having an absolute error and not a relative error, then clearly a fixed point type is called for, even if it's less efficient, right? Isn't it always true that we should code for correctness first, by using a type that directly models the abstraction, and then optimize only if necessary? Are you making an exception to this rule for fixed point types?>> When Robert was reminding you that you need to know what you are doing when using FloatingPoint arithmetic, he did not for a moment mean to say that somehow the problem is trivial in fixedpoint arithmetic. Computing trig functions in limited precision fixedpoint and retaining sufficient accruacy is a VERY hard problem. Note that neither fixedpoint nor floatingpoint models the abstraction, since the abstraction is real! The only difference is whether the error control is relative or absolute. FOr some purposes relative error is *easier* to analyze than *absolute* error, but there is no sense in which fixedpoint is somehow a more accurate abstract representation of real than flaotingpoint!

Thu, 11 May 2000 03:00:00 GMT 


Tom Mor #9 / 48

fixed point vs floating point
Quote: >Converting a fixed point value to fpt, issue the sinus instruction >and convert it back is much faster than even thinking about doing it in >fixed point. ;)
But "type degrees is delta 1.0/16 range 0.0 .. 360.0;" has only 5760 possible values, and a table lookup with a table of 5760 entries is often quite reasonable, and surely faster than converting to fp, calculating a sin, and converting back. Not to mention if the function is not just sin, but, say the 6th power of the cos (in a Phong illumination calculation, say). I can't seem to find my Cody&Waite, but I think they give fixed point algorithms, and I also think I've seen mention of Ada code for it in the PAL or some such public place.

Thu, 11 May 2000 03:00:00 GMT 


Herman Rub #10 / 48

fixed point vs floating point
Quote:
>Matthew says ><<Remember that thread a few months back about floating point comparison? As >Robert reminded us, you have to know what you're doing when you use >floating point arithmetic. I was trying to avoid floating point types, >because I'm not a numerical analyst, and don't want to just *hope* my >floating point calculations produce a correct result. I want to *know* >they're correct. My error analysis is simpler if use a fixed point type, >right? >If my abstraction calls for a real type having an absolute error and not a >relative error, then clearly a fixed point type is called for, even if it's >less efficient, right? Isn't it always true that we should code for >correctness first, by using a type that directly models the abstraction, >and then optimize only if necessary? Are you making an exception to this >rule for fixed point types?>>
In principle, there is no such thing as floating point arithmetic. Fixed point arithmetic can be done to arbitrary precision; to do floating arithmetic to much more than what is provided, it is necessary to emulating the floating in fixed, and to do the quite clumsy fixed point in floating. That fixed point arithmetic is so slow on computers is due to the fact that the hardware manufacturers, partly influenced by those gurus who do not see the uses of fixed point, have combined several fixed point instructions to produce floating point in such a way that it is very difficult to use it for anything else. It would not be difficult for the hardware to allow the fixed point part of the floating unit to be available to the user, and it would allow considerable speedup.  This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN479071399

Fri, 12 May 2000 03:00:00 GMT 


Vince Del Vecch #11 / 48

fixed point vs floating point
Date: 24 Nov 1997 12:10:12 0500
Organization: Analog Devices CPD Lines: 39 XNewsreader: Gnus v5.3/Emacs 19.33 Quote:
> Let's change the problem a bit. What's more expensive: to calculate the > sine of a fixed point with a binary small, or to calculate the sine of a > floating point? What about making the fixed point 32 bits instead of 64  > will that make it more efficient? > That processor Intermetrics is building an Ada compiler for (SHARC, or > something like that) comes with "fixed point math in hardware." Will that > make any difference in efficiency?
The SHARC (and other Analog Devices DSPs) only support fixedpoint math with a fixed binary point. Specifically, they support type Signed_Fixed is delta 2**(Word_Size + 1) range 1.0 .. 1.0; and type Unsigned_Fixed is delta 2**(Word_Size) range 0.0 .. 1.0; where small == delta, upper endpoint is not a member of the type, and, for Sharc, Word_Size == 32. The difference from many other processors is that they have operations for Signed_Fixed * Signed_Fixed => Signed_Fixed Unsigned_Fixed * Unsigned_Fixed => Unsigned_Fixed where the shifting you need to do is built to the instruction. For arbitrary smalls, it probably won't be any faster than on other processors, but if you restricted yourself to certain smalls, it would be possible to be as efficient as floatingpoint. On the other hand, for the Sharc specifically, there is no fixed point sin routine in the C runtime library (or the Ada runtime library, for that matter). That is mostly because the processor also has floating point, and I think most people use the floating point. Fixed point transcendental operations are much more prevalent on our fixedpointonly processors. Vince Del Vecchio
Not speaking for Analog Devices

Fri, 12 May 2000 03:00:00 GMT 


Geert Bosc #12 / 48

fixed point vs floating point
``If my abstraction calls for a real type having an absolute error and not a relative error, then clearly a fixed point type is called for, even if it's less efficient, right? Isn't it always true that we should code for correctness first, by using a type that directly models the abstraction, and then optimize only if necessary? Are you making an exception to this rule for fixed point types? You seem to be suggesting  in the name of efficiency  that we convert a fixed point number into a floating point, calculate the sine, and then convert the result back into fixed point. OK, fair enough, but what is the accuracy of the result? '' The fixed point type is the abstraction that you do calculations with when you are calculating with currencies for example. The heading of a ship is of course a real value. So any difference between this real value and the calculated one (whether fixed or float) is error for you. When converting a real value between 0.0 and 360.0 to a value in radians and then taking the sine using floating pt you will have a result that has a certain relative error. Since you also know that the result interval is in 1.0 .. 1.0, you have a result with a known absolute precision. On typical hardware this absolute error will be much smaller than 1.0E12. Regards, Geert Disclaimer: I'm not a numerical analyst!

Fri, 12 May 2000 03:00:00 GMT 


Vince Del Vecch #13 / 48

fixed point vs floating point
Date: 24 Nov 1997 16:43:30 0500
Organization: Analog Devices CPD Lines: 39 XNewsreader: Gnus v5.3/Emacs 19.33 Quote:
> Let's change the problem a bit. What's more expensive: to calculate the > sine of a fixed point with a binary small, or to calculate the sine of a > floating point? What about making the fixed point 32 bits instead of 64  > will that make it more efficient? > That processor Intermetrics is building an Ada compiler for (SHARC, or > something like that) comes with "fixed point math in hardware." Will that > make any difference in efficiency?
The SHARC (and other Analog Devices DSPs) only support fixedpoint math with a fixed binary point. Specifically, they support type Signed_Fixed is delta 2**(Word_Size + 1) range 1.0 .. 1.0; and type Unsigned_Fixed is delta 2**(Word_Size) range 0.0 .. 1.0; where small == delta, upper endpoint is not a member of the type, and, for Sharc, Word_Size == 32. The difference from many other processors is that they have operations for Signed_Fixed * Signed_Fixed => Signed_Fixed Unsigned_Fixed * Unsigned_Fixed => Unsigned_Fixed where the shifting you need to do is built to the instruction. For arbitrary smalls, it probably won't be any faster than on other processors, but if you restricted yourself to certain smalls, it would be possible to be as efficient as floatingpoint. On the other hand, for the Sharc specifically, there is no fixed point sin routine in the C runtime library (or the Ada runtime library, for that matter). That is mostly because the processor also has floating point, and I think most people use the floating point. Fixed point transcendental operations are much more prevalent on our fixedpointonly processors. Vince Del Vecchio
Not speaking for Analog Devices

Fri, 12 May 2000 03:00:00 GMT 


Robert Dew #14 / 48

fixed point vs floating point
Herman says <<That fixed point arithmetic is so slow on computers is due to the fact that the hardware manufacturers, partly influenced by those gurus who do not see the uses of fixed point, have combined several fixed point instructions to produce floating point in such a way that it is very difficult to use it for anything else. It would not be difficult for the hardware to allow the fixed point part of the floating unit to be available to the user, and it would allow considerable speedup. << There is no problem in implementing fast integer or fractional integer multiply and divide, using the same techniques as are used for floatingpoint. However, apart from a few people on soapboxes, there is no demand for such instructions, let alonge fast implementations of them (the Transputer is one of the very few machines that provides a fractional binary scaled multiply instruction).

Fri, 12 May 2000 03:00:00 GMT 


Joe Gwi #15 / 48

fixed point vs floating point
Quote:
> There is no problem in implementing fast integer or fractional integer > multiply and divide, using the same techniques as are used for floatingpoint. > However, apart from a few people on soapboxes, there is no demand for such > instructions, let alonge fast implementations of them (the Transputer is > one of the very few machines that provides a fractional binary scaled > multiply instruction).
Robert also said in some other message that floating point is faster than integer simply because the hardware is designed that way. This is certainly true in general, and was true for us. Performance follows resources. I don't know what came over me. I never agree with Robert. It won't happen again. Until perhaps 5 years ago, we always used fixed point (scaled binary) in the code, regardless of language (except Ada83, actually; see below), because we couldn't afford to design good enough floating point hardware into the computers. In those days, we built our own purposebuilt singleboard computers (SBCs) for major realtime projects. What we couldn't afford was SBC real estate, not money for components. The FP hardware would always be in the original plan, but soon got pushed off the board by some other more important hardware components, most often added onboard memory. Push come shove, more memory always won. Now, the issue is gone, for two reasons. First, the floating point hardware is generally built into the processor chips, so there is no real estate to save. Second, we no longer build our own SBCs for standard busses such as VME, we buy them, unless cornered. The fixed point arithmetic in the Ada83 compilers that we used and tested in those days didn't work well or even correctly in many cases, so we never used it. I don't offhand recall doing an Ada job that didn't use floating point; Ada83 was mostly used on the larger computers, such as VAXes and Silicon Graphics boxes, so the issue never came up. The exception may have been XD Ada from System Designers (where are they now?) on the 68020. So, why do we prefer floating point, speed aside? Because it's easier to program with, as scaling issues go away, and so it's much easier to get right. Remember, the traditional comparison was with manuallyimplemented scaled binary. The fact that floating point arithmetic is approximate isn't much of an issue for most physicsbased mathematical code, and those places that are exceptions are still *much* less trouble to deal with than the traditional scaled binary. A random note: When declaring an angle in Ada, it's necessary to declare the range as something like 0.1 to 360.1 degrees (not 0 to 360 degrees), to prevent the fuzziness of floating point arithmetic from causing randomappearing constraint errors. The fuzz can extend ever so slightly over the line, causing unnecessary and unexpected constraint errors if one tries to set the range too tightly. With Ada95 (which I assume has fixed the Ada83 problems with fixedpoint arithmetic) on modern machines, use of fixed point has never been proposed, simply because use of floating point is still simpler to program, because scaling is handled automatically, and because more programmers are familiar with floating point than with scaled fixed point. One less thing to worry about, and to get wrong. As for speed, float versus fixed, in practice it seems to depend more on the compiler and its level of excessive generalization and also level of code optimization (and level of overly paranoid and/or redundant range checking) than the raw hardware speed of arithmetic. This is especially true of math library elements such as Sine and Cosine and Square Root, which, being precompiled, don't usually respond to changes in the optimization settings. Again, performance follows engineering resources. In one recent example, Square Root was taking 60 microseconds on a 50MHz 68060; this should take no more than 2 uS. Where did the time go? To range check after range check. To internal use of 96bit extended floats (which must be done in the 68060 FP instruction emulation library rather than the CPU hardware), all to implement a 32bit float function. To functions within functions, all checking everything. Etc. Same kind of story with the transcendental functions, only worse. One needs to read the generated assembly code to know what a given compiler is up to. We only need a few functions, specifically Sine, Cosine, Arc Tangent, and Square Root, speed is very much of the essence, and we need only 10e4 to 10e5 (16bit) accuracy anyway. So, we will write our own versions of these functions, in Ada, C, or even assembly, as determined by performance tests. Someone asked how to compute the Sine in fixed point; nobody answered. There are three traditional methods, polynomial approximations, bruteforce table lookup, or table lookup with interpolation. These methods are approximately equal in speed when it's all said and done, although bruteforce table lookup can trade memory for speed better than polynomials, while polynomials can be faster than table lookup on some machines because polynomials can avoid floatfix conversions and index arithmetic and range checks (to prevent accessing beyond the ends of the table). Any of these methods can outperform the standard mathlib functions precisely because they do exactly what's asked, and nothing more. As for the polynomial approximations, the bible is to this day is "Approximations for Digital Computers"; Cecil Hastings, Jr.; Princeton University Press; 1955. This has been a classic since its publication. In those days, the computers were small, so the programmers had to be very focused on performance. For computation of polynomials in real systems, convert the published polynomials into Horner's form. This is discussed in many texts on numerical methods. In short, a + bX + cX^2 becomes a + X(b+cX), which is faster to compute, and has better numerical properties than the standard powerseries form. Joe Gwinn

Sat, 13 May 2000 03:00:00 GMT 


