FPU instructions etc.
Author Message
FPU instructions etc.

Hello, (world!)
I am just about to enter the wonderful world of the math co-processor. I
was wondering if someone could point out some good resources explaining
things.
And whats this thing about 'unrolling' I've seen in some code snippets?
If you unroll this a couple of times it'll be faster? Huh?

Appreciated,
Jussi

Wed, 13 Nov 2002 03:00:00 GMT
FPU instructions etc.
Vulture's site:
http://www.ice-digga.com/programming/vul.html - I haven't looked at his
FPU tutorial, but hey, see how you like it :)

Quote:

> Hello, (world!)
> I am just about to enter the wonderful world of the math co-processor. I
> was wondering if someone could point out some good resources explaining
> things.
> And whats this thing about 'unrolling' I've seen in some code snippets?
> If you unroll this a couple of times it'll be faster? Huh?

> Appreciated,
> Jussi

--

Team2k PC/Palm Pilot Programming Team:
http://ppilot.homepage.com

To email me, remove '3*&' from my email address. This is to deter spam :)

Wed, 13 Nov 2002 03:00:00 GMT
FPU instructions etc.

Quote:

> I am just about to enter the wonderful world of the math co-processor. I
> was wondering if someone could point out some good resources explaining
> things.

I haven't found a good one other than the original documentation.

Quote:
> And whats this thing about 'unrolling' I've seen in some code snippets?
> If you unroll this a couple of times it'll be faster? Huh?

Take the loop:

BASIC:
let y=0;
for x=1 to 10
y=y+x
next x

C++:
int y=0;
for (int x=1;i<11;i++) y+=x;

Now, such a loop, when executed, would spend a lot of time in the looping
code and not a lot of time in the addition stage.  In pseudo-machine
language, it would look like this:

y=0;
x=1;
loop:   is x<11?
if no, exit loop
add x to y      // of interest
branch back up to loop
exit loop:
.
.
.

Now, the line of interest above is the only line that actually does
something computation-wise.  The rest of it is just to control the program
flow.  Stepping through all that code ten times wastes time, and on modern
processors the branching bit takes relatively far more time than the
addition part.  So a good compiler would optimise it by unrolling the
loop, thus:

y=0

This would be unrolling the loop ten times.  All I've done is take away
the control code and list the 'inner loop' the number of times it would
actually execute.

You could also unroll it just once by getting the loop to do two additions
in the loopy bit, or it could do five additions twice.  You'll note my
'optimisation' above doesn't account for x should it be needed later on,
and also any self-respecting compiler would optimise the whole lot as
simply y=55.

Modern processors can unroll loops to a limited extent before the code
hits the actual execution unit.  In fact, unrolling loops can decrease
execution speed in certain cases.

Unrolling loops is all about increasing speed.  The tradeoff is that the
program size generally gets bigger.

Richard Cavell

Thu, 14 Nov 2002 03:00:00 GMT
FPU instructions etc.
For basic info, see Ch 14 of "The art of assembly language programming"
at http://webster.cs.ucr.edu
Randy Hyde

Quote:
> Hello, (world!)
> I am just about to enter the wonderful world of the math co-processor. I
> was wondering if someone could point out some good resources explaining
> things.
> And whats this thing about 'unrolling' I've seen in some code snippets?
> If you unroll this a couple of times it'll be faster? Huh?

> Appreciated,
> Jussi

Sat, 16 Nov 2002 03:00:00 GMT

 Page 1 of 1 [ 4 post ]

Relevant Pages