Double precision runs fast tan single precision in MS Fortran 
Author Message
 Double precision runs fast tan single precision in MS Fortran

I have a program that uses the straight forward multiplication
to do convolution of discrete probabilities.  I have a double
precision version and single precision version.  The only
difference is that the declaration of real variables one all in
single precision and another in double precision.  They both come
similar results.  One strange thing is:  when compiled in MS PowerStation
the double precision one runs slightly faster than the single
precision one.  I tried MS 5.1 the same thing happened.  Since
the program uses a lot of MS extensions (all F90 compatable) I
can not try it on our SUN F77 compiler.

Any similar experience or insight?

J.G.Liao



Tue, 28 Jan 1997 06:11:53 GMT  
 Double precision runs fast tan single precision in MS Fortran
  I get faster execution time in double precision on the IBM 590 workstation,
which is a 64-bit machine. I was told single precision on this platform has to
be promoted to 64 bit, processed, and then demoted to single precision again. I
guess this disadvantage does not overcome any possible cache benifits from using
only 32 bits, at least for my code. I don't understand why they just don't
define 64 bits as their single precision size like the Cray does???

Regards,

Tom



Tue, 28 Jan 1997 23:54:03 GMT  
 Double precision runs fast tan single precision in MS Fortran

Quote:

>   I get faster execution time in double precision on the IBM 590 workstation,
> which is a 64-bit machine. I was told single precision on this platform has to
> be promoted to 64 bit, processed, and then demoted to single precision again. I
> guess this disadvantage does not overcome any possible cache benifits from using
> only 32 bits, at least for my code. I don't understand why they just don't
> define 64 bits as their single precision size like the Cray does???

We didn't do that because it would have taken away useful function for
those who needed the storage density benefits of (32-bit) single precision
more than they need the computational speed of (64-bit) double precision.

If you are looking for a way to do everything in double precision without
changing your source, you should look at the various forms of the
autodbl option.
--



Wed, 29 Jan 1997 14:37:24 GMT  
 Double precision runs fast tan single precision in MS Fortran

Quote:
>  I get faster execution time in double precision on the IBM 590 workstation,
>which is a 64-bit machine. I was told single precision on this platform has to
>be promoted to 64 bit, processed, and then demoted to single precision again. I
>guess this disadvantage does not overcome any possible cache benifits from using
>only 32 bits, at least for my code. I don't understand why they just don't
>define 64 bits as their single precision size like the Cray does???

Because if you're using _large_ arrays, the difference in swapping time
will be far more important than the difference in computing time,
especially if you can't/don't keep the references local.

The supercomputers have a very high memory bandwidth and they don't
use, usually, virtual memory, so they are not affected by the same
problems as your 32 bit workstation.

Dan
--
Dan Pop
CERN, CN Division

Mail:  CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland



Thu, 30 Jan 1997 06:47:30 GMT  
 Double precision runs fast tan single precision in MS Fortran

Quote:
>I have a program that uses the straight forward multiplication
>to do convolution of discrete probabilities.  I have a double
>precision version and single precision version.  The only
>difference is that the declaration of real variables one all in
>single precision and another in double precision.  They both come
>similar results.  One strange thing is:  when compiled in MS PowerStation
>the double precision one runs slightly faster than the single
>precision one.  I tried MS 5.1 the same thing happened.  Since
>the program uses a lot of MS extensions (all F90 compatable) I
>can not try it on our SUN F77 compiler.

>Any similar experience or insight?

This is due to the PC FPU architecture. All the computations are performed
internally in extended double precision and the results are converted to
single of double precision when they are stored in variables. It seems
that the conversion to/from double precision is faster than the
conversion to/from single precision.

You may get similar results on IBM RS/6000, which performs all the
floating point calculations in double precision, but on other RISC
platforms the single precision version will be usually faster.

Dan
--
Dan Pop
CERN, CN Division

Mail:  CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland



Thu, 30 Jan 1997 06:41:12 GMT  
 Double precision runs fast tan single precision in MS Fortran

Quote:


> >  I get faster execution time in double precision on the IBM 590 workstation,
> >which is a 64-bit machine. I was told single precision on this platform has to
> >be promoted to 64 bit, processed, and then demoted to single precision again. I
> >guess this disadvantage does not overcome any possible cache benifits from using
> >only 32 bits, at least for my code. I don't understand why they just don't
> >define 64 bits as their single precision size like the Cray does???

> Because if you're using _large_ arrays, the difference in swapping time
> will be far more important than the difference in computing time,
> especially if you can't/don't keep the references local.

> The supercomputers have a very high memory bandwidth and they don't
> use, usually, virtual memory, so they are not affected by the same
> problems as your 32 bit workstation.

> Dan

  I think it has been the consensus that portability was IBM's highest concern on this
issue, rather than a small performance increase. As far as compensating for a virtual
memory system, what issues would one have to address that are different than for
making efficient use of cache? Both want to maximize data locality for computation.
What else is there to address? A direct-mapped cache adds a few problems, but
padding your arrays properly should addresses that issue. Am I  under-estimating the
problem here, please tell me yes and why because I would like to know?

  We used an old Cray trick to solve the scaling problem of cache-based machines for
larger data sets. In your data-crunching subroutines, instead of calling this routine once
and pass it a set of huge arrays to {*filter*}on, call it 10,100,1000,etc... times and only
send a section of your array for processing. On the Cray we use this for Multi-Tasking,
or splitting the problem among processors, but the same technique works well, at least
for some problems, for making your calculations cache  efficient on RISC hardware.

  Several individuals suggested the use of IBM xlf auto-double compile options. That
had mixed results in our original tests, some programs performed well with these options,
others started giving NaN results when they previously worked. It was easier to make a
sed script that explicitly gave us REAL or DOUBLE as the default and use a defined
constant for the number of bytes in a floating point value/array for certain system
routines that were called. Some IBM guy took our code that the auto-double option
choked on so maybe that will be fixed in the next patch or two.

Thanx for the ideas/comments/experiences everyone sent,

Tom



Sat, 01 Feb 1997 04:07:28 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Convert single precision to double precision

2. double precision vs. single precision

3. single precision or double precision?

4. HP Fortran question: precision of double-precision?

5. Single and double precision versions of IMSL routines give different

6. Single and double precision versions of IMSL routines give different results

7. Single / Double Precision Issue

8. single/double precision

9. Single to double precision

10. double precision skiped to single ?

11. Mixing double and single precision

12. Single/Double Precision

 

 
Powered by phpBB® Forum Software