Fast memory fill 
Author Message
 Fast memory fill

Hi,

Quote:
> Are there any ways to make this faster?  For example, can MMX instructions
> be used to fill 64 bits at a time (I have zero knowledge of MMX, just
> wondering).

You can use MMX, and it's faster, if the destination is not in Level 1
Cache, but if it is in L1-Cache, the rep stosd version is the fastest
one. Even using the SIMD FPU to move 128 bits at one time isn't faster

MfG Alexander



Thu, 08 Aug 2002 03:00:00 GMT  
 Fast memory fill

Quote:

> > Are there any ways to make this faster?  For example, can MMX instructions
> > be used to fill 64 bits at a time (I have zero knowledge of MMX, just
> > wondering).
> You can use MMX, and it's faster, if the destination is not in Level 1
> Cache, but if it is in L1-Cache, the rep stosd version is the fastest
> one. Even using the SIMD FPU to move 128 bits at one time isn't faster

This is alot like the "fastest block copy" question, and is ultimately
processor dependant.  So you have to pick your target CPU[s].

Intel has special circuitry on P6s to speed up block string instructions
[REP  MOVS and STOS] and claims these are the fastest way to move/set
blocks.  This also has the virtue of being backwards compatible with
all x86's, even if it is suboptimal on some.

AFAIK, AMD K6 and K7 cores do not have this circuitry, and AMD claims
MMX (MOVQ) instructions are the fastest.  This of course requires MMX
support that older Pentiums and PPro's lack, so fallback instructions
are necessary.  For these older processors, FPU instructions are the
fastest (FLD/FST qword), sometimes even the slow integer variations
(FILD/FIST qword).

-- Robert



Thu, 08 Aug 2002 03:00:00 GMT  
 Fast memory fill

Quote:

> AFAIK, AMD K6 and K7 cores do not have this circuitry, and AMD claims
> MMX (MOVQ) instructions are the fastest.  This of course requires MMX
> support that older Pentiums and PPro's lack, so fallback instructions
> are necessary.  For these older processors, FPU instructions are the
> fastest (FLD/FST qword), sometimes even the slow integer variations
> (FILD/FIST qword).

Ok, I sorta suspected that.  Since portability (to Pentiums and their clones)
is also an issue I suppose what I've already got is probably the best all
around solution.

Thanks guys,

Mark



Thu, 08 Aug 2002 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Fast filled rectangles in mode 12h...

2. Fastest way to fill data...

3. Fast data fill

4. Fast filled polygons

5. LabView fills memory

6. memory filled up?

7. fast, fast, fast sin, cos, sqrt - OSI open source code

8. Need fast method to save data to disk or memory

9. Fast 68000 memory copy routine.

10. Fast memory moves on PCs

11. Wanted: fast memory copy on 486 & P5

 

 
Powered by phpBB® Forum Software