FFT routine for F90 
Author Message
 FFT routine for F90

Hi !

I am looking for a fortran 90 function or module which can perform a
one-dimensional FFT and which shouldn't be limited by a maximum number of
data points. I found a lot of f77 routines out there but they either were
inaccurate or I couldn't port them to an f90 code (which is important for my
application).

Thanks

THorsten



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:

> Hi !

> I am looking for a Fortran 90 function or module which can perform a
> one-dimensional FFT and which shouldn't be limited by a maximum number of
> data points. I found a lot of f77 routines out there but they either were
> inaccurate or I couldn't port them to an f90 code (which is important for my
> application).

> Thanks

> THorsten

See the aAlan Miller Page. There alot of routines in F90.  If you don't found
it. I can send you a copy.

Sincerely

Javier

--
                               \\|//
                               (o o)
--------------------------ooO---(_)---Ooo--------------------------
Dr. Javier Gonzalez Platas
Grupo de Rayox-X
Universidad de La Laguna        Tel (+34) 922 31 83 00
Dpto. Fisica Fundamental II         (+34) 922 31 83 01
E-30204 La Laguna. Tenerife     Fax (+34) 922 25 69 73



-------------------------------------------------------------------
                             (_)   (_)



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90
Hi !

You probably mean the singlton.f90 module. I tried that. It works perfect
when I call it under Windows NT and Fortran Powerstation but as soon as I
port this module to any other platform (SUN, SGI, or PC with Linux) and
various compilers, I get an error, saying that there is an undefined
reference to fft_1d in the module.

Do I call the module maybe wrong or are there any compiler settings required
under Unix platforms?

How would a sample program look like? I would appreciate to see a code how
others implemented the singlton module.

Thanks

THorsten



Quote:

> > Hi !

> > I am looking for a Fortran 90 function or module which can perform a
> > one-dimensional FFT and which shouldn't be limited by a maximum number
of
> > data points. I found a lot of f77 routines out there but they either
were
> > inaccurate or I couldn't port them to an f90 code (which is important
for my
> > application).

> > Thanks

> > THorsten

> See the aAlan Miller Page. There alot of routines in F90.  If you don't
found
> it. I can send you a copy.

> Sincerely

> Javier

> --
>                                \\|//
>                                (o o)
> --------------------------ooO---(_)---Ooo--------------------------
> Dr. Javier Gonzalez Platas
> Grupo de Rayox-X
> Universidad de La Laguna        Tel (+34) 922 31 83 00
> Dpto. Fisica Fundamental II         (+34) 922 31 83 01
> E-30204 La Laguna. Tenerife     Fax (+34) 922 25 69 73



> -------------------------------------------------------------------
>                              (_)   (_)



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Thorsten W. Hertel wrote

Quote:
>Hi !
>I am looking for a Fortran 90 function or module which can perform a
>one-dimensional FFT and which shouldn't be limited by a maximum number of
>data points. I found a lot of f77 routines out there but they either were
>inaccurate or I couldn't port them to an f90 code (which is important for
my
>application).

Following primitive subroutine is 1 dim. mixed radix FFT routine.
The length of series is not necessarily a power of 2. If the length of
series is the prime number, the time for calculation is as long as DFT.
This code is simple, but is slower than other tuned FFT routines
because it has wasted calculations.
If you use Digital Visual Fortran, you can make DLL by coping this code.
If "double complex" is not available, sorry, please remake.

This subroutine calculates:
<Transform>
C_k = \sum_{j=0}^{n-1} c_j*exp(-2*pi*i*j*k/n) (k=1,2,..,n-1)
or
<Inverse Transform>
c_j = \sum_{k=0}^{n-1} C_k*exp(+2*pi*i*k*j/n) (j=1,2,..,n-1)
( i=sqrt(-1), C_k and c_j are complex number.)
You have to normalize above results in order to adapt to
your use.

----- begging of code -----
! DCFFTMR -------- subroutine for inteface
! rsvfft ------- recursive subroutine which is called by DCFFTMR

! subroutine DCFFTMR(cr, ci, n,isn,icon)
 ! cr [input and output]
 !  1 dim array of real part of complex data cr(0:n-1)
 ! ci [input and output]
 !  1 dim array of imaginary part of complex data ci(0:n-1)
 ! n  [input]
 !  number of elements of cr(:), ci(:)
 ! isn [input]
 !  indicates transform(isn=+1) or inverse transform(isn=-1)
 ! icon [output]
 !  condition code( 0:no error, 1:n<=1, 2:isn/=+-1)

! recursive subroutine rsvfft(n, pr, c, tmp)
 ! n  [input]
 !  number of elements of c(:) and tmp(:)
 ! pr [input]
 !  lower bound of array sections
 ! c  [input]
 !  1 dim complex array of data c(0:n-1)
 ! tmp [output]
 !  1 dim temporary complex array tmp(0:n-1)

subroutine DCFFTMR(cr, ci, n,isn,icon)
 implicit none

 ! Specify that DCFFTMR is exported to a DLL
 ! and that the external name is 'DCFFTMR'
 !DEC$ ATTRIBUTES DLLEXPORT :: DCFFTMR
 !DEC$ ATTRIBUTES ALIAS:'DCFFTMR' :: DCFFTMR

!-----dummy arguments-----
 integer n, isn, icon
 double precision, dimension(0:(n-1)):: cr, ci
!-----local variables-----
 double complex, dimension(0:(n-1)):: cin, tmpin
 double precision phase
 integer pr, i

 if(n<=1) then
  icon = 1
  return
 end if

 if((isn/=1).eqv.(isn/=-1)) then
  icon = 2
  return
 end if

 cin = dcmplx(cr,ci)
 phase = -isn*4.0d0*datan(1.0d0)*2.0d0/n
 pr = 0

 call rsvfft(n, pr, phase, cin, tmpin)

 cr = DBLE(cin)
 ci = DIMAG(cin)
 icon = 0
 return
end subroutine DCFFTMR

recursive subroutine rsvfft(n, pr, phase, c, tmp)
 implicit none

!-----dummy arguments-----
 integer n, pr
 double precision phase
 double complex, dimension(0:(n-1)):: c, tmp
!-----local variables-----
 integer rd, nrd, j, m, r
 double complex xc, w

 if(n <= 1) return
 rd = 2
 do while(rd*rd <= n)
  if(mod(n,rd) == 0) exit
  rd = rd + 1
 end do

 if(mod(n,rd) /= 0) rd = n
 nrd = n/rd

 do j = 0, nrd-1
  do m = 0, rd-1
   xc = c(pr+j)
   do r = nrd, n-1, nrd
    w = dcmplx(DCOS(phase*m*r),DSIN(phase*m*r))
    xc = xc + w*c(pr+r+j)
   end do
   w = dcmplx(DCOS(phase*m*j),DSIN(phase*m*j))
   tmp(pr+m*nrd+j) = xc * w
  end do
 end do

 do r = 0, n-1, nrd
  call rsvfft(nrd, pr+r, phase*rd, tmp, c)
 end do

 do j = 0, nrd-1
  do m = 0, rd-1
   c(pr+rd*j+m) = tmp(pr+nrd*m+j)
  end do
 end do

end subroutine rsvfft
----- end of code -----

I'm sorry for long writing.
Thanks

----------
Hiroshi Kobayashi




Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:

> Hi !

> I am looking for a Fortran 90 function or module which can perform a
> one-dimensional FFT and which shouldn't be limited by a maximum number of
> data points. I found a lot of f77 routines out there but they either were
> inaccurate or I couldn't port them to an f90 code (which is important for my
> application).

Check out the recursive FFT algorith, by Jose M. Perez-Jorda,
which he published in:

  "In-place self-sorting fast Fourier transform algorithm
  with local memory references,"
  Computer Physics Communications, v108, pp 1-8 (1998).

The interesting feature about it is that the instructions
have been rearranged so as to intensively work on one small
area of memory at a time as much as possible, rather than
skipping around everywhere.

Why is that a benefit?  Well, because nowadays most computers
have some form of cache memory, and so Perez-Jorda's algorithm
achieves a very high cache hit percentage.  For small arrays,
his benchmarks indicate no improvement, but for large ones the
speedup is around a factor of three.  YMMMV, depending on what
type of machine you're using, and the ratio of speeds between
your cache and regular memory.

Oh, that algorithm also doesn't need the trig and bit-reversal
tables, either.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

]
]Check out the recursive FFT algorith, by Jose M. Perez-Jorda,
]which he published in:
]
]  "In-place self-sorting fast Fourier transform algorithm
]  with local memory references,"
]  Computer Physics Communications, v108, pp 1-8 (1998).

 Do you know whether the article or FFT code is available on-line ?
 Thanks.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:


> ]
> ]Check out the recursive FFT algorith, by Jose M. Perez-Jorda,
> ]which he published in:
> ]
> ]  "In-place self-sorting fast Fourier transform algorithm
> ]  with local memory references,"
> ]  Computer Physics Communications, v108, pp 1-8 (1998).

>  Do you know whether the article or FFT code is available on-line ?
>  Thanks.

It's not on line, that I know of.  You just have to hike over to the
library to get it (a terrible 20th Century habit, even though we are
almost into the 21st<G>).

Fortunately, the source code is very compact and it is listed in the
article.
The only bad thing is that you have to pay close attention to the
distinction between lower case "l" and numeral "1", since he uses a lot
of mixed case.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:

>It's not on line, that I know of.  You just have to hike over to the
>library to get it (a terrible 20th Century habit, even though we are
>almost into the 21st<G>).
>Fortunately, the source code is very compact and it is listed in the
>article.
>The only bad thing is that you have to pay close attention to the
>distinction between lower case "l" and numeral "1", since he uses a lot
>of mixed case.

It's a shame that it's not online because even a subroutine as small
as this is not all that easy to key in correctly.  I compared it with
the FFT code in TFFT.FOR from Polyhedron's web site and it was about
3X as fast for complex vectors of length 2**19 (about 23 vs. 72
seconds to execute 40 FFTs) on my home PC, although for some reason
my PC is anomalously slow on the TFFT test, by far its worst
comparative performance.

One change to be considered in the code might be to replace the lines

            call FASTFT(DATA(1:N),L,2*ISIGN)
            call FASTFT(DATA(N+1:),L,2*ISIGN)

with

            call FASTFT(DATA(1),L,2*ISIGN)
            call FASTFT(DATA(N+1),L,2*ISIGN)

In DVF 6.0b at least, the array section syntax induces the compiler
to make temporary copies of the array sections making the code take
over half again as long to run and requiring several megabytes of
stack space to run.  Is this kind of problem (allocating unnecessary
array temporaries) unique to DVF, or pervasive in all F95 compilers?
The author (Jose Perez-Jorda) didn't seem to encounter a problem
with it (using xlf90?), getting 3X performance gains for large
vectors as well.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:
> It's a shame that it's not online because even a subroutine as small
> as this is not all that easy to key in correctly.

Are you sure it is not available on machine-readable media from the
CPC Program Library? There may be a fee for this, just like there is
for subscribing to the journal, but your convenience and peace of mind
might be worth it. Alternatively, you could try a scanner and OCR
software, followed by some proofreading and ftnchek'ing (and running
your usual FFT test suite, naturally).

Quote:
>             call FASTFT(DATA(1:N),L,2*ISIGN)
>             call FASTFT(DATA(N+1:),L,2*ISIGN)

> with

>             call FASTFT(DATA(1),L,2*ISIGN)
>             call FASTFT(DATA(N+1),L,2*ISIGN)

> In DVF 6.0b at least, the array section syntax induces the compiler
> to make temporary copies of the array sections making the code take
> over half again as long to run and requiring several megabytes of
> stack space to run.  Is this kind of problem (allocating unnecessary
> array temporaries) unique to DVF, or pervasive in all F95 compilers?

It's certainly not unique to DVF. In fact, I find it disappointing
that DVF 6 (according to your report) still doesn't optimise this
copy away when the array section is contiguous in memory. I'd call
it a bug (particularly if it occurs at high optimisation settings).
But most Fortran 90 compilers go through a phase early in their
development in which all array sections used as actual arguments where
the dummy is not assumed-shape trigger a copy-in/copy-out. It's just
that one would expect the more mature products to have grown out of
such na?vet by now. If it is legal for you to use DATA(1) as the
actual argument instead of DATA(1:N) --- and you should check by
making the interface to FASTFT explicit at the point of call ---
then DATA must already be known to be contiguous (i.e. it cannot be
an assumed-shape dummy or a pointer), in which case there are no excuses.

Quote:
> The author (Jose Perez-Jorda) didn't seem to encounter a problem
> with it (using xlf90?), getting 3X performance gains for large
> vectors as well.

If any compilers do this sort of optimisation, I would expect XLF
to be among them.


Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

]
]
]>It's not on line, that I know of.  You just have to hike over to the
]>library to get it (a terrible 20th Century habit, even though we are
]>almost into the 21st<G>).
]
]>Fortunately, the source code is very compact and it is listed in the
]>article.
]>The only bad thing is that you have to pay close attention to the
]>distinction between lower case "l" and numeral "1", since he uses a lot
]>of mixed case.
]
]It's a shame that it's not online because even a subroutine as small
]as this is not all that easy to key in correctly.

 What kind of license it's under ? Perhaps, you could post it here ?
 Thanks.

] I compared it with
]the FFT code in TFFT.FOR from Polyhedron's web site and it was about
]3X as fast for complex vectors of length 2**19 (about 23 vs. 72
]seconds to execute 40 FFTs) on my home PC, although for some reason
]my PC is anomalously slow on the TFFT test, by far its worst
]comparative performance.

 What about vectors of power of 2 length ?

]One change to be considered in the code might be to replace the lines
]
]            call FASTFT(DATA(1:N),L,2*ISIGN)
]            call FASTFT(DATA(N+1:),L,2*ISIGN)
]
]with
]
]            call FASTFT(DATA(1),L,2*ISIGN)
]            call FASTFT(DATA(N+1),L,2*ISIGN)
]
]In DVF 6.0b at least, the array section syntax induces the compiler
]to make temporary copies of the array sections making the code take
]over half again as long to run and requiring several megabytes of
]stack space to run.  Is this kind of problem (allocating unnecessary
]array temporaries) unique to DVF, or pervasive in all F95 compilers?
]The author (Jose Perez-Jorda) didn't seem to encounter a problem
]with it (using xlf90?), getting 3X performance gains for large
]vectors as well.

 This is diturbing. Compilers haven't got up to speed with F90 optimizations,
 and the committee wants to put OO stuff in. I think that it would make sense
 to define som kind of "minimal Fortran" which includes only the essential
 features, some kind of standardised ELF, for people who only care about
 numerical performance.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:
> In DVF 6.0b at least, the array section syntax induces the compiler
> to make temporary copies of the array sections making the code take
> over half again as long to run and requiring several megabytes of
> stack space to run.

The B3 update, released 16 June, has in its release notes a bullet point
saying they improved just this kind of behaviour. Check it out and tell
us of your results.

        Jan



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:


> >It's not on line, that I know of.  You just have to hike over to the
> >library to get it (a terrible 20th Century habit, even though we are
> >almost into the 21st<G>).

> >Fortunately, the source code is very compact and it is listed in the
> >article.
> >The only bad thing is that you have to pay close attention to the
> >distinction between lower case "l" and numeral "1", since he uses a lot
> >of mixed case.

> It's a shame that it's not online because even a subroutine as small
> as this is not all that easy to key in correctly.  I compared it with

People who want "official" copies of the routine could always try
e-mailing

if
he will e-mail out the source code.


Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:

>The B3 update, released 16 June, has in its release notes a bullet point
>saying they improved just this kind of behaviour. Check it out and tell
>us of your results.

Where did you see this bullet point?  The release notes file is getting
kind of long and I couldn't find the text "6.0B3" in it.  The behavior
was not improved in my case.  Another poster suggested making the
interface explicit at the point of call, but this was already achieved
because the subroutine was an internal subroutine in the main program
and the calls posted earlier were actually recursive calls (yes, the
subroutine was declared recursive).

When I first tried running the code with 6.0B3, I just went to
Developer Studio and hit the Rebuild All switch and got execution
times of 65/17 seconds for TFFT/FASTFT, but later builds reverted
to something like 72-74 seconds for TFFT and 26-27 seconds for the
faster version of FASTFT.  Closing Developer Studio and then
restarting it and hitting the Rebuild All switch again got me back
down to 68.4/18.7 seconds.  The pattern here seems to be that the
first time the Build | Execute button is hit, I am getting
significantly better performance.



Wed, 18 Jun 1902 08:00:00 GMT  
 FFT routine for F90

Quote:
> >The B3 update, released 16 June, has in its release notes a bullet point
> >saying they improved just this kind of behaviour. Check it out and tell
> >us of your results.

> Where did you see this bullet point?  The release notes file is getting
> kind of long and I couldn't find the text "6.0B3" in it.

No, the release notes don;t contain that string. I can't check at the moment,
but possibly this remark is in the page announcing the update.

Quote:
> When I first tried running the code with 6.0B3, I just went to
> Developer Studio and hit the Rebuild All switch and got execution
> times of 65/17 seconds for TFFT/FASTFT, but later builds reverted
> to something like 72-74 seconds for TFFT and 26-27 seconds for the
> faster version of FASTFT.

Dunno for sure, but is Developer Studio using an incremental linker? If so,
that might explain the phenomenon you're seeing.

        Jan



Wed, 18 Jun 1902 08:00:00 GMT  
 
 [ 14 post ] 

 Relevant Pages 

1. 2-D FFT routines in F90

2. f90 FFT routine

3. FFT routine in assembly???

4. fft routine (was Re: m88000 benchmarks)

5. Counter Intuitive Results: Optimising an FFT routine for Speed

6. FFT & other math routines

7. FFT routines

8. my FFT routines

9. assembler FFT routines

10. Efficient Column-wise FFT Routines?

11. FFT & F90

12. FFT subroutines for F90

 

 
Powered by phpBB® Forum Software