Author 
Message 
Thorsten W. Herte #1 / 14

FFT routine for F90
Hi ! I am looking for a fortran 90 function or module which can perform a onedimensional FFT and which shouldn't be limited by a maximum number of data points. I found a lot of f77 routines out there but they either were inaccurate or I couldn't port them to an f90 code (which is important for my application). Thanks THorsten

Wed, 18 Jun 1902 08:00:00 GMT 


Javier Gonzalez Plata #2 / 14

FFT routine for F90
Quote:
> Hi ! > I am looking for a Fortran 90 function or module which can perform a > onedimensional FFT and which shouldn't be limited by a maximum number of > data points. I found a lot of f77 routines out there but they either were > inaccurate or I couldn't port them to an f90 code (which is important for my > application). > Thanks > THorsten
See the aAlan Miller Page. There alot of routines in F90. If you don't found it. I can send you a copy. Sincerely Javier  \\// (o o) ooO(_)Ooo Dr. Javier Gonzalez Platas Grupo de RayoxX Universidad de La Laguna Tel (+34) 922 31 83 00 Dpto. Fisica Fundamental II (+34) 922 31 83 01 E30204 La Laguna. Tenerife Fax (+34) 922 25 69 73
 (_) (_)

Wed, 18 Jun 1902 08:00:00 GMT 


Thorsten W. Herte #3 / 14

FFT routine for F90
Hi ! You probably mean the singlton.f90 module. I tried that. It works perfect when I call it under Windows NT and Fortran Powerstation but as soon as I port this module to any other platform (SUN, SGI, or PC with Linux) and various compilers, I get an error, saying that there is an undefined reference to fft_1d in the module. Do I call the module maybe wrong or are there any compiler settings required under Unix platforms? How would a sample program look like? I would appreciate to see a code how others implemented the singlton module. Thanks THorsten
Quote:
> > Hi ! > > I am looking for a Fortran 90 function or module which can perform a > > onedimensional FFT and which shouldn't be limited by a maximum number of > > data points. I found a lot of f77 routines out there but they either were > > inaccurate or I couldn't port them to an f90 code (which is important for my > > application). > > Thanks > > THorsten > See the aAlan Miller Page. There alot of routines in F90. If you don't found > it. I can send you a copy. > Sincerely > Javier >  > \\// > (o o) > ooO(_)Ooo > Dr. Javier Gonzalez Platas > Grupo de RayoxX > Universidad de La Laguna Tel (+34) 922 31 83 00 > Dpto. Fisica Fundamental II (+34) 922 31 83 01 > E30204 La Laguna. Tenerife Fax (+34) 922 25 69 73
>  > (_) (_)

Wed, 18 Jun 1902 08:00:00 GMT 


Kobayash #4 / 14

FFT routine for F90
Thorsten W. Hertel wrote Quote: >Hi ! >I am looking for a Fortran 90 function or module which can perform a >onedimensional FFT and which shouldn't be limited by a maximum number of >data points. I found a lot of f77 routines out there but they either were >inaccurate or I couldn't port them to an f90 code (which is important for my >application).
Following primitive subroutine is 1 dim. mixed radix FFT routine. The length of series is not necessarily a power of 2. If the length of series is the prime number, the time for calculation is as long as DFT. This code is simple, but is slower than other tuned FFT routines because it has wasted calculations. If you use Digital Visual Fortran, you can make DLL by coping this code. If "double complex" is not available, sorry, please remake. This subroutine calculates: <Transform> C_k = \sum_{j=0}^{n1} c_j*exp(2*pi*i*j*k/n) (k=1,2,..,n1) or <Inverse Transform> c_j = \sum_{k=0}^{n1} C_k*exp(+2*pi*i*k*j/n) (j=1,2,..,n1) ( i=sqrt(1), C_k and c_j are complex number.) You have to normalize above results in order to adapt to your use.  begging of code  ! DCFFTMR  subroutine for inteface ! rsvfft  recursive subroutine which is called by DCFFTMR ! subroutine DCFFTMR(cr, ci, n,isn,icon) ! cr [input and output] ! 1 dim array of real part of complex data cr(0:n1) ! ci [input and output] ! 1 dim array of imaginary part of complex data ci(0:n1) ! n [input] ! number of elements of cr(:), ci(:) ! isn [input] ! indicates transform(isn=+1) or inverse transform(isn=1) ! icon [output] ! condition code( 0:no error, 1:n<=1, 2:isn/=+1) ! recursive subroutine rsvfft(n, pr, c, tmp) ! n [input] ! number of elements of c(:) and tmp(:) ! pr [input] ! lower bound of array sections ! c [input] ! 1 dim complex array of data c(0:n1) ! tmp [output] ! 1 dim temporary complex array tmp(0:n1) subroutine DCFFTMR(cr, ci, n,isn,icon) implicit none ! Specify that DCFFTMR is exported to a DLL ! and that the external name is 'DCFFTMR' !DEC$ ATTRIBUTES DLLEXPORT :: DCFFTMR !DEC$ ATTRIBUTES ALIAS:'DCFFTMR' :: DCFFTMR !dummy arguments integer n, isn, icon double precision, dimension(0:(n1)):: cr, ci !local variables double complex, dimension(0:(n1)):: cin, tmpin double precision phase integer pr, i if(n<=1) then icon = 1 return end if if((isn/=1).eqv.(isn/=1)) then icon = 2 return end if cin = dcmplx(cr,ci) phase = isn*4.0d0*datan(1.0d0)*2.0d0/n pr = 0 call rsvfft(n, pr, phase, cin, tmpin) cr = DBLE(cin) ci = DIMAG(cin) icon = 0 return end subroutine DCFFTMR recursive subroutine rsvfft(n, pr, phase, c, tmp) implicit none !dummy arguments integer n, pr double precision phase double complex, dimension(0:(n1)):: c, tmp !local variables integer rd, nrd, j, m, r double complex xc, w if(n <= 1) return rd = 2 do while(rd*rd <= n) if(mod(n,rd) == 0) exit rd = rd + 1 end do if(mod(n,rd) /= 0) rd = n nrd = n/rd do j = 0, nrd1 do m = 0, rd1 xc = c(pr+j) do r = nrd, n1, nrd w = dcmplx(DCOS(phase*m*r),DSIN(phase*m*r)) xc = xc + w*c(pr+r+j) end do w = dcmplx(DCOS(phase*m*j),DSIN(phase*m*j)) tmp(pr+m*nrd+j) = xc * w end do end do do r = 0, n1, nrd call rsvfft(nrd, pr+r, phase*rd, tmp, c) end do do j = 0, nrd1 do m = 0, rd1 c(pr+rd*j+m) = tmp(pr+nrd*m+j) end do end do end subroutine rsvfft  end of code  I'm sorry for long writing. Thanks  Hiroshi Kobayashi

Wed, 18 Jun 1902 08:00:00 GMT 


Kenneth G. Hamilto #5 / 14

FFT routine for F90
Quote:
> Hi ! > I am looking for a Fortran 90 function or module which can perform a > onedimensional FFT and which shouldn't be limited by a maximum number of > data points. I found a lot of f77 routines out there but they either were > inaccurate or I couldn't port them to an f90 code (which is important for my > application).
Check out the recursive FFT algorith, by Jose M. PerezJorda, which he published in: "Inplace selfsorting fast Fourier transform algorithm with local memory references," Computer Physics Communications, v108, pp 18 (1998). The interesting feature about it is that the instructions have been rearranged so as to intensively work on one small area of memory at a time as much as possible, rather than skipping around everywhere. Why is that a benefit? Well, because nowadays most computers have some form of cache memory, and so PerezJorda's algorithm achieves a very high cache hit percentage. For small arrays, his benchmarks indicate no improvement, but for large ones the speedup is around a factor of three. YMMMV, depending on what type of machine you're using, and the ratio of speeds between your cache and regular memory. Oh, that algorithm also doesn't need the trig and bitreversal tables, either.

Wed, 18 Jun 1902 08:00:00 GMT 


Michael Kagalen #6 / 14

FFT routine for F90
] ]Check out the recursive FFT algorith, by Jose M. PerezJorda, ]which he published in: ] ] "Inplace selfsorting fast Fourier transform algorithm ] with local memory references," ] Computer Physics Communications, v108, pp 18 (1998). Do you know whether the article or FFT code is available online ? Thanks.

Wed, 18 Jun 1902 08:00:00 GMT 


Kenneth G. Hamilto #7 / 14

FFT routine for F90
Quote:
> ] > ]Check out the recursive FFT algorith, by Jose M. PerezJorda, > ]which he published in: > ] > ] "Inplace selfsorting fast Fourier transform algorithm > ] with local memory references," > ] Computer Physics Communications, v108, pp 18 (1998). > Do you know whether the article or FFT code is available online ? > Thanks.
It's not on line, that I know of. You just have to hike over to the library to get it (a terrible 20th Century habit, even though we are almost into the 21st<G>). Fortunately, the source code is very compact and it is listed in the article. The only bad thing is that you have to pay close attention to the distinction between lower case "l" and numeral "1", since he uses a lot of mixed case.

Wed, 18 Jun 1902 08:00:00 GMT 


James Van Buskir #8 / 14

FFT routine for F90
Quote:
>It's not on line, that I know of. You just have to hike over to the >library to get it (a terrible 20th Century habit, even though we are >almost into the 21st<G>). >Fortunately, the source code is very compact and it is listed in the >article. >The only bad thing is that you have to pay close attention to the >distinction between lower case "l" and numeral "1", since he uses a lot >of mixed case.
It's a shame that it's not online because even a subroutine as small as this is not all that easy to key in correctly. I compared it with the FFT code in TFFT.FOR from Polyhedron's web site and it was about 3X as fast for complex vectors of length 2**19 (about 23 vs. 72 seconds to execute 40 FFTs) on my home PC, although for some reason my PC is anomalously slow on the TFFT test, by far its worst comparative performance. One change to be considered in the code might be to replace the lines call FASTFT(DATA(1:N),L,2*ISIGN) call FASTFT(DATA(N+1:),L,2*ISIGN) with call FASTFT(DATA(1),L,2*ISIGN) call FASTFT(DATA(N+1),L,2*ISIGN) In DVF 6.0b at least, the array section syntax induces the compiler to make temporary copies of the array sections making the code take over half again as long to run and requiring several megabytes of stack space to run. Is this kind of problem (allocating unnecessary array temporaries) unique to DVF, or pervasive in all F95 compilers? The author (Jose PerezJorda) didn't seem to encounter a problem with it (using xlf90?), getting 3X performance gains for large vectors as well.

Wed, 18 Jun 1902 08:00:00 GMT 


<bg.. #9 / 14

FFT routine for F90
Quote: > It's a shame that it's not online because even a subroutine as small > as this is not all that easy to key in correctly.
Are you sure it is not available on machinereadable media from the CPC Program Library? There may be a fee for this, just like there is for subscribing to the journal, but your convenience and peace of mind might be worth it. Alternatively, you could try a scanner and OCR software, followed by some proofreading and ftnchek'ing (and running your usual FFT test suite, naturally). Quote: > call FASTFT(DATA(1:N),L,2*ISIGN) > call FASTFT(DATA(N+1:),L,2*ISIGN) > with > call FASTFT(DATA(1),L,2*ISIGN) > call FASTFT(DATA(N+1),L,2*ISIGN) > In DVF 6.0b at least, the array section syntax induces the compiler > to make temporary copies of the array sections making the code take > over half again as long to run and requiring several megabytes of > stack space to run. Is this kind of problem (allocating unnecessary > array temporaries) unique to DVF, or pervasive in all F95 compilers?
It's certainly not unique to DVF. In fact, I find it disappointing that DVF 6 (according to your report) still doesn't optimise this copy away when the array section is contiguous in memory. I'd call it a bug (particularly if it occurs at high optimisation settings). But most Fortran 90 compilers go through a phase early in their development in which all array sections used as actual arguments where the dummy is not assumedshape trigger a copyin/copyout. It's just that one would expect the more mature products to have grown out of such na?vet by now. If it is legal for you to use DATA(1) as the actual argument instead of DATA(1:N)  and you should check by making the interface to FASTFT explicit at the point of call  then DATA must already be known to be contiguous (i.e. it cannot be an assumedshape dummy or a pointer), in which case there are no excuses. Quote: > The author (Jose PerezJorda) didn't seem to encounter a problem > with it (using xlf90?), getting 3X performance gains for large > vectors as well.
If any compilers do this sort of optimisation, I would expect XLF to be among them.

Wed, 18 Jun 1902 08:00:00 GMT 


Michael Kagalen #10 / 14

FFT routine for F90
]
] ]>It's not on line, that I know of. You just have to hike over to the ]>library to get it (a terrible 20th Century habit, even though we are ]>almost into the 21st<G>). ] ]>Fortunately, the source code is very compact and it is listed in the ]>article. ]>The only bad thing is that you have to pay close attention to the ]>distinction between lower case "l" and numeral "1", since he uses a lot ]>of mixed case. ] ]It's a shame that it's not online because even a subroutine as small ]as this is not all that easy to key in correctly. What kind of license it's under ? Perhaps, you could post it here ? Thanks. ] I compared it with ]the FFT code in TFFT.FOR from Polyhedron's web site and it was about ]3X as fast for complex vectors of length 2**19 (about 23 vs. 72 ]seconds to execute 40 FFTs) on my home PC, although for some reason ]my PC is anomalously slow on the TFFT test, by far its worst ]comparative performance. What about vectors of power of 2 length ? ]One change to be considered in the code might be to replace the lines ] ] call FASTFT(DATA(1:N),L,2*ISIGN) ] call FASTFT(DATA(N+1:),L,2*ISIGN) ] ]with ] ] call FASTFT(DATA(1),L,2*ISIGN) ] call FASTFT(DATA(N+1),L,2*ISIGN) ] ]In DVF 6.0b at least, the array section syntax induces the compiler ]to make temporary copies of the array sections making the code take ]over half again as long to run and requiring several megabytes of ]stack space to run. Is this kind of problem (allocating unnecessary ]array temporaries) unique to DVF, or pervasive in all F95 compilers? ]The author (Jose PerezJorda) didn't seem to encounter a problem ]with it (using xlf90?), getting 3X performance gains for large ]vectors as well. This is diturbing. Compilers haven't got up to speed with F90 optimizations, and the committee wants to put OO stuff in. I think that it would make sense to define som kind of "minimal Fortran" which includes only the essential features, some kind of standardised ELF, for people who only care about numerical performance.

Wed, 18 Jun 1902 08:00:00 GMT 


Jan Vorbruegge #11 / 14

FFT routine for F90
Quote: > In DVF 6.0b at least, the array section syntax induces the compiler > to make temporary copies of the array sections making the code take > over half again as long to run and requiring several megabytes of > stack space to run.
The B3 update, released 16 June, has in its release notes a bullet point saying they improved just this kind of behaviour. Check it out and tell us of your results. Jan

Wed, 18 Jun 1902 08:00:00 GMT 


Kenneth G. Hamilto #12 / 14

FFT routine for F90
Quote:
> >It's not on line, that I know of. You just have to hike over to the > >library to get it (a terrible 20th Century habit, even though we are > >almost into the 21st<G>). > >Fortunately, the source code is very compact and it is listed in the > >article. > >The only bad thing is that you have to pay close attention to the > >distinction between lower case "l" and numeral "1", since he uses a lot > >of mixed case. > It's a shame that it's not online because even a subroutine as small > as this is not all that easy to key in correctly. I compared it with
People who want "official" copies of the routine could always try emailing
if he will email out the source code.

Wed, 18 Jun 1902 08:00:00 GMT 


James Van Buskir #13 / 14

FFT routine for F90
Quote:
>The B3 update, released 16 June, has in its release notes a bullet point >saying they improved just this kind of behaviour. Check it out and tell >us of your results.
Where did you see this bullet point? The release notes file is getting kind of long and I couldn't find the text "6.0B3" in it. The behavior was not improved in my case. Another poster suggested making the interface explicit at the point of call, but this was already achieved because the subroutine was an internal subroutine in the main program and the calls posted earlier were actually recursive calls (yes, the subroutine was declared recursive). When I first tried running the code with 6.0B3, I just went to Developer Studio and hit the Rebuild All switch and got execution times of 65/17 seconds for TFFT/FASTFT, but later builds reverted to something like 7274 seconds for TFFT and 2627 seconds for the faster version of FASTFT. Closing Developer Studio and then restarting it and hitting the Rebuild All switch again got me back down to 68.4/18.7 seconds. The pattern here seems to be that the first time the Build  Execute button is hit, I am getting significantly better performance.

Wed, 18 Jun 1902 08:00:00 GMT 


Jan Vorbruegge #14 / 14

FFT routine for F90
Quote: > >The B3 update, released 16 June, has in its release notes a bullet point > >saying they improved just this kind of behaviour. Check it out and tell > >us of your results. > Where did you see this bullet point? The release notes file is getting > kind of long and I couldn't find the text "6.0B3" in it.
No, the release notes don;t contain that string. I can't check at the moment, but possibly this remark is in the page announcing the update. Quote: > When I first tried running the code with 6.0B3, I just went to > Developer Studio and hit the Rebuild All switch and got execution > times of 65/17 seconds for TFFT/FASTFT, but later builds reverted > to something like 7274 seconds for TFFT and 2627 seconds for the > faster version of FASTFT.
Dunno for sure, but is Developer Studio using an incremental linker? If so, that might explain the phenomenon you're seeing. Jan

Wed, 18 Jun 1902 08:00:00 GMT 


