accessing 3D FFT data as 1D/3D complex/real arrays
Author 
Message 
yind #1 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
I have a code that uses 3D fast Fourier transforms to solve Poisson's equation for electrostatics on a grid. The FFT library routines that I use have explicit, overloaded interfaces, so the data to be transformed (the actual argument) must be a 1D complex array, not a 3D complex array, and not real arrays. The data is initially real (imaginary part zero). After the forward transform it becomes complex and is manipulated. The backward transform bring the data back to the real domain (imaginary part zero again). The calculations before the forward transform and after the backward transform can be done much more efficiently if the data is stored contiguously in a real array without the interleaving zero imaginary part. Unfortunately the FFT library does not allow the real and imaginary parts of the data to be split into two separate arrays. Question 1: Without using storage equivalencing (the array is allocatable) or violating the shape matching requirement between actual and dummy arguments (explicit interfaces are always used), is there another standard conforming way to access the data as 1D/3D complex arrays or as 1D/4D [sic] real arrays in different parts of the program? Combinations of the intrinsic functions reshape, real, and cmplx move the data around and can require temporary storage. Due to the large size of the FFT grid, this is not desired. Question 2: Is there a good way to insert and remove the zero imaginary part quickly from the complex array before and after the FFTs? The obvious use of the intrinsic functions real and cmplx to convert the data seems to be slow especially when the array is large, undoing the speedup that could have been gained. I use Intel's fortran compiler v8.1 and Math Kernel Library v7.2, and the FFTW library v3.0.1. Thanks! Dewey Yin

Sat, 25 Aug 2007 12:19:02 GMT 


glen herrmannsfeld #2 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote:
> I have a code that uses 3D fast Fourier transforms to solve Poisson's > equation for electrostatics on a grid. The FFT library routines that I > use have explicit, overloaded interfaces, so the data to be transformed > (the actual argument) must be a 1D complex array, not a 3D complex > array, and not real arrays.
In the static array days, this would be done through EQUIVALENCE, but I don't believe it can for automatic, allocatable, or pointers. Quote: > The data is initially real (imaginary part zero). After the forward > transform it becomes complex and is manipulated. The backward > transform bring the data back to the real domain (imaginary part zero > again). The calculations before the forward transform and after the > backward transform can be done much more efficiently if the data is > stored contiguously in a real array without the interleaving zero > imaginary part. Unfortunately the FFT library does not allow the real > and imaginary parts of the data to be split into two separate arrays.
Note that the FFT of a real array is symmetric (even), and so takes up twice as much space in your case as it needs to. I believe that there is a system for converting a real array to a complex array with the right transform properties, though I have not tried it. Quote: > Question 1: Without using storage equivalencing (the array is > allocatable) or violating the shape matching requirement between actual > and dummy arguments (explicit interfaces are always used), is there > another standard conforming way to access the data as 1D/3D complex > arrays or as 1D/4D [sic] real arrays in different parts of the program? > Combinations of the intrinsic functions reshape, real, and cmplx move > the data around and can require temporary storage. Due to the large > size of the FFT grid, this is not desired.
The REAL and CMPLX intrinsic should not slow down data movement. They should not, for example, be implemented as function calls, but just a different way to address the storage. (That is a quality of implementation question, you should check your specific case.) Quote: > Question 2: Is there a good way to insert and remove the zero > imaginary part quickly from the complex array before and after the > FFTs? The obvious use of the intrinsic functions real and cmplx to > convert the data seems to be slow especially when the array is large, > undoing the speedup that could have been gained.
You mean it is slower than doing the same amount of data movement on real arrays? If not, post the assembly generated code for a very small program that shows the effect. So, my belief is that REAL and CMPLX won't slow it down any more than it otherwise would. Other than that, I believe it can be done with pointers, but I couldn't find the place in the standard where it would say that. That is, have a pointer to a COMPLEX array and assign it to a pointer to a REAL array that is twice as big in the appropriate way. (Someone will tell me if that won't work.) You then need to find the data access pattern that is fastest at moving data around. In general it is the one where the leftmost subscript varies fastest. If you are not doing that, it will be very slow.  glen

Sat, 25 Aug 2007 17:21:55 GMT 


Jan Vorbrügge #3 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote: > Question 2: Is there a good way to insert and remove the zero > imaginary part quickly from the complex array before and after the > FFTs? The obvious use of the intrinsic functions real and cmplx to > convert the data seems to be slow especially when the array is large, > undoing the speedup that could have been gained.
At least for the forward FFT, use a different routine, one that is optimized for a real input array. For the backward transform, there should also be a variant that discards the imaginary part, whatever the result might have been  this risks loosing data when the inter mediate form does not retain the correct symmetry. Jan

Sat, 25 Aug 2007 21:06:39 GMT 


Gordon Sand #4 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote: >> Question 1: Without using storage equivalencing (the array is >> allocatable) or violating the shape matching requirement between actual >> and dummy arguments (explicit interfaces are always used), is there >> another standard conforming way to access the data as 1D/3D complex >> arrays or as 1D/4D [sic] real arrays in different parts of the program? >> Combinations of the intrinsic functions reshape, real, and cmplx move >> the data around and can require temporary storage. Due to the large >> size of the FFT grid, this is not desired.
If you have a 1D complex array then you can pass it to a subroutine which uses assumed sizes (the "*" or F77 variable dimension form) and treat the array as twice as large but real. Then the real parts will be there at a stride of 2. This makes two "heroic" assumptions that are pretty save. The first is that the real and imaginary parts are laid out adjacently and use a Cartesian representation. It is possible that the complexes are in a polar representation but that has many disadvantages so has never been observed "in the wild". And that the form is real then imaginary. Again it is not "guaranteed" but is a sure bet. The second assumption is that you can do the mixed mode call. You need to "hide" the code from the inquiring eyes of the compiler. So abandon the safety of F90 and resort to all the old bad F77 tricks. (My understanding is that the complex components are there on an "as if" basis but that is short of a guarantee in standard speak. The type mismatch of complex to real works in practice but real to complex might have alignment issues. Corrections welcome.) All of this is based on two observations. 1. An allocated array is just an array for the purposes of a call. 2. Various "equivalence tricks" are available through sequence association across calls. So you have to violate the standard to take the complex data type apart and to mismatch the complex to real on a call. The other F77 tricks are within the standard but not quite what F90 intended. Explicit interfaces with assumed shape are perfectly OK and match type and count of arguments. You need to cheat on the complex/real type mismatch. FFT symmetries for 1D are so suggestive of sensible compact forms that Hermitian symmetric for 1D is pretty standard. (I do not mean the trick of putting the n/2 purely real coefficient into the imaginary part of the 0'th purely real coefficient in a complex array. It does not work for odd lengths and is a truly strange hack.) When the Hermitian symmetry is applied to 2D data the suggestive form leaves a bit of ambiguity so you are left to Read The Fine Manual very carefully. For 3D the symmetries leave even less specified.

Sat, 25 Aug 2007 22:02:08 GMT 


stev.. #5 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
1) As Gordon wrote, in practice a 3d complex array is stored as a 1d chunk of contiguous memory with interleaved real/imaginary parts, and you can therefore pass it to an F77 (or C) routine which expects a "1d" array. (This isn't mandated by the standard, but everyone has relied on it for so long that it might as well be.) 2) If you are using MKL or FFTW, they have specialized routines for transforming purely real data and saving a factor of (roughly) 2 in time and memory because of the Hermitian symmetry of the output. (They also have the dual transform, from Hermitiansymmetry input to real output.) Regards, Steven G. Johnson

Sun, 26 Aug 2007 05:29:16 GMT 


glen herrmannsfeld #6 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote:
> 1) As Gordon wrote, in practice a 3d complex array is stored as a 1d > chunk of contiguous memory with interleaved real/imaginary parts, and > you can therefore pass it to an F77 (or C) routine which expects a "1d" > array. (This isn't mandated by the standard, but everyone has relied > on it for so long that it might as well be.)
6.2.2.2 specifies array element order in memory 5.1.2.5.4 For assumed size arrays the actual and dummy argument may have different rank and extents, only the size is assumed. 16.4.3.1 COMPLEX occupies two contiguous storage units 16.5.6 However, when a variable of type default real is partially associated with 23 a variable of type default complex, the complex variable does not become undefined when 24 the real variable becomes defined and the real variable does not become undefined when 25 the complex variable becomes defined. 4.4.3 The first real value is called the real part, and the second real value is called the imaginary part. (This might only apply to the text representation.)

Sun, 26 Aug 2007 06:38:01 GMT 


Richard E Main #7 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote: > This makes two "heroic" assumptions that are > pretty save. The first is that the real and imaginary parts are laid out > adjacently and use a Cartesian representation. It is possible that the > complexes are in a polar representation but that has many disadvantages > so has never been observed "in the wild". And that the form is real then > imaginary. Again it is not "guaranteed" but is a sure bet.
The standard seems rather explicit about exactly such guarantees. Add in all the usual caveats about how the underlying implementation may vary as long as you can't tell the difference with standardconforming code. But then, if you can't tell the difference, it shouldn't much matter. The supporting material is a bit spread around, making it hard to cite concisely, but I'll claim that this is actually guaranteed as opposed to just observed in practice. Now the matter of type disagreement in a procedure call  that's different. There I can't find any support in the standard (though I'm certainly aware that it works in many/most/all? implementations, at least as long as you stick to what I will sloppily refer to as f77style calls).  Richard Maine  Good judgment comes from experience; email: my first.last at org.domain  experience comes from bad judgment. org: nasa, domain: gov   Mark Twain

Mon, 27 Aug 2007 00:33:02 GMT 


Gordon Sand #8 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote:
>>This makes two "heroic" assumptions that are >>pretty save. The first is that the real and imaginary parts are laid out >>adjacently and use a Cartesian representation. It is possible that the >>complexes are in a polar representation but that has many disadvantages >>so has never been observed "in the wild". And that the form is real then >>imaginary. Again it is not "guaranteed" but is a sure bet. > The standard seems rather explicit about exactly such guarantees. Add > in all the usual caveats about how the underlying implementation may > vary as long as you can't tell the difference with standardconforming > code. But then, if you can't tell the difference, it shouldn't much > matter. The supporting material is a bit spread around, making it hard > to cite concisely, but I'll claim that this is actually guaranteed as > opposed to just observed in practice. > Now the matter of type disagreement in a procedure call  that's > different. There I can't find any support in the standard (though I'm > certainly aware that it works in many/most/all? implementations, at > least as long as you stick to what I will sloppily refer to as f77style > calls).
I was hoping Richard would settle the issue. But I think we both ended up saying that it looks like a Cartesian representation but might not be one as long as it acts like one. I may have been sloppy (or the snipping overly aggressive) as I was intending to discuss the internal representation in that fragment and I think else where had said the external representation was Cartesian. The conversion intrinsics are based on a Cartesian form. A polar representation is strictly "roll your own" with abs, atan2, sin and cos readily available. (My spill chucker keeps trying to turn atan2 into satan so maybe it knows more math than I think!) There are lots of things that tell one that the external representation is as a pair of numbers but I could not find anything that would say I could equivalence a complex with a real array of extent two and be guaranteed that the first array element would be the real part and the second array element would be the imaginary part. The FFT idiom often uses the ability to treat the equivalence as actually working and even guaranteed to work. The equivalence boiler plate (as reported in F95 Handbook) is that there is no guaranteed mathematical relationship which one understands to mean the single and double need not share bit patterns, etc, etc and which could also mean that a real and complex might have any relationship (real part, imaginary part, magnitude are plausible things that would not require {*filter*} implementations) as none is guaranteed. The other part of the FFT idiom is knowing the index mapping of various dimensional arrays. That is completely and precisely covered by sequence mapping so poses no issue. It does mean staying with assumed size ("*" or F77 style dummy arrays) arrays to allow for rank mismatch when needed. The type mismatch could fail if one had a Cartesian representation (i.e. as a pair of numbers) but that the convention was to provide the address of the second member (imaginary part) and then looks in front of it for the other member (real part). The usual observed implementation passes the first address and then looks just after it for the other member. Equivalence can be used to cause other than natural alignments (which can cause performance issues) but things are still supposed to work. There are three parts to the FFT storage idioms for Fortran. 1. The ability to look inside a complex and find the real and imaginary parts as adjacent elements of a real array. This is not standard conforming but works on all know implementations. 2. Index mapping under programmer control exactly as given by sequence association. Needs to use assumed size, rather than assumed shape, for full effect. 3. The equivalence noted in 1 will work naturally across calls without additional fuss. This is not standard conforming. With fuss it could be reduced to just relying on the equivalence. In any case it works on all known implementations.

Mon, 27 Aug 2007 01:57:52 GMT 


Richard E Main #9 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote: > >This makes two "heroic" assumptions that are > >pretty save. The first is that the real and imaginary parts are laid out > >adjacently and use a Cartesian representation.... > >Again it is not "guaranteed" but is a sure bet. > There are lots of things that tell one that the external representation > is as a pair of numbers but I could not find anything that would say > I could equivalence a complex with a real array of extent two and > be guaranteed that the first array element would be the real part and > the second array element would be the imaginary part.
Well, the 3rd sentence of 4.3 (in f2003) says "The first real value is called the real part, and the second real value is called the imaginary part." Admittedly, this might be what you call the "external" representation, and what I might call a descriptive model. However, 16.5.5 (Events that cause variables to become defined) talks about defining a complex entity by defining both of its parts through partial association. Elsewhere we see that partial association is what happens between reals and complex in equivalence or common. Ok, I don't quite see the explicit link that the 2 parts mentioned in 4.3 are the same 2 parts mentioned in 16.5.5, but I claim that is the only fully consistent interpretation (since the standard doesn't mention any other 2 parts). Wouldn't surprise me if the link might be explicitly clear in earlier versions of the standard, but I've not time for that right now. Anyway, I'd suspect I could guess the answer if it were submitted as an interp request. So I claim that the standard does guarantee that the equivalence works as stated. Of course, as usual, that doesn't mean that the compiler couldn't do all kinds of strange things behind your back to make it work, but I do claim that it has to work. Interestingly.... I had forgotten that the guarantees all apply to default complex only. I don't see any such stuff for other kinds of complex. The description of the 2 parts as real and imaginary still applies, but not the bit about being able to equivalence them. There, I agree that we have to fall back to the old "it will be that way in practice" instead of any guarantee of the standard. (And you may run into compilers {*filter*}ing about it, at least with a warning).  Richard Maine  Good judgment comes from experience; email: my first.last at org.domain  experience comes from bad judgment. org: nasa, domain: gov   Mark Twain

Mon, 27 Aug 2007 02:35:36 GMT 


glen herrmannsfeld #10 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
(snip) Quote: > I was hoping Richard would settle the issue. > But I think we both ended up saying that it looks like a Cartesian > representation but might not be one as long as it acts like one. > I may have been sloppy (or the snipping overly aggressive) as I > was intending to discuss the internal representation in that fragment > and I think else where had said the external representation was > Cartesian. The conversion intrinsics are based on a Cartesian form. > A polar representation is strictly "roll your own" with abs, atan2, > sin and cos readily available. (My spill chucker keeps trying to > turn atan2 into satan so maybe it knows more math than I think!)
I believe it has to act like one even in the case of EQUIVALENCE and COMMON. I am pretty sure this was all required in F66 and F77, and it is a little harder to find the words in newer standards, but I believe it is there. Consider the description of partial association, for example: "When both parts of a default complex entity become defined as a result of partially associated default real or default complex entities becoming defined, the default complex entity becomes defined." My understanding is that you can EQUIVALENCE a complex variable and real variables or arrays, and define one through defining the other. I don't actually see where it requires the real part to come first and the imaginary second, but I don't know why anyone wouldn't do that. They are required to be adjacent in memory. Quote: > There are lots of things that tell one that the external representation > is as a pair of numbers but I could not find anything that would say > I could equivalence a complex with a real array of extent two and > be guaranteed that the first array element would be the real part and > the second array element would be the imaginary part. The FFT idiom > often uses the ability to treat the equivalence as actually working and > even guaranteed to work. The equivalence boiler plate (as reported in > F95 Handbook) is that there is no guaranteed mathematical relationship > which one understands to mean the single and double need not share bit > patterns, etc, etc and which could also mean that a real and complex > might have any relationship (real part, imaginary part, magnitude are > plausible things that would not require {*filter*} implementations) as none > is guaranteed. The other part of the FFT idiom is knowing the index > mapping of various dimensional arrays. That is completely and precisely > covered by sequence mapping so poses no issue. It does mean staying > with assumed size ("*" or F77 style dummy arrays) arrays to allow > for rank mismatch when needed. > The type mismatch could fail if one had a Cartesian representation > (i.e. as a pair of numbers) but that the convention was to provide the > address of the second member (imaginary part) and then looks in front > of it for the other member (real part).
If that were true you couldn't EQUIVALENCE a complex variable to the first element of a COMMON block, among other things. Quote: > The usual observed > implementation passes the first address and then looks just after it > for the other member. Equivalence can be used to cause other than > natural alignments (which can cause performance issues) but things > are still supposed to work. > There are three parts to the FFT storage idioms for Fortran. 1. The > ability to look inside a complex and find the real and imaginary parts > as adjacent elements of a real array. This is not standard conforming > but works on all know implementations.
I believe it is required that they be adjacent. Quote: > 2. Index mapping under programmer > control exactly as given by sequence association. Needs to use assumed > size, rather than assumed shape, for full effect.
Yes, it needs to be assumed size, as does most F77 compatibility. Quote: > 3. The equivalence > noted in 1 will work naturally across calls without additional fuss. > This is not standard conforming. With fuss it could be reduced to > just relying on the equivalence. In any case it works on all known > implementations.
To the extent that newer standards include F77, they must be standard conforming. The previous standards were specifically written to allow this, as it was commonly done. The only one I can't find is the one that says that the real part comes first.  glen

Mon, 27 Aug 2007 02:46:33 GMT 


Gordon Sand #11 / 11

accessing 3D FFT data as 1D/3D complex/real arrays
Quote:
>>>This makes two "heroic" assumptions that are >>>pretty save. The first is that the real and imaginary parts are laid out >>>adjacently and use a Cartesian representation.... >>>Again it is not "guaranteed" but is a sure bet. >>There are lots of things that tell one that the external representation >>is as a pair of numbers but I could not find anything that would say >>I could equivalence a complex with a real array of extent two and >>be guaranteed that the first array element would be the real part and >>the second array element would be the imaginary part. > Well, the 3rd sentence of 4.3 (in f2003) says "The first real value is > called the real part, and the second real value is called the imaginary > part." Admittedly, this might be what you call the "external" > representation, and what I might call a descriptive model. > However, 16.5.5 (Events that cause variables to become defined) talks > about defining a complex entity by defining both of its parts through > partial association. Elsewhere we see that partial association is what > happens between reals and complex in equivalence or common. Ok, I don't > quite see the explicit link that the 2 parts mentioned in 4.3 are the > same 2 parts mentioned in 16.5.5, but I claim that is the only fully > consistent interpretation (since the standard doesn't mention any other > 2 parts). Wouldn't surprise me if the link might be explicitly clear in > earlier versions of the standard, but I've not time for that right now. > Anyway, I'd suspect I could guess the answer if it were submitted as an > interp request.
Thank you. I was being overly parnoid about what was not locked down. I had long since learned to treat equivalence as undefing the other variables and had not looked for the exceptions. Quote: > So I claim that the standard does guarantee that the equivalence works > as stated. Of course, as usual, that doesn't mean that the compiler > couldn't do all kinds of strange things behind your back to make it > work, but I do claim that it has to work. > Interestingly.... I had forgotten that the guarantees all apply to > default complex only. I don't see any such stuff for other kinds of > complex. The description of the 2 parts as real and imaginary still > applies, but not the bit about being able to equivalence them. There, I > agree that we have to fall back to the old "it will be that way in > practice" instead of any guarantee of the standard. (And you may run > into compilers {*filter*}ing about it, at least with a warning).

Mon, 27 Aug 2007 03:49:01 GMT 


