top 32 bits of 64bit product of two 32bit integers
> I would be interested if any of {C,C++,Fortran90} can do the
> following:
> AFAIK, most modern CPUs calculate the product of two unsigned 32bit
> integers as a 64bit integer, and return the bottom 32bits. Is
> there a fast way to access the top 32 bits of that product in
> {C,C++,Fortran90}, or is there a quick way to access the assembly
> language routine that does so?
There is in some compilers, using nonstandard extensions, such as embedded
assembly code, or you can call a separate assembly routine.
> Or is the
> only way to do this to use the "doubledprecisionmultiply" technique:
> top32(A*B) = top16( top16( bot16(A)*bot16(B) )
> + bot16(A)*top16(B)
> + top16(A)*bot16(B) )
> + top16(A)*top16(B)
> And would the resulting code be somewhat portable?
That is the portable way to do it. Note, however, that this definition isn't
suitable for implementation, because of possible overflow. Here is an
implementation that should be portable:
/* ISO C 32x32 to 64 bit unsigned multiply */
#define LOW16(x) ((x) & 0xffff)
#define HIGH16(x) (((x)>>16) & 0xffff)
/* mpy32 returns the full product of two 32bit integers.
** It should work with any ISOcompliant C compiler.
*/
void mpy32 (
unsigned long a, /* multiplicand, 0  (2^321) */
unsigned long b, /* multiplier, 0  (2^321) */
unsigned long *ph, /* MS 32 bits of a*b */
unsigned long *pl /* LS 32 bits of a*b */
) {
unsigned long pm; /* intermediate term */
*pl = LOW16(a) * LOW16(b);
pm = HIGH16(a) * LOW16(b);
*ph = HIGH16(a) * HIGH16(b) + HIGH16(pm);
pm = LOW16(pm) + LOW16(a) * HIGH16(b) + HIGH16(*pl);
*ph+= HIGH16(pm);
*pl = (LOW16(pm) << 16) + LOW16(*pl);
}
> How about getting the top 64 bits of the 128bit product of two
> 64bit integers [I work on a DEC Alpha AXP running OSF/1]?
Same concept, same technique.

