Bill Stewart (bill.stewart@pobox.com)
Mon, 26 Apr 1999 00:22:25 -0700
At 10:45 AM 4/23/99 +0200, Mok-Kong Shen wrote:
>Olivier Langlois wrote:
>> My question is : Is it possible to write a legal C construction to would
>> generate the correct assembler code ?
If you want code that's both optimally fast and portable across
hardware platforms and compilers, no.
Many C compilers now have 64-bit "long long" or similar types,
and some compilers for some platforms have 64-bit longs.
For those cases, you can write legal C that's dependable,
just not portable.
If you want a non-portable solution, you can play games with
your environment by trying various combinations of variable declarations,
casts, parentheses, operations order, etc., which can sometimes
be a big performance win, especially if you know things about
the range of input and output values that there isn't a good
way to express in C. Instead of doing "edit, compile, test",
you do "edit, compile, read assembler" :-)
A long time ago I was writing Mandelbrot set calculators that
ran in the background on the AT&T Blit terminals (10MHz 68000-based
1Kx1K graphics screen intelligent terminal with a really dumb C compiler).
Portable code was pretty slow - the compiler called a subroutine
to do the fixed-point multiplication, incurring subroutine call times and
doing lots of manipulation to be correct in the general case.
(Floating-point was even slower, since there was no float hardware :-)
The 68000 has 32-bit registers but 16-bit ints.
By putting the critical variables in registers and doing the
multiplies in correct order, it was possible to get the assembler
to just do the 16x16->32 multiply without calling the subroutine or truncating
back down to 16-bit, accumulate in 32-bit, and shift to renormalize.
Part of the process of playing around with it was to make sure
I could put enough variables in registers to do the fast work
without running out of registers - a much nicer process on 68000 than 8086.
>32 bit hardware usually provides an extra register that can be utilized
>for multiplication of two 32 bit operands. Since however high-level
>programming languages don't (and can't) specify the availability
>of results outside of the domain of 'integers', compilers don't
>take that extra register into account. Hence the answer to your
>question is NO. But you can certainly write (slow running)
>high-level code to do multi-precision arithmetics.
In many cases, the MPI package will already contain a wide
selection of tricks to get high performance out of Intel hardware.
Either reuse it or recycle the code...
Thanks!
Bill
Bill Stewart, bill.stewart@pobox.com
PGP Fingerprint D454 E202 CBC8 40BF 3C85 B884 0ABE 4639
The following archive was created by hippie-mail 7.98617-22 on Thu May 27 1999 - 23:44:22