Ben Laurie (ben@algroup.co.uk)
Fri, 23 Apr 1999 14:03:17 +0100
Enzo Michelangeli wrote:
>
> With VC++ 4.2, optimized with /Ox, casting i to __int64 as suggested by Eric
> backfires resulting in very lame code with two subroutine calls:
>
> unsigned test( unsigned i, unsigned j )
> {
> return(((__int64)i*i)%j);
> }
>
> ------------- 8< -------------
> [...]
> ; File q.c
> _i$ = 8
> _j$ = 12
> _test PROC NEAR
> ; Line 2
> mov eax, DWORD PTR _i$[esp-4]
> sub esp, 8
> mov DWORD PTR -8+[esp+8], eax
> push esi
> mov DWORD PTR -8+[esp+16], 0
> ; Line 3
> mov ecx, DWORD PTR -8+[esp+16]
> mov edx, DWORD PTR -8+[esp+12]
> push ecx
> push edx
> push ecx
> push eax
> call __allmul
> mov ecx, DWORD PTR _j$[esp+8]
> push 0
> push ecx
> push edx
> push eax
> call __allrem
> ; Line 4
> pop esi
> add esp, 8
> ret 0
> [...]
> ------------- 8< -------------
>
> However, this seems to work nicely:
>
> unsigned test( unsigned i, unsigned j )
> {
> register __int64 ii;
> ii = i*i;
> return (ii%j);
> }
>
> as it produces a tight:
>
> ------------- 8< -------------
> ; File q.c
> _i$ = 8
> _j$ = 12
> _test PROC NEAR
> ; Line 5
> mov eax, DWORD PTR _i$[esp-4]
> sub edx, edx <---------- you also need to zero edx!
> imul eax, eax
> div DWORD PTR _j$[esp-4]
> mov eax, edx
> ; Line 6
> ret 0
> ------------- 8< -------------
>
> There is still life for the old "register" keyword :-)
Isn't this exactly the same (broken) code he is trying to avoid? You've
said "do a 32-bit multiply and widen _the result_ to 64 bit", which gets
you nowhere (and is why the resulting code is "tight", no doubt).
The one line equivalent of what you wrote is "(__int64)(i*i)%j", which
is not the same as "((__int64)i*i)%j".
Cheers,
Ben.
>
> Cheers --
>
> Enzo
>
> -----Original Message-----
> From: Olivier Langlois <olanglois@sympatico.ca>
> To: Derek Atkins <warlord@MIT.EDU>
> Cc: CodherPlunks@toad.com <CodherPlunks@toad.com>
> Date: Friday, April 23, 1999 5:16 PM
> Subject: Re: (x * x) % y
>
> >
> >-----Original Message-----
> >From: Derek Atkins <warlord@MIT.EDU>
> >To: Olivier Langlois <olanglois@sympatico.ca>
> >Cc: CodherPlunks@toad.com <CodherPlunks@toad.com>
> >Date: 22 avril, 1999 20:07
> >Subject: Re: (x * x) % y
> >
> >
> >>This would be a compiler bug, not a 'C' bug (assuming it is
> >>a bug). What compiler are you using, and on what platform?
> >
> >The code was generated by VC++ on Windows.
> >But I don't think it is a compiler bug. Like someone else said, 32 bits
> >multiplication produce a 32 bits product.
> >
> >First I did a mistake. The correct assembler code would be:
> >
> >mov eax, DWORD PTR _i$[esp-4]
> >mul eax
> >div DWORD PTR _j$[esp-4]
> >mov eax, edx
> >
> >The 1 operand version of the instruction mul multiply the operand (in that
> >case eax) with the register eax. My first version result was unpredictable
> >because eax wasn't initialized.
> >
> >The idea of my post is that the x86 processor can expand a 32 bits
> >multiplication into a 64 bits value
> >that it stores in a pair of registers (EDX:EAX). and the instruction DIV
> >divide
> >this value. So the x86 architecture would allow you to perform (i*i)%j for
> >any value of i. But the compiler doesn't use this capability (or I don't
> >know how to write correctly the C statement). It XOR the
> >higher 32 bits in EDX and use a version of the MUL instruction (the 2
> >operands
> >version) that store the
> >product in a 32 bits register.
> >
> >>GCC on Linux/x86 certainly gives different code.
> >>
> >>-derek
> >>--
> >> Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> >> Member, MIT Student Information Processing Board (SIPB)
> >> URL: http://web.mit.edu/warlord/ PP-ASEL N1NWH
> >> warlord@MIT.EDU PGP key available
> >>
> >
> >
-- http://www.apache-ssl.org/ben.html"My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
The following archive was created by hippie-mail 7.98617-22 on Thu May 27 1999 - 23:44:22