Enzo Michelangeli (em@who.net)
Fri, 23 Apr 1999 17:59:59 +0800
With VC++ 4.2, optimized with /Ox, casting i to __int64 as suggested by Eric
backfires resulting in very lame code with two subroutine calls:
unsigned test( unsigned i, unsigned j )
{
return(((__int64)i*i)%j);
}
------------- 8< -------------
[...]
; File q.c
_i$ = 8
_j$ = 12
_test PROC NEAR
; Line 2
mov eax, DWORD PTR _i$[esp-4]
sub esp, 8
mov DWORD PTR -8+[esp+8], eax
push esi
mov DWORD PTR -8+[esp+16], 0
; Line 3
mov ecx, DWORD PTR -8+[esp+16]
mov edx, DWORD PTR -8+[esp+12]
push ecx
push edx
push ecx
push eax
call __allmul
mov ecx, DWORD PTR _j$[esp+8]
push 0
push ecx
push edx
push eax
call __allrem
; Line 4
pop esi
add esp, 8
ret 0
[...]
------------- 8< -------------
However, this seems to work nicely:
unsigned test( unsigned i, unsigned j )
{
register __int64 ii;
ii = i*i;
return (ii%j);
}
as it produces a tight:
------------- 8< -------------
; File q.c
_i$ = 8
_j$ = 12
_test PROC NEAR
; Line 5
mov eax, DWORD PTR _i$[esp-4]
sub edx, edx <---------- you also need to zero edx!
imul eax, eax
div DWORD PTR _j$[esp-4]
mov eax, edx
; Line 6
ret 0
------------- 8< -------------
There is still life for the old "register" keyword :-)
Cheers --
Enzo
-----Original Message-----
From: Olivier Langlois <olanglois@sympatico.ca>
To: Derek Atkins <warlord@MIT.EDU>
Cc: CodherPlunks@toad.com <CodherPlunks@toad.com>
Date: Friday, April 23, 1999 5:16 PM
Subject: Re: (x * x) % y
>
>-----Original Message-----
>From: Derek Atkins <warlord@MIT.EDU>
>To: Olivier Langlois <olanglois@sympatico.ca>
>Cc: CodherPlunks@toad.com <CodherPlunks@toad.com>
>Date: 22 avril, 1999 20:07
>Subject: Re: (x * x) % y
>
>
>>This would be a compiler bug, not a 'C' bug (assuming it is
>>a bug). What compiler are you using, and on what platform?
>
>The code was generated by VC++ on Windows.
>But I don't think it is a compiler bug. Like someone else said, 32 bits
>multiplication produce a 32 bits product.
>
>First I did a mistake. The correct assembler code would be:
>
>mov eax, DWORD PTR _i$[esp-4]
>mul eax
>div DWORD PTR _j$[esp-4]
>mov eax, edx
>
>The 1 operand version of the instruction mul multiply the operand (in that
>case eax) with the register eax. My first version result was unpredictable
>because eax wasn't initialized.
>
>The idea of my post is that the x86 processor can expand a 32 bits
>multiplication into a 64 bits value
>that it stores in a pair of registers (EDX:EAX). and the instruction DIV
>divide
>this value. So the x86 architecture would allow you to perform (i*i)%j for
>any value of i. But the compiler doesn't use this capability (or I don't
>know how to write correctly the C statement). It XOR the
>higher 32 bits in EDX and use a version of the MUL instruction (the 2
>operands
>version) that store the
>product in a 32 bits register.
>
>>GCC on Linux/x86 certainly gives different code.
>>
>>-derek
>>--
>> Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
>> Member, MIT Student Information Processing Board (SIPB)
>> URL: http://web.mit.edu/warlord/ PP-ASEL N1NWH
>> warlord@MIT.EDU PGP key available
>>
>
>
The following archive was created by hippie-mail 7.98617-22 on Thu May 27 1999 - 23:44:22