Peter Gutmann (peterg@kcbbs.gen.nz)
Thu, 26 Mar 1998 03:02:07 +1200 (NZST)
>One thing that occurs to me is that if VTune is fairly automatic, and
>able to do it's job quickly, perhaps someone could hack up a loader
>(as in load/run time) VTune which optimised for your particular CPU at
>load time. Or if VTune isn't that fast store some of VTune's
>decisions in a compact format ready to speed load time VTuneing.
Doesn't VTune just do static analysis based on its built-in model of the CPU's
behaviour?
The easiest option to get the best performance would be to include code for
different processors in cases where there's a significant difference and call
the appropriate routine in the same manner in which NT kernel calls work: The
first time you make the call to the function, it calls a dynamic-linking
routine which detects the CPU type and sets the function pointer to the
appropriate CPU-specific function. After that, the function gets called
directly:
functionPtr = initFunction;
initFunction:
detect CPU type;
functionPtr = CPU-optimised code;
functionPtr();
encryptFunction:
functionPtr();
However given that the 586-asm looks like the best overall performer, it's
probably not worth accomodating anything other than this.
Peter.
The following archive was created by hippie-mail 7.98617-22 on Fri Aug 21 1998 - 17:16:15 ADT