Eric Young (eay@cryptsoft.com)
Thu, 26 Mar 1998 10:12:14 +1000 (EST)
On Wed, 25 Mar 1998, Adam Back wrote:
> Hmmm, try again, Eric's timings were in Blowfish ecb's (blocks, being
> 8 bytes, so I need to x8 to compare), all normalised to 166 Mhz.
>
> Intel AMD k6 AMD k5 Intel Pentium Pentium
> MMX MMX non-MMX non-MMX pro II
> ======================================================================
> bfs-072m-gcc 3.8 5.3 4.0
> bfs-082b-gcc 4.0 4.8 5.9 2.4 5.9
> bfs-082b-586-asm 8.3 5.9 5.8
> bfs-082b-686-asm 6.8 6.1 4.3 6.0
> ======================================================================
> bfs-072m-gcc-ptr2 4.6 5.8 4.9
> bfs-082b-gcc-ptr2 2.9 3.2 3.3 1.2 3.6
> bfs-082b-586-asm-ptr2 8.3 5.9 5.8
> bfs-082b-686-asm-ptr2 6.8 4.8 5.9
> ======================================================================
hmm.... ok, in the intertests of us not going around and around in circles for
days :-), I've built 5 binaries for linux,
586.bf, 686.bf, def.bf, ptr,bf and ptr2.bf.
These are all the possible build options.
I have put them in ftp://pandora.cryptsoft.com/pub/eay/bf, along with the
correct linux libc.so shared library :-). It should also be noted that the
BF_PTR etc options only affect the inner blowfish loop, so these options have
no affect on the assember versions.
FYI, a diff between the 072m code I have indicates the C code is the same
as the 082b code...
So if we want accurate comparisons of CPU, I give you binraries, and the
output from them on a pentium 100, and a pentium pro 200, running linux.
All builds with -O3 -fomit-frame-pointer -DCPU=pentium -m485
Also, as far as I know, the only difference between a pentium and a pentium
mmx is the addition of the mmx instructions and a larger on chip cache.
For a pentium pro vs pentium II, the same deal. The pentium pro/pentium II
differ in the speculative execution which actually makes the
pentium pro/pentium II much better for general C code. Often though,
the pentium can be better for hand tuned code.
>From my just run figures, I now get
Intel AMD k6 AMD k5 Intel Pentium Pentium
MMX MMX non-MMX non-MMX pro II
======================================================================
bfs-072m-gcc 3.8 5.3 4.0
bfs-082b-gcc 4.0 4.8 5.9 3.9 6.0
bfs-082b-gcc-ptr 4.1 5.6
bfs-082b-gcc-ptr2 2.9 3.6
bfs-082b-586-asm 8.3 5.9 5.8 7.9 6.1
bfs-082b-686-asm 6.8 6.1 4.3 6.6 6.7
======================================================================
Anyway, the raw numbers are appended.
I suppose I should take this offline soon, since we are going on a bit :-).
Anyway, back to the relevence to general algorithem encoding, libdes
has a large number of variations,
I have 3 differnt styles, index or ptr and 4 or 16 loop unrolling. The silly
thing is that the different CPU/compilers like all different versions
For a PPC604 100mhz, AIX system,
options des ecb/s
4 c p 275118.61 100.0%
16 c p 224483.98 81.6%
full loop unrolling slows things down by %20. This happens on quite a few
CPUs and is in some ways unexpected.
For a quick example of the best option variation...
16 r2 i 43552.51 100.0% Solaris 2.4, 486 50mhz, gcc 2.6.3
16 r1 p 158667.40 100.0% Pentium 100 Linux
16 r2 i 434454.80 100.0% Pentium pro 200, freeBSD
4 c p 275118.61 100.0% PPC604 100, AIX
4 r2 p 181146.14 100.0% Alpha, 165mhz? cc
16 c i 149448.90 100.0% HPUX 10 - 9000/887 - cc
16 c i 124382.70 100.0% solaris 2.5.1 - sparc 10 50mhz - gcc 2.7.2
16 r1 p 108791.59 87.5% solaris 2.5.1 - sparc 10 50mhz - gcc 2.7.2
16 r1 p 475516.90 100.0% solaris 2.5.1 usparc 167mhz?? - SC4.0
16 c i 427001.40 89.8% solaris 2.5.1 usparc 167mhz?? - SC4.0
The sparc10 vs usparc shows differences between CPU and compiler...
If I remeber correctly there is a middle ground set of options that both
sparcs don't mind too much.
Anyway for those interested in these things, I could go on for hours :-)
eric
pentium 100
586.bf
Blowfish set_key per sec = 1065.200 ( 938.791uS)
Blowfish raw ecb bytes per sec = 4784038.400 ( 1.672uS)
Blowfish cbc bytes per sec = 4279808.000 ( 1.869uS)
686.bf
Blowfish set_key per sec = 898.800 ( 1112.595uS)
Blowfish raw ecb bytes per sec = 3990870.400 ( 2.005uS)
Blowfish cbc bytes per sec = 3657011.200 ( 2.188uS)
def.bf
Blowfish set_key per sec = 544.400 ( 1836.885uS)
Blowfish raw ecb bytes per sec = 2349132.800 ( 3.406uS)
Blowfish cbc bytes per sec = 1917644.800 ( 4.172uS)
ptr.bf
Blowfish set_key per sec = 570.400 ( 1753.156uS)
Blowfish raw ecb bytes per sec = 2465136.000 ( 3.245uS)
Blowfish cbc bytes per sec = 1993216.000 ( 4.014uS)
ptr2.bf
Blowfish set_key per sec = 376.400 ( 2656.748uS)
Blowfish raw ecb bytes per sec = 1725177.600 ( 4.637uS)
Blowfish cbc bytes per sec = 1329049.600 ( 6.019uS)
pentium pro 200
586.bf
Blowfish set_key per sec = 1556.548 ( 642.447uS)
Blowfish raw ecb bytes per sec = 7390034.240 ( 1.083uS)
Blowfish cbc bytes per sec = 7141371.888 ( 1.120uS)
686.bf
Blowfish set_key per sec = 1798.993 ( 555.867uS)
Blowfish raw ecb bytes per sec = 8082935.510 ( 0.990uS)
Blowfish cbc bytes per sec = 7717407.347 ( 1.037uS)
def.bf
Blowfish set_key per sec = 1639.431 ( 609.968uS)
Blowfish raw ecb bytes per sec = 7252084.422 ( 1.103uS)
Blowfish cbc bytes per sec = 5780346.076 ( 1.384uS)
ptr.bf
Blowfish set_key per sec = 1549.798 ( 645.246uS)
Blowfish raw ecb bytes per sec = 6790041.935 ( 1.178uS)
Blowfish cbc bytes per sec = 5501813.065 ( 1.454uS)
ptr2.bf
Blowfish set_key per sec = 995.719 ( 1004.300uS)
Blowfish raw ecb bytes per sec = 4327555.375 ( 1.849uS)
Blowfish cbc bytes per sec = 3719978.883 ( 2.151uS)
The following archive was created by hippie-mail 7.98617-22 on Fri Aug 21 1998 - 17:16:16 ADT