Sandy Harris (sandy.harris@sympatico.ca)
Mon, 31 Aug 1998 23:13:55 -0400
I've thought up a method of cryptanalysis which I don't recall
reading of anywhere, probably for several of the following reasons:
it's pretty obvious
it's hopelessly inefficient
my memory's not all that good
I haven't read all that much
But it looks to me like it isn't entirely hopeless, & even if it
is, then proving it hopeless for a given cipher might be useful.
Consider a cipher that operates on 64-bit blocks & uses a 128-bit
key. Can we write equations in any useful algebra for the output
in terms of the key & input? Given enough matched input/output
(plaintext/ciphertext) pairs known to use the same key, can we
just solve for the key?
This is why we need non-linear operations in a cipher. If the
operations were all linear & boolean, each input/output pair
would give us 64 linear equations for the output variables in
terms of (known) inputs & key. Two pairs, 128 bits of known
plaintext, & we'd have 128 linear equations in 128 variables.
Solving for the key would be straightforward.
Against any decent real cipher, it certainly won't be at all
straightforward. My questions are whether it is possible at
all, if so, whether it's pratical and whether the answers
to those questions can be proved.
Choosing an algebra:
The attacker can choose whatever algebra gives the best attack.
For most ciphers, you would choose an algebra that coped well
with some of the cipher's operations and then spend some effort
expressing the other cipher operations in that algebra.
For example, if you're attacking CAST-128 or Twofish, you likely
want to use Boolean algebra, so you need addition expressed in
Boolean terms. Use the relation:
a + b = (a^b) + (a&b)<<1
applied recursively until the shift goes off the end of the
wordsize.
On the other hand, if you're attacking IDEA you might want to
use arithmetic modulo 2^16 or modulo 2^16*(2^16+1), expressing
bitwise XOR in arithmetic terms as:
a ^ b = ((a+b)%2) + 2((a/2+b/2)%2) + . . .
where / is truncating integer division and % is remainder.
Whether you're expressing Boolean operations arithmetically or
vice versa, you expect to get some moderately messy non-linear
expressions.
What to attack:
For some ciphers, you'd solve for the primary key. For others,
you might solve for the round keys. Choose whichever is easier,
since both break the cipher.
Going for the round keys gives you more variables to solve for,
but avoids dealing with complexity in the key schedule. With a
simple key schedule, you obviously go for the primary key. If
the key schedule is more complex, look at the tradeoffs.
For ciphers using key-dependent s-boxes, you might treat the
s-box contents as unknowns & try to solve for them. This looks
highly impractical against Blowfish with its 32K bits of s-box
material, but perhaps less so for Twofish.
Example:
I know that Boolean problems can be thoroughly intractable,
in particular that SAT is NP-complete, and that the equations
for a good cipher will not be linear. But are there any proofs
that the problem is intractable for real ciphers?
Take a simple variant of CAST, for example, in which round keys
are just chunks of the primary key and all the combining operations
(key-text, s-box outputs & Feistel halves) are just XOR. The only
non-linear part is the s-box lookup.
S-boxes can be written as Boolean assertions along the lines:
(!a0 && !a1 && ... !a7 && b0 && !b1 && . . . b31) || (a0 && !a1 && ..) ...
Where a0...a7 are the 8 input bits, b0...b31 the 32 outputs, and the
rest of the notaion is from C, ! for negation and && and || for AND and
OR. An s-box lookup makes exactly one of the 256 bracketed clauses
true. Each clause describes one possible input & the matching output.
In my example, the first clause maps 8-bit input 00...0 to 32-bit
output 10...1.
Now I rather strongly suspect that if I extended this to 16-round
CAST, got a big collection of such assertions, handed them to an
equation-solver & asked it for the key, I'd have a spectacularly
long wait.
But can we prove the project is doomed from the start because I'd
need more than 2^64 different plaintexts & that's not possible?
This would prove this CAST immune to this attack in just the way
various ciphers are shown to be immune to differential or linear
cryptanalysis.
Or, given that CAST uses bent functions as s-box columns and that
overall s-box non-linearity is high, can we prove some interesting
lower bound on the work required to solve that system of equations?
Is there perhaps even an obvious bound because equation-solvers use
a form of search?
-- Sandy Harris sandy.harris@sympatico.ca Help secure the Internet: http://www.cygnus.com/~gnu/swan.html
The following archive was created by hippie-mail 7.98617-22 on Sat Apr 10 1999 - 01:11:02