Mok-Kong Shen (mok-kong.shen@stud.uni-muenchen.de)
Fri, 08 Jan 1999 16:29:31 +0100
NOTE: The following is a revised (slightly extended) version of an
article I posted elsewhere.
In light of the forthcoming 56-bit key length restriction it appears
valuable to re-contemplate on all available techniques in cryptology
(independent of whether they are favoured in current practice) in order
to assess their value (even when employed as a minor 'adjuvant') in
contributing to (combined) encryption schemes that can offer adequate
security to confidential communications of innocent and law-obeying
people of the world despite the severe limitation posed by the 56-bit
key length upper bound. I propose therefore to discuss on the topic of
pseudo-OTP.
An ideal true OTP is by definition absolutely unpredictable and,
furthermore, used only once. This, as far as I understand, is
theoretical and not strictly obtainable in practice. Using physical
noise and taking care to avoid bias gives some near approximation to
an ideal true OTP. But, as is well known, such OTP is disadvantageous
in handling, since a copy has to be transported and securely kept by
the communication partners. Thus there is a place in practical
encryption of employing pseudo-OTP that does not have this management
problem though on the other hand may be (I like to argue for 'not
necessarily') of security that is inferior to OTP obtained through
physical means.
What I actually like to have discussed here is a method of generating
pseudo-OTP that is given further below. But before doing that I want
to recall that a pseudo-OTP can be obtained trivially from a PRNG, if
the sequence is not longer than the period of the PRNG and each
subsequence is never reused. The problem is of course that most PRNGs
that are good for numerical computations are cryptologically very weak.
On the other hand, there are cryptologically strong PRNGs known in the
literature. Sometime ago I myself also made a humble attempt (in my
compound PRNG, see [1]) to obtain pseudo-random number sequences
of arbitrarily huge periods that are (I believe) difficult to analyze
and that are generated through interleaving the outputs of an
arbitrarily large number of (weak) PRNGs and passing that multiplex
through a shuffling buffer and then post-processing the numbers
outgoing from the shuffling buffer through a pairwise addition mod 1.
(This is a special case of the device of Wichmann and Hill originally
designed for improving the statistical properties of PRNGs. The
general case of adding together more than two numbers mod 1 is
employed in the 56-bit EX-versions of my WEAK-series of algorithms.)
Thus, summarizing the above, obtaining pseudo-OTP from PRNG is one of
the viable ways in practice.
Now I want to describe another practical way of obtaining pseudo-OTP.
(REMARK: the stuff is known, no novelty is claimed.) One may have
psychologically some natural reservations against PRNGs. For the
numbers are ultimately (despite transformations, shuffling, etc.)
originated from some simple mathematical equations. Hence the actual
sequences could have some auto-correlations in them, however minute.
(This is why physical random numbers are believed to be superior.)
This motivates us to look for sources other than mathematics for
obtaining pseudo-OTP. One simple and convenient such source suggests
itself: natural language texts. For these are abundant and easily
available everywhere, including regions of the globe outside of the
export-regulating countries. Of course, from the very beginning of
cryptology it has been known that natural language texts are highly
auto-correlated and some classical methods of analysis are in fact
mainly based on frequency distributions of single letters, bigrams,
trigrams, etc. However, we can attempt to remove, more exactly to
arbitrarily reduce, the auto-correlations in several ways and these
preferably in combination (thus hopefully approaching white noise):
1. Apply (mono- or polyalphabetical) substitutions or homophones.
2. Apply permutations (transpositions), including reversal.
3. Combining different streams through:
a. bitwise XOR.
b. addition mod 2^32 (for example), also with circular shifts.
c. substitution, e.g. mapping a pair of bytes, one from each
stream, to a pair of bytes.
4. Compression through:
a. common compression techniques (the Huffman encoding is a
substitution with a variable length code).
b. XORing groups of bits, e.g. 32 bits, i.e. taking their parity.
5. Applying other simple encryption techniques, e.g. auto-keying
applied to a group of words.
Note that for (3) an arbitrary number of streams can be combined
and that all techniques can be applied recursively, for example
(3b) can be applied after (4b). Of course at this point one normally
tends to recall the efficiency issue. However, as I recently pointed
out [2,3], one practical and simple way of coping with the 56-bit
limitation is to make use of a new paradigm, namely 'security through
inefficiency', i.e. paying the price of some longer processing time.
This can at least be justified in a poorman's environment and appears
to be useful in a substantial part of civilian applications as well.
One remarkable advantage of using pseudo-OTP generated from natural
language texts as compared to OTP obtained from physical means is that
these need not be securely stored. In fact, the user may display the
natural language texts and read them for entertainment when he is not
doing encryption. Neither is it necessary nor even advantageous to
pre-compute the pseudo-OTP once for all subsequent uses. For one
needs only to record the offsets of the different streams
participating in the pseudo-OTP that has been arrived at by the
previously sent messages. The 'key' (the secret to be kept by the
communication partners) consists of the names or ISBN of the texts
and their very first starting offsets. (As a variation one could
introduce some amounts of skips after each message sent). Since
one's library (digitalized to be quickly used) may contain a large
number of texts (maybe of several different languages which is
advantageous from the point of view of combination), each having the
potential of participating in the pseudo-OTP, the mere knowledge of
the content of the library of the user is virtually useless to the
analyst. Further, an almost unlimited number of pseudo-OTPs can be
generated after some having been exhausted in actual communications.
Some remarks to special cases: Some of the participating streams could
even be from the same text, i.e. differing only in offsets. In place
of natural language texts one could also use mathematical constants,
e.g. Pi. (Transcendental functions other than Pi have practical
problems of dealing with arbitrarily given offsets, unless they have
been computed to a sufficiently large number of digits and are easily
available.) Further, since the sources are entirely public, these need
not even be stored at the user's place but downloaded as needed from
some server of the internet. Finally there is apparently nothing
againt using a (weak) PRNG as one of the participating streams. Also
in principle one can, if one wishes, incorporate other sources,
including digitalized voice, photos, picture, etc. into the scheme.
(Even recordings of hardware noise, i.e. OTP, if already available at
the communication partners, can be used; the naturaul language texts
then serves in some sense to prolong enormously the useful lifespan
of such recordings.)
Some words to avoid misunderstanding: Since the techniques for
constructing pseudo-OTP mentioned above are all very simple, people
outside of the export-regulating countries can easily implement these
(without having to import them) in order to be able to communicate
confidentially either with themselves or with people of the 33
countries. Hence export regulations have no effect on pseudo-OTP.
The items listed are simply those that I personally believe to be
potentially useful; they are in no sense exhaustive. Experimental
studies are certainly needed in order to find optimal schemes for the
practice, thus leading to additions and/or refinements of the
techniques.
In the above I have made a humble attempt to sketch one possible way
of obtaining a pseudo-OTP. I should appreciate your opinions on that
and suggestions of other ways of advantageously generating such for
applications in the future 56-bit environment.
M. K. Shen
------------------------
Literature
[1] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper9
[2] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper15
[3] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper17
The following archive was created by hippie-mail 7.98617-22 on Sat Apr 10 1999 - 01:18:02