Anonymous (nobody@zoom.nu)
Fri, 27 Mar 1998 03:42:01 +0100
I'm working on some stego tools. Here is the model.
The work is broken into two parts. The common interface is a file full
of random bits (50% 0's, 50% 1's). One part converts between encrypted
messages and random files, stripping off any identifying information about
the encrypted messages, rescaling bignum values to look random, etc.
The other part works with random files and does the stego operation,
inserting them into or extracting them from the data which hides them.
By dividing the work like this, people who know the crypto formats can
write converters between random files and various encrypted message
formats. Other people can write stego tools which work directly with
random files and hide them in various ways.
I have created an algorithm which creates a special kind of random number
generator given the random input file. You give it a probability, like
4/13, and it uses the random bits to produce a boolean value which is
true with probability 4/13. This is done in such a way that later,
knowing the probability (4/13) and the choice (true or false) allows
you to infer the random bits which were used. A series of such choices
gradually uses up the random input, and at the other end allows the
receiver to reconstruct it.
This is conceptually tricky because it involves fractional bits and
arithmetic encoding, but the actual algorithm works out to be quite
simple.
Using this algorithm, I've created a simple text justification program.
It expands lines to be constant width by inserting spaces randomly into
each line. The choices of where to put the spaces is driven by the
RNG which takes a file as input. There is another program which, given
the justified file, reconstructs the random file which was used to drive
the RNG.
This is not a particularly dense encoding; it averages three or four
bits per line. So you must use a very long message to encode any
significant amount of material. But it does demonstrate the principle.
Next I want to write a parody generator. Given a base document, this
extracts the frequence of letter combinations and then uses it to produce
another document which has the same combinations of letters. Depending
on the lengths of the letter strings used, the resulting document can
often look superficially like real text.
Making the choice for the letters involves a RNG, so this would be a good
use for the stego RNG. As long as both sides had the same base document
(could be some famous text) then the receiver can extract the random
file which was used as input.
Can anyone point to source for a parody generator like this? Emacs has
one in its dissociated-press macro, but I don't want to download megabytes
of source to dig this out. I think there was an article in Scientific
American a few years ago. I tried writing one using a simple algorithm
but it did not generalize well to longer letter strings.
Thanks!
The following archive was created by hippie-mail 7.98617-22 on Fri Aug 21 1998 - 17:16:19 ADT