Wednesday, July 05, 2006

A hard-to-beat cipher?

We have the set of integers from, say, 0 to 100 million -- call it S -- as our primary source of cipher numbers. Supposing we have 100 plaintext characters we wish to represent, we intend to establish 100 subsets of S at 1 million integers per set. We then randomly draw numbers from S and assign them to each subset. That is, each subset will have 1 million integers chosen randomly from S.
We now assign a plaintext character to each subset, meaning that any element of the subset represents the character.
For example, the letter "e" would be represented by any one of a million numbers.
When a message is sent, a program randomly, or pseudorandomly, selects a number from the set assigned to a character and then discards it for the remainder of the transmission. That is, "e" would be represented by a different integer for every occurrence. But the decipherer program would check each number against the subsets and would know which character was intended.
What this means is that an enciphered message would contain no repeating numbers.
Hence, frequency analysis would fail. It's impossible to determine the probability that some number represents say an "e" because although "e" shows up often in the message, "e" is represented by a set of unique numbers, each of which shows up only once.
If an identical message was sent twice, it is highly improbable that its numerical representation would recur. For example the probability of "hello" being represented by the same digit string twice is about 10^(-30).
Even if a particular message was decoded without use of the key, there is only a very low chance of a subsequent message being decoded.
No program -- not even one of the proposed quantum computer designs -- is likely to be able to crack that cipher given a sample of a few messages, though, given enough messages, that possibility exists. However, it's possible to either randomly rebuild the subsets after some interval or to begin with a number much higher than 10^8.
Advantage of the program is that it doesn't need to hide frequency in a smokescreen of false noise, thus making it fairly compact for an enciphered program. Disadvantage is that it requires that receiver and sender have the key. Yet, the program could be used in concert with a public key encryption system.
A reason for not using public key encryption only is because of the efficiency issue. Some might eschew the public key method altogether because sometimes the primes are low enough for supercomputers to crack, or because there is no telling how far advanced quantum computers have come in classified research.
Enciphered programs usually tend to be less compact that unenciphered ones simply because most of the numbers used require more binary digits than the corresponding ASCII character requires (though ASCII is not highly efficient).
I realize this stuff is old hat. But suppose you have a legitimate need to communicate privately and don't trust commercial encryption programs because of the possibility of a backdoor key given to the feds? Now you have something simple that should work quite well.


Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home