## Efficient bit permutation using delta swaps

A common operation in cryptographic block cipher algorithms is to shuffle the input bits in some specified way. While such bit permutations are easy to do in hardware, they tend to be difficult to implement efficiently in software. The Initial Permutation (IP) in the DES block cipher is a good case in point. It is defined as the following rearrangement of the 64 input bits:

A naive implementation would loop through each of the 64 input bits one by one and place them at the correct position in the output. Needless to say, this would be rather slow. A pre-computed lookup table for a 64-bit permutation would be too huge to even consider, while multiple piece-wise lookup tables would most likely not offer any performance benefit over a naive loop. However, there is a clever solution using only 30 bitwise operations (15 XORs, 10 shifts, 5 ANDs). The key idea is to repeatedly swap non-overlapping groups of bits with an operation called a delta swap:

In a nutshell, the delta swap operation can simultaneously swap multiple bit pairs where the two bits in each pair are separated by the same distance from each other. Let’s examine how this works. (Notation: $\oplus$ means XOR, $\ll$ means bitwise shift left, $\gg$ means bitwise shift right)

We start with the input block represented as a sequence of 64 bits:

$a=a_{1}a_{2}...a_{64}$

Suppose we want to swap the bit at position $i$ with the bit at position $j$, where $j > i$. The distance between the two bits is

$\delta=j-i$

Now notice that if we construct a temporary value $b=b_{1}b_{2}...b_{64}$ such that

$b_{n} = \left\{ \begin{array}{lr}a_{i} \oplus a_{j} & : n \in \{i,j\} \\ 0 & : n \notin \{i,j\} \end{array} \right.$

then $a \oplus b$ swaps the bits $a_{i}$ and $a_{j}$ as required, because of the following properties of XOR:

$a_{i} \oplus (a_{i} \oplus a_{j})=a_{j} \\ a_{j} \oplus (a_{i} \oplus a_{j})=a_{i}$

$a_{i} \oplus 0=a_{i}$

This is exactly how a delta swap works. The temporary value $b$ is constructed by shifting the input $a$ to the right $\delta$ bits, then XORing with the unshifted version of $a$. This places $a_{i} \oplus a_{j}$ in $b_{j}$. The other bits of $b$ are then zeroed out by ANDing with a mask that only has the bit at position $j$ turned on. Finally, the bit swap in $a$ is accomplished by XORing $a$ with an unshifted $b$ and then with $b$ shifted left $\delta$ bits. This is illustrated diagrammatically below.

In order to swap multiple bit pairs simultaneously, we need only turn on the appropriate bits in the AND mask. Specifically, for each 1 bit in the mask, the corresponding bit in the input will be swapped with the input bit that is $\delta$ positions to its left.

Now that we’re armed with the delta swap technique, let’s return to the original problem of implementing the DES Initial Permutation efficiently. Notice that if we arrange the 64-bit input block in the DES algorithm as an 8 x 8 bit matrix, then the Initial Permutation can be visualized as a flip of the matrix about its diagonal (the one that extends from bit 57 to bit 08), followed by the shuffling of its rows. We can accomplish this with a sequence of five delta swaps. The first three flip the matrix about its diagonal, and the last two shuffle the rows.

The DES Final Permutation (FP), being the inverse of the Initial Permutation, can be implemented by performing the same delta swaps in reverse order. The reason why the delta swap technique is so useful is that it works for any permutation of bits. In fact, it can be shown that for 64-bit permutations the required number of delta swaps will never exceed 11.

## Encrypted Notes Could Hold Key To Unsolved Murder

The FBI is asking amateur code breakers to help with an unsolved murder case. In 1999, the body of 41-year-old Ricky McCormick was found in a field near St. Louis, Missouri. The only possible clues to his death are two hand-written notes that were found stuffed in his pocket. The notes appear to be encrypted with a deceptively simple-looking cipher, but have so far resisted all attempts at cryptanalysis. The FBI’s code-breaking unit is hoping more eyes on the problem will help crack this puzzle.

A word of warning to would-be decipherers: Several famous codes and ciphers have tantalized cryptanalysts for decades or even centuries, and have never been solved (a comprehensive list is maintained by Elonka Dunin).

## Automated Solution of Simple Substitution Ciphers

About a year ago, I was working on the solutions to the cryptograms in the first chapter of the Mathematical Cryptography book, and the idea struck me that it would be nice to have a computer program that could automatically solve such puzzles. It turns out that this is easier said than done, especially if the spaces between words are not preserved in the ciphertext and the puzzle author took pains to manipulate the character frequencies of the underlying message.

Exercise 1.4 (c) from the book is a good example. I urge you to try to solve it by hand in order to get an appreciation for the nature of the problem. It is not easy! Then think about how you would teach a computer to perform this feat automatically.

GSZES GNUBE SZGUG SNKGX CSUUE QNZOQ EOVJN VXKNG XGAHS

AWSZZ BOVUE SIXCQ NQESX NGEUG AHZQA QHNSP CIPQA OIDLV

JXGAK CGJCG SASUB FVQAV CIAWN VWOVP SNSXV JGPCV NODIX

GJQAE VOOXC SXXCG OGOVA XGNVU BAVKX QZVQD LVJXQ EXCQO

VKCQG AMVAX VWXCG OOBOX VZCSO SPPSN VAXUB DVVAX QJQAJ

VSUXC SXXCV OVJCS NSJXV NOJQA MVBSZ VOOSH VSAWX QHGMV

GWVSX CSXXC VBSNV ZVNVN SAWQZ ORVXJ CVOQE JCGUW NVA

Over the past 25 years, many papers have been published on the subject in various journals, and many interesting approaches have been proposed. It may be a surprise for some to learn that even state-of-the-art algorithms are not capable of reliably solving very short cryptograms (under 200 characters in length), especially when word divisions are eliminated.

My own attempt, building on work done by Prof. Richard Spillman (who first proposed attacking the problem via genetic algorithms) and Sam Hasinoff (who first suggested using character n-gram models), is an open source program called Alkindus. It works quite well, and is easily capable of solving puzzles similar to the ones from Mathematical Cryptography in a fully automated manner.

The work-in-progress paper describing the theory and empirical results is in need of a large collection of cryptograms completely independent of the Project Gutenberg corpus of English-language text that I used to train the underlying n-gram model. To use cryptograms derived from the training data itself would be cheating! So, I am especially interested in obtaining large amounts of English text (varying from say, 50 to 2500 characters in length), which could be used to put the program through a proper, rigorous test. If you have any suggestions, please let me know.