## When worse is better

Let me begin this post with a puzzle.

Alex and Bob worked as financial advisers for the same company. They drew equal salaries from the company. They behaved well at the office. Both worked on similar assignments. Each assignment required a yes-no decision. The company used the decisions made by them to make profits.

After the recession hit the company very badly, they have to fire one of them. Both Alex and Bob have worked on almost the same number of assignments in the last ten years. Alex has been consistently taking about 80% decisions correctly every year. Bob, on the other hand, has been taking only about 5% correct decisions every year.

The company has decided to keep Bob and fire Alex. Why?

If you want to solve this puzzle yourself, you might want to stop here and think for a while. There are spoilers ahead.

Before giving away the solution to this puzzle, let me discuss something else.

Shannon's noisy channel coding theorem guarantees that in a binary symmetric channel with an error rate $$p$$, it is possible to transmit digital data with arbitrarily small probability of error up to a transmission rate of $1 + p \mathop{\log_2} p + (1 - p) \mathop{\log_2} (1 - p)$

We'll call this the optimal transmission rate. The error rate $$p$$ is the probability that the channel will corrupt a bit from 1 to 0 or vice versa. The transmission rate is the ratio of the number of data bits to the total number of bits used to represent them. If we employ error-correcting codes, the transmission rate would be less than 1 because we'll need extra bits for error correction. As a result, the number of bits used to represent them would be more than the number of data bits.

For $$p = 0$$, the transmission rate is maximum. That's obvious. Loosely speaking, if the channel is error-free, we do not need extra bits for error correction. The transmission rate is poorest when $$p = \frac{1}{2}$$. In this case, each bit received is completely random since the channel is just as likely to send 0 or 1 irrespective of the original bit sent by the transmitter. However, it might be surprising that as the error rate increases from $$0.5$$ to $$1$$, the optimal transmission rate increases. When $$p = 1$$, the optimal transmission rate is maximum again. It is easy to undersand this intuitively. At $$p = 1$$, the channel is guaranteed to corrupt each bit. The receiver can correct the error by inverting each bit in the received message, so we do not need extra bits for error correction in this case as well.

Now, let me get back to the puzzle I mentioned while beginning this post.

The company decided to keep Bob because he was more useful to the company. He helped the company to take 95% correct decisions. They simply did the opposite of what Bob recommended and made huge profits in the last ten years. Alex on the other hand helped them take only 80% decisions correctly, so he has to go. It's unfair but it's more profitable.

## From the Tower of Hanoi to counting bits

A few weeks ago, I watched Rise of the Planet of the Apes in a movie theatre. The movie showed a genetically engineered chimpanzee trying to solve a puzzle involving 4 discs, initially stacked in decreasing size on one of three pegs. The chimpanzee was supposed to transer the entire stack to one of the other pegs, moving only one disc at a time and never moving a larger one onto a smaller one.

The problem was called the Lucas' problem in the movie. I couldn't help myself from remarking to my friend sitting next to me that the problem is actually known as the Tower of Hanoi and the minimum number of moves required to solve the problem is 2n − 1 if there are n discs involved. The math lectures were not very welcome by her in the middle of the movie.

During the intermission, she was curious to know why it was called the Lucas' problem instead of the Tower of Hanoi in the movie. I think it's probably because the puzzle was invented by the French mathematician Édouard Lucas in 1883. There is another version of this problem known as the Tower of Brahma that involves 64 discs made of pure gold and three diamond needles. According to a legend, a group of Brahmin priests are working at the problem and the world will end when the last move of the puzzle is completed. Now, even if they make one move every second, it'll take 18,446,744,073,709,551,615 seconds to complete the puzzle. That's about 584.54 billion years.

Since, the puzzle in the movie involved only 4 discs, it can be solved in 24 − 1 = 15 moves as illustrated in an animated solution by André Karwath that I found in the Wikipedia article for the puzzle.

I'll not discuss the solution of this puzzle in this blog post. There are plenty of articles on the web including the Wikpedia article that describes why it takes a minimum of 2n − 1 moves to complete the puzzle. In this post, I'll talk about an interesting result I discovered while playing with this puzzle one April afternoon this year.

An interesting thing to note is that if we use Tn to denote the minimum number of moves required to solve the Tower of Hanoi problem involving n discs, then Tn when expressed in binary is the largest possible n-bit integer. For example, T4 = 1510 = 11112. After all, the largest n-bit integer is 2n − 1. Now, let me describe the result I came across while playing with it by asking a question.

This problem deals with arbitrary-precision integers. Some definitions:

1. A positive integer N is called an n-bit integer if and only if the minimum number of bits required to express the integer is n, or in other words, ⌊log2(N)⌋ + 1 = n.
2. Tn represents the largest possible n-bit integer.

Some assumptions:

1. Let the time complexity of adding or subtracting two integers be O(1).
2. Let the time complexity of counting the number of 1-bits in an n-bit integer be O(n).

A positive integer n is given. What is the most efficient way, in terms of time complexity, to compute the number of 1-bits in T1 + T2 + … + Tn?

The naive approach would involve adding all the n integers and counting the number of 1-bits in the sum. The time required to add the n integers is O(n). The sum would require (n + 1) bits to be represented. Hence, it would take O(n) time to count the number of bits in the sum. Since the sum is (n + 1) bits long, the naive approach requires O(n) memory as well. If n is as large as 264, over 2 exbibytes of memory would be required to store the sum.

There is a more efficient solution.

T1 + … + Tn
= 2(n + 1) − 2 − n
= [2(n + 1) − 1] − (n + 1)

Now, 2(n + 1) − 1 has (n + 1) 1-bits. Subtracting (n + 1) from it is equivalent to removing one 1-bit in 2(n + 1) − 1 for each 1-bit in (n + 1). So,

[number of 1-bits in T1 + … + Tn]  =  [(n + 1)]  −  [number of 1-bits in (n + 1)].

So, the computation involves counting the number of 1-bits in n + 1 which takes O(log n) time and subtracting it from n + 1 which takes O(1) time. The problem is solved in O(log n) time. It occupies O(log n) memory. What required over 2 exbibytes of memory with the naive approach requires only a little over 8 bytes of memory with this approach.

## Langford pairing

I came across this problem more than 2 years ago when Pranave, a colleague at RSA, posted this in the internal mathematics forum.

Given a list of first n natural numbers with each natural number occurring exactly twice in the list, we need to arrange the list in a manner such that there are exactly k numbers between two k's for each k in the sequence.

For a small list, say, (1, 1, 2, 2, 3, 3, 4, 4), the answer can be obtained by trial and error: (4, 1, 3, 1, 2, 4, 3, 2).

But what if the list is very long? There is another important question: Is there a solution for every possible such list? The answer turns out to be: No. One can easily see that there is no valid arrangements for (1, 1) or (1, 1, 2, 2) that satisfy the condition specified in the problem statement.

So, we have two things to solve now.

1. For what values of n is the problem solvable?
2. If the problem is solvable for some value of n, how do we solve it?

In this blog post, I'll show how I solved it. I took some help from the computer by writing a Python program to see if I could notice any patterns. Here is the output.

n =  1 :          0 solutions (     0 s)
n =  2 :          0 solutions (     0 s)
n =  3 :          2 solutions (     0 s)
n =  4 :          2 solutions (     0 s)
n =  5 :          0 solutions (     0 s)
n =  6 :          0 solutions (     0 s)
n =  7 :         52 solutions (     0 s)
n =  8 :        300 solutions (     0 s)
n =  9 :          0 solutions (     1 s)
n = 10 :          0 solutions (     6 s)
n = 11 :      35584 solutions (    45 s)
n = 12 :     216288 solutions (   390 s)


From the output, I could form a conjecture:

There exists a permutation of (1, 1, 2, 2, …, n, n) such that for each k in the sequence (1 ≤ k ≤ n) there are k numbers between the two k's if and only if n ≡ 3 (mod 4) or n ≡ 0 (mod 4).

Let me show how I proved this conjecture.

### Proof of the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4)

Let (1, 1, 2, 2, …, n, n) be a sequence which can be rearranged such that there are k numbers between the two k's for each k in the sequence. Let us number the position of each number in this permutation as 1, 2, …, 2n.

Partition the numbers in this partition into two halves:

• Partition 1: All numbers in odd numbered positions.
• Partition 2: All numbers in even numbered positions.

The length of each partition is n.

Note that if k (1 ≤ k ≤ n) is even, one of the two k's is located on an odd numbered position and the other one on even numbered position.

If k is odd the two k's are both located on odd numbered positions or both located on even numbered positions. The two k's will never be split in two different partitions. It will be together in one partition.

From the above discussion it is obvious that the number of even numbers in each partition is equal. Let the number of even numbers in each partition be m. Let us assume there are p pairs of odd numbers in partition 1 and q pairs of odd numbers in partition 2. Thus, there are 2p odd numbers in partition 1 and 2q odd numbers in partition 2.

Since, the length of each partition is n, n = m + 2p = m + 2q. Hence, p = q. This implies that there must be p + q = 2p pairs of odd numbers in the two partitions, i.e. the number of positive odd numbers less than or equal to n is 2p.

In other words, the number of positive odd numbers less than or equal to n must be even. … (1)

We know that, for a positive integer n, the number of positive odd numbers less than or equal to n is ⌈n/2⌉. Let q be the quotient and r be the remainder obtained after dividing n by 4. We have four possible cases.

1. If n = 4q + 1, number of positive odd numbers less than or equal to n is, ⌈(4q + 1)/2⌉ = 2q + 1.
2. If n = 4q + 2, number of positive odd numbers less than or equal to n is, ⌈(4q + 2)/2⌉ = 2q + 1.
3. If n = 4q + 3, number of positive odd numbers less than or equal to n is, ⌈(4q + 3)/2⌉ = 2q + 2.
4. If n = 4q + 4, number of positive odd numbers less than or equal to n is, ⌈(4q + 4)/2⌉ = 2q + 2.
We can see that the number of odd numbers less than or equal to n is even if and only if n ≡ 3 (mod 4) or n ≡ 0 (mod 4). … (2)

From (1) and (2), the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4) is proven.

### Proof of the sufficiency of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4):

If we can show that we can construct a permutation that satisfies the condition in the question for every positive integer n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4), then the proof would be complete. We have two cases to deal with:

1. n ≡ 3 (mod 4)
2. n ≡ 0 (mod 4)

When, n ≡ 3 (mod 4), let x = (n + 1) / 4. Partition the first n natural numbers in 4 sequences, p1, p2, p3 and p4 such that each sequence contains x − 1 numbers and 3 single numbers a, b and c according to the following rules. This is possible since, 4(x − 1) + 3 = 4x − 1 = n + 1 − 1 = n. The partitioning rules are as follows.

1. a = 2x − 1
2. b = 4x − 2
3. c = 4x − 1
4. p1 = Odd numbers between 1 and a in ascending order. (1, …, 2x − 3).
5. p2 = Even numbers between 1 and a in ascending order. (2, …, 2x − 2).
6. p3 = Odd numbers between a and b in ascending order. (2x + 1, …, 4x − 3).
7. p4 = Even numbers between a and b in ascending order. (2x, …, 4x − 4).

It can be verified that the above partitioning rules imply that the length of each partition pi is x − 1 where 1 ≤ i ≤ 4. Also, all natural numbers less than or equal to n are present in the 4 partitions and 3 single numbers above.

Let us consider an array A of size 2n. The first index of array is 1. Now, a valid permutation can be formed with the following placement rules.

1. A[1] to A[x − 1] contains reversed p4.
2. A[x] to A[2x − 2] contains reversed p1.
3. A[2x − 1] contains b.
4. A[2x] to A[3x − 2] contains p1.
5. A[3x − 1] contains c.
6. A[3x] to A[4x − 2] contains p4.
7. A[4x − 1] contains a.
8. A[4x] to A[5x − 2] contains reversed p3.
9. A[5x − 1] to A[6x − 3] contains reversed p2.
10. A[6x − 2] contains b.
11. A[6x − 1] contains a.
12. A[6x] to A[7x − 2] contains p2.
13. A[7x − 1] contains c.
14. A[7x] to A[8x − 2] contains p3.

Note that 8x − 2 = 2n. So, the last rule can be rewritten as "A[7x] to A[2n] contains p3".

When n ≡ 0 (mod 4), let x = n / 4. Partition the first natural numbers in 4 sequences, p1, p2, p3 and p4 such that each sequence contains x − 1 numbers and 4 single numbers a, b, c and d. The partition rules contain the 7 rules discussed above and one more rule. The additional rule is: d = 4x.

The placement are same for all cells in the array with one exception and two additional rules for the two new cells.

1. A[4x − 1] = d. (Exception. Modifies rule 7.)
2. A[8x − 1] = a.
3. A[8x] = d.

With these rules it can be shown that two k's have k elements in between. In other words, the difference in the array indices of the cells where two k's are present is k + 1.

a = 2x − 1. The two a's are located at A[4x − 1] and A[6x − 1]. The difference in array indices is 2x. Hence, two a's have 2x − 1 = a numbers in between. Similarly, this can be verified for b, c and d.

The two 1's are located at A[2x − 2] and A[2x]. The difference in array indices is 2. Hence, the two 1's have 1 number in between. From this it can be shown that for each k in p1, there are k numbers in between. Take an arbitrary element m in p1. From the construction rules, we see that, the two m's are located in A[2x − 2 − (m − 1)/2] and A[2x + (m − 1)/2]. The difference in the indices is: 2x + (m − 1)/2 − 2x + 2 + (m − 1)/2 = m + 1. Similarly, it can be shown that for any k in any parition pi (1 ≤ i ≤ 4), there are k numbers between the two k's.

Here is an example.

n = 11
x = (11 + 1) / 4 = 3
a = 5
b = 10
c = 11
p1 = (1, 3)
p2 = (2, 4)
p3 = (7, 9)
p4 = (6, 8)

(a)
+------------+
(c)   |            |
+-----|------------|------+
(b)    |     |            |      |
+------|-----|----------+ |      |
|      |     |          | |      |
8 6 3 1 10 1 3 11 6 8 5 9 7 4 2 10 5 2 4 11 7 9
=== ===    ===    ===   === ===      ===    ===
|   |      |      |     |   |        |      |
|   +------+ (p1) |     |   +--------+ (p2) |
+-----------------+     +-------------------+
(p4)                      (p3)


### Notes

1. There is an entry for this problem in OEIS: http://oeis.org/A014552.
2. The internal symmetries in the permutation are not very obvious from the proof but the permutation simply requires us to place a, b, c, p1, p2 and p3 in this order: (p4' p1' b p1 c p4 a p3' p2' b a p2 c p3) when n ≡ 3 (mod 4), where pi' is reversed pi (1 ≤ i ≤ 4).
3. For n = 3, we get x = 1. So, length of the partitions p1, p2, p3 and p4 is x − 1 = 0 each. So, rules that involve the partitions do not apply in case of n = 3.
4. In this proof, I have shown one set of rules to construct a valid permutation for a given n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4). But actually, I found 2 such sets of rules. The variables a, b, c and d remain same in the other rule. Only the rules for  p1, p2, p3, p4 and placement rules differ.
5. The above two points imply that, except for n = 3, using construction rules, we can construct at least two different permutations for a given value of n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4).
6. It seemed like for values of n > 20, there might be over a quadrillion possible solutions. It wasn't feasible to verify this on my laptop with the program I wrote. When I discussed this with a person who was interested in this problem, he asked, "What? Don't you have a quad(rillion)-core processor?"
7. I worked through a Saturday night till 3:30 AM of the following Sunday morning and I could prove only the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4). When a friend asked me about the progress, I could manage to say this, "Half of it. Need to prove a 'necessary and sufficient' condition. Proved the necessary part only but this much is sufficient for me today. The sufficient part is necessary for the proof to be complete but i think, now it has become necessary to do the sufficient part tomorrow as it is too late already."
8. The above point reminds me of a famous quote by George Pólya, "Mathematics consists in proving the most obvious thing in the least obvious way".
9. The above point reminds me of http://humor.beecy.net/misc/principia/.

## From out shuffles to multiplicative order

A perfect riffle shuffle or bridge shuffle is when we split a deck of cards into two halves (one in each hand), and then alternately drop one card from each half till all have fallen. For the sake of this problem, let us assume that the top card will always be the top card and the bottom card will always be the bottom card after shuffling. However the rest of the cards will be deranged.

How many such riffle shuffles do we need to perform on a deck of 52 cards, in order to obtain the exact same starting order?

Such a style of shuffling is also known as out shuffle.

Let us assume that there are 2n cards where n is a positive integer. Let us number the positions of the cards as 0, 1, … 2n − 1 where position 0 is the top of the deck and position 2n − 1 is the bottom of the deck. Here is an example with 10 cards called c0, c1, … c9 with positions indicated in the first column.

0. c0                     c0
1. c1                     c5
2. c2                     c1
3. c3                     c6
4. c4 ———> Shuffling ———> c2
5. c5                     c7
6. c6                     c3
7. c7                     c8
8. c8                     c4
9. c9                     c9

Let us consider a card at position k where 0 ≤ k < 2n. It is easy to see from the description of the problem that a card at position k moves to position 2k if the card belongs to the top half of the deck, otherwise it moves to 2(k − n) + 1.

2(k − n) + 1 ≡ 2k − (2n − 1) ≡ 2k (mod 2n − 1)

So, we can say that a card at position k moves to position k' ≡ 2k mod (2n − 1) after an out shuffle where 0 ≤ k' < 2n.

So, after m shuffles a card at position k moves to position k ≡ 2m · k (mod 2n − 1) where 0 ≤ k < 2n. So, we are looking for the smallest positive integer m such that for all integers k, where 0 ≤ k < 2n,

2m · k ≡ k (mod 2n − 1)

Removing k from both sides we get

2m ≡ 1 (mod   (2n−1)gcd(k, 2n−1))

For simplicity, let us represent (2n−1)gcd(k, 2n−1)) as Fnk. So, a card at position k returns to the same position after m out shuffles where m is the smallest positive integer that satisfies the congruence relation

2m ≡ 1 (mod Fnk)

Since 2n − 1 is odd, Fnk must be odd. So, gcd(2, Fnk) = 1. The smallest positive integer m that satisfies the congruence relation above is known as the multiplicative order of 2 modulo Fnk and is denoted as ordFnk(2).

If n = 1, Fnk = 1 and 2m ≡ 1 (mod Fnk) is true for all positive integers m. So, with two cards, an out shuffle doesn't change the positions of the cards.

If n > 2, let us consider a k which is coprime to 2n − 1 and 1 < k < 2n − 1. There exists at least one such k because φ(x) ≥ 2 for x > 2 where φ(x) is the Euler's totient of a positive integer x which is defined as the number of positive integers less than or equal to x that are coprime to x. Now, gcd(k, 2n − 1) = 1. So, now the congruence relation

2m ≡ 1 (mod   (2n−1)gcd(k, 2n−1))

reduces to

2m ≡ 1 (mod 2n − 1)

So, far we understand that after ord2n−1(2) out shuffles, any card at a position number coprime to 2n − 1 would be back to its original position. What about cards at other positions?

For a card at position k such that k is not coprime to 2n − 1, gcd(k, 2n − 1) ≠ 1. In this case the number of out shuffles required to bring the card at its original position is ordFnk2.

2ord2n−1(2) ≡ 1 (mod 2n − 1) ⇒ 2ord2n−1(2) ≡ 1 (mod Fnk) … (since Fnk | 2n − 1)

As a consequence of Lagrange's theorem, ordFnk(2) | ord2n−1(2).

In other words, ord2n−1(2) = c · ordFnk(2) for some positive integer c.

So, performing ord2n−1(2) out shuffles is same as repeating ordFnk(2) out shuffles c times. Every ordFnk(2) shuffles brings back the card at position k to the same position. So, repeating it c times would leave it at the same position as well.

Hence, we conclude that ord2n−1(2) out shuffles would restore a deck of 2n cards to its original configuration, where n is a positive integer.

For a deck of 52 cards, n = 26. So, ord51(2) out shuffles would restore the deck to its original configuration.

Let us compute ord51(2).

From Carmichael's theorem, 2λ(51) ≡ 1 (mod 51), where λ(n) is the smallest positive integer m such that am ≡ 1 (mod n) for every positive integer a coprime to and smaller than n.

λ(51) = lcm(λ(3), λ(17)) = lcm(2, 16) = 16.

Again as a consequence of Lagrange's theorem, ord51(2) | λ(51). In other words, ord51(2) | 16.

Let us check all the factors of 16 and see which one is the multiplicative order of 2 modulo 51.

22 ≡ 4 (mod 51) 24 ≡ 16 (mod 51) 28 ≡ 256 ≡ 1 (mod 51).

∴ ord51(2) = 8.

## A hidden form and Fermat's Last Theorem

Here is a problem I made one night for my friends who love to solve mathematical puzzles.

Find all integer solutions for the equation: y2 + 3 = x318.

You might want to stop here and think for a while if you want to solve this yourself. There are spoilers ahead.

In this problem, I have hidden a Diophantine equation of the form an + bn = cn. The first clue to this hidden form might be the equation with its terms rearranged.

x3 = 18y2 + 54

The right hand side is 2(9y2 + 33). This is of the form 2(3a2b + b3) which is equal to (a + b)3 − (a − b)3. Now, the hidden form must be clear. So, let me write down the solution to this problem neatly.

y2 + 3 = x318
⇒ x3 = 18y2 + 54
⇒ x3 = (y + 3)3 − (y − 3)3
⇒ x3 + (y − 3)3 = (y + 3)3

From Fermat's last theorem, we know that an equation of the form, an + bn = cn has got a trivial solution a = b = 0 and no other solutions for positive integers a, b and c when n > 2. Now, let us examine x3 + (y − 3)3 = (y + 3)3 with the help of Fermat's Last Theorem. A trivial solution in this case is ruled out because there is no value of y for which y − 3 = y + 3 = 0.

However, if we consider y − 3 = 0, or y = 3, we get an equation that is no longer of the form required by Fermat's Last Theorem.

x3 = 63
⇒ x = 6

If we consider y + 3 = 0, or y = −3, again we get

x3 + (−6)3 = 0
⇒ x = 6.

Hence, the integral solutions to the equation are:

x = 6
y = ±3