Let me share a very interesting C programming puzzle that I learnt from one of my colleagues.

Consider the following code-snippet written in the C programming language.

int i;
for (i = 0; i < 6; i--)
    printf(".");
This code invokes undefined behaviour. The value in variable i decrements to INT_MIN after |INT_MIN| + 1 iterations. In the next iteration, there is a negative overflow which is undefined for signed integers in C. On many implementations though, INT_MIN − 1 would result in INT_MAX which is not less than 6 and thus the loop would terminate. With such implementations, this code would print |INT_MIN| + 1 dots. With 32-bit integers, that would be 2147483649 dots. However, it could also be optimized into an endless loop by the compiler if it wants to exploit the fact that a negative overflow for signed integers is undefined. gcc in fact does optimize that to an endless loop if you compile it with the -O3 option.

Now, here is the puzzle.

Add or modify exactly one operator in the following code such that it prints exactly 6 dots.

int i;
for (i = 0; i < 6; i--)
    printf(".");

An obvious solution is:

int i;
for (i = 0; i < 6; i++)
    printf(".");

There are at least 3 other solutions. I found one of them very interesting. I'll discuss the interesting solution tomorrow. Till then, feel free to try and post your solutions below as comments.

Update: The discussion on the solution is now available at: Solutions to 'Loopy C Puzzle'.

Rise of the Planet of the Apes (Poster) A few weeks ago, I watched Rise of the Planet of the Apes in a movie theatre. The movie showed a genetically engineered chimpanzee trying to solve a puzzle involving 4 discs, initially stacked in decreasing size on one of three pegs. The chimpanzee was supposed to transer the entire stack to one of the other pegs, moving only one disc at a time and never moving a larger one onto a smaller one.

The problem was called the Lucas' problem in the movie. I couldn't help myself from remarking to my friend sitting next to me that the problem is actually known as the Tower of Hanoi and the minimum number of moves required to solve the problem is 2n − 1 if there are n discs involved. The math lectures were not very welcome by her in the middle of the movie.

During the intermission, she was curious to know why it was called the Lucas' problem instead of the Tower of Hanoi in the movie. I think it's probably because the puzzle was invented by the French mathematician Édouard Lucas in 1883. There is another version of this problem known as the Tower of Brahma that involves 64 discs made of pure gold and three diamond needles. According to a legend, a group of Brahmin priests are working at the problem and the world will end when the last move of the puzzle is completed. Now, even if they make one move every second, it'll take 18,446,744,073,709,551,615 seconds to complete the puzzle. That's about 584.54 billion years.

Animated solution Since, the puzzle in the movie involved only 4 discs, it can be solved in 24 − 1 = 15 moves as illustrated in an animated solution by André Karwath that I found in the Wikipedia article for the puzzle.

I'll not discuss the solution of this puzzle in this blog post. There are plenty of articles on the web including the Wikpedia article that describes why it takes a minimum of 2n − 1 moves to complete the puzzle. In this post, I'll talk about an interesting result I discovered while playing with this puzzle one April afternoon this year.

An interesting thing to note is that if we use Tn to denote the minimum number of moves required to solve the Tower of Hanoi problem involving n discs, then Tn when expressed in binary is the largest possible n-bit integer. For example, T4 = 1510 = 11112. After all, the largest n-bit integer is 2n − 1. Now, let me describe the result I came across while playing with it by asking a question.

This problem deals with arbitrary-precision integers. Some definitions:

  1. A positive integer N is called an n-bit integer if and only if the minimum number of bits required to express the integer is n, or in other words, ⌊log2(N)⌋ + 1 = n.
  2. Tn represents the largest possible n-bit integer.

Some assumptions:

  1. Let the time complexity of adding or subtracting two integers be O(1).
  2. Let the time complexity of counting the number of 1-bits in an n-bit integer be O(n).

A positive integer n is given. What is the most efficient way, in terms of time complexity, to compute the number of 1-bits in T1 + T2 + … + Tn?

The naive approach would involve adding all the n integers and counting the number of 1-bits in the sum. The time required to add the n integers is O(n). The sum would require (n + 1) bits to be represented. Hence, it would take O(n) time to count the number of bits in the sum. Since the sum is (n + 1) bits long, the naive approach requires O(n) memory as well. If n is as large as 264, over 2 exbibytes of memory would be required to store the sum.

There is a more efficient solution.

T1 + … + Tn
 = 2(n + 1) − 2 − n
 = [2(n + 1) − 1] − (n + 1)

Now, 2(n + 1) − 1 has (n + 1) 1-bits. Subtracting (n + 1) from it is equivalent to removing one 1-bit in 2(n + 1) − 1 for each 1-bit in (n + 1). So,

[number of 1-bits in T1 + … + Tn]  =  [(n + 1)]  −  [number of 1-bits in (n + 1)].

So, the computation involves counting the number of 1-bits in n + 1 which takes O(log n) time and subtracting it from n + 1 which takes O(1) time. The problem is solved in O(log n) time. It occupies O(log n) memory. What required over 2 exbibytes of memory with the naive approach requires only a little over 8 bytes of memory with this approach.

I came across this problem more than 2 years ago when Pranave, a colleague at RSA, posted this in the internal mathematics forum.

Given a list of first n natural numbers with each natural number occurring exactly twice in the list, we need to arrange the list in a manner such that there are exactly k numbers between two k's for each k in the sequence.

For a small list, say, (1, 1, 2, 2, 3, 3, 4, 4), the answer can be obtained by trial and error: (4, 1, 3, 1, 2, 4, 3, 2).

But what if the list is very long? There is another important question: Is there a solution for every possible such list? The answer turns out to be: No. One can easily see that there is no valid arrangements for (1, 1) or (1, 1, 2, 2) that satisfy the condition specified in the problem statement.

So, we have two things to solve now.

  1. For what values of n is the problem solvable?
  2. If the problem is solvable for some value of n, how do we solve it?

In this blog post, I'll show how I solved it. I took some help from the computer by writing a Python program to see if I could notice any patterns. Here is the output.

n =  1 :          0 solutions (     0 s)
n =  2 :          0 solutions (     0 s)
n =  3 :          2 solutions (     0 s)
n =  4 :          2 solutions (     0 s)
n =  5 :          0 solutions (     0 s)
n =  6 :          0 solutions (     0 s)
n =  7 :         52 solutions (     0 s)
n =  8 :        300 solutions (     0 s)
n =  9 :          0 solutions (     1 s)
n = 10 :          0 solutions (     6 s)
n = 11 :      35584 solutions (    45 s)
n = 12 :     216288 solutions (   390 s)

From the output, I could form a conjecture:

There exists a permutation of (1, 1, 2, 2, …, n, n) such that for each k in the sequence (1 ≤ k ≤ n) there are k numbers between the two k's if and only if n ≡ 3 (mod 4) or n ≡ 0 (mod 4).

Let me show how I proved this conjecture.

Proof of the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4)

Let (1, 1, 2, 2, …, n, n) be a sequence which can be rearranged such that there are k numbers between the two k's for each k in the sequence. Let us number the position of each number in this permutation as 1, 2, …, 2n.

Partition the numbers in this partition into two halves:

  • Partition 1: All numbers in odd numbered positions.
  • Partition 2: All numbers in even numbered positions.

The length of each partition is n.

Note that if k (1 ≤ k ≤ n) is even, one of the two k's is located on an odd numbered position and the other one on even numbered position.

If k is odd the two k's are both located on odd numbered positions or both located on even numbered positions. The two k's will never be split in two different partitions. It will be together in one partition.

From the above discussion it is obvious that the number of even numbers in each partition is equal. Let the number of even numbers in each partition be m. Let us assume there are p pairs of odd numbers in partition 1 and q pairs of odd numbers in partition 2. Thus, there are 2p odd numbers in partition 1 and 2q odd numbers in partition 2.

Since, the length of each partition is n, n = m + 2p = m + 2q. Hence, p = q. This implies that there must be p + q = 2p pairs of odd numbers in the two partitions, i.e. the number of positive odd numbers less than or equal to n is 2p.

In other words, the number of positive odd numbers less than or equal to n must be even. … (1)

We know that, for a positive integer n, the number of positive odd numbers less than or equal to n is ⌈n/2⌉. Let q be the quotient and r be the remainder obtained after dividing n by 4. We have four possible cases.

  1. If n = 4q + 1, number of positive odd numbers less than or equal to n is, ⌈(4q + 1)/2⌉ = 2q + 1.
  2. If n = 4q + 2, number of positive odd numbers less than or equal to n is, ⌈(4q + 2)/2⌉ = 2q + 1.
  3. If n = 4q + 3, number of positive odd numbers less than or equal to n is, ⌈(4q + 3)/2⌉ = 2q + 2.
  4. If n = 4q + 4, number of positive odd numbers less than or equal to n is, ⌈(4q + 4)/2⌉ = 2q + 2.
We can see that the number of odd numbers less than or equal to n is even if and only if n ≡ 3 (mod 4) or n ≡ 0 (mod 4). … (2)

From (1) and (2), the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4) is proven.

Proof of the sufficiency of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4):

If we can show that we can construct a permutation that satisfies the condition in the question for every positive integer n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4), then the proof would be complete. We have two cases to deal with:

  1. n ≡ 3 (mod 4)
  2. n ≡ 0 (mod 4)

When, n ≡ 3 (mod 4), let x = (n + 1) / 4. Partition the first n natural numbers in 4 sequences, p1, p2, p3 and p4 such that each sequence contains x − 1 numbers and 3 single numbers a, b and c according to the following rules. This is possible since, 4(x − 1) + 3 = 4x − 1 = n + 1 − 1 = n. The partitioning rules are as follows.

  1. a = 2x − 1
  2. b = 4x − 2
  3. c = 4x − 1
  4. p1 = Odd numbers between 1 and a in ascending order. (1, …, 2x − 3).
  5. p2 = Even numbers between 1 and a in ascending order. (2, …, 2x − 2).
  6. p3 = Odd numbers between a and b in ascending order. (2x + 1, …, 4x − 3).
  7. p4 = Even numbers between a and b in ascending order. (2x, …, 4x − 4).

It can be verified that the above partitioning rules imply that the length of each partition pi is x − 1 where 1 ≤ i ≤ 4. Also, all natural numbers less than or equal to n are present in the 4 partitions and 3 single numbers above.

Let us consider an array A of size 2n. The first index of array is 1. Now, a valid permutation can be formed with the following placement rules.

  1. A[1] to A[x − 1] contains reversed p4.
  2. A[x] to A[2x − 2] contains reversed p1.
  3. A[2x − 1] contains b.
  4. A[2x] to A[3x − 2] contains p1.
  5. A[3x − 1] contains c.
  6. A[3x] to A[4x − 2] contains p4.
  7. A[4x − 1] contains a.
  8. A[4x] to A[5x − 2] contains reversed p3.
  9. A[5x − 1] to A[6x − 3] contains reversed p2.
  10. A[6x − 2] contains b.
  11. A[6x − 1] contains a.
  12. A[6x] to A[7x − 2] contains p2.
  13. A[7x − 1] contains c.
  14. A[7x] to A[8x − 2] contains p3.

Note that 8x − 2 = 2n. So, the last rule can be rewritten as "A[7x] to A[2n] contains p3".

When n ≡ 0 (mod 4), let x = n / 4. Partition the first natural numbers in 4 sequences, p1, p2, p3 and p4 such that each sequence contains x − 1 numbers and 4 single numbers a, b, c and d. The partition rules contain the 7 rules discussed above and one more rule. The additional rule is: d = 4x.

The placement are same for all cells in the array with one exception and two additional rules for the two new cells.

  1. A[4x − 1] = d. (Exception. Modifies rule 7.)
  2. A[8x − 1] = a.
  3. A[8x] = d.

With these rules it can be shown that two k's have k elements in between. In other words, the difference in the array indices of the cells where two k's are present is k + 1.

a = 2x − 1. The two a's are located at A[4x − 1] and A[6x − 1]. The difference in array indices is 2x. Hence, two a's have 2x − 1 = a numbers in between. Similarly, this can be verified for b, c and d.

The two 1's are located at A[2x − 2] and A[2x]. The difference in array indices is 2. Hence, the two 1's have 1 number in between. From this it can be shown that for each k in p1, there are k numbers in between. Take an arbitrary element m in p1. From the construction rules, we see that, the two m's are located in A[2x − 2 − (m − 1)/2] and A[2x + (m − 1)/2]. The difference in the indices is: 2x + (m − 1)/2 − 2x + 2 + (m − 1)/2 = m + 1. Similarly, it can be shown that for any k in any parition pi (1 ≤ i ≤ 4), there are k numbers between the two k's.

Here is an example.

n = 11
x = (11 + 1) / 4 = 3
a = 5
b = 10
c = 11
p1 = (1, 3)
p2 = (2, 4)
p3 = (7, 9)
p4 = (6, 8)

(a) +------------+ (c) | | +-----|------------|------+ (b) | | | | +------|-----|----------+ | | | | | | | | 8 6 3 1 10 1 3 11 6 8 5 9 7 4 2 10 5 2 4 11 7 9 === === === === === === === === | | | | | | | | | +------+ (p1) | | +--------+ (p2) | +-----------------+ +-------------------+ (p4) (p3)

Notes

  1. There is an entry for this problem in OEIS: http://oeis.org/A014552.
  2. The internal symmetries in the permutation are not very obvious from the proof but the permutation simply requires us to place a, b, c, p1, p2 and p3 in this order: (p4' p1' b p1 c p4 a p3' p2' b a p2 c p3) when n ≡ 3 (mod 4), where pi' is reversed pi (1 ≤ i ≤ 4).
  3. For n = 3, we get x = 1. So, length of the partitions p1, p2, p3 and p4 is x − 1 = 0 each. So, rules that involve the partitions do not apply in case of n = 3.
  4. In this proof, I have shown one set of rules to construct a valid permutation for a given n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4). But actually, I found 2 such sets of rules. The variables a, b, c and d remain same in the other rule. Only the rules for  p1, p2, p3, p4 and placement rules differ.
  5. The above two points imply that, except for n = 3, using construction rules, we can construct at least two different permutations for a given value of n such that n ≡ 3 (mod 4) or n ≡ 0 (mod 4).
  6. It seemed like for values of n > 20, there might be over a quadrillion possible solutions. It wasn't feasible to verify this on my laptop with the program I wrote. When I discussed this with a person who was interested in this problem, he asked, "What? Don't you have a quad(rillion)-core processor?"
  7. I worked through a Saturday night till 3:30 AM of the following Sunday morning and I could prove only the necessity of the condition n ≡ 3 (mod 4) or n ≡ 0 (mod 4). When a friend asked me about the progress, I could manage to say this, "Half of it. Need to prove a 'necessary and sufficient' condition. Proved the necessary part only but this much is sufficient for me today. The sufficient part is necessary for the proof to be complete but i think, now it has become necessary to do the sufficient part tomorrow as it is too late already."
  8. The above point reminds me of a famous quote by George Pólya, "Mathematics consists in proving the most obvious thing in the least obvious way".
  9. The above point reminds me of http://humor.beecy.net/misc/principia/.

I first came to know about Sunny Vaghela about one and a half years ago when I got an email from Sandip Dev of Sun Microsystems (now acquired by Oracle).

A couple of days ago, I found him again at attrition.org, a website that used to be the largest mirror of defaced websites.

Sunny Vaghela has found a place in attrition.org's charlatans watch list: Sunny Vaghela: Claims of Orkut Vulnerability Research. He is the third Indian to get into this list after Ankit Fadia and Sahil Khan.

Here is a C puzzle I like to ask my friends and colleagues. It is one of those silly puzzles that may bring a smile on your face if you aren't able to figure it out immediately.

#include <stdio.h>

int main()
{
    http://susam.in/
    printf("hello, world\n");
    return 0;
}

This code compiles and runs successfully even though the current draft of the C99 standard doesn't mention anywhere that URLs are valid syntactic elements in C. How does this code work then?

RSS