## James Randi and a graphologist

A few weeks ago, I watched this video: James Randi and a Graphologist.

After 1 minute 50 seconds in the video, Randi says, "Now the expected result by chance would be to get 1 out of the 5 correct, by chance alone, and there's a less than 1% chance of your getting all 5 correct."

There were 5 ladies in the video and the graphologist was supposed to assign a profession to each lady from a given list of 5 professions. Now, there are 5! = 120 possible ways of assigning them. I tried verifying that the expected number of ladies for which the graphologist assigns the correct profession is exactly 1. I did this in a very convoluted manner using the concept of derangements. I'll describe this first and in the end, I'll provide a really simple way of verifying this using the concept of linearity of expectation.

Let us count the number of ways to assign them in which he gets exactly 1 correct and the remaining 4 incorrect. There are 5C1 ways of choosing the lady for which the assignment is correct and if we call the number of ways wrong profession is assigned to each of the remaining 4 ladies as d4, then the number of ways to assign the professions such that exactly 1 lady has been assigned the correct profession is: 5C1 ⋅ d4.

Similarly, the number of ways he can get exactly 2 correct is 5C2 ⋅ d3 where d3 is the number of ways wrong profession is assigned to each of the remaining 3 ladies. Let us denote the number of ways wrong profession can be assigned to k ladies from a list of k unique professions as dk, where 1 ≤ k ≤ 5. Now, we can say that in general, the number of ways he can get exactly k correct is 5Ck ⋅ d5-k. So, the probability p(k) that he can get exactly k correct is (5Ck ⋅ d5-k)5!.

The number dk is the possible number of derangements of k elements. A derangement of a given sequence is a permutation of the elements in the sequence in which none of the elements appear in its original position. It is obvious that d1 = 0 since there is only one permutation of a sequence containing a single element and this only permutation is not a derangement. d2 = 1 since the only possible derangement of (1, 2) is (2, 1). d3 = 2 since the only possible derangements of (1, 2, 3) is (2, 3, 1) and (3, 1, 2). To calculate d4 and d5, I'll use the recurrence relation, dn = (n - 1)(dn-1 + dn-2). The recurrence relation is easy to understand. I'm including an explanation from the Wikipedia article on derangement:

Suppose that there are n persons numbered 1, 2, …, n. Let there be n hats also numbered 1, 2, …, n. We have to find the number of ways in which no one gets the hat having same number as his/her number. Let us assume that first person takes the hat i. There are n − 1 ways for first person to choose the number i. Now there are 2 options -
• Person i takes the hat of 1. So Now the problem reduces to n − 2 persons and n − 2 hats.
• Person i does not take the hat 1. This situation can be simulated as renumbering hat 1 as i. So now case is equivalent to solving problem with n − 1 persons n − 1 hats.
So, d4 = 3(d3 + d2) = 3(2 + 1) = 9 and d5 = 4(d4 + d3) = 4(9 + 2) = 44.

The expected value of the number of ladies for which the professions are assigned correctly by the graphologist turns out to be:

0 ⋅ p(0) + 1 ⋅ p(1) + 2 ⋅ p(2) + 3 ⋅ p(3) + 4 ⋅ p(4) + 5 ⋅ p(5)

= 0 + (5C1 ⋅ d4)5! + 2 ⋅ (5C2 ⋅ d3)5! + 3 ⋅ (5C3 ⋅ d2)5! + 4 ⋅ (5C4 ⋅ d1)5! + 5 ⋅ (5C5 ⋅ d0)5!

= (0 + 45 + 40 + 30 + 0 + 5)120

= 1

On trying to generalize this result for n ladies and n professions, we get the same result. The expected number of correct assignments by the graphologist is always 1, no matter how many persons are involved.

This happens because

0 ⋅ nC0 ⋅ dn + 1 ⋅ nC1 ⋅ dn-1 + … + n ⋅ nCn ⋅ d0 = n!

where n is a positive integer, d0 = 1, dk is the number of derangements of k elements.

As a result, for n number of persons, where n is any integer, the number of correct assignments of professions turns out to be:

0 ⋅ p(0) + 1 ⋅ p(1) + … + n ⋅ p(n) = 0 ⋅ (nC0 ⋅ dn)n! + (nC1 ⋅ dn - 1)n! + … + 5 ⋅ (nCn ⋅ d0)n! = (0 ⋅ nC0 ⋅ dn + 1 ⋅ nC1 ⋅ dn-1 + … + n ⋅ nCn ⋅ d0)n! = n!n! = 1

That was too much work to verify a simple fact. Here is a simpler way to verify it. Let us denote the graphologist's set of assignments of professions using a sequence of n numbers. The kth number in the sequence is 1 if the kth person is assigned the correct profession; it is 0 otherwise. For each person, the probability that he is assigned the correct profession is 1n. This is because out of the n! possible ways of assigning n professions to n people, for a given k, the number of ways the kth person is assigned the correct profession is (n - 1)! and (n - 1)!n! = 1n.

So, the expected value of each number in the sequence is: 1/n. It can be proven that expectation is linear. One result of this is, E(x + y) = E(x) + E(y), where E(r) is used to denote the expected value of a random variable r defined on a given probability space. In other words, expectation of sum of random variables is equal to the sum of expectations of each random variable. So, the sum of all the numbers in the sequence is n⋅1n = 1.

If you enjoyed this, here is a little fun problem for you that I made up while thinking on the concepts of derangement and expected value. Prove that

nC0d0 + nC1d1 + … + nCndn = n!

## Factoring step in Impossible Puzzle

In my last blog post on solving the Impossible Puzzle, I promised that I would prove the following.

For any positive integer N, there exists only one pair of positive integers a and b such that 1 < a < b and ab = N if and only if N is a product of two different primes, a cube of a prime, or a fourth power of a prime.
In the last post, I gave an intuitive demonstration of why this should be true. However, intuition may be misleading and should not be relied upon. We need to prove this by pure logic.

From the fundamental theorem of arithmetic, we know that any integer greater than 1 can be expressed as a unique product of primes. A corollary of this theorem is:

Any positive integer N > 1 can be written uniquely in a canonical form N = p1k1p2k2 … pnkn where, each ki is a positive integer and each pi is a prime for positive integers i, 1 ≤ i ≤ n.
From this, let me prove the following lemma.

Lemma: The number of possible factors of N is (k1 + 1)(k2 + 1)…(kn + 1)

Proof: Any factor of N can be expressed as p1r1p2r2 … pnrn, where 0 ≤ ri ≤ ki for 1 ≤ i ≤ n.

In the expression for the factor of N, we can see that each pi can be raised by any power between 0 and ki (including 0 and ki). So, there are ki + 1 different ways to choose the exponent for each pi in the above expression.

Hence, the total number of possible factors is: (k1 + 1)(k2 + 1)…(kn + 1). ∎

Now, let me prove the theorem I mentioned in the beginning of this post.

Theorem: For any positive integer N, there exists only one pair of positive integers a and b such that 1 < a < b and ab = N if and only if N is a product of two different primes, a cube of a prime, or a fourth power of a prime.

Proof: The sufficiency of the condition is easy to show.

If N is a product of two different primes, p1 and p2, where p1 < p2, the set of factors of N is {1, p1, p2, N}. Thus, it has only two factors a = p1 and b = p2 such that 1 < a < b and ab = N.

If N is a cube of a prime p, N = p3 and the set of factors of N is {1, p, p2, p3}. a = p and b = p2 are the only factors such that 1 < a < b and ab = N.

If N is a fourth power of a prime p, N = p4 and the set of factors of N is {1, p, p2, p3, p4}. a = p and b = p3 are the only factors such that 1 < a < b and ab = N.

Now, let us prove the necessity of the condition.

If a positive integer N has only one two factors a and b such that 1 < a < b, it implies that there are two possibilities of the total set of factors of N. If N is not a perfect square, the total set of factors of N is {1, a, b, N} where 1 × N = a × b. If N is a perfect square, such that √N = r, the total set of factors of N is {1, a, r, b, N} where 1 × N = a × b = r × r.

So, in any case, the total number of factors N has, is either 4 or 5.

Let us write N as p1k1p2k2…pnkn for primes p1, p2, &hellip, pn and positive integers k1, k2, …, kn.

For n ≥ 3, the total number of factors N has, is (k1 + 1)(k2 + 1)…(kn + 1) ≥ (k1 + 1)(k2 + 1)(k3 + 1) ≥ 6. But we have shown that the total number of factors of N is either 4 or 5. So, n must be less than 3. Hence, we can write N as p1k1p2k2 and thus the total number of factors it has can be written as (k1 + 1)(k2 + 1). Without loss of generality, let us assume k1 ≤ k2.

All possible factorings of 4 and 5 are as follows.

• 4 = 1 × 4 = (0 + 1)(3 + 1)
• 4 = 2 × 2 = (1 + 1)(1 + 1)
• 5 = 1 × 5 = (0 + 1)(4 + 1)
From the above enumeration of factorings we can see that one of the following must be true.
• k1 = 0 and k2 = 3
• k1 = k2 = 1
• k1 = 0 and k2 = 4
If k1 = 0 and k2 = 3, N = p23.

If k1 = k2 = 1, N = p1p2.

If k1 = 0 and k2 = 4, N = p24.

So, we have shown that N is a product of two primes, a cube of a prime, or a fourth power of a prime.∎

For a few more interesting results on factors, see http://susam.in/downloads/mathematics/theorems/factors.pdf.

## Solving the Impossible Puzzle

Last evening, I stumbled upon this interesting puzzle: http://en.wikipedia.org/wiki/Impossible_Puzzle. Let me quote the problem statement from the Wikipedia article.

Given are X and Y, two integers, greater than 1, are not equal, and their sum is less than 100. S and P are two mathematicians; S is given the sum X + Y, and P is given the product X × Y of these numbers.
1. P says "I cannot find these numbers."
2. S says "I was sure that you could not find them."
3. P says "Then, I found these numbers."
4. S says "If you could find them, then I also found them."
What are these numbers?
I'll explain how I solved it. Without loss of generality, we can assume X to be smaller than Y. X can not be greater than 49, otherwise the sum X + Y wouldn't be less than 100. For example, if X is 50, the sum is 101 even for the smallest possible value of Y, i.e. Y = 51. Since, the sum of X and Y is less than 100 and X can take a minimum value of 2, Y can not be greater than 97. The smallest sum possible is 5 for X = 2 and Y = 3.

Let me explain these things once again, algebraically.

X < Y and X + Y < 100 ⇒ X + 1 ≤ Y and X + Y ≤ 99 ⇒ 2X + 1 ≤ X + Y ≤ 99 ⇒ X ≤ 49

X > 1 and X + Y < 100 &rArr 2 ≤ X and X + Y ≤ 99 ⇒ 2 + Y ≤ X + Y ≤ 99 ⇒ Y ≤ 97.

Y > X and X > 1 ⇒ Y ≥ X + 1 and X ≥ 2 ⇒ Y ≥ 3

Y > X > 1 ⇒ Y ≥ X + 1 and X ≥ 2 ⇒ X + Y ≥ 2X + 1 ≥ 2 ⋅ 2 + 1 = 5

The above constraints automatically place some constraints on the product XY.

X > 1 and Y > X ⇒ X ≥ 2 and Y ≥ X + 1 ⇒ XY ≥ X(X + 1) ≥ 6

X + Y < 100 ⇒ X + Y ≤ 99 ⇒ Y ≤ 99 − X ⇒ XY ≤ X(99 − X)

ddX X(99 − X) = ddX 99X − X2 = 99 − 2X

This value is positive for all real numbers X in the closed interval [2, 49]. So, X(99 − X) is monotonically increasing in this interval. Therefore, the largest possible value of X(99 − X) is 49(99 − 49) = 2450.

Hence, XY ≤ 2450

Let me write down all the constraints we have found so far neatly.

1. 2 ≤ X ≤ 49
2. 3 ≤ Y ≤ 97
3. 5 ≤ X + Y ≤ 99
4. 6 ≤ XY ≤ 2450
Now, let us analyze what the two mathematicians said.

P says "I cannot find these numbers."

If X and Y are two prime numbers, P can easily factor the product to find the numbers X and Y. For example, if XY = 35, he can easily determine that the two numbers are 5 and 7. The only other way of factoring it is 1 × 35 but X = 1 is not allowed in this puzzle.

If XY is a cube of a prime number, he can find the two numbers as X = ∛(XY) and Y = [∛(XY)]2. For example, if XY = 125, he can say that the two numbers are 5 and 25.

If XY is a fourth power of a prime number, he can find the two numbers as X = ∜(XY) and [∜(XY)]3. For example, if XY = 81, he can say that the two numbers are 3 and 27. Factoring it as 9 × 9 is not useful because X = Y is not allowed in this puzzle.

It is not possible to find X and Y for any other type of product. For example, if P is given 12, there is no way for him to know whether the numbers are X = 2 and Y = 6, or X = 3 and Y = 4.

If you are not convinced with this demonstration, please wait for my next blog post in which I will prove:

For any positive integer N, there exists only one pair of positive integers a and b such that 1 < a < b and ab = N if and only if N is a product of two different primes, a cube of a prime, or a fourth power of a prime.
Update: The blog post is now available: Factoring step in Impossible Puzzle

So, the information we get from the first sentence uttered by P is that the product given to him is definitely neither a product of two primes nor a cube of a prime nor a fourth power of a prime.

S says "I was sure that you could not find them."

S can be sure that P wouldn't find them only if he has got a number that can not be expressed as a sum of two primes or as a sum of a prime and its square or a sum of a prime and its cube. So, if S is given 11, he would be sure that P can not find X and Y. I wrote a Python program to find all such sums.

Here is the output from the program: possible-sums.py

11 17 23 27 29 35 37 41 47 51 53 57 59 65 67 71 77 79 83 87 89 93
95 97

P says "Then, I found these numbers."

P realizes that if S was sure of what he said, then the sum S has got can not be expressed as a sum of two primes, sum of a prime and its square, or sum of a prime and its cube. From the output of the program above, we saw that there are 24 such possible sums.

Now, P must have factored the product with him into two factors and compared their sum with the numbers in this list of possible sums. He must have found exactly one way of factoring (ignoring the ones that involve 1 or equal factors) possible such that the sum of the factors belonged to the list. Hence, he could decide that the pair of factors from the only possible factoring were X and Y.

Let us take 24 as an example. 24 can be factored as 2 × 12, 3 × 8 and 4 × 6. However, 3 and 8 is the only pair of factors which has a sum that belongs to the list of possible sums with S. So, if P is given 24, he can determine that X = 3 and Y = 8.

However, if P is given 30, he would find that it can be factored in two different ways such that their sum is a possible sum that S might have. For example, if 30 is factored into 2 and 15, the sum of the factors is 17 which is a possible sum S might have. If it factored into 5 and 6, their sum is 11 which again is a possible sum S might have. So, in such a case, P can't determine whether X = 2 and Y = 15, or X = 5 and Y = 6.

So, P definitely has a product which can be factored into X and Y in only one way such that X + Y is a possible sum that S might have. I wrote another Python program to find all such products.

Here are the first 10 lines of output from the program: possible-products.py

If P has 18: X = 2, Y = 9, S has 11
If P has 24: X = 3, Y = 8, S has 11
If P has 28: X = 4, Y = 7, S has 11
If P has 50: X = 2, Y = 25, S has 27
If P has 52: X = 4, Y = 13, S has 17
If P has 54: X = 2, Y = 27, S has 29
If P has 76: X = 4, Y = 19, S has 23
If P has 92: X = 4, Y = 23, S has 27
If P has 96: X = 3, Y = 32, S has 35
If P has 98: X = 2, Y = 49, S has 51

S says "If you could find them, then I also found them."

From the above output, we can see that there are certain sums that S might have for which there are multiple possible products that might be with P. For example, if S has 11, P might have 18, 24, etc. In such a case S can not conclude on a particular product P has, and the values of X and Y.

But since S said that he could also find X and Y, S certainly has a sum for which there is exactly one possible product that P might have. The program finds such sums and prints it in the end. Here are the last 10 lines of output.

If P has 2296: X = 41, Y = 56, S has 97
If P has 2310: X = 42, Y = 55, S has 97
If P has 2322: X = 43, Y = 54, S has 97
If P has 2332: X = 44, Y = 53, S has 97
If P has 2340: X = 45, Y = 52, S has 97
If P has 2346: X = 46, Y = 51, S has 97
If P has 2350: X = 47, Y = 50, S has 97
If P has 2352: X = 48, Y = 49, S has 97

X = 4 and Y = 13 has a unique sum.


## Product of negatives

Negative multiplied by negative is positive.

This is usually taught to us in school after our knowledge about numbers expands to include the negative numbers. Most of us take this fact for granted. I took it on faith as well.

One of my pastimes is to ponder on these facts that I once learnt by faith and observable evidence rather than reason and logic. I see whether I can see these facts in a new light. A few years ago, I thought over this one as well and proving this fact was easy.

First let me provide an intuitive understanding of why it should be so. In the discussion to follow, we will assume that we already know some basic properties like distributive property, a × (−b) = −(a × b), etc. because I don't want to prove each and every thing involved here right from the Dedekind–Peano axioms.

We know that 7 × 8 = 56. Let us express 7 as (10 − 3) and 8 as (10 − 2).

So, the following must be true.

(10 − 3) × (10 − 2) = 56

Using the distributive property of multiplication over addition, we get,

10 × 10 + (−3) × 10 + 10 × (−2) + (−3) × (−2) = 56

We already know that −3 × 10 = −30 and 10 × −2 = −20. But we do not know what (−3) × (−2) is. So, let us see what we get by using what we know.

10 × 10 + (−3) × 10 + 10 × (−2) + (−3) × (−2) = 56

⇔ 100 + (−30) + (−20) + (−2) × (−3) = 56

⇔ 50 + (−2) × (−3) = 56

⇔ (−2) × (−3) = 6

We see that (10 − 3) × (10 − 2) = 56 is true if and only if (−2) × (−3) is considered as 6.

However this is not a proper proof. One may ask whether this rule is true for all such calculations even when the numbers are different. Can we prove that −a × −b = a × b is absolutely necessary for all real numbers a and b? So, let me try a proper algebraic proof.

Let a and b be two positive real numbers. We know that a − a = 0.

Multiplying both sides by −b, we get,

(a − a) × (−b) = 0 × (−b)

Again, using the distributive property of multiplication over addition, we get,

a × (−b) + (−a) × (−b) = 0.

We know that a × (−b) = −(a × b), but we do not know what (−a) × (−b) is. Using what we know, we get,

−(a × b) + (−a) * (−b) = 0.

Adding (a × b) to both sides, we get,

(−a) × (−b) = a × b

## Repeating calendars

A lot of people ask me why I love solving mathematical puzzles. Some ask me how solving puzzles would be useful in my life. The simplest and the only answer I can give is: I enjoy solving puzzles. The concepts learnt while solving a puzzle sometimes turn out to be useful at work or while designing an algorithm. Last night, it helped me earn a pizza for dinner.

A few days back, a friend of mine called me and told me about something that she felt was an interesting thing she had read somewhere. She had read that February 2010 is a special month since each day of the week occurs exactly four times. While I was wondering how obvious this was, she also made the bold claim that this happens only once in 11 years.

I told her that whatever she had read was wrong but she didn't agree. She was ready to bet on it. She was so sure of what she had read that inspite of reminding her that the problem was mathematical and I was better at mathematics than she was, she went ahead with the bet. Of course she lost it and treated me to a pizza shop.

The month of February in any non leap year has 28 days and 28 is a multiple of 7. So, each day would occur exactly 4 times in the month of February of any non leap year. Have a look.

susam@swift:~$paste <(cal feb 2010) <(cal feb 2011) February 2010 February 2011 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 1 2 3 4 5 7 8 9 10 11 12 13 6 7 8 9 10 11 12 14 15 16 17 18 19 20 13 14 15 16 17 18 19 21 22 23 24 25 26 27 20 21 22 23 24 25 26 28 27 28  A few weeks ago, Prunthaban told me about another variant of this hoax. As per the hoax, February 2010 is special because it starts with a Monday and each day of the week occurs exactly four times. We thought a little over it and found it to be wrong. Here is a counter counterexample: susam@swift:~$ paste <(cal
feb 2010) <(cal feb 2021) <(cal feb 2032)
February 2010           February 2021           February 2032
Su Mo Tu We Th Fr Sa    Su Mo Tu We Th Fr Sa    Su Mo Tu We Th Fr Sa
1  2  3  4  5  6        1  2  3  4  5  6     1  2  3  4  5  6  7
7  8  9 10 11 12 13     7  8  9 10 11 12 13     8  9 10 11 12 13 14
14 15 16 17 18 19 20    14 15 16 17 18 19 20    15 16 17 18 19 20 21
21 22 23 24 25 26 27    21 22 23 24 25 26 27    22 23 24 25 26 27 28
28                      28                      29

February 2032 doesn't start on Monday. This hoax became popular probably because some people found this to hold true for 2021 and they never bothered to check it for 2032.

An interesting question that came to Prunthaban's mind was: Which year will have a calendar which looks exactly like this year's calendar? In other words, after how many years will we have a calendar where the day of week on which each date of the year falls on is exactly equal to the day of the week each date of this year falls on?

We found that after every 28 years the calendars repeat. So, the calendar of 2038 would look exactly like the calendar of 2010. The mathematics behind this is pretty simple.

Each year has 365 days. A leap year has 366 days. Every fourth year is a leap year. Note that the number of years after which the calendar repeats must be a multiple of 4. Let the number of years after which the calendar repeats be p.

If a certain year begins on a particular day and another year begins on the same day, then the number of days between these two years must be a multiple of 7. So, the number of days in these p years must be a multiple of 7.

Now, p must be a multiple of 4. This will ensure that if year x is a leap year if and only if year x + p is a leap year. This condition is necessary because the calendar of a leap year and that of a non leap year can never match. So, let p = 4k, where k is a positive integer.

As per the above discussion, the following must be true:

365 · 3k + 366 · k ≡ 0 (mod 7) ⇒ 1461k ≡ 0 (mod 7)

Now 7 ∤ 1461 and 7 is a prime number. So, 7 must divide k. The smallest positive integer k for which 7 ∣ k is k = 7. So, p = 28.

However, while writing this blog post, I realized that there is a subtle exception. Every fourth year is not really a leap year. A year that is divisible by 100 but not by 400 is not a leap year.

susam@swift:~\$ paste <(cal
feb 1900) <(cal feb 2000) <(cal feb 2100)
February 1900           February 2000           February 2100
Su Mo Tu We Th Fr Sa    Su Mo Tu We Th Fr Sa    Su Mo Tu We Th Fr Sa
1  2  3           1  2  3  4  5        1  2  3  4  5  6
4  5  6  7  8  9 10     6  7  8  9 10 11 12     7  8  9 10 11 12 13
11 12 13 14 15 16 17    13 14 15 16 17 18 19    14 15 16 17 18 19 20
18 19 20 21 22 23 24    20 21 22 23 24 25 26    21 22 23 24 25 26 27
25 26 27 28             27 28 29                28

One can see that 2000 is a leap year while 1900 and 2100 aren't since 1900 as well as 2100 are divisible by 100 but not by 400. So, the calendars of 2100 and 2128 do not match since 2100 is not a leap year while 2128 is.

A more general answer must consider this as well. A fix to the previous solution that works across centuries must have p = 400k. Considering that in a period of 400 years, we have three years that are divisible by 100 but not by 400, we conclude that the number of leap years in a period of 400 years is 97. So,

365 · (400 − 97) · k + 366 · 97k ≡ 0 (mod 7) ⇒ 146097k ≡ 0 (mod 7)

Since, 7 ∣ 146097, the smallest positive integer k for which the above equivalence holds true is k = 1. So, p = 400.

Hence, the calendar repeats every 400 years.