Let me begin this post with a puzzle.

Alex and Bob worked as financial advisers for the same company. They
drew equal salaries from the company. They behaved well at the office.
Both worked on similar assignments. Each assignment required a yes-no
decision. The company used the decisions made by them to make profits.

After the recession hit the company very badly, they have to fire one of
them. Both Alex and Bob have worked on almost the same number of
assignments in the last ten years. Alex has been consistently taking
about 80% decisions correctly every year. Bob, on the other hand, has
been taking only about 5% correct decisions every year.

The company has decided to keep Bob and fire Alex. Why?

If you want to solve this puzzle yourself, you might want to stop here
and think for a while. There are spoilers ahead.

Before giving away the solution to this puzzle, let me discuss
something else.

Shannon's noisy channel coding theorem guarantees that in a binary
symmetric channel with an error rate \(p\), it is possible to transmit
digital data with arbitrarily small probability of error up to a
transmission rate of
\[
1 + p \mathop{\log_2} p + (1 - p) \mathop{\log_2} (1 - p)
\]

We'll call this the optimal transmission rate. The error rate \(p\) is
the probability that the channel will corrupt a bit from 1 to 0 or vice
versa. The transmission rate is the ratio of the number of data bits to
the total number of bits used to represent them. If we employ
error-correcting codes, the transmission rate would be less than 1
because we'll need extra bits for error correction. As a result, the
number of bits used to represent them would be more than the number of
data bits.

For \(p = 0\), the transmission rate is maximum. That's obvious.
Loosely speaking, if the channel is error-free, we do not need extra
bits for error correction. The transmission rate is poorest when
\(p = \frac{1}{2}\). In this case, each bit
received is completely random since the channel is just as likely to
send 0 or 1 irrespective of the original bit sent by the transmitter.
However, it might be surprising that as the error rate increases from
\(0.5\) to \(1\), the optimal transmission rate increases. When
\(p = 1\), the optimal transmission rate is maximum again. It is
easy to undersand this intuitively. At \(p = 1\), the channel is
guaranteed to corrupt each bit. The receiver can correct the error by
inverting each bit in the received message, so we do not need extra bits
for error correction in this case as well.

Now, let me get back to the puzzle I mentioned while beginning this
post.

The company decided to keep Bob because he was more useful to the
company. He helped the company to take 95% correct decisions. They
simply did the opposite of what Bob recommended and made huge profits in
the last ten years. Alex on the other hand helped them take only 80%
decisions correctly, so he has to go. It's unfair but it's more
profitable.