A particular type of question is asked time after time in C programming forums. There are two things about such questions that irritate the experienced programmers in the forums. Firstly, this type of questions is so common that many don't even want to respond to them even if it means posting a link to another thread where a similar question has been answered. Secondly, and more importantly, even if someone tries to provide the correct answer to the question, there are many others who ignore it and fill up the thread with incorrect answers.

The questions usually involve finding the output of a code like this.

#include <stdio.h>

int main()
{
    int i = 5;
    printf("%d %d %d\n", i, i--, ++i);
    return 0;
}

The output is 5 6 5 when compiled with gcc and 6 6 6 when compiled with Microsoft C/C++ compiler that comes with Microsoft Visual Studio. The versions of the compilers with which I got these outputs are gcc (Debian 4.3.2-1.1) 4.3.2 and Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86.

Another such frequently asked question looks more or less like this.

#include <stdio.h>

int main()
{
    int a = 5;
    a += a++ + a++;
    printf("%d\n", a);
    return 0;
}

In this case, I got the output as 17 with both the compilers.

The behaviour of such C programs is undefined. In the statements printf("%d %d %d ", i, i--, ++i); and a += a++ + a++;, semicolon is the only sequence point. C guarantees that all side effects of a given expression is completed by the next sequence point in the program. If the value of a variable is modified more than once between two consecutive sequence points, the behavior is undefined. Such code may behave differently when compiled with different compilers.

Before I quote the relevant sections from the ISO/IEC standard, let me quote something from K&R. In Section 2.12 (Precedence and Order of Evaluation) of the book, the authors write,

C, like most languages, does not specify the order in which the operands of an operator are evaluated. (The exceptions are &&, ||, ?:, and ','.) For example, in a statement like
x = f() + g();
f may be evaluated before g or vice versa; thus if either f or g alters a variable on which the other depends, x can depend on the order of evaluation. Intermediate results can be stored in temporary variables to ensure a particular sequence.

In the next paragraph, they write,

Similarly, the order in which function arguments are evaluated is not specified, so the statement
printf("%d %d\n", ++n, power(2, n));   /* WRONG */
can produce different results with different compilers, depending on whether n is incremented before power is called. The solution, of course, is to write
++n;
printf("%d %d\n", n, power(2, n));

They provide one more example in this section.

One unhappy situation is typified by the statement
a[i] = i++;
The question is whether the subscript is the old value of i or the new. Compilers can interpret this in different ways, and generate different answers depending on their interpretation.

If you want to read more on this, download the ISO/IEC 9899 C standard and turn to page 438 for Annex C — Sequence Points. It lists down all the sequence points. ; is one of them. + and ++ operators are not sequence points.

Next, read section 5.1.2.3 (Program execution), point 2.

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects,11) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)

4 comments

Veetrag said:

I completely agree with what you are trying to say. I tried to explain the same stuff 6 years back when I was put incharge of designing questions for a programming competition. I tried to explain it to organizers how ambiguous these questions can be, considering the fact that most participants/compilers do not follow ANSI standard properly. Secondly, these questions are not real test of programming skills, they are just some gimmicks to 'wow' people, real life programming is much different.

Susam Pal said:

I understand what you mean. I look at it in a different way. These questions would be a good test of how well one knows the language. If one tries to find the output for such programs and further tries to justify why the output should be so, clearly, he doesn't understand the language. One who understands the language would mention that the code has undefined behaviour.

I also believe that such questions are very pertinent to professional programming. It is important to know what leads to undefined behaviour and write programs that behaves uniformly across OSes, architectures and compilers.

Javin Paul said:

There are other features also in other programming languages like threading in Java which is completely dependent on JVM behavior or garbage collection in Java which varies across JVM to JVM.

CH Gowri Kumar said:

Well, the reason why the same/similar questions arise is simple - people, out of ignorance, read wrong books and there are few who can guide them to use the right books and the right platform/environment to learn C. The Indian market is full of books that *still* teach C using Turbo C and the questions/exercises contain precisely questions about these. Many of the examples in such books include conio.h!

Post a comment

RSS