This is one of my favourite C-cum-ASM puzzles I often ask friends and colleagues who are interested in C as well as assembly language. The solution to this puzzle is implementation-dependent. I am pretty sure there is no way to solve this in an implementation-independent manner.

Here is the puzzle.

Code for the puzzle:

susam@swift:~/lab$ cat overwrite.c
#include <stdio.h>

void f()
{
}

int main()
{
    printf("1\n");
    f();
    printf("2\n");
    printf("3\n");
}

The problem is to define the function void f() such that the output is 1 followed by 3 as shown below:

susam@swift:~/lab$ gcc overwrite.c && ./a.out
1
3

Printing 3 in void f() and exiting is not allowed as a solution.

With this constraint, the puzzle essentially involves figuring out what code we can place in the body of void f() such that it causes the program to skip the instructions to carry out printf("2\n");. Let me show how I solved this for two implementations.

  1. gcc 4.3.2 on 64-bit Debian 5.0.3 running on 64-bit Intel Core 2 Duo.
  2. Microsoft Visual Studio 2005 on 32-bit Windows XP running on 64-bit Intel Core 2 Duo.

Let me first show step by step how I approached this problem for gcc when I first came across this puzzle. I added the statement char a = 7; to the function void f(). Next, I compiled the code and analyzed the assembly code generated for f() and main() functions.

susam@nifty:~$ cat overwrite.c
#include <stdio.h>

void f()
{
    char a = 7;
}

int main()
{
    printf("1\n");
    f();
    printf("2\n");
    printf("3\n");
    return 0;
}
susam@nifty:~$ gcc -c overwrite.c && objdump -d overwrite.o

overwrite.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <f>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   c6 45 ff 07             movb  $0x7,-0x1(%rbp)
   8:   c9                      leaveq
   9:   c3                      retq

000000000000000a <main>: a: 55 push %rbp b: 48 89 e5 mov %rsp,%rbp e: bf 00 00 00 00 mov $0x0,%edi 13: e8 00 00 00 00 callq 18 <main+0xe> 18: b8 00 00 00 00 mov $0x0,%eax 1d: e8 00 00 00 00 callq 22 <main+0x18> 22: bf 00 00 00 00 mov $0x0,%edi 27: e8 00 00 00 00 callq 2c <main+0x22> 2c: bf 00 00 00 00 mov $0x0,%edi 31: e8 00 00 00 00 callq 36 <main+0x2c> 36: b8 00 00 00 00 mov $0x0,%eax 3b: c9 leaveq 3c: c3 retq

When main() calls f(), the microprocessor saves the return address in the stack. Line with label 1d in the objdump listing for main() is the call to f(). After f() is executed, the code at line with label 22 is executed. So, the return address that is saved on stack is the address at which the instructions at line with label 22 would be present at runtime. The code highlighted in orange is the assembly code for printf("2\n");. This is the code we want to skip. In other words, we want to modify the return address in the stack to that of the instructions at line labelled 2c. From the listing we can figure that it is equivalent to skipping 10 bytes in the instructions for main or adding 10 to the return address saved in the stack.

So, we know now how many instructions to skip. But how do we get hold of the return address saved in the stack when f() is being executed? This is where the statement char a = 7; that I added in void f() helps. The lines highlighted in yellow in the objdump listing for f() is the assembly code for this statement. We can see that the RSP (stack pointer) is copied into RBP (base pointer) and the variable a is allocated one location below memory address pointed to by RBP. Also, note that the old RBP was pushed into the stack by f() before it copied RSP to RBP.

Let us have a look at the sequence of steps performed in the call to f() and allocation of the variable a.

  1. The microprocessor saves the return address by pushing RIP (instruction pointer) into the stack when main() calls f().
  2. f() saves RBP by pushing it into the stack.
  3. f() copies RSP to RBP and stores 7 at the memory address in RBP minus 1. In other words, the variable a is allocated in the address in RBP minus 1.
At this point, the stack would appear as shown below.

ContentSize (in bytes)
Return address (RIP)8
Base pointer (RBP)8
Variable a1

So, we see that if we add 9 to the address of a, we get hold of the return address. So, here is the solution.

susam@nifty:~$ cat overwrite.c
#include <stdio.h>

void f()
{
    char a;
    * (char **) (&a + 9) += 10;
}

int main()
{
    printf("1\n");
    f();
    printf("2\n");
    printf("3\n");
    return 0;
}
susam@nifty:~$ gcc overwrite.c && ./a.out
1
3

Now, let me present the dumpbin listing and the solution for Visual Studio 2005.

C:\>type overwrite.c
#include <stdio.h>

void f()
{
    char a = 7;
}

int main()
{
    printf("1\n");
    f();
    printf("2\n");
    printf("3\n");
    return 0;
}

C:\>cl overwrite.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42
for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

overwrite.c
Microsoft (R) Incremental Linker Version 8.00.50727.42
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:overwrite.exe
overwrite.obj

C:\>dumpbin /disasm overwrite.obj
Microsoft (R) COFF/PE Dumper Version 8.00.50727.42
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file overwrite.obj

File Type: COFF OBJECT

_f:
  00000000: 55                 push        ebp
  00000001: 8B EC              mov         ebp,esp
  00000003: 51                 push        ecx
  00000004: C6 45 FF 07        mov         byte ptr [ebp-1],7
  00000008: 8B E5              mov         esp,ebp
  0000000A: 5D                 pop         ebp
  0000000B: C3                 ret
  0000000C: CC                 int         3
  0000000D: CC                 int         3
  0000000E: CC                 int         3
  0000000F: CC                 int         3
_main:
  00000010: 55                 push        ebp
  00000011: 8B EC              mov         ebp,esp
  00000013: 68 00 00 00 00     push        offset $SG2224
  00000018: E8 00 00 00 00     call        _printf
  0000001D: 83 C4 04           add         esp,4
  00000020: E8 00 00 00 00     call        _f
  00000025: 68 00 00 00 00     push        offset $SG2225
  0000002A: E8 00 00 00 00     call        _printf
  0000002F: 83 C4 04           add         esp,4
  00000032: 68 00 00 00 00     push        offset $SG2226
  00000037: E8 00 00 00 00     call        _printf
  0000003C: 83 C4 04           add         esp,4
  0000003F: 33 C0              xor         eax,eax
  00000041: 5D                 pop         ebp
  00000042: C3                 ret

  Summary

           B .data
          57 .debug$S
          2F .drectve
          43 .text

Just like in the objdump listing, in this listing too, the lines in yellow show where the variable a is allocated and the lines in orange show the instructions we want to skip. Here is the solution.

C:\>type overwrite.c
#include <stdio.h>

void f()
{
    char a;
    * (char **) (&a + 5) += 13;
}

int main()
{
    printf("1
");
    f();
    printf("2
");
    printf("3
");
    return 0;
}

C:\>cl /w overwrite.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42
for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

overwrite.c
Microsoft (R) Incremental Linker Version 8.00.50727.42
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:overwrite.exe
overwrite.obj

C:\>overwrite.exe
1
3

Do you have more creative ways of solving this puzzle?

3 comments

Anonymous said:

Good article. This is how buffer overflow is used as a exploit. But I guess, kernel provides some random stack alignment to prevent this.

Anonymous said:

I think you mean ASLR and this can be turned off using echo 0 > /proc/sys/kernel/randomize_va_space (on Linux) - And you can turn of DEP in your Control Panel's System → Advanced → Performance → Settings tab on Windows, iirc.

Anonymous said:

Post a comment

RSS