7455608.c4

BlackHoodie 2018 reverse engineering challenge

Intro

This year I attended BlackHoodie 2018, a women-only reverse engineering bootcamp. It was three days filled with awesome talks and some hard-core RE. During the event, a challenge was set by @N1aKan. The challenge can be found on github and an official writeup has been released. But I wanted to write my own, so here we are! This challenge can be solved in three different ways, and I will try to cover all of them!

Minimap:

Tools

I initially solved this challenge with IDA Pro, mostly because I am most comfortable with it and time was precious. However, IDA is ridiculously expensive (and I only have access to one due to my School, for my research), so for this writeup I will be using radare2. I have never used r2, but I have heard many great things about it, so I wanted to give it a try, and it will be a learning experience for me too! GDB is a good alternative too.

File recon

We were given a file, 7355608.c4 and told to find the flag. I have never encountered the .c4 extension before, so I used file, to get some information about it.

$> file 7355608.c4 
7355608.c4: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=17c8857a28c514ad309346cff48d88a3459b0731, not stripped

So we know we are dealing with a 64-bit ELF executable. Which means we can run it! (I initially did so in a VM that wasn’t networked, just in case )

$> ./7355608.c4 
The bomb has been planted.
BOOM !

Okay, so the file does not expect any input. Probably not a buffer overflow of sorts. But the output should be a good staring point for reverse engineering it. At this point I guessed we need to somehow defuse the bomb. Let’s have a look.

RE recon

I used this cheatsheet, this tutorial and the book for my crash-course into radare2!

I generally try to understand as much as possible what a binary does, figure out what functions are actually used and what is random junk. In this section I’ll be explaining the various parts of the program I’ve identified as relevant. This challenge can be solved in 15 minutes by debugging it. But that is not my purpose right now.

Load the binary with r2 7355608.c4. I used aaa to get r2 to analyse the binary (resolve symbols & refs, analyse function calls, etc). We then use afl to list all the functions in the binary. We spot the function main, and we can easily hypothesise this is most likely a C program. We can tell r2 to find the starting point by using s main. Some interesting function names appear, such as ptrace, boom(), func_cipher42, func_cipher13, func_cipher6 and nuke. We’ll keep an eye out for these, they may or may not be used, but good to notice them. We’ll now tell r2 to go to the main function: s main.

[0x004005e0]> aaa [x] Analyze all flags starting with sym. and entry0 (aa) [x] Analyze function calls (aac) [x] Analyze len bytes of instructions for references (aar) [x] Constructing a function name for fcn.* and sym.func.* functions (aan) [x] Type matching analysis for all functions (afta) [x] Use -AA or aaaa to perform additional experimental analysis. [0x004005e0]> afl 0x00400528 3 26 sym._init 0x00400560 1 6 sym.imp.putchar 0x00400570 1 6 sym.imp.__stack_chk_fail 0x00400580 1 6 sym.imp.__libc_start_main 0x00400590 1 6 sym.imp.malloc 0x004005a0 1 6 sym.imp.ptrace 0x004005b0 1 6 sym.imp.exit 0x004005c0 1 6 sym.imp.__ctype_b_loc 0x004005d0 1 6 sub.__gmon_start_5d0 0x004005e0 1 41 entry0 0x00400610 4 50 -> 41 sym.deregister_tm_clones 0x00400650 4 58 -> 55 sym.register_tm_clones 0x00400690 3 28 sym.__do_global_dtors_aux 0x004006b0 4 38 -> 35 entry1.init 0x004006d6 4 123 sym.boom 0x00400751 4 110 sym.func_cipher42 0x004007bf 4 85 sym.func_cipher13 0x00400814 4 97 sym.func_cipher6 0x00400875 4 88 sym.nuke 0x004008cd 1 95 sym.apoptosis 0x0040092c 31 1402 main 0x00400eb0 4 101 sym.__libc_csu_init 0x00400f20 1 2 sym.__libc_csu_fini 0x00400f24 1 9 sym._fini [0x004005e0]> s main [0x0040092c]>

We can see in the r2 cursor the address jumped to 0x0040092c, which is the start of the main() function. We’re now going to use V to enter visual mode. This will display the hex values (hex editor), but we are looking for the graph view, so press V again. We investigate the function and we notice many variables. In the middle of all the variable assignment, we can see there are two malloc calls: one of 44 bytes and one of 120 bytes. I renamed these to array_44 and array_120 (r2 commands to rename the variable used in the loop: in Visual Mode, go with the cursor at the line where the variable is used, hit d and then n, input new name).

; size_t size ; ',' ; 44
edi = 0x2c
; void *malloc(size_t size)
sym.imp.malloc ();[ga]
qword [array_44] = rax   
; size_t size ; 'x' ; 120
edi = 0x78
; void *malloc(size_t size)
sym.imp.malloc ();[ga] 
qword [array_120] = rax  

Moving on, we can see a lot of local variables being assigned the values of the characters from the strings The bomb has been planted. and I see you debugging me!. These are great finds, especially since we see the first string as output when running the binary. We can assume these are statically defined arrays.

dword [local_f0h] = 0x54 ; 'T' ; 84
dword [local_ech] = 0x68 ; 'h' ; 104
dword [local_e8h] = 0x65 ; 'e' ; 101
dword [local_e4h] = 0x20 ; 32
dword [local_e0h] = 0x62 ; 'b' ; 98
dword [local_dch] = 0x6f ; 'o' ; 111
dword [local_d8h] = 0x6d ; 'm' ; 109
dword [local_d4h] = 0x62 ; 'b' ; 98
[ ... ]

First output

Looking past the first (massive) basic block, we see the first interesting set of blocks. The function putchar() is being called in a loop which runs 25 times; I renamed loc_16ch to index, as this is the variable tested in the loop (r2 commands to change value from hex to decimal: in Visual Mode, go with the cursor at the line where the variable is used, hit d and then i, enter desired base - d). The variables loaded start with local_f0h, which we identified above. So this piece of code is the one printing The bomb has been planted.. Great!

planted

isDigit()

The following block of instructions was quite had to understand what it does, but I found these two resources (__ctype_b_loc StackOverflow and codesdope.com) really useful. local_168h is assigned the value 0x7f in the first basic block of the program. The instructions below makes a lookup to the table __ctype_b_loc for the value stored in local_168h (0x7f). It then masks the returned value by 0x800 (masking the _ISdigit flag) and checks if it is set (basically, it checks if the value of local_168h is a digit). The check will not pass, so we will have to change that (make note of address 0x400cb1). At this point I decided to rename local_168h to flag_7b.

isDigit

The following small snippet allows you to check which flags are set for the 0x7f character. I wrote it in order to confirm my understanding of the __ctype_b_loc table. You can run it here.

#include <stdio.h>
#include <ctype.h>

int main () {
   int var = 0x7f;
   printf("Alphanum::                %i\n", isalnum(var) ); 
   printf("Alphabet:                 %i\n", isalpha(var) );
   printf("Control character:        %i\n", iscntrl(var) );
   printf("Digit:                    %i\n", isdigit(var) ); 
   printf("Graphical representation: %i\n", isgraph(var) );
   printf("Lowercase:                %i\n", islower(var) );
   printf("Printable:                %i\n", isprint(var) );
   printf("Punctuation:              %i\n", ispunct(var) );
   printf("Space character:          %i\n", isspace(var) ); 
   printf("Uppercase:                %i\n", isupper(var) ); 
   printf("Hexadecimal digit:        %i\n", isxdigit(var) ); 
   return(0);
}        

Boom and ptrace

By quickly inspecting the control flow graph, we realise there are many paths which lead the program to call the boom() function. This prints the characters BOOM! and exits the program. We will want to avoid this function!

Carrying on the investigation, we can see ptrace is called, in order to determine if the program is being debugged. If it detects debugging, it prints 23 characters starting with the variable local_150h, which corresponds to the string I see you debugging me!. After this, the program calls the boom() function.

ptrace

If the ptrace detection is passed successfully (which it won’t if debugging it, make note of address 0x400cd5), the program checks whether the variable flag_7b is not equal to 0x7B (in the first basic block flag_7b is set to 0x7B and has not been altered so far). Again, the program will take the wrong route here, so we make note of address 0x400d27.

Cipher42

If this comparison is successful, func_cipher42 is called, with three parameters (local_170h which should be 0, local_80h which seems to be another statically defined array, and array_120 which is an array with the size of 120 bytes). The function iterates over the elements of the array starting with local_80h (30 iterations), and computes a xor of each element with the value 0x42, and saves in the the array array_120. Also, local_170h is set to 0x7B. (r2 commands: s sym.func_cipher42 to seek the function). We can see that the variables starting with local_80h are defined as 4 byte values. This, combined with the fact there are 30 iterations and that the array size is 120 bytes, leads us to believe array_120 will hold 30 values, and array_44 will hold 11 values.

cipher42

Apoptosis

After this, we have another comparison of flag_7b with local_170h, which has been set to 0x7B in func_cipher42. If this check fails, boom() is called, otherwise apoptosis is called, with two parameters: array_120 which is the array previously used and array_44 which is the other allocated array, of 11 elements. The function performs a mod 88 operation on the first element of the array array_120 and stores it in the first element of array_44. While the instructions may seem complicated, they basically compute array_120[0] / 88, then multiply the result by 88 and subtract that from the initial value. The compiler optimises the modulus operation by computing the value stored in edx = 0x2e8ba2e9 as a fixed point approximation of (2**35 / 88 + 1) * 2 (without approximation = 780903147) (source used here).

apoptosis

Cipher13

Next, the function func_cipher13 is called, with the argument array_120. A quick inspection of the function reveals the array is being iterated over and from each element the value 13 (0x1d) is being subtracted.

cipher13

isAlpha()

We have another check on the variable index. Following the previous rationale, we determine the mask 0x400 will be checking whether the character is a letter of the alphabet. index should have last been used in the loop that prints the string The bomb has been planted., therefore its value should be 0x1a. Running the code snippet above on 0x1a reveals the only flag set is _IScntrl. Therefore, this check will lead us to yet another boom() function (make note of address 0x400dc4).

Path choices

Next, the value in the eax register (should be 0 from the previous blocks) is compared to 0x400. If it is equal, the path goes on to call func_cipher42 again. If it is not equal, it compares eax with -1. If the values are equal, the path leads to a new function, func_cipher6. All paths then converge and continue with the basic block at address 0x400e0d.

Cipher6

func_cipher6 is called with one argument, array_120. This function was a bit harder to decipher than the previous two cipher functions, as the operation performed wasn’t only dependent on the elements of array_120.

cipher6

There is another index that loops over the 30 elements of the array. The key to understanding the function relies in the following lines of instructions (comments to help understand it):

edx = dword [index] ; edx = index
eax = edx           ; eax = index                       
eax += eax          ; eax = 2 * index
eax += edx          ; eax = 3 * index
eax += eax          ; eax = 6 * index
eax ~= eax          ; eax = - (6 * index)

So, at each iteration, the function subtracts (6 * index) from each element of array_120.

Final boom

Finally, the converging basic block checks whether the 3nd and 15th (offsets 0x8 and 0x38) element of array_120 are equal. If they are not, the path leads us to one final boom(). If they are equal, the program goes on to print each of the 30 characters of array_120, masked by the value 0x7f.

That’s it for understanding the program. Now we know everything it does, and we can go on to solving it!

Solving the challenge

We’re going to look at the ways of solving this challenge: by debugging it (while it’s running), by patching it, and by implementing the necessary functions. Let’s get started!

Debugging

I loaded the binary with the -d flag, which enables debugging (r2 -d 7355608.c4). Ran aaa and s main again. In the above section, I mentioned some interesting addresses to remember. We’ll start debugging by setting some breakpoints.

The first one will be at 0x400cb2. Breakpoints can be set with the db command: db 0x400cb2. Without any arguments, db list the breakpoints you have set. Program execution until the first breakpoint encountered is achieved with dc. Once we hit the breakpoint, we can see we have passed the string printing loop:

:> dc The bomb has been planted. hit breakpoint at: 400cb1 :>

Our breakpoint is on a je instruction (jump equal), and the preceding instruction is a test:

 0x00400caf      85c0           test eax, eax    ; eax = 0
 0x00400cb1 b    74d0           je 0x400c83

This means that, if the eax register is zero, the jump will be taken. However, that leads us to triggering boom(), so we’d like to avoid this. If the eax register is indeed zero, the test instruction will have set the ZF bit (zero flag) in the FLAGS register. We can view all the registers with dr=. If Z appears in the value of rflags, then ZF is set. We’ll have to clear it, such that the jump is not taken. We can do this simply by assigning zero to the bit: dr zf=0. We can check if the path taken is correct by single stepping with dso or by pressing F8.

(note: radare2 has a nice view of the assembly & the registers which gets live updated as you step through the instructions. It’s reached by Vpp.)

The next breakpoint will be 0x00400cd6. It compares the rax register with -1. If ptrace detects the program is being debugged, then rax will indeed be set to -1 and we can see that reflected in its value. Again, we will have to clear ZF, in order to avoid running into boom().

 0x00400cd1      4883f8ff       cmp rax, -1        ; rax = 0xffffffff
 0x00400cd5 b    7549           jne 0x400d20

Our next breakpoing is 0x00400d27. We compare the value of flag_7b to 0x7b. We can inspect the value of flag_7b with .afvd flag_7b (analyze function variables display). If they are equal, it will trigger boom(), so once again we have to clear ZF.

 0x00400d20      83bd98feffff.  cmp dword [flag_7b], 0x7b       ; flag_7b = 0x7b               
 0x00400d27 b    750c           jne 0x400d35            

We then let the program run through func_cipher42 and set a breakpoint at 0x00400d5e. If the variable local_170h is not equal to flag_7b, then we end up running into boom(). However, by inspecting the two variables, we can see they are the same and we can carry on.

 0x00400d52      8b8590feffff   mov eax, dword [local_170h]    ; local_170h = 0x7b
 0x00400d58      398598feffff   cmp dword [flag_7b], eax       ; flag_7b = 0x7b   
 0x00400d5e b    740c           je 0x400d6c   

The next breakpoint will be 0x00400dc4. We will let the program run through apoptosis (which we know is useless) and func_cipher13 unhindered. As we determined, 0x1a is not a letter of the alphabet, therefore we need to once again clear ZF and force the program to go on our desired path.

 0x00400db1      2500040000     and eax, 0x400           ; eax = 0   
 0x00400db6      89859cfeffff   mov dword [index], eax   ; index = 0  
 0x00400dbc      8b859cfeffff   mov eax, dword [index]   ; eax = 0 
 0x00400dc2      85c0           test eax, eax   
 0x00400dc4 b    741d           je 0x400de3

We can step two instructions (to reach 0x00400dcb) and verify the program does not take the jump.

 0x00400dc6      3d00040000     cmp eax, 0x400           ; eax = 0  
 0x00400dcb      7422           je 0x400def

Two more steps and we reach the comparison of eax with -1 (address 0x00400dd0). If they are not equal, the flow will skip straight to printing array_120. However, we want to pass through func_cipher6, so we have to this time set ZF: dr zf=1.

 0x00400dcd      83f8ff         cmp eax, -1              ; eax = 0 
 0x00400dd0      753b           jne 0x400e0d

Finally, we break on 0x00400e29. We compared the 3rd and 15th element in array_120 to each other, and if they are not equal, the path leads us to boom(). However, they are equal, and so we can continue as we were (see Final boom).

We set one final breakpoint on 0x00400e7c and let it run. It will print the flag for us!

:> db 0x00400e7c :> dc The bomb has been defused ! :) hit breakpoint at: 400e7c :>

Patching

Patching a binary means changing its bytes, in order to change the instructions, ultimately changing the behaviour of the binary. We’re going to want to change all the jumps from the debugging section, where we had to clear or set ZF, to their opposites (e.g. je to jne), except for the ptrace check. As we’ll simply be running the binary, without a debugger attached, this check will naturally pass.

First of all, I made a copy of the binary. It’s easy to mess up, so better safe than sorry: cp 7355608.c4 7355608-patch.c4. I loaded the binary with the -w flag, which allows r2 to write to it: r2 -w 7355608-patch.c4, ran aaa and found the entry point s main.

The first patch will be at 0x400cb2. We need to change je to jne. To do this, we hit A and enter our new instruction jne 0x400c83, then save the changes (press Enter). And that’s it! Let’s carry on.

As I mentioned, we are not going to patch 0x00400cd6, as ptrace should not detect anything.

All the patches:

0x00400cb1      75d0           jne 0x400c83 
0x00400d27      740c           je  0x400d35
0x00400dc4      751d           jne 0x400de3  
0x00400dd0      743b           je  0x400e0d 

Let’s test out the patch!

$> ./7355608-patch.c4 
The bomb has been planted.

The bomb has been defused ! :)

Success! Patching is quite easy in radare2, I’m very pleased with that!

Implementing the code

This works for the challenge because it’s not too complicated and big, and because we managed to understand what all the binary does in the RE recon section. And we can now select the relevant information and values from the binary, and create our answer.

We know we have three functions, func_cipher42, func_cipher13 and func_cipher6, always being applied on the elements of array_120. We can extract the values of the array from the first basic block. We also have a good idea of the order the functions should be applied. Towards the end we had a choice of either applying func_cipher42 a second time or going through func_cipher6, and my guess was we needed to go via the latter one.

Below is the code that outputs the flag!

def ciph42(arr):
    for i in range(len(arr)):
        arr[i] ^= 0x42
    return arr

def ciph13(arr):
    for i in range(len(arr)):
        arr[i] -= 13
    return arr

def ciph6(arr):
    for i in range(len(arr)):
        arr[i] -= 6 * i
    return arr

def and_7f(arr):
    for i in range(len(arr)):
        arr[i] &= 0x7f
    return arr

if __name__ == '__main__':
    arr = [ 0x23, 0x39, 0x3c, 0x7d, 0xc5, 0xd8, 0xdc, 0xdb, 0x1f, 0xe9, 0xe8, 0x80, 0x37, 0xff, 0x84, 0x8e, 0x99, 0xd1, 0x9f, 0xa6, 0xa9, 0x142,0x146,0xbe,0x143,0x81, 0x88, 0x8d, 0xad, 0xa6 ]

    arr = ciph42(arr)
    arr = ciph13(arr)
    arr = ciph6(arr)
    arr = and_7f(arr)
    mstr = ''
    for a in arr:
        mstr += (chr(a))

    print("string:", mstr)
    # string: The bomb has been defused ! :)

Conclusion

And that’s it! We’ve managed to reverse engineer a binary, understand all it does, learn to debug it and patch it all in one go! Good job for making it through!

Using radare2 has been a learning experience and I liked it. While debugging I felt I miss the feedback GDB-peda gives you when you’re at a jump instruction (is/is not taken). But it wasn’t that hard to figure it out. I think I’ll be using r2 from now on at CTFs, on RE and pwn challenges.