Basics of Binary Exploitation

Posted on Aug 19, 2019

Intro into assembly

Each personal computer has a microprocessor that manages the computer’s arithmetical, logical, and control activities.

Each family of processors has its own set of instructions for handling operations like getting user input, displaying info on screen etc. These set of instructions are called ‘machine language instructions’. A processor can only understand these machine language instructions which is basically 0’s & 1’s. So, here comes the need of our low-level assembly.

Assembly can be intimidating so I will sum it up for you and this is (pretty) enough to start pwning some binaries.

  • In assembly you are given 8-32 global variables of fixed size to work with which are called “registers”.
  • There are some special registers also. MOst important is “program counter”, which tells the cpu which instruction we’re executing next. This is same as IP(instruction pointer) - don’t get confused.
  • Technically, all the computation is executed on registers. A 64-bit processor requires 64-bit registers, since it enables the CPU to access 64-bit memory addresses. A 64-bit register can also store 64-bit instructions, which cannot be loaded into a 32-bit register. Therefore, most programs written for 32-bit processors can run on 64-bit computers, while 64-bit programs are not backward compatible with 32-bit machines.
  • But big programs need more space so they access memory. Memory is accessed by using memory location or through push & pop op. on a stack.
  • Control flow is handled via altering program counter directly using jumps, branches, or calls. These inst. are called “GOTOs”.
  • Status flags are generally of 1-bit. They tells about wheather flag is set or reset.
  • Branches are just GOTOs that are predicated on a status flag, like, “GOTO this address only if the last arithmetic operation resulted in zero”.
  • A CALL is just an unconditional GOTO that pushes the next address on the stack, so a RET instruction can later pop it off and keep going where the CALL left off.

I think this is enough info about assembly and you’re ready to dive into binary exploitation. Wanna learn more then this book is awesome - here

Let’s start pwning binaries

To start you will need a disassembler(converts 0’s & 1’s [machine code] into assembly) like radare2, IDA, objdump etc. and a debugger(used to debug programs) like gdb, OllyDbg etc.

Let’s get started: Here is the code that I wrote and we will try to exploit it. It’s a simple license checker which check two strings. Source will be available on my github.

crackme1.c

#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]){      
	if(argc==2){
		printf("Checking Licence: %s\n", argv[1]);
		if(strcmp(argv[1], "hello_stranger")==0){
			printf("Access Granted!\n");
			printf("Your are 1337 h4xx0r\n");
		}
		else{
			printf("Wrong!\n");
		}	
	}
	else{
		fprintf(stderr, "Usage: %s <name>\n", argv[0]);
		return 1;
	}
	
	return 0;
}

This code is pretty simple and I hope you can understand it. So lets compile it.

$ gcc crackme1.c -o crackme1

Now we will use gdb to debug our program

$ gdb crackme1

Now we know that every program has main function. So lets disassemble it.

(gdb) disassemble main

It will through this:

Dump of assembler code for function main:
   0x0000000000001169 <+0>:	push   %rbp
   0x000000000000116a <+1>:	mov    %rsp,%rbp
   0x000000000000116d <+4>:	sub    $0x10,%rsp
   0x0000000000001171 <+8>:	mov    %edi,-0x4(%rbp)
   0x0000000000001174 <+11>:	mov    %rsi,-0x10(%rbp)
   0x0000000000001178 <+15>:	cmpl   $0x2,-0x4(%rbp)
   0x000000000000117c <+19>:	jne    0x11e3 <main+122>
   0x000000000000117e <+21>:	mov    -0x10(%rbp),%rax
   0x0000000000001182 <+25>:	add    $0x8,%rax
   0x0000000000001186 <+29>:	mov    (%rax),%rax
   0x0000000000001189 <+32>:	mov    %rax,%rsi
   0x000000000000118c <+35>:	lea    0xe71(%rip),%rdi        # 0x2004
   0x0000000000001193 <+42>:	mov    $0x0,%eax
   0x0000000000001198 <+47>:	callq  0x1040 <printf@plt>
   0x000000000000119d <+52>:	mov    -0x10(%rbp),%rax
   0x00000000000011a1 <+56>:	add    $0x8,%rax
   0x00000000000011a5 <+60>:	mov    (%rax),%rax
   0x00000000000011a8 <+63>:	lea    0xe6b(%rip),%rsi        # 0x201a
   0x00000000000011af <+70>:	mov    %rax,%rdi
   0x00000000000011b2 <+73>:	callq  0x1050 <strcmp@plt>
   0x00000000000011b7 <+78>:	test   %eax,%eax
   0x00000000000011b9 <+80>:	jne    0x11d5 <main+108>
   0x00000000000011bb <+82>:	lea    0xe67(%rip),%rdi        # 0x2029
   0x00000000000011c2 <+89>:	callq  0x1030 <puts@plt>
   0x00000000000011c7 <+94>:	lea    0xe6b(%rip),%rdi        # 0x2039
   0x00000000000011ce <+101>:	callq  0x1030 <puts@plt>
   0x00000000000011d3 <+106>:	jmp    0x120c <main+163>
   0x00000000000011d5 <+108>:	lea    0xe72(%rip),%rdi        # 0x204e
   0x00000000000011dc <+115>:	callq  0x1030 <puts@plt>
   0x00000000000011e1 <+120>:	jmp    0x120c <main+163>
   0x00000000000011e3 <+122>:	mov    -0x10(%rbp),%rax
   0x00000000000011e7 <+126>:	mov    (%rax),%rdx
   0x00000000000011ea <+129>:	mov    0x2e6f(%rip),%rax        # 0x4060 <stderr@@GLIBC_2.2.5>
   0x00000000000011f1 <+136>:	lea    0xe5d(%rip),%rsi        # 0x2055
   0x00000000000011f8 <+143>:	mov    %rax,%rdi
   0x00000000000011fb <+146>:	mov    $0x0,%eax
   0x0000000000001200 <+151>:	callq  0x1060 <fprintf@plt>
   0x0000000000001205 <+156>:	mov    $0x1,%eax
   0x000000000000120a <+161>:	jmp    0x1211 <main+168>
   0x000000000000120c <+163>:	mov    $0x0,%eax
   0x0000000000001211 <+168>:	leaveq 
   0x0000000000001212 <+169>:	retq   
End of assembler dump.

This looks ugly right. Well it’s AT&T syntax, change it to intel using:

(gdb) set disassembly-flavor intel

For permanent change, create ~/.gdbinit and add

set disassembly-flavor intel

Again disassemble main and you will get a more readable code

Dump of assembler code for function main:
   0x0000000000001169 <+0>:	push   rbp
   0x000000000000116a <+1>:	mov    rbp,rsp
   0x000000000000116d <+4>:	sub    rsp,0x10
   0x0000000000001171 <+8>:	mov    DWORD PTR [rbp-0x4],edi
   0x0000000000001174 <+11>:	mov    QWORD PTR [rbp-0x10],rsi
   0x0000000000001178 <+15>:	cmp    DWORD PTR [rbp-0x4],0x2
   0x000000000000117c <+19>:	jne    0x11e3 <main+122>
   0x000000000000117e <+21>:	mov    rax,QWORD PTR [rbp-0x10]
   0x0000000000001182 <+25>:	add    rax,0x8
   0x0000000000001186 <+29>:	mov    rax,QWORD PTR [rax]
   0x0000000000001189 <+32>:	mov    rsi,rax
   0x000000000000118c <+35>:	lea    rdi,[rip+0xe71]        # 0x2004
   0x0000000000001193 <+42>:	mov    eax,0x0
   0x0000000000001198 <+47>:	call   0x1040 <printf@plt>
   0x000000000000119d <+52>:	mov    rax,QWORD PTR [rbp-0x10]
   0x00000000000011a1 <+56>:	add    rax,0x8
   0x00000000000011a5 <+60>:	mov    rax,QWORD PTR [rax]
   0x00000000000011a8 <+63>:	lea    rsi,[rip+0xe6b]        # 0x201a
   0x00000000000011af <+70>:	mov    rdi,rax
   0x00000000000011b2 <+73>:	call   0x1050 <strcmp@plt>
   0x00000000000011b7 <+78>:	test   eax,eax
   0x00000000000011b9 <+80>:	jne    0x11d5 <main+108>
   0x00000000000011bb <+82>:	lea    rdi,[rip+0xe67]        # 0x2029
   0x00000000000011c2 <+89>:	call   0x1030 <puts@plt>
   0x00000000000011c7 <+94>:	lea    rdi,[rip+0xe6b]        # 0x2039
   0x00000000000011ce <+101>:	call   0x1030 <puts@plt>
   0x00000000000011d3 <+106>:	jmp    0x120c <main+163>
   0x00000000000011d5 <+108>:	lea    rdi,[rip+0xe72]        # 0x204e
   0x00000000000011dc <+115>:	call   0x1030 <puts@plt>
   0x00000000000011e1 <+120>:	jmp    0x120c <main+163>
   0x00000000000011e3 <+122>:	mov    rax,QWORD PTR [rbp-0x10]
   0x00000000000011e7 <+126>:	mov    rdx,QWORD PTR [rax]
   0x00000000000011ea <+129>:	mov    rax,QWORD PTR [rip+0x2e6f]        # 0x4060 <stderr@@GLIBC_2.2.5>
   0x00000000000011f1 <+136>:	lea    rsi,[rip+0xe5d]        # 0x2055
   0x00000000000011f8 <+143>:	mov    rdi,rax
   0x00000000000011fb <+146>:	mov    eax,0x0
   0x0000000000001200 <+151>:	call   0x1060 <fprintf@plt>
   0x0000000000001205 <+156>:	mov    eax,0x1
   0x000000000000120a <+161>:	jmp    0x1211 <main+168>
   0x000000000000120c <+163>:	mov    eax,0x0
   0x0000000000001211 <+168>:	leave  
   0x0000000000001212 <+169>:	ret    
End of assembler dump.

Now make a assumption how this binary works. When you run it without any argument it will display the usage message. If you pass two arguments where first one is program name itself and second one is license key, it will display a access granted or access denied message. Now apply that assumption to assembly code.

For exploitation, we can ignore most of the stuff. So at 0x1178, you can see a cmp function which is comparing a pointer to hex 0x2(which is 2 in decimal). According to our assumption, that must be checking arguments. Just below that 0x117c have a jne(basically jump not equal). So if those strings don’t match, control flow will jump to addr 0x11e3. Now at addr 0x1198, it is calling a printf function, which maybe printing “Checking License:” when you run the binary. Next interesting addr is 0x11b2, it is calling a strcmp(string compare) function. It should be comparing our key with the correct key to verify. Next we have 0x11b7 which is a test function and returns value 0 if strings match. After that we have addr 0x11b9 which is jne(jump not equal), jumps to addr 0x11d5 if strings are not equal. After that we have 0x11c2 and 0x11ce which is calling a puts(it just prints stuff) function, this will print “Access Granted!” and some other text if we give correct key. Next is 0x11d3 which will jump to 0x120c and terminates our program. Now let’s exploit it using gdb to print access granted without using key.

First set breakpoint at main. Breakpoint is a point in memory where your execution stops.

(gdb) break *main

Now run the program and watch the control flow. You can use pen-paper for better understanding.

(gdb) run 
(gdb) ni

ni is to execute next instruction. After that just press enter and it will execute the next instruction. Now try running the program with a key.

(gdb) run random_key
(gdb) ni

Carefully watch the control flow this time. Now according to our assumption, if we change the value of eax at addr 0x11b7, we are telling the program that the strings matched and it will print the access granted message. So for that set breakpoint 2 to the address of test eax, eax.

(gdb) disass main
(gdb) break *0x00005555555551b7

Again run the program with a random_key.

(gdb) run random_key

After hitting the first breakpoint, type continue to jump to next breakpoint.

(gdb) continue
(gdb) info registers
(gdb) set $eax=0
(gdb) ni

Here I set the value of eax to 0 and run the program instruction by instruction. After setting eax=0, next addr 0x00005555555551b9 will not be executed as it is jne. Use ni to continue executing next instruction.

(gdb) run random_key
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: crackme1 random_key

Breakpoint 1, 0x0000555555555169 in main ()
(gdb) continue
Continuing.
Checking Licence: random_key

Breakpoint 2, 0x00005555555551b7 in main ()
(gdb) info registers
rax            0x3                 3
rbx            0x0                 0
rcx            0xfff7fdff          4294442495
rdx            0x68                104
rsi            0x55555555601a      93824992239642
rdi            0x7fffffffe563      140737488348515
rbp            0x7fffffffe170      0x7fffffffe170
rsp            0x7fffffffe160      0x7fffffffe160
r8             0xffffffff          4294967295
r9             0x1d                29
r10            0xfffffffffffff1a9  -3671
r11            0x7ffff7f36140      140737353310528
r12            0x555555555070      93824992235632
r13            0x7fffffffe250      140737488347728
r14            0x0                 0
r15            0x0                 0
rip            0x5555555551b7      0x5555555551b7 <main+78>
eflags         0x206               [ PF IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb) set $eax=0
(gdb) info registers 
rax            0x0                 0
rbx            0x0                 0
rcx            0xfff7fdff          4294442495
rdx            0x68                104
rsi            0x55555555601a      93824992239642
rdi            0x7fffffffe563      140737488348515
rbp            0x7fffffffe170      0x7fffffffe170
rsp            0x7fffffffe160      0x7fffffffe160
r8             0xffffffff          4294967295
r9             0x1d                29
r10            0xfffffffffffff1a9  -3671
r11            0x7ffff7f36140      140737353310528
r12            0x555555555070      93824992235632
r13            0x7fffffffe250      140737488347728
r14            0x0                 0
r15            0x0                 0
rip            0x5555555551b7      0x5555555551b7 <main+78>
eflags         0x206               [ PF IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb) ni
0x00005555555551b9 in main ()
(gdb) 
0x00005555555551bb in main ()
(gdb) 
0x00005555555551c2 in main ()
(gdb) 
Access Granted!
0x00005555555551c7 in main ()
(gdb) 
0x00005555555551ce in main ()
(gdb) 
Your are 1337 h4xx0r
0x00005555555551d3 in main ()
(gdb)

Voila! You have cracked the program without knowing the correct key.

This one is just a basic intro into binary exploitation and enough to get you started.