Reverse engineering C programs (64bit vs 32bit) - bin 0x10
Key Takeaways
Reverse engineering C programs using tools like CC, IDA, GDB, and Hopper, comparing 64bit and 32bit architectures and analyzing assembler code, variables, and function calls.
Full Transcript
[Music] We have already had many episodes where we read a sampler code and reverse engineered how a program works. And we even have written our first exploit by using a buffer overflow vulnerability in a program written in C. In this code, I want to show you how you can learn how to read a sampler produced by CC code yourself. The idea is simple. Just write some C code with different C language features and then look at the assembler code that is produced by compiling it. This is often part of normal research. For example, listen to what Ian Bearer from Google project zero says during a talk about his research on interprocess calls on OSX. Um, one approach to reversing or to to understanding how this kind of thing works would be to sit in IDA and just reverse the serialization and deserialization code and slowly build up a picture of how it works. But another kind of quite nice way to do it is just write a test program to send little messages and then find the right place using LLDB to break and just start dumping hex. So because he had to understand a fairly complex data structure, he simply wrote a test program to analyze it instead of reversing a full application. Or there was a talk and a paper from Blackhead USA in 2007 about how to reverse C++ programs by looking at C++ concepts and how they look like an assembler. So now I have created three different CC code test cases. You can find them in my GitHub repository or just write it yourself. One is about variables and data types. One is about function calls. And one is about control flow stuff like loops and ifs. So let's start with the variables C. First thing I want to point out are those triple X's. Those triple X's are defined as an assembler knob instruction. The reason for that is later when we look into the disassembly, we can find those knobs which are separating our tests and that is pretty neat. So this makes it easier to see which line of C code is responsible for which lines in assembler. I will not go over every single test. This is something you could do yourself. Simply pause the video at certain points or clone the repository. Anyway, let's get started. First, we define a couple of numbers unsigned and signed integers and floing point numbers and different sizes with UN32 or UN64. The letter is important because normal integers might have different sizes depending on 32-bit or 64-bit architecture. So it can lead to bugs. So better use data types. You are guaranteed to get a certain size. If you want to learn how to program C properly, there's a great article called how to see as of 2016. After that, we create an array with 32-bit unsigned integers and we access one of the elements of this array. Then we look at a single character and then also a string and maybe you know that a star means pointer in C. So we are define a variable that is pointing to a string. I have added a make file. So you can simply type make into the terminal to compile all files or make clean to remove the binaries. This will create a 32-bit and a 64-bit version of the variables program. But as you can see, I get an error trying to compile a 32-bit version with minus M32 on this 64-bit machine. So I have to install the 32-bit libraries first to be able to build the code. After installing those, the build works fine. A make file is just a little script that defines how a project has to be compiled. So let's open the code 32-bit and 64-bit version next to each other in GDB and disassemble main and also open the code. Okay, now let's look at the first integer examples with negative values and signed and unsigned values. First of all, all those local variables are stored somewhere on the ST. You can see that because they are referenced relative to the base pointer. Then you notice that the assembler code doesn't know negative numbers. They are fff. If you are interested how negative numbers are displayed, watch my 10th episode about numbers. And also there is no difference between variables that are signed or unsigned. But there's one interesting difference between 32-bit and 64-bit code. Because we define one number to be 64-bit long, but in 32-bit the registers are only 32bit. So if you want to write full 64-bit, you have to write two times. The floating point numbers are also interesting because they got stored somewhere else in the program and that value is then moved into the local variable. The array is also interesting. We created an array with 10 values but only set the first five values to a default value. As you can see, those values are stored on the stack and then it moves from that location on the stack to the real area location instead of writing it directly to the area. It does it this way. No idea why. And you can see down here when we reference the third entry. So you can see that this is the real location of the array on the stick. Next come the strings. You can see that a character is just a bite. It doesn't matter if we have an unsigned int with 8 bit or a char. It's the same. And strings are also referenced over an address. So the local variable is not an array of characters. The local variable contains an address pointing to a string. Now let's have a look at the control flows. Open it in radar. Analyze all. Seek to main function and enter visual mode. First we set a variable to zero. And then comes the if. This is done by loading this local variable into a register and comparing it to hex FF and then jump if it was less or equal. So you can see which branch it may take. Then comes a while loop. We load the local variable again in a register compare it to a value and either jump inside the block or leave. And inside the block we load this value again, increment it and write it back. Now compare it to the for loop. It's basically the same. We start by setting the variable to zero. Then we compare it if the loop condition is still true. And inside the loop block, we can see our knob. And at the end of the block, we increment the variable by one. Exactly the same like the while loop. So you can see that four and while loop and C are basically the same. Next, let's have a look at how functions are called. Again, open both the 32-bit and 64-bit version. First thing you notice that the 64-bit version moves uh zero in EX. No idea why. Otherwise, the function call looks the same except look at the addresses. If you have no ASLR, then 64-bit code is generally at hex 40 something while 32-bit code is at hex 80 something. Knowing stuff like that is helpful because if you see an address with 40 something, you know immediately that is pointing into your code. So the next function returns a value and we save it in a variable. And you can see that in both cases the value is taken from the EAX register. Okay. So apparently return values are handled via EAX. Now function 3 is interesting because we pass a parameter to it in 32bit. You can see that the value is loaded from somewhere and then stored on top of the stack and then the function is called. But on 64-bit, we see that the value is loaded into the EDI register. This is our first big difference. Functions in 64-bit seem to be called with parameters in registers while in 32-bit the parameters are stored on a stack. Next function uses two parameters and again you can see how 32-bit just places the value on the stack. First parameter on the top of the stack, the second a bit further down. But in 64-bit you can see that it uses ESI and EDI for that. Now we get curious what does 64-bit do when we have so many parameters that we cannot keep them all in registers. First of all, 32-bit code. Again, you can see how the parameters are stored on the ST and the first parameter is on top of the ST in the last value moved. That is what we would expect in 64bit. We can see that the first couple of parameters are stored in the registers EDI, ESI, EDX and so forth. But from the seventh parameter on, they get stored on the stack as well. Awesome. Now you can identify all kind of different assembler patterns. You don't need a decompiler all the time. You can do this all in your head. And when you reverse more and more programs, those patterns become more easy to recognize and you will not feel overwhelmed again with the mass of weird instructions. You will be able to scan over function and say, "Ah, here's a local variable then calls this function with the variable as a parameter and then the return values used in a loop." And you can use the same method to understand how different disassemblers like Hopper, Radar, GDB display code or for example how different the AT&T assembler syntax is from the Intel syntax. I hope you have a lot of fun next time reversing a program. [Music] [Applause] [Music]
Original Description
Learning how to reverse engineering programs written in C. We do this by comparing x64 and x32 compiled programs.
-=[ ❤️ Support ]=-
→ per Video: https://www.patreon.com/join/liveoverflow
→ per Month: https://www.youtube.com/channel/UClcE-kVhqyiHCcjYwcpfj9w/join
-=[ 🐕 Social ]=-
→ Twitter: https://twitter.com/LiveOverflow/
→ Website: https://liveoverflow.com/
→ Subreddit: https://www.reddit.com/r/LiveOverflow/
→ Facebook: https://www.facebook.com/LiveOverflow/
-=[ 📄 P.S. ]=-
All links with "*" are affiliate links.
LiveOverflow / Security Flag GmbH is part of the Amazon Affiliate Partner Programm.
#ReverseEngineering
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from LiveOverflow · LiveOverflow · 18 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
▶
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
LiveOverflow - Trailer
LiveOverflow
Introduction to Linux - Installation and the Terminal - bin 0x01
LiveOverflow
Writing a simple Program in C
LiveOverflow
Writing a simple Program in Python - bin 0x03
LiveOverflow
Live Hacking - Twitch Recording overthewire.org - Vortex 0x01-0x03 (3h)
LiveOverflow
Reversing and Cracking first simple Program - bin 0x05
LiveOverflow
Abusing the exception handler to leak flag - 32C3CTF readme (pwnable 200)
LiveOverflow
ROP with a very small stack - 32C3CTF teufel (pwnable 200)
LiveOverflow
Uncrackable Programs? Key validation with Algorithm and creating a Keygen - Part 1/2 - bin 0x07
LiveOverflow
Uncrackable Program? Finding a Parser Differential in loading ELF - Part 2/2 - bin 0x08
LiveOverflow
Syscalls, Kernel vs. User Mode and Linux Kernel Source Code - bin 0x09
LiveOverflow
Smashing the Stack for Fun and Profit - setuid, ssh and exploit.education - bin 0x0B
LiveOverflow
Live Hacking - EFF-CTF 2016 - Level 0-4 (Enigma Conference)
LiveOverflow
First Stack Buffer Overflow to modify Variable - bin 0x0C
LiveOverflow
First Exploit! Buffer Overflow with Shellcode - bin 0x0E
LiveOverflow
Buffer Overflows can Redirect Program Execution - bin 0x0D
LiveOverflow
Doing ret2libc with a Buffer Overflow because of restricted return pointer - bin 0x0F
LiveOverflow
Reverse engineering C programs (64bit vs 32bit) - bin 0x10
LiveOverflow
pwnable.kr - Levels: fd, collision, bof, flag
LiveOverflow
Reverse Engineering and identifying Bugs - BKPCTF cookbook (pwn 6) part 1
LiveOverflow
Leaking Heap and Libc address - BKPCTF cookbook (pwn 6) part 2
LiveOverflow
Arbitrary write with House of Force (heap exploit) - BKPCTF cookbook (pwn 6) part 3
LiveOverflow
Live Hacking - Internetwache CTF 2016 - web50, web60, web80
LiveOverflow
Live Hacking - Internetwache CTF 2016 - crypto60, crypto70, crypto90
LiveOverflow
A simple Format String exploit example - bin 0x11
LiveOverflow
NEW VIDEOS ARE COMING - loopback 0x00
LiveOverflow
HTML + CSS + JavaScript introduction - web 0x00
LiveOverflow
The HTTP Protocol: GET /test.html - web 0x01
LiveOverflow
Building Poor Man's Logic Analyzer with an Arduino - Reverse Engineering A/C Remote part 1
LiveOverflow
What is PHP and why is XSS so common there? - web 0x02
LiveOverflow
Introducing the AngularJS Javascript Framework - XSS with AngularJS 0x00
LiveOverflow
Sandbox Bypass in Version 1.0.8 - XSS with AngularJS 0x1
LiveOverflow
Capturing & Analyzing Packets with Saleae Logic Pro 8 - Reverse Engineering A/C Remote part 2
LiveOverflow
XSS Contexts and some Chrome XSS Auditor tricks - web 0x03
LiveOverflow
Previous Bypass is now fixed in version 1.4.7 - XSS with AngularJS 0x2
LiveOverflow
New Sandbox Bypass in 1.4.7 - XSS with AngularJS 0x3
LiveOverflow
The Heap: what does malloc() do? - bin 0x14
LiveOverflow
The Heap: How to exploit a Heap Overflow - bin 0x15
LiveOverflow
Reverse Engineering with Binary Ninja and gdb a key checking algorithm - TUMCTF 2016 Zwiebel part 1
LiveOverflow
Scripting radare2 with python for dynamic analysis - TUMCTF 2016 Zwiebel part 2
LiveOverflow
Live Hacking - Internetwache CTF 2016 - exp50, exp70, exp80
LiveOverflow
Sandbox bypass for the latest AngularJS version 1.5.8 - XSS with AngularJS 0x4
LiveOverflow
Channel is growing and Riscure hardware CTF starting soon - loopback 0x01
LiveOverflow
Explaining Dirty COW local root exploit - CVE-2016-5195
LiveOverflow
What is CTF? An introduction to security Capture The Flag competitions
LiveOverflow
The Heap: How do use-after-free exploits work? - bin 0x16
LiveOverflow
The Browser is a very Confused Deputy - web 0x05
LiveOverflow
The Heap: Once upon a free() - bin 0x17
LiveOverflow
Simple reversing challenge and gaming the system - BruCON CTF part 1
LiveOverflow
int0x80 from DualCore lent me his lockpicking set and I'm a horse - BruCON CTF part 2
LiveOverflow
The Heap: dlmalloc unlink() exploit - bin 0x18
LiveOverflow
MD5 Length Extension and Blind SQL Injection - BruCON CTF part 3
LiveOverflow
TCP Protocol introduction - bin 0x1A
LiveOverflow
Socket programming in python and Integer Overflow - bin 0x1B
LiveOverflow
Linux signals and core dumps - bin 0x1C
LiveOverflow
[Live] Remote oldschool dlmalloc Heap exploit - bin 0x1F
LiveOverflow
Riscure Embedded Hardware CTF setup and introduction - rhme2 Soldering
LiveOverflow
Rooting a CTF server to get all the flags with Dirty COW - CVE-2016-5195
LiveOverflow
How to learn hacking? ft. Rubber Ducky
LiveOverflow
Format String to dump binary and gain RCE - 33c3ctf ESPR (pwn 150)
LiveOverflow
More on: Security Basics
View skill →Related AI Lessons
🎓
Tutor Explanation
DeepCamp AI