Running a Buffer Overflow Attack - Computerphile

Computerphile · Intermediate ·🔐 Cybersecurity ·10y ago

Key Takeaways

The video demonstrates a buffer overflow attack on a Linux system, showcasing how to exploit a vulnerable program to gain root access and execute arbitrary code, using tools like GDB and Python.

Full Transcript

so we'll talk about something very different today very different to my normal image filtering sort of videos uh that is buffer overflow exploits and and what they are and how you do them um which is kind of fun um I'm you know obviously somewhat of a geek I quite like these sort of things lowlevel memory exploits a buffer overflow exploit is a situation where we're using some probably a low level C function or something to write a string or some other variable into a piece of memory that is only a certain length but we're trying to write something in that's longer than that and it then overwrites the later memory addresses and that can cause all kinds of problems the first thing we should talk about probably is roughly what happens in memory with a program when it's run now we're talking about C programs in Linux today just because I happen to have a Linux VM running here and it's easier but this will apply to many different languages many different operating systems so when a program is run by the operating system so we're we're in some shell and we type in a command line to run a program the operating system will effectively call as a as a function the main method of your of your code but your actual process your your executable will be held in memory in a very specific way um and it's consistent between different processes so we have a big block of ram we don't know how big our Ram is because it can be varied but we use something called virtual memory address translation to say that everything in here this is n Ox n n n dot dot dot this is the bottom of the memory as it were and up here is Ox f f f so this is the equivalent of 11111111 memory address all the way up to 32 or 64 bits and this is not now when you use this there are certain areas of this memory that are always allocated to certain things so up here we have kernel things so this will be command line parameters that we passed to our program and environment variables and so on down here we have something called the text that's the actual code of our program the machine instructions that we've compiled get loaded in there now that's read only because we don't want to be messing about down there in here we have data so uninitialized and initialized variables get held here and then we have the Heap now the Heap may have been mentioned from time to time it's where you allocate large things in your memory big area of memory that you can allocate huge chunks on to do various things okay what you do with that is of course up to your program and then up here perhaps the most important bit in some ways anyway is the stack now the stack holds the local VAR Ables for each of your functions and when you call a new function like let's say you say print F and then some some parameters that gets put on the end of a stack so the Heap grows in this direction as you add memory and the stack grows in this direction now that I've laid that out we won't talk about it anymore we'll just focus on the stack okay because that's where a lot of these buffer overflows happen you can have overflows in other areas but we're not going to be dealing with them today I'm going to turn this sideways because I think it's just a little bit easier to understand um at least that's how I tended to look at it okay so this is our memory again nice and big this is now our stack area excuse my program is writing up here we have the high memory addresses FF dot dot dot so something up here is high and this is Ox n n n now of course the stack won't be taking up this whole region but it doesn't matter so high memory addresses and low memory addresses and the stack grows downwards so when we add something onto the end of a stack it gets put on this side and then this moves in this direction of course I'm talking about a stack without telling you what a stack is professor brailsford's already talked about this and probably done a much better job of explaining it than I would there's a lot of computer science depends on Stacks I sometimes think that stacks and trees is just about all computer science is about so we'll just say that you know how stack works and then we'll we'll move on we have some program that's calling a function a function is some area of code that does something and then returns back to where it was before so this is our calling function here when the calling function wants to make use of something it adds its parameters but it's passing onto the stack so this will be parameter a and this will be parameter B and they will be added into the stack in reverse order and then the assembler code for this function will make something called a call and that will jump to somewhere else in memory and work with these two things and it's the nature of this stack that causes us to have problems let's look at some code and then we'll see how it works I've got myself here a program that isn't very good I wrote it so it's a piece of C code so if we look at it it's just a very simple C code that allocates some memory on the stack and then copies a string into it from the command line okay so up here we've got the main function for C that takes the number of parameters it's been given and then appointed to those variables that you've got and they'll be held in the kernel area of our memory we allocate a buffer that's 500 characters long and then we call a function called string copy which will copy our command line parameter from argv into our buffer our function puts on a return address which is replacing the code that we need to go back to once we've done the string copy so that's how main knows where to go after it's finished and then we put on a reference to our the base pointer in our previous function we won't worry about that too much because it's not relevant particularly to this video so this is just going to be our EBP base pointer this is our allocated space for our buffer and it's 500 long if we write into it something that's longer than 500 we're going to go straight past the buffer over this and crucially over our return variable and that's where we point back to something we shouldn't be doing okay so what I'm going to do is is walk through it in the code and then let's see if it works so this is my Carly Linux distribution which has all kinds of slightly dubious password cracking tools and other penetration testing tools it's meant for ethical hacking let's just make that clear I've written here a a small function that does our copy from the command line okay now I've compiled it and I can run it so I can run my vulnerable code with hello and that will copy hello into this buffer and then simply return so nothing happens it's the most boring program ever another program might do something like copy hello in there and then now it's in the buffer that can go off and process it yeah I mean maybe you've got a function that makes things all up a case so you copy hello off then you change this new copy to be all uper case and then you output it to the screen and this doesn't have to be main this could be any function we're going to run something called GDB which is the Linux uh command line debugger um I wouldn't advise using GDB unless you really like seeing a lot of assembly and really doing lowlevel Linux things there a lot of text on the screen there so we don't have to worry about no this text here is just warranty information so now I'm going to type in list and it shows us the code for our function so we can see it's it's just a compiled function now it knows this because the compiler included this information along with the executable now we can also show the machine code for this so we can say disassemble Main and we can see the code for main so they're the instructions that would actually go to the CPU these are the actual CPU instructions that will be run okay now we won't dwell on much of this because assembly isaps a whole series of of Talks by someone other than me Steve Bagley knows a lot about assembler however a couple of really important things are this line here sub of ox 1 F4 from ESP that's allocating the 500 for the buffer that is we're here and we go 500 in this direction and that's where our buffer goes so buffer sitting to the left on this image but lower in memory than the rest of our variables okay now um we can run this program from GDB and if it crashes then we can look at the registers and find out what's happened so we can say run hello and it will start the program and say hello okay and it's exited normally now we can pass something a little bit longer than hello if we pass something that's over 500 then this buffer will go over this base pointer and this return value and break the code going so that'll just crash your it should just crash it python for example can produce strings based on simple scripts on the command line so what we do is we say run and then we pass it a python script uh of print 41 that's the a character 500 and let's say six times okay just a little bit more than 500 so it's going to cause somewhat of a problem but not a catastrophe okay and then we run that and it's it's received a segmentation fault now a segmentation fault is what a CPU will send back to you when you're trying to access something in memory you shouldn't be doing okay now that's not actually happened because we overwrote somewhere we shouldn't what's happened is that the return address was half overwritten with these 41s so it doesn't know what it is so yeah there is nothing in memory at B 74141 and if there is it doesn't belong to this process it's not allowed so it gets a segmentation fault so if we change this to 508 we're going two bytes further along which means we're now WR overwriting the entire of our return address we're overwriting this R here with 41s now if there was some virus code at 414141 that's a big problem okay so that's where we're going with this all right so we run this and you can see the return address is now 41 41 41 41 now I can actually I can show you the the registers and you can see that the instruction pointer is now try pointing at 4141 so that means that it's read this return value and tried to return to that place in the code and run it and of course it can't now we can have a little bit more fun okay we've broken our code what can we do now well what we need to do is change this return value to somewhere where we've got some payload that we're trying to give we're trying to produce okay so luckily if I quit this debugger I have some pre-prepared payload just for this okay now in fact this payload is just a simple very short program in assembler that puts some variables on the stack and then executes a system call to tell it to run a shell okay to run a a new command line okay so if I show this code uh shell code okay this code will depend on the Linux AA system and you know on whether you're using an Intel CPU or something else okay this is just a string of different commands crucially this xcd x80 is throwing a system into interrupt which means that it's going to run the system call okay that's all we'll go into about this what this will actually do is run something called Zs which is an Old Shell which doesn't have a lot of protections involved so let's go back to our debugger and we're going to run again but this time we're going to run and we're going to run a slightly more malicious piece of code uh we're going to put in our 41s Times by 8 and then we're going to put in our Shell Code there we go okay so now we're we're doing all 41s and then a bunch of malicious code okay now that's actually now too long we've gone too far but we'll we'll fix that in a minute okay and finally the last thing we want to add in is our return address which we'll customize in a moment to craft an exploit from this what we need to do is remember the fact that string copy is going to copy into our buffer okay so we're going to start here we want to overwrite the memory of this return address with somewhere pointing to our um malicious code okay okay now we can't necessarily know for sure where our malicious code might be stored Elsewhere on the disc so we don't worry about that or on memory we want to put it in this buffer so we're going to put some malicious code in here and then we're going to have a return address that points back into it okay now memory moves around slightly when you move when you run these programs that you know things change slightly environment variables are added and removed things move around so we want to try and hedge our bets and get the rough area that this will go in in here we put in something called a noop sled or you know there's ious other words for it so this is simply SL X90 that is the machine instruction for just move to the next one so that's good anywhere we land in that noop is going to tick along to our malicious code so we have a load of x90s here then we have our Shell Code right that's our malicious payload that runs our shell and then we have the return address right in the right place we have our return address that points back right smack in the middle of these x90s and what that means is even even if he's move a bit it'll still work do like having a slope almost is it it's exactly like that yes anywhere we land in here is going to cause a real problem so we've got our our bomb or our I don't know yeah pit of lava yeah it's for salac pit isn't it right and and your know up sled takes you in and then you get digested over 10,000 years or whatever it is so we've got three things we need to do we need to put in some x90s we need to put in our Shell Code which I've already got and we need to put in our return address worry about the return address last okay so if we go back to my code we changed the first x41 that we're putting in okay we changed to 90 so we're putting in a load of noop operations then we've got our Shell Code and then we've got what will eventually be our return address and we'll put in 10 of those because it's just just to have a little bit of padding between our Shell Code and our stack that's moving about now this 58 here people will have noticed now this is too big because we're putting in extra information so if we write 8 bytes it goes exactly where we want over our return address but we've now got 43 bytes of Shell Code and we've got 40 bytes of return address so - 40 - 43 is 425 we change this 508 to 425 and so now this exploit here that we're looking at is exactly what I hoped it would be here okay some X90 no operation sleds for Shell Code and then we've got our return address which is 10 * 4 bytes we run this and we've got a segmentation fault which is exactly what we hope we get because our return address hasn't been changed yet so now let's look at our memory and work out where our return address should go so in in GDB it's it's paused the program after the segmentation fault so we can say list the registers uh at about let's say 200 of them at the stack point of minus 550 okay so that's going to be right at the beginning of our buffer and what we're seeing here is a load of 90s in a row okay so we just need to pick a memory desk right in the middle of them so let's pick this one let's say B FF FF a b a okay I'm going to write that down so I don't forget it okay now there's a nice quirk in this which is the Intel CPU is a little endian which means I have to put it in backwards but yet more things we have to learn but it's fine BF FF fa put cat sock on and can't type when people watching um and uh ba okay now theoretically when I run this what will happen is string copy will do its thing it will copy its string in and then when it tries to return it will load this return value and execute that instruction which will be somewhere in this buffer and then it will read off and run our Shell Code so we should get a shell okay and we did okay so that's a good start right we know our program works allbe it in the debugger with very little side effect okay the question now is can I take this and use it on the command line to gain access to this machine now Linux has a Linux has quite restrictive policies on what can and can't be done from certain programs but some programs such as changing your password are run using something called suid so what that means is that for the sake of running that program you are a complete rout you have root access to that machine because otherwise how could you change the password file you're not normally allowed to even read it the shadow file so if you find a vulnerability in that kind of program and there's more than I think there should be then um that's when there's a real problem now obviously these vulnerabilities are getting rarer but it's catastrophic if you get one okay so let's leave this debugger okay and then back to our nice clear command line environment okay so if I list the files we've got this vulnerable program here is shown in red that shows that it's suid rout okay which means when we run it it will be running as rout which is not great for security Now that and my shoddy programming which means it's fundable to buffer overflow okay so if I copy my exploit okay here we go so this is a Big Moment of Truth right whether this whole video is going to work I've put my code in okay just is just like it was in the debugger okay I've tried to make it exactly the same so the memory doesn't move around let's just say who am I on Linux so we can say I am myself okay I don't have root access so can I for example look at the password file so I can say cat slat Etc SL Shadow permission denied no dice okay fair enough I'm not supposed to be looking at that now I run my exploit so enac my vulnerability with the right address and we've got shell who am I root okay so now can I look at my shadow file so root is like God for this system in Linux there is nothing you can't do as root okay so I've got my root shell and I'm root so I can cap Etc Shadow and I can see what's in the shadow file but the point is that there's nothing I can't do now I can wipe the machine or do anything like that myself and then I can quit this um and then my program just gracefully exits because it now returns to normal code and hopefully no one is any of the wies of anything's gone on now there are things that the operating system does to try and stop this from happening randomizing your address um your memory layout and um things like no executing of stacks and stuff there are ways around this they obviously for a different video um but at least things are getting definitely better stealth and Bot Nets usually go hand inand because from the point of view of a CNC server it wants to ensure some years ago it seems the NSA got a back door in one of these routers presumably because they got of their people to get a job

Original Description

Making yourself the all-powerful "Root" super-user on a computer using a buffer overflow attack. Assistant Professor Dr Mike Pound details how it's done. Formerly titled "Buffer Overflow Attack" -Aug 2021 The Stack: https://youtu.be/7ha78yWRDlE Botnets: https://youtu.be/UVFmC178_Vs The Golden Key: iPhone Encryption: https://youtu.be/6RNKtwAGvqc 3D Stereo Vision: https://youtu.be/O7B2vCsTpC0 Brain Scanner: https://youtu.be/TQ0sL1ZGnQ4 http://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computer Science at the University of Nottingham: http://bit.ly/nottscomputer Computerphile is a sister project to Brady Haran's Numberphile. More at http://www.bradyharan.com
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Computerphile · Computerphile · 0 of 60

← Previous Next →
1 Follow the Cookie Trail - Computerphile
Follow the Cookie Trail - Computerphile
Computerphile
2 EXTRA BITS - Follow the Cookie Trail - Computerphile
EXTRA BITS - Follow the Cookie Trail - Computerphile
Computerphile
3 Musical Floppy Drives - Computerphile
Musical Floppy Drives - Computerphile
Computerphile
4 The Hair Algorithm - Computerphile
The Hair Algorithm - Computerphile
Computerphile
5 Getting Sorted & Big O Notation - Computerphile
Getting Sorted & Big O Notation - Computerphile
Computerphile
6 Quick Sort - Computerphile
Quick Sort - Computerphile
Computerphile
7 Hyper History and Cyber War - Computerphile
Hyper History and Cyber War - Computerphile
Computerphile
8 Entropy in Compression - Computerphile
Entropy in Compression - Computerphile
Computerphile
9 Original Elite on the BBC B - Computerphile
Original Elite on the BBC B - Computerphile
Computerphile
10 IP Addresses and the Internet - Computerphile
IP Addresses and the Internet - Computerphile
Computerphile
11 A Career in Video Games - Computerphile
A Career in Video Games - Computerphile
Computerphile
12 Error Detection and Flipping the Bits - Computerphile
Error Detection and Flipping the Bits - Computerphile
Computerphile
13 Programming BASIC and Sorting - Computerphile
Programming BASIC and Sorting - Computerphile
Computerphile
14 Birthplace of the World Wide Web - Computerphile
Birthplace of the World Wide Web - Computerphile
Computerphile
15 Punch Card Programming - Computerphile
Punch Card Programming - Computerphile
Computerphile
16 Programming Paradigms - Computerphile
Programming Paradigms - Computerphile
Computerphile
17 CERN Computing Centre (and mouse farm) - Computerphile
CERN Computing Centre (and mouse farm) - Computerphile
Computerphile
18 Error Correction - Computerphile
Error Correction - Computerphile
Computerphile
19 Home-Made Code - Computerphile
Home-Made Code - Computerphile
Computerphile
20 Security of Data on Disk - Computerphile
Security of Data on Disk - Computerphile
Computerphile
21 Gesture Controls - Computerphile
Gesture Controls - Computerphile
Computerphile
22 How Intelligent is Artificial Intelligence? - Computerphile
How Intelligent is Artificial Intelligence? - Computerphile
Computerphile
23 Encryption and Security Agencies - Computerphile
Encryption and Security Agencies - Computerphile
Computerphile
24 Virtual Machines Power the Cloud - Computerphile
Virtual Machines Power the Cloud - Computerphile
Computerphile
25 Hacking Websites with SQL Injection - Computerphile
Hacking Websites with SQL Injection - Computerphile
Computerphile
26 How Huffman Trees Work - Computerphile
How Huffman Trees Work - Computerphile
Computerphile
27 Cracking Websites with Cross Site Scripting - Computerphile
Cracking Websites with Cross Site Scripting - Computerphile
Computerphile
28 Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile
Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile
Computerphile
29 Texting Cabbage with a Recorder - Computerphile
Texting Cabbage with a Recorder - Computerphile
Computerphile
30 Hashing Algorithms and Security - Computerphile
Hashing Algorithms and Security - Computerphile
Computerphile
31 How YouTube Works - Computerphile
How YouTube Works - Computerphile
Computerphile
32 How NOT to Store Passwords! - Computerphile
How NOT to Store Passwords! - Computerphile
Computerphile
33 A New Golden Age of Video Games - Computerphile
A New Golden Age of Video Games - Computerphile
Computerphile
34 A Universe of Triangles - Computerphile
A Universe of Triangles - Computerphile
Computerphile
35 Cross Site Request Forgery - Computerphile
Cross Site Request Forgery - Computerphile
Computerphile
36 The True Power of the Matrix (Transformations in Graphics) - Computerphile
The True Power of the Matrix (Transformations in Graphics) - Computerphile
Computerphile
37 The Great 202 Jailbreak - Computerphile
The Great 202 Jailbreak - Computerphile
Computerphile
38 EXTRA BITS - Printing and Typesetting History - Computerphile
EXTRA BITS - Printing and Typesetting History - Computerphile
Computerphile
39 Triangles to Pixels - Computerphile
Triangles to Pixels - Computerphile
Computerphile
40 The Problem with Time & Timezones - Computerphile
The Problem with Time & Timezones - Computerphile
Computerphile
41 The Visibility Problem - Computerphile
The Visibility Problem - Computerphile
Computerphile
42 Lights and Shadows in Graphics - Computerphile
Lights and Shadows in Graphics - Computerphile
Computerphile
43 The Penguin Barcode - Computerphile
The Penguin Barcode - Computerphile
Computerphile
44 Typesetters in the '80s - Computerphile
Typesetters in the '80s - Computerphile
Computerphile
45 The Font Magicians - Computerphile
The Font Magicians - Computerphile
Computerphile
46 The Little Mac with the Big Bite - Computerphile
The Little Mac with the Big Bite - Computerphile
Computerphile
47 EXTRA BITS - More on the Original Mac at 30 - Computerphile
EXTRA BITS - More on the Original Mac at 30 - Computerphile
Computerphile
48 XP to Ubuntu with an 8yr old Hacktop - Computerphile
XP to Ubuntu with an 8yr old Hacktop - Computerphile
Computerphile
49 EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile
EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile
Computerphile
50 EXTRA BITS - Making a Bootable USB in Linux - Computerphile
EXTRA BITS - Making a Bootable USB in Linux - Computerphile
Computerphile
51 EXTRA BITS - Installing Ubuntu Permanently - Computerphile
EXTRA BITS - Installing Ubuntu Permanently - Computerphile
Computerphile
52 The Dawn of Desktop Publishing - Computerphile
The Dawn of Desktop Publishing - Computerphile
Computerphile
53 What is Bootstrapping? - Computerphile
What is Bootstrapping? - Computerphile
Computerphile
54 Reverse Polish Notation and The Stack - Computerphile
Reverse Polish Notation and The Stack - Computerphile
Computerphile
55 Home-Made Z80 Retro Computer - Computerphile
Home-Made Z80 Retro Computer - Computerphile
Computerphile
56 Should Everybody Learn to Code? - Computerphile
Should Everybody Learn to Code? - Computerphile
Computerphile
57 Programming in PostScript - Computerphile
Programming in PostScript - Computerphile
Computerphile
58 Heartbleed, Running the Code - Computerphile
Heartbleed, Running the Code - Computerphile
Computerphile
59 YouTube's Secret Algorithm - Computerphile
YouTube's Secret Algorithm - Computerphile
Computerphile
60 YouTube Search & Discovery - Computerphile
YouTube Search & Discovery - Computerphile
Computerphile

This video teaches how to conduct a buffer overflow attack on a Linux system, exploiting a vulnerable program to gain root access and execute arbitrary code. It covers the basics of buffer overflow attacks, stack-based buffer overflows, and return-oriented programming.

Key Takeaways
  1. Compile the vulnerable code
  2. Run the code with a string longer than 500 characters
  3. Use GDB to inspect the code and memory layout
  4. Compile the function and view its machine code
  5. Disassemble the main function to see the CPU instructions
  6. Run the program from GDB and pass a string longer than the buffer size to cause a buffer overflow
  7. View the registers and find the return address
  8. Change the return address to point to a payload that executes a system call to run a shell
💡 A buffer overflow attack can be used to gain control of a machine and execute arbitrary code, but operating systems have measures to prevent this, such as randomizing memory layout and disabling stack execution.

Related AI Lessons

Account Takeover Attacks: Why Authentication Isn’t the Real Problem
Learn why authentication isn't the main issue in account takeover attacks and how attackers steal trusted sessions
Dev.to · Sentinel Layer
When the Most Important Feature of a Migration Toolset Isn’t a Feature at All
Learn why trust and identity are crucial when selecting an Active Directory migration toolset, and how to prioritize them in your decision-making process
Medium · Cybersecurity
HOW TO RECOVER CRYPTOCURRENCY LOST TO BITCOIN INVESTMENT SCAM/MALICE CYBER RECOVERY WIL DO IT
Learn how to recover lost cryptocurrency from Bitcoin investment scams and malice cyber attacks
Medium · Cybersecurity
Aflac Japan Data Breach Exposes 4.38 Million Policyholder Records
Aflac Japan's data breach exposes 4.38 million policyholder records, highlighting the importance of cybersecurity in protecting sensitive information
Dev.to · BeyondMachines
Up next
You Think Your Card Declined by Mistake? It Might Be a 2026 Scam
Tolulope Michael
Watch →