All Kernel Page Faults Explained
Key Takeaways
The video explains the fundamentals of kernel page faults, including lazy loading design, page tables, and copy-on-write, with a focus on operating systems and backend engineering.
Full Transcript
every access to every memory for the first time is going to fault because of the lazy loading design. This is also another kind of page fault that is special. So the page is still there but it's not yours to write to my friend. You're just a child. You're just a baby. Go. You're a fork. Go fork your stuff there. So the colonel will copy it and we'll just do the magic. another page fault. Now, let's actually talk about page faults. You see, page faults, if we're going to define it, is back to this idea of translation. Who is actually doing the reading of memory? It's the CPU, right? CPU are asked to read something from memory. Hey, I want to read this area, please. This virtual memory. There is a component in the CPU called the MMEU, the memory management unit that translates the virtual address to a physical address. And this is a feature not of the kernel. That's a CPU feature. What does that mean? Every CPU ships with this ability to translate. Yeah. and ships. Guess what? When that does that, that that means that the kernel must abide by the CPU data structures for translation because how would how else would you translate? You see how complex this stuff is? To translate from virtual memory to physical memory, right? You need a page table and you need need to have a certain structure certain number of bits and have these have they should have meaning and I don't want to go into details because that deserves its own uh it's episode but there is levels to paging page the index page and then there is another page and then the page directory and then the final page so sometimes there are four or five page levels for performance and u efficiency. So this is is is is a feature of the CPU itself that does the translation. Now if a CPU wants to read a virtual address and it looks up this page tables using a single uh register the the kernel writes to that look that points to the kernel space where we have the physical page table translation table and the CPU walks this page and it's literally called page block and it will walk the page and then it has a cost right and it will do the translation and when it finds the physical address there is a bit among other bits that is called is this present or not zero not present one means it's present and by default everything is not present that's how the kernel works the kernel is a lazy load by design. That's how it was designed. We always going to lazy load when we need it. That's just how it was designed. You can argue with the maintainers if you want, but that's how it was designed. It's a lazy loading mechanism. That means if I find a zero, that means it was not loaded. But wait a minute. If it's not loaded, that means it's it's not there. So the kernel needs to intervene to load it. How does that work? The CPU raises a page fault and says, "Hey Colonel, take over. I'm faulting this page, this virtual memory page for this particular process. Please can you allocate a physical memory for this place and the kernel does it stuff and then gives back the the control to the CPU and the CPU moves over. And to do that we need to switch a page fault need to switch the process from the user mode to the kernel mode. Why? Of course, we need we're dealing with kernel stuff. We're updating the page tables. Perhaps we might update some VMAs. We might extend the stack. Who knows? I need to update some memory structure that belong to the kernel. And I cannot do it as a as a user. It's a stinking user. We cannot do it. So let's talk through examples of page faults. Now that we understand what a page fold is, the first example is the first access. Every first access will fault. Every access to every memory for the first time is going to fault because of the lazy loading design. Okay, that means that if a if the CPU is try to access a virtual memory, let's say 100 and that lives in the first page for simplicity, the CPU first of all is going to look and it says all right, I don't have this mapping cached. This is called the TLB not the topic of today but it's think of it as a cache the cache dis mapping for performance of course because that page work is expensive anything in memory is expensive it's is yeah it's so slow compared to what the CPU can do right so I don't I don't have it there so I need to walk the pages I'm going walk the pages and I discovered that this page doesn't exist because that's the first time I access it right I'm let's say I'm reading or even writing same thing. So the the kernel the CPU will raise a page fault because that page is not there. The kernel pro switches this process to kernel mode which means we flushed all the registers of the users. We saved the state of the process and now we load the uh instruction register to point to the kernel stuff. Now we're we're in kernel mode. Okay, we took the hit. Now the colonel start doing all this stuff. All right, let's load a new page. Oh, just trying to access this for the first time. You're trying to read write to it. Sure. Let me look at a fresh physical page for you. Here you will let me now as one right the second right. What's the second right? Well, I'm I need to update the page table to point to this physical memory and I mark it now as present one instead of zero. So the CPU can have it in its uh in for for the future reads this CPU or other CPUs, right? We don't have to do anything else for that. That's a simple operation done. We access the the access the memory. You did a right you wrote it and now the CPU remembers this. There is a TLB in uh entry uh translation look aside buffer which now maps this page to this page. We don't map individual addresses. We map the first address of each page because it's better, right? Because we don't need to map because it's just more efficient to map one as opposed to 4,000, right? Entries. So then that's one idea of page fault the first access. So every time you access something for the first time cause a page fault. So imagine a stack. Let's say you're calling functions. The first page is is faulted. It's faulted in. That's the that's the vocabulary that people use. It's faulted in by the kernel. Right? So then I call function. I I hit the second page. It's faulted in. Another new page is faulted in. Okay. Okay, so we did two kernel swabs, but this time because I not only I faulted another page, I also expanded the stack. So that's an in place update of the stack virtual memory area. And instead of 0 to 100, now it's 0 to 300. I'm updating that VMA, right? So I need to lock the data structure so multiple other threads doesn't do the same thing for me. and and uh corrupt my data structures. So that was the second item stack expansion. So as I am expanding the stack, I'm also doing page faults. A third example is copy of write or copy on write. Copy on write happens when you fork two processes. That's another benefit by the way of the virtual uh memory. When you when you have a process and you fork it, right? like that you fork it. Nothing gets copied by default. If that virt that new process gets a brand new virtual memory address that is almost identical to the first one, but all what we do is we share the page tables. So all those two processes point to the same physical memory which is beautiful. And unless until until what? until the child tries to write something, tries to change something on in one of its memory and because process me processes by design isolated we need not to write on top of each other. So when we write that is a first access to the child and what what the kernel does is actually does more work than than you can imagine. It copies the page. It updates the virtual memory mapping for the accessed page of the child to the new physical memory. And then it does the right, right? And all of this happens because it's a was a page fault. This page fault happened not because the memory was not present. The memory was present, right? But it was protected if you will, right? It's another bet. I forgot what it is actually, but it's it's two processes pointing to the same physical address. And this is also another kind of page fault that is special. So the page is still there, but it's not yours to write to, my friend. You're just a child. You're just a baby. Go, you're a fork. Go fork your stuff there. So the kernel will copy it and will just do the magic. Another page falls. So it's just a page fault to deal with this stuff. So you can see that if the beauty of this if as long as you're reading you're fine, right? Remember this only happens when you actually writing you want to change the state. So the kernel will essentially only trigger this copy on write is if you actually going to write and we're going to copy it. Right? So that's another page kind of page faults copy and write. [snorts] So the most expensive thing you can do after you fork is write into every single memory that you uh that you that you share with the parent process. That's the worst case scenario, right? But uh the best case scenario if you only read as a child. An example here is a reddus snapshot backup procedure, right? that essentially if it wants to take a backup for durability reasons, it forks the forks a process and will just read read read and write to disk. So that's a completely only reads, right? So if the parent did change in this particular case, uh if the parent did change something, whoever does the right takes the hit of the copy and write essentially, right? Uh that's that's the rule of copy of right. Another example of page fault is swap. We talked about this a little bit where I am mapped to a physical page but then the colonel decides that hey your process is just weak. You didn't even use these pages. Let me take it to someone who actually deserve and value this precious physical memory. You don't value these physical memory. So it will swap you out as a process just that page or multiple pages and it will update the page table entry with the file number I think it's the node ID the what is it what do they call it forgot iode I think yeah you they will put the iode of the swap file swap table right uh and will tell you that it will the kernels that Hey, this has been swapped out. Now we have free physical memory. This whenever this guy wants to come back and read this stuff, the CPU will walk the page and will uh of course walk the page and discover that has been swapped out and will rage a rage raise a page fault to the kernel. We do a kernel sw switch kernel mode switch the use uh the colonel will take over and does all this stuff right and uh we'll read from disk allocates a physical memory hopefully there is physical memory at that point allocate it swipe it back and then do all this dance another example is filebacked memory right uh I think it's called mm map right m map that's a very common way of people used to build databases like this just create a map and then as you write to memory it will just write to disk. It was just like a nice way to map a physical area to an actual uh page on disk so that anytime you update it uh it will just immediately update the file on disk. And so that's another way. But if you if if if you're trying to write something that is mapped to a disk, you also going to raise a page fault if there is no if there is no physical memory allocated. So one will be allocated in that case. And finally another final example of a page fault is permissions. Permission is very critical. An example is let's take a buffer overflow where someone injected uh a function and that called the function and wrote instead of writing 100 bytes wrote way more which caused the uh the bytes to be to override the return address right so wrote that extra bytes which happened to be a specific instruction that is applicable for this CPU And then at the end they wrote the return memory to point back to the code that they just wrote. And they in this particular case when the function finish executing the return address with the kernel will pop back uh what is it called the instruction uh register will point back based on the return address which happened to be an address that the attacker has specified. the CPU will read the instruction address from the wrong location because the stack doesn't have code. It's remember the stack is only me uh variables and functions and you know it doesn't have code it have variables only the actual functions or lives in the text area and that's where the instructions live. So the return address instead of pointing to the somewhere in the text area it was pointing back somewhere in the stack. That's basically where where the buffer overflow uh stack overflow attacks happen right so then the instruction will put back to the stack the CPU will say oh I'm supposed to read this instruction which happen to be in this virtual memory uh let me fetch it and it will fetch the page and we'll do a page walk like just like anything else it will read it and it will do a page walk and we'll discover that oh it's actually mapped to a page a physical page let me read it and when it reads it. We're like, well, well, wait a minute. This is a stack. This is marked as read, read, and write only. You can read to the stack. You can write to the stack, but you cannot execute anything from the stack. No, no, no, no, no, no, no, no, no, no. Wait a minute. What are you guys doing? How? What? This is marked as read and write, but it's also an instruction. Page fault. That's what the CPU does. Page fault. It literally raises its hands, too. Page fault. It will say, "Hey, page fault." And we'll say the colonel will just take over. It's like, how does that happen? And the colonel here can does either can crash, it does say segmentation fault or whatever. it will just give you a bogus error that you'll never understand and it will just either crash the process or deal with that. But yeah, these attacks are rarely can happen these days, right? The only time an attack can happen is when when the when the attacker writes the change the return address to not a place to a to to a place in the stack where they wrote stuff but to a place where they know there is code right like I don't know shared memory like libby you can execute a function in libby that is dangerous like I don't know delete all files or something I don't know some such function exists. But yeah, that's the worst thing they can do. They point you to a place to execute a function that that happens to be not controlled by the attacker. It was just a function that is there loaded in your text area, right? or or a shared
Original Description
Fundamentals of Operating Systems Course
https://oscourse.win
Watch Full show on Page faults here
https://youtu.be/AsFxaZJ1M0k
NodeJS Internals and Architecture
https://node.win
Backend Troubleshooting Course
https://performance.husseinnasser.com
Fundamentals of Backend Engineering
https://backend.win
Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon)
https://network.husseinnasser.com
Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)
https://database.husseinnasser.com
Follow me on Medium
https://medium.com/@hnasr/membership
Introduction to NGINX (link redirects to udemy with coupon)
https://nginx.husseinnasser.com
Python on the Backend (link redirects to udemy with coupon)
https://python.husseinnasser.com
Become a Member on YouTube
https://www.youtube.com/channel/UC_ML5xP23TOWKUcc-oAE_Eg/join
Buy me a coffee if you liked this
https://www.buymeacoffee.com/hnasr
Arabic Software Engineering Channel
https://www.youtube.com/channel/UChWZsjdoRvZ0T9QWZOD6UpA
🔥 Members Only Content
https://www.youtube.com/playlist?list=UUMO_ML5xP23TOWKUcc-oAE_Eg
🏭 Backend Engineering Videos in Order
https://backend.husseinnasser.com
💾 Database Engineering Videos
https://www.youtube.com/playlist?list=PLQnljOFTspQXjD0HOzN7P2tgzu7scWpl2
🎙️Listen to the Backend Engineering Podcast
https://husseinnasser.com/podcast
Gears and tools used on the Channel (affiliates)
🖼️ Slides and Thumbnail Design
Canva
https://partner.canva.com/c/2766475/647168/10068
Stay Awesome,
Hussein
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Hussein Nasser · Hussein Nasser · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Extending ArcObjects (IGeometry) - 01 - Getting Started
Hussein Nasser
Extending ArcObjects (IGeometry) - 02 - The Document, The Map and The Layers
Hussein Nasser
Channel Update - New Book, New Job, New Videos
Hussein Nasser
Learn Programming with VB.NET - 01 - Getting Started
Hussein Nasser
Learn Programming with VB.NET - 02 - Classes and Objects (Part 1)
Hussein Nasser
Learn Programming with VB.NET - 03 - Classes and Objects (Part 2)
Hussein Nasser
Learn Programming with VB.NET - 04 - User Interface
Hussein Nasser
Learn Programming with VB.NET - 05 - By Value v. By Reference
Hussein Nasser
Learn Programming with VB.NET - 06 - Variable size, 32 bit vs 64 bit
Hussein Nasser
Learn Programming with VB.NET - 07 - Conditional Statements
Hussein Nasser
Learn Programming with VB.NET - 08 - Inheritance
Hussein Nasser
Learn Programming with VB.NET - 09 - Strategy Design Pattern
Hussein Nasser
Learn Programming with VB.NET - 10 - How did I learn programming
Hussein Nasser
IGeometry 2016 Retrospective - Channel Update
Hussein Nasser
Javascript by Example - The Vook
Hussein Nasser
Vlog - Keep your servers close and your database closer
Hussein Nasser
Vlog - Client/Server Programming Languages
Hussein Nasser
Javascript By Example L1E01 - Getting Started
Hussein Nasser
Persistent Connections (Pros and Cons)
Hussein Nasser
Javascript By Example L1E02 - Building the Calculator Interface
Hussein Nasser
Happy new Year from IGeometry!
Hussein Nasser
Synchronous v. Asynchronous
Hussein Nasser
Javascript By Example L1E03 - Displaying the Digits on Calculator Screen
Hussein Nasser
Show Your Work. Blog, Vlog, Write, Create and Develop!
Hussein Nasser
Relational Database Atomicity Explained By Example
Hussein Nasser
Javascript By Example L1E04 - Operators, All Clear with Arrow Functions
Hussein Nasser
What Comes First, User Experience or Software Architecture?
Hussein Nasser
Javascript By Example L1E05 - Evaluate the Calculator Expressions with eval
Hussein Nasser
Fastest Way to Learn Programming Language or Technology
Hussein Nasser
Javascript By Example L1E06 - Fix Leading Zero Bug with Conditions
Hussein Nasser
Stateful vs Stateless Applications (Explained by Example)
Hussein Nasser
Javascript By Example L1E07 - Running our Calculator on the Mobile Phone
Hussein Nasser
Advice for New Software Engineers and Developers
Hussein Nasser
Why JSON is so Popular?
Hussein Nasser
Building Scalable Software - SLA, HS, VS
Hussein Nasser
Vlog (Istanbul) - Datacenter Proximity
Hussein Nasser
Should Software Engineers Learn Bleeding-Edge Technologies?
Hussein Nasser
Do Developers Build Bad User Interfaces/Experience?
Hussein Nasser
Learn By Doing.
Hussein Nasser
I Wrote Bad Front-End Code That Broke Chrome
Hussein Nasser
My Story
Hussein Nasser
Vlog - Horizontal vs Vertical Scaling
Hussein Nasser
Can User Experience Help Build Better Rest API?
Hussein Nasser
Reverse engineering Instagram in flight mode
Hussein Nasser
The Benefits of the 3-Tier Architecture (e.g. REST API)
Hussein Nasser
Stateless v. Stateful Architecture (Podcast)
Hussein Nasser
The evolution from virtual machines to containers
Hussein Nasser
Proxy vs. Reverse Proxy (Explained by Example)
Hussein Nasser
Canary Deployment (Explained by Example)
Hussein Nasser
No Excuses
Hussein Nasser
Synchronous vs Asynchronous Applications (Explained by Example)
Hussein Nasser
What is an Asynchronous service?
Hussein Nasser
Difference between Client Polling vs Server Push in Notifications
Hussein Nasser
Software vs. Hardware AdBlockers (Explained by Example)
Hussein Nasser
HTTP Caching with E-Tags - (Explained by Example)
Hussein Nasser
Simple Object Access Protocol Pros and Cons (Explained by Example)
Hussein Nasser
Nodejs Express "Hello, World"
Hussein Nasser
Reverse Engineering Instagram feed
Hussein Nasser
Popup Modal Dialog with Javascript and HTML
Hussein Nasser
MIME and Media Type sniffing explained and the type of attacks it leads to
Hussein Nasser
More on: Reading ML Papers
View skill →Related Reads
📰
📰
📰
📰
From a Go CLI to a full developer ecosystem: Gopher Glide for IDEs
Dev.to · Shiyam
Database Connection Pooling Explained Like You’re Actually Running a Production App, Because Most…
Medium · Programming
You're Writing Paper Commands Wrong
Dev.to · Eden
Reading a Verified Contract You Didn't Write: A Systematic Approach
Dev.to · Pavel Espitia
🎓
Tutor Explanation
DeepCamp AI