The ONLY C keyword with no C++ equivalent

mCoding · Beginner ·🔢 Mathematical Foundations ·4y ago

Key Takeaways

The C keyword 'restrict' has no equivalent in C++ and is used to enable compiler optimizations by promising that objects accessible through a pointer will not be accessed through any other means, with tools like vector instructions and assembly code

Full Transcript

hello and welcome i'm james murphy in this video we're going to be talking about the restrict keyword in c it's the only keyword in c that has no analog in c plus plus and it's actually an important one c plus was originally started as essentially an extension for c to support classes and as you can see over the years c plus has gained a number of features that c just doesn't have c is a much smaller and simpler language than c plus plus there are a few keywords on the left hand side here for c that don't appear on the right hand side for c plus plus like generic but most of the things like generic have some analog in c plus in this case generic is kind of c's way of doing what c plus uses templates for however restrict is a different case there is nothing in c plus plus that does what restrict does in c restrict is what's called a type qualifier like const or volatile except restrict can only be applied to pointer types so it doesn't make sense to have a restrict int restrict is a promise that lets the compiler know that the programmer guarantees that no object that i access through a restrict pointer will be accessed through any other means besides the restrict pointer making this promise allows the compiler to be able to do more optimizations that it might not have been able to do before for correctness reasons look what happens when i get rid of the restrict qualifier you can see that the assembly has six instructions but when i have the restrict in there it's only five instructions if we take a look at the assembly output here on the right we see that the instructions without the restrict qualifier do something that's seemingly redundant first off it starts by making a copy of the amount variable and then adding that into the star x variable basically and then it makes another copy of the amount variable the one that we just read two lines before and then it does the subtraction and then returns so why did the compiler choose to read again and make another copy of the amount variable when it just did the exact same thing two lines before well the reason is the compiler doesn't know what these x y and amount are pointing to it's possible that someone passed in the same thing for x as they did for amount and in that case this line where we're changing the value that x is pointing to is also changing the value that amount is pointing to so when i go to do the subtraction in the next line the value for amount has changed since it's theoretically possible that the value that amount is pointing to has changed the compiler has no choice but to read it again suppose though that as the programmer we know that it doesn't make sense to pass in the same thing for amount in x then we can tell the compiler i guarantee you compiler that this variable is not pointed to or accessed by anything else and the way that we do that is with the restrict keyword as you can see once i add in the restrict keyword on x the compiler deletes the redundant read logically speaking it probably makes sense to mark all three of these pointers restrict but in this case it doesn't allow any extra optimizations in this example we're implementing a simple addition of two vectors of length n so we have source one and source two which are of length n and we're supposed to store the answer to the addition in the destination variable looking at the assembly output we see that there are about 46 instructions but when i add the restrict keyword then it goes down to only 29 instructions that was a pretty big difference by adding the restrict keyword onto the destination which is essentially saying that i'm guaranteeing that neither of the source pointers are overlapping with the destination then just by doing that i've cut the code size quite significantly what exactly got cut out in all that mess glancing at the assembly we see that there are a lot of these xmm word kind of things going on and what these registers are for are vector instructions so what this code is trying to do is do the vector addition using the vector instructions on the machine meaning it's trying to add multiple elements together at once as you can probably imagine though if the destination overlaps with either of the sources it might not be correct to do multiple of these additions at once let's compare to the assembly that's without the restrict keyword in this case we still see that there are a bunch of vector operations happening we have this label l4 and then a bunch of vector operations and then a jump back to l4 so that's a vectorized loop but we also see that there is an unvectorized loop here so we have l3 and then a jump back to l3 just using the regular registers so basically what's happening here is there's a whole bunch of extra instructions that the compiler does to check to see if it would be okay to use the vectorized instructions and then it does the vector thing if that's allowed otherwise if the source and destination are overlapping then it's forced to just do it you know one at a time in the normal case when your destination and source pointers are actually pointing to different things there's no overlapping going on you're not going to see much of a performance improvement and the reason for this is because it does just check to see if they're overlapping and if not it does the vector operations so if you have a really long array then the vector operations are still going to be able to work on your really long array there was just that one if branch at the very beginning that had to check whether or not it can do those operations but the code for this is quite a bit bigger so that could cause an instruction cache miss and there could also be a cache miss if the processor weren't able to predict the you know branch that goes through all these things and decides that the vector operations are the correct ones so should you be going around slapping restrict onto all of your pointers probably not remember restrict is a promise you're promising the compiler that you're not going to be aliasing this pointer nothing else that the restrict is going to access is going to be accessed by a different pointer if that makes sense in the context of your function then go ahead throw it in there you might get a speed up but if it doesn't make sense then don't use it you can run into some really really hard to find bugs if you use restrict inappropriately consider for example this fibonacci function it takes in a destination which is pointing to enough memory for n elements and then it populates those elements with the first n fibonacci numbers so suppose the implementer says oh you know what the formula for fibonacci is just adding right so fibonacci of n plus 2 is just fibonacci of n plus fibonacci of n plus 1. so why don't i just use the vector addition and call it with destination plus two destination and destination plus one this is now violating the restrict promise since the destination is just one or two elements ahead of the source pointers so reading these pointers here is going to be reading from the destination pointer essentially a few iterations later but we promised that we wouldn't do that that nothing else was going to be pointing maybe i didn't know what restrict meant and so i just went ahead and did it and well it seems like it's giving the correct output all my tests are passing so let's go ahead and you know push it to production i'll go ahead and compile it you know and for my release build and then all of a sudden i start getting the wrong answers remember the purpose of restrict was to enable more optimizations if i was in a debug build with a low level of optimization those optimizations might not have happened then when i go to compile for the release build i get a different answer i get the wrong answer now and it doesn't matter how many times i test it if my testing build is not 03 then i'm going to get the right answer in all of my tests this is an extremely tricky bug to find this is one of those cases where because the n plus two term depends on the n term and the n plus one term i need to do things in order i can't do two of these operations at the same time like i would in a vectorized situation that means that for correctness in this case i really shouldn't have destination as being a restrict pointer now i get rid of that and i go back to the correct answers even at o3 a good practice to follow when you have one of these situations where you think you want to add restrict but there might be some situation where the pointers may be overlapping is to just have two versions of the function one that explicitly allows overlapping and then one with the restrict keyword then just make sure that you use the correct one in your function if i just call the correct vector ad that allows overlapping then there's no more issues with the restricts pointer and i can still use the restrict pointer version for the majority of cases where my pointers are not overlapping if you've ever heard of mem copy or mem move this is exactly the strategy that these two functions use they essentially do the same thing you have a source pointer and a destination pointer and you're going to copy n bytes from the source into the destination the only difference between mem copy and memo is that with mem copy it has the restrict pointers and so when you do a mem copy you're not allowed to pass overlapping source and destination regions but for move you are you can see how this might force the implementations to be different in mammoof you might have to copy into a temporary buffer and then copy the temporary buffer into the destination but in memcapi since there's no overlapping you could potentially copy directly from the source to the destination so what's the deal with c plus plus how come restrict is allowed in c but it's not allowed in c plus there was a proposal around 2014 trying to pave the way to add restrict into c plus but there was a lot of pushback namely because restrict is really hard to make work with classes how would restrict work with the this pointer in member functions and what would it mean to mark a member variable restrict there were just too many questions that didn't have really good answers and it would have been a ton of work so it just never really made it in however i have deceived you just a little bit if you're willing to move away from standard c plus meaning the c plus that's actually defined in the actual standards document then there is a way for you to use restrict in c plus currently every major compiler including microsoft's clang and gcc all support a use of the restrict keyword which is not part of standard c plus they support a language extension that allows you to use it anyway and it seems like underscore underscore restrict is the way that it's spelled so if you change all of your restricts to double underscore restrict or for some compilers it's double underscore restrict double underscore then you can actually use a version of restrict in your c plus plus code however the version of restrict that you get from your compiler may be different than the version of restrict that you get from a different compiler they may work in different situations some may support references and others not some may support certain optimizations and others not and the actual semantics of what it means to use restrict you now have to dig into your compiler manual and figure out exactly what the compiler is guaranteeing and what you're promising to the compiler when you use the restrict keyword it's not as simple as in c but if you're really trying to squeak that last ounce of performance out of your compiler it might be worth it to make code that's not portable to all of c plus plus and just make code that is only supported by that one compiler and that's something that's done a lot in real world applications if you really really need speed but for most cases i'm guessing you're probably not going to need it hey everyone thanks for watching i know that was a really technical one and i'm not even sure how much my audience knows c and c plus plus but i think there are enough of you in there that it was worth making the video so i hope you appreciated this little you know technical bit of c as always thank you to my patrons and to my donors for supporting me and allowing me to make more of these videos i really appreciate your support lastly if you enjoyed the video don't forget to like comment and if you especially liked the video please consider subscribing or becoming one of my patrons thanks and see you next time

Original Description

C has "restrict" but C++ does not. The restrict keyword in C is the only keyword that has no analogue in C++. The keyword promises to the compiler that objects accessible through the pointer will not be accessed (either read or written to) through any other means than through the restrict pointer. This promise to the compiler allows more potential optimizations, including removing redundant reads and potentially allowing automatic vectorization, which can lead to smaller and faster code. However, if you use it incorrectly, restrict can be the source of some seriously hard to find bugs because they may only appear at high optimization levels and not in your testing or debugging builds. Major C++ compilers do support restrict in a non-standardized way though, so if you are willing to write code that is specific to your compiler, then you may still be able to take advantage of restrict even though it is not part of standard C++. Erratum: In the fib_upto_n example the check for n==0 should go before the write to dst[0], and the check for n==1 should go before the write to dst[1], it is fixed in the GitHub link. I'm very confused why I ever wrote it that way, but that's what I get for not writing tests :). This did not affect anything in the video because I was using n=10. ― mCoding with James Murphy (https://mcoding.io) Source code: https://github.com/mCodingLLC/VideosSampleCode Compiler explorer: https://godbolt.org/z/4EzvTGcfo restrict keyword: https://en.cppreference.com/w/c/language/restrict StackOverflow on restrict: https://stackoverflow.com/questions/745870/realistic-usage-of-the-c99-restrict-keyword SUPPORT ME ⭐ --------------------------------------------------- Patreon: https://patreon.com/mCoding Paypal: https://www.paypal.com/donate/?hosted_button_id=VJY5SLZ8BJHEE Other donations: https://mcoding.io/donate Top patrons and donors: Laura M, Jameson, John M, Pieter G, Vahnekie, Sigmanificient BE ACTIVE IN MY COMMUNITY 😄 ---------------------------------
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from mCoding · mCoding · 43 of 60

1 Goodbye, List! Type hinting standard collections - New in Python 3.9
Goodbye, List! Type hinting standard collections - New in Python 3.9
mCoding
2 Python's comma equals ,= operator?
Python's comma equals ,= operator?
mCoding
3 Finding Primes in Python with the Sieve of Eratosthenes
Finding Primes in Python with the Sieve of Eratosthenes
mCoding
4 Find the First Missing Positive Int | Hard Interview Question on LeetCode
Find the First Missing Positive Int | Hard Interview Question on LeetCode
mCoding
5 JSON Tutorial Python | Basic Python Recipes
JSON Tutorial Python | Basic Python Recipes
mCoding
6 Simulating Brownian Motion in Python
Simulating Brownian Motion in Python
mCoding
7 The Single Most Useful Decorator in Python
The Single Most Useful Decorator in Python
mCoding
8 The Fastest Way to Loop in Python - An Unfortunate Truth
The Fastest Way to Loop in Python - An Unfortunate Truth
mCoding
9 Numpy Array Broadcasting In Python Explained
Numpy Array Broadcasting In Python Explained
mCoding
10 Brownian Motion Single Path Zoom
Brownian Motion Single Path Zoom
mCoding
11 Brownian Motion Fractal Zoom
Brownian Motion Fractal Zoom
mCoding
12 Magic Methods - Making Python builtins work with your classes
Magic Methods - Making Python builtins work with your classes
mCoding
13 50 Million Primes In 5 Seconds - Segmented Sieve of Eratosthenes
50 Million Primes In 5 Seconds - Segmented Sieve of Eratosthenes
mCoding
14 The Hottest New Feature Coming In Python 3.10 - Structural Pattern Matching / Match Statement
The Hottest New Feature Coming In Python 3.10 - Structural Pattern Matching / Match Statement
mCoding
15 How Fast is Python's Sort? Performance Testing
How Fast is Python's Sort? Performance Testing
mCoding
16 C++ First Missing Int, faster than 100%!
C++ First Missing Int, faster than 100%!
mCoding
17 [April Fools 2021] Python 4.0! New old print, mandatory static typing, StackOverflow integration
[April Fools 2021] Python 4.0! New old print, mandatory static typing, StackOverflow integration
mCoding
18 Python dataclasses will save you HOURS, also featuring attrs
Python dataclasses will save you HOURS, also featuring attrs
mCoding
19 C++ Sudoku Solver in 7 minutes using Recursive Backtracking
C++ Sudoku Solver in 7 minutes using Recursive Backtracking
mCoding
20 Every PROOF you've seen that .999... = 1 is WRONG
Every PROOF you've seen that .999... = 1 is WRONG
mCoding
21 Python's sharpest corner is ... plus equals? (+=)
Python's sharpest corner is ... plus equals? (+=)
mCoding
22 Binary Search - A Different Perspective | Python Algorithms
Binary Search - A Different Perspective | Python Algorithms
mCoding
23 The Best Way to Check for Optional Arguments in Python
The Best Way to Check for Optional Arguments in Python
mCoding
24 Local and Global Variable Lookup Weirdness in Python
Local and Global Variable Lookup Weirdness in Python
mCoding
25 Efficient Exponentiation
Efficient Exponentiation
mCoding
26 How To Install Python for Data Science
How To Install Python for Data Science
mCoding
27 0.1 + 0.2 is NOT 0.3 in Most Programming Languages
0.1 + 0.2 is NOT 0.3 in Most Programming Languages
mCoding
28 Python 3.10's new type hinting features
Python 3.10's new type hinting features
mCoding
29 Python 3.10's Quality of Life improvements
Python 3.10's Quality of Life improvements
mCoding
30 Introducing mZips! Python Zip and Zip Longest
Introducing mZips! Python Zip and Zip Longest
mCoding
31 Match statement tips
Match statement tips
mCoding
32 Using except: is a HUGE mistake
Using except: is a HUGE mistake
mCoding
33 Python + YouTube API | Automating descriptions
Python + YouTube API | Automating descriptions
mCoding
34 Anaphones, phonetic anagrams
Anaphones, phonetic anagrams
mCoding
35 Cracking passwords using ONLY response times | Secure Python
Cracking passwords using ONLY response times | Secure Python
mCoding
36 Python f-strings can do more than you thought. f'{val=}', f'{val!r}', f'{dt:%Y-%m-%d}'
Python f-strings can do more than you thought. f'{val=}', f'{val!r}', f'{dt:%Y-%m-%d}'
mCoding
37 Diagnose slow Python code. (Feat. async/await)
Diagnose slow Python code. (Feat. async/await)
mCoding
38 Python MD5 implementation
Python MD5 implementation
mCoding
39 Salting, peppering, and hashing passwords
Salting, peppering, and hashing passwords
mCoding
40 x to bool conversion in Python, C++, and C
x to bool conversion in Python, C++, and C
mCoding
41 You should put this in all your Python scripts | if __name__ == '__main__': ...
You should put this in all your Python scripts | if __name__ == '__main__': ...
mCoding
42 Find the Skyline Problem with C++ Solution Explained
Find the Skyline Problem with C++ Solution Explained
mCoding
The ONLY C keyword with no C++ equivalent
The ONLY C keyword with no C++ equivalent
mCoding
44 Should you use "not not x" instead of "bool(x)" in Python? (NO!)
Should you use "not not x" instead of "bool(x)" in Python? (NO!)
mCoding
45 Multiple Assignments in Python
Multiple Assignments in Python
mCoding
46 Why I don't like Python's chained comparisons
Why I don't like Python's chained comparisons
mCoding
47 Automated Testing in Python with pytest, tox, and GitHub Actions
Automated Testing in Python with pytest, tox, and GitHub Actions
mCoding
48 You can pip install directly from GitHub
You can pip install directly from GitHub
mCoding
49 __new__ vs __init__ in Python
__new__ vs __init__ in Python
mCoding
50 Metaclasses in Python
Metaclasses in Python
mCoding
51 The easy way to keep your repos tidy.
The easy way to keep your repos tidy.
mCoding
52 Which Python @dataclass is best? Feat. Pydantic, NamedTuple, attrs...
Which Python @dataclass is best? Feat. Pydantic, NamedTuple, attrs...
mCoding
53 Python __slots__ and object layout explained
Python __slots__ and object layout explained
mCoding
54 C++ cache locality and branch predictability
C++ cache locality and branch predictability
mCoding
55 Avoiding import loops in Python
Avoiding import loops in Python
mCoding
56 25 nooby Python habits you need to ditch
25 nooby Python habits you need to ditch
mCoding
57 Python staticmethod and classmethod
Python staticmethod and classmethod
mCoding
58 Building a Python app with Anvil to email me if my website goes down (includes paid features)
Building a Python app with Anvil to email me if my website goes down (includes paid features)
mCoding
59 31 nooby C++ habits you need to ditch
31 nooby C++ habits you need to ditch
mCoding
60 Interviewing the creator of C++, Bjarne Stroustrup
Interviewing the creator of C++, Bjarne Stroustrup
mCoding

The C keyword 'restrict' has no equivalent in C++ and is used to enable compiler optimizations by promising that objects accessible through a pointer will not be accessed through any other means. This keyword can be used to squeeze out performance but at the cost of portability to all C++ compilers. The restrict keyword can lead to significant optimizations and reduction in code size, but it can also lead to tricky bugs if not used correctly.

Key Takeaways
  1. Add the restrict keyword to a pointer to guarantee that it is not accessed through any other means
  2. Remove the restrict qualifier to see the impact on code size and optimizations
  3. Use the restrict keyword to enable vector instructions
  4. Check if the destination and source pointers are overlapping
  5. Use vector instructions if not overlapping
  6. Populate an array with the first n Fibonacci numbers
  7. Use the vector addition to calculate Fibonacci numbers
💡 The restrict keyword in C has no direct equivalent in C++ and its semantics are compiler-dependent, requiring checking the compiler manual

Related AI Lessons

Up next
How to Open OSM Files (OpenStreetMap Data)
File Extension Geeks
Watch →