Set Intersection Size At Least Two - Leetcode 757 - Python
Key Takeaways
The video solves the Leetcode 757 problem, 'Set Intersection Size At Least Two', using a greedy approach and interval sorting in Python.
Full Transcript
Another day, another dollar from making these videos. At this point, I think I'd be better off getting a job at McDonald's or something. But anyways, today let's solve the problem set intersection size at least two. So, we're given a two-dimensional array of intervals. And as soon as you see that, I mean, when you see an interval problem, the name of the game is almost always going to be sorting, and this problem is no exception. Typically, we can do some kind of sorting and then greedy based approach with this problem or with all interval problems. If you don't know that yet, it's probably because you have not gone through the NEETCode 150 or the 250 list, and specifically the intervals section. So, I would recommend doing that if you haven't. But, I had a bit of trouble even understanding what this question was asking, to be honest. So, I'll try to make that clear. In this example here, we're given some intervals. So, just to draw it really quickly, we have 1 3. That's going to be like this. We have 3 7. That's going to be like this. And lastly, we have 8 9. It's going to be over here. And so, none of the intervals are really overlapping much. I mean, these are overlapping, and we do include the endpoints, so keep that in mind. And also, I think they guarantee us that all of the endpoints are going to be between zero and some positive number. And also, that each interval, we're never going to have like a single-point interval. The two endpoints do have to be different. The second one is always going to be greater than the first one. So, knowing that, you can kind of read through the description where it says, "The definition of a containing set is an array of numbers such as this one or this one, where each interval from the list has at least two integers in that list of numbers." So, the reason that was confusing to me is when they said has at least two integers, I took that to mean like the endpoints. But, that's not what they're talking about. They meant that like this interval is defined as 1 2 and 3. So, at least two of those points have to be in the list. Same thing with this one, it's defined as 3 4 5 6 7. At least two of these points have to be in the list, and then same thing with this guy. So, they give us two that are possible. They give us this 1 2 uh 4 7 8 9. That's one, and that has six points in it. And they give us a different one over here. 2 3 4 8 9. So, the reason we prefer this one is it only took us five points. So, what we're trying to do is create the minimum possible containing set, and we actually don't even have to create it. We just need to know the size of it. But, I I I think in this problem, we could have done either one. It's just as easy to create the set as it is to figure out the length of it. So, now we kind of have an idea. And this is why I said sorting is the name of the game, because obviously, if there's some overlap, we want to definitely pick the overlapping sections as like the points, so that we can kill two birds with one stone. So, naively, you might try just sorting them in ascending order. So, using the first point to sort them in ascending order. And if there's a tie, maybe we have this interval over here. So, these two both start at the same position, but they end at different positions. Maybe we'd still do this in ascending order. So, this interval would go first because its ending point is smaller than the other one. We're basically using the endpoint as the tiebreaker. So, that's what you might naively try. And so, let's say we did try that, and I'm going to use a more interesting example than this one, because in this example, it actually does work out pretty nicely. So, I guess I I will just quickly go through this example. Imagine we're iterating over these from left to right. We see this interval. Okay, we don't know much. And assuming we took care of everything that came before it, what we would want to do at this point, and since it's the first interval, we have to pick two points in this interval. But, which two points would we pick? Would we pick like the first two, or would we pick like the last two? Those are our only choices in this case. Well, my claim is definitely going to pick the last two, because that maximizes our chances of overlapping with the next interval, given that the next interval is definitely going to start after this one. Now, we have no idea where it's going to end, but we definitely know it's going to start before that. So, that's what we would do. We'd pick these two, and then we would go to the next interval. And we'd keep these like in two variables, let's say. And our variables are going to be two and three. So, now we're here. This interval starts at three, it ends at seven. What we would check is do both of these intervals fall in this one? Well, no, they don't. Like, we're trying to be greedy. We want both of them to fall in here, then we don't have to add any new points to our a containing set. But, this time they don't. Okay, then we would check, well, at least does the last one start in here? Is it contained here? Yes, it is. So, we got one out of two. And then for this one, well, we would get rid of it. Now, we'd find a new point in this one, and we'd probably pick the last one to maximize our chances. So, it seems like it works, but let me show you a really simple counterexample to this. And let's say this interval was bigger like that. And let's say then we have another interval that does start after this one, but it actually ends before. Let me actually make it really small. Let's just make it 2 3. Then, I don't know what else I'll add here, but maybe I'll change the rest of this example later. But, just from this one, we reach an issue, because we're starting we're going to go through this one first, this one second, this one third, because we sorted them based on the starting time. So, we pick this one. We want to pick the last two to maximize our chances, because we know for sure all the future intervals are going to start after this one. But, we don't know that they're going to start after this one ends. So, that's the edge case. So, that's the problem here. If we pick these two, we actually didn't maximize our chances. We kind of ruined our chances, because had we started with this interval, we could have picked these two, and then we would have guaranteed that the next one does end before this one. Maybe it doesn't start before this one ends, but still. So, the point I'm trying to make is if we're trying to be greedy in the way that I talked about, trying to pick the last points possible to maximize our chances, we shouldn't sort based on the starting time of these intervals. We should actually sort based on the ending time of these intervals. Okay. So, so far, so good. We can try that. And in this example, it will work. Like, I'm going to get these two points as my first points chosen. So, that's two and three. And then I'm going to go to this next interval. I'm going to find that these points are already contained in this guy, so that's good. And then I'm going to go to this interval, and then we're going to be forced to pick two new points. So, that will have been four points a total, and that works for this example. But, once again, let me show you a counterexample. Well, it's not really a counterexample, but it kind of helps us arrive at a more precise solution. So, what I'm about to explain to you next is probably the most complicated part of the problem. It's really, really subtle, so it's easy to miss this, unless we really go through the example and dry run through it. So, I'm going to add another interval. Let's make it 4 5. And so, I'm picking this interval because it also ends at this position. So, we already said we're sorting based on the end value, let's say the right part of the interval, and we're doing that in ascending order. But, if there's a tiebreaker, what do we do? Do we then just sort based on the left value in ascending order, or do we sort based on the left value in descending order? And at first, it might seem like it doesn't really matter. And that's because it doesn't. You can code it up both ways, but I'm going to show you why doing it in descending order based on the first value is a lot more simple. And I'm going to use some variables now. Now, we're actually going to go through the dry run. Just a few variables. One is going to keep track of the result, and that's going to be a integer, but just to make it really, really clear, I'm actually going to make it an array, so you actually know exactly which points we picked. So, let's do that. And I'm also going to keep track of my last two points that I have picked that are a part of the solution set. So, I'm going to call them P1 and P2. P1 is going to be the previous one, and then P2 is going to be like the most recent one. So, if we had two points like this, this is P1, this is P2. Just to make that really, really clear. So, now let's start. And uh we're going through this from left to right, using the ending values in ascending order. And I'm going to initially try to do it in the other way. I'm going to try to now also do the left values in ascending order as well. So, we're going to first go through this guy, and we're going to pick the last two points that this is a part of, because we're trying to be greedy. We hope that the last two points of this are also a part of the next interval, which is going to end before this one. Or rather, end after this one. So, we pick these two. We're going to get two and three here, and then that's exactly what these are going to be as well. In that case, we would have just incremented this by two, by the way. And we would have known that because P1 and P2, let's say initially they start at -1, because those are definitely going to be out of bounds. And then what we're going to check is just that P2, which is -1, is less than the starting of this one. And we know that P1 is always going to be less than P2. So, if that's the case, then we know that neither of these belong to this interval. So, we have to pick two new points. So, that was the initial case. Now, we're here. Now, we're going to go to this interval next. And we're going to check, is P2 less than the beginning of this? Is it less than the left? Is it less than one? No, it's not. So, that automatically tells us P2 is definitely a part of this interval because we know that P2 was a part of the previous interval and this one definitely ends after that one or it ends at the same value. So, we don't even have to check that P2 is less than this guy. And so, since we found out that P2 is bigger than the left, we already know it's less than the right. That means it falls somewhere in here. So, that's how I'm going to code it at least. So, now we know P2 is in there. Let's check P1. Okay, is P1 also greater than the left? Because we know P2 is somewhere in here. P2 is over here. P1 is definitely less than P2. So, if it's greater than the left point, then we know that P1 also falls somewhere in here. And so, right now it does. So, that tells us both of these guys are in this interval. We don't have to do anything. Keep them the same. Okay? And next, we're going to get to this interval and we're going to find that P2 is actually less than the left here, which is four. So, neither of these guys falls in this interval. So, then we have to pick two brand new ones. We pick the last two points of this. We pick right and right minus one, which is going to be four. So, it will replace both of these with four and five. And we'll add that to the result. So, this one actually worked out in this example. And had we done it the other order, had we done this one first and this one second, it would have actually worked out the exact same way. Well, not the exact same way, but the result would have been the same. The length, that is. Because we would have picked these two first. That's two, three. Then we would have gotten here and we would have picked four and five. And then we would have gotten here and we know four and five is definitely a part of this one. So, it seems like it works out in both cases. We just end up with Well, we actually ended up with the same result. And intuitively, it seems like it would work. And it does, but when you make the example a bit more interesting, things start to get kind of messy. So, I'm going to get rid of this interval, but really I'm just changing the starting point of it. Now, I'm going to have this one be like this. So, it's still like relative to the sorting order, still before this one. But I want to show you how we have to manipulate our pointers now, our points, rather. So, we start here. Same thing happens. We have two, three. We get two, three here. Two, three added to the result. So, our result is now two. Next, we're going to go to this point. So, what we're going to find is that yes, P2 is in bounds. It's not less than this point, so it's somewhere in here. And then we check the same for P1, but it is less than the beginning, so we can't include this one. So, we keep the three, but then we need one new point. So, what we're going to do is take this guy, move it over here. That's now going to be P1 cuz we always want P1 to be less than this one. And then, which point should we pick from this interval? We should probably pick the last one cuz cuz we know that every future interval is going to have an ending point greater than or equal to this one. So, that's what we do. We pick five. And now, this is the special case that you have to handle. You have to handle it one way or another and I'll show you how I'm going to handle it. But now, we get to this interval. So, we know that if there is a tiebreaker, the future intervals are going to have a bigger left value, a more restrictive left value than the previous ones. So, now, they're ending at the same position. We check, is P2 part of this interval? Well, it's not less than the left point, which is four, so it is part of the interval. Now, we check, is P1? Well, this is four. Three is less than that, so three is no longer a part of this interval. So, in my way, the way I want to code it up in a nice, clean way, what I'm going to say is okay, we have to pick a new point from this interval now, right? And remember the way we pick, we always pick the far right points that we can. So, what I'm going to do now is say, okay, well, P2, it can go over here. It can replace this guy, five. And then, P2 is going to be the last point from this interval. That's going to give us five. So, clearly, something went wrong. Even though so far our answer is still correct, logically, we messed up. These cannot be the same point. We can't double-count a point. So, now, if I have a future interval that looks like this, maybe five, six, what I'm going to find is that both of these points do fall within that interval, but that doesn't mean we have two points from that interval. We do need a six. Five is not enough. Five is just one point. We need to add a six, but our solution would continue with this as if everything is fine and it would say, okay, this is the solution set, two, three, five. That's incorrect. It should have been two, three, five, six, which would have given us a length of four, not a length of three. So, we could have handled this in two different ways. You could have not just taken this value and naively replaced it with this. We could have added an if statement that if P2 is already equal, so we're in this state right now, we find that three has to be replaced. We need to pick a new point here. So, you could add an if statement and say, okay, if five is already equal to the right value, well, we can't replace it with the right value. So, what we're going to do is update the left value. We're going to update P1 and we're going to set it to right minus one. We know for sure that is in bounds. So, you could do that. You could set this to four. And then that works perfectly fine. You'll just need an extra if statement in your code. But let me show you why sorting this in the other way actually solves this problem altogether. Had we sorted it such that this one went first and then this one went second, notice how if they have a tie, they end at the same position, anything I include from this one is definitely going to be included in this one as well because the starting point is going to be before this one. So, what I'm saying is not only are we sorting the right values in ascending order, we're also going to add a second condition, sort the left in descending order if there is a tiebreaker. So, this way we can keep our updates pretty simple. And this way the code would have been something like this or the dry one would have been Initially, we start at two, three. Then we get to this interval, which is going to show us that neither of these are actually in bounds. So, we end up picking four, five. And then these will become four and five. And then we'll get here and we'll find that four and five are already in bounds, so we're good. And then when we get to another interval, maybe a five, six, we're going to find that five is in bounds. So, we will put five over here. Four was not in bounds, so we're going to need to replace it with a new one, six. If this was confusing, I'll show you the code in both variations and show you why I prefer the more simple approach where we do add this second sorting condition. Okay, so let me just reset this and what we're going to do here is initialize the result to be zero. I'm going to sort the intervals and you can do that like this in Python. Set the key to be a lambda function where we're passing in an interval I and we're sorting it like this where the second value takes precedence and by default this will be in ascending order. And then the second value will take second precedence, but we're going to do that in reverse order. So, to do that, you can just add a negative sign. And then we're also going to initialize our points P1, P2, -1, -1. And then we start iterating through each interval. I'm going to just call them left and right as I unpack them like this. And we're going to end up returning the result, so I'll just put that there. So, now I coded it up in a really simple way, I think, which is like this. If P2 is less than left, we automatically know neither of these points fall within this interval. Because we know P1 is smaller than P2. If it's smaller than the left value, well, then it's not anywhere in this interval. So, then in that case, we increment the result and we pick the last two points from this interval. So, I'm going to set P1 and P2 to be this. It's going to be right minus one. That's P1. And P2 is just going to be right. The other case is if P2 is not less than left, so P2 is actually included in the interval, but P1 is less than left. So, in that case, sorry, in this case, we increment the result by two. In this case, we increment the result by one and we only have to replace P2 because P2 now is going to take the spot of P1. So, we can do this. P1 equals P2. And in the same line, you can say P2 is now going to be right minus one cuz we're being greedy. Sorry, not right minus one. It's just going to be the rightmost point cuz we're being greedy. And then, the last case is where neither of these points are in the interval. And in that case, you don't really do anything. I guess you could increment the result by zero, but that's not going to do anything, so we don't need that case anyway. So, I'm going to submit this now and you can see it works. It's pretty efficient. Now, let me show you. Had we sorted it the other way, it still would have possibly worked. This case, I think, is still going to be the same. If neither of the points are in the current interval, just pick the last two points. Okay, but if the first point is not in the interval, but the second one is, we can't just naively set P2 to be right because P2 might already be right. So, I'm going to add a if statement. Is P2 equal to right? If it's not, then we can do this code down here. But if it is, then we need to do something else. We need to update P1. P2 can stay the same, but P1 now must be right minus one, I believe. So, I will run that code now. Okay, you can see that one also works. It's pretty much the exact same time complexity, which by the way, I think you can probably tell is going to be bottlenecked based on the sorting, which is going to be n log n, where n is the size of the intervals array. Thanks for watching, and I'll see you guys soon.
Original Description
🚀 https://neetcode.io/yt - A better way to prepare for Coding Interviews
🧑💼 LinkedIn: https://www.linkedin.com/in/navdeep-singh-3aaa14161/
🐦 Twitter: https://twitter.com/neetcode1
⭐ BLIND-75 PLAYLIST: https://www.youtube.com/watch?v=KLlXCFG5TnA&list=PLot-Xpze53ldVwtstag2TL4HQhAnC8ATf
Problem Link: https://leetcode.com/problems/set-intersection-size-at-least-two/
0:00 - Read the problem
0:30 - Drawing Explanation
16:59 - Coding Explanation
leetcode 757
#neetcode #leetcode #python
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from NeetCodeIO · NeetCodeIO · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Leetcode 149 - Maximum Points on a Line - Python
NeetCodeIO
Design Linked List - Leetcode 707 - Python
NeetCodeIO
Minimum Time to Collect All Apples in a Tree - Leetcode 1443 - Python
NeetCodeIO
Design Browser History - Leetcode 1472 - Python
NeetCodeIO
Number of Good Paths - Leetcode 2421 - Python
NeetCodeIO
Flip String to Monotone Increasing - Leetcode 926 - Python
NeetCodeIO
Maximum Sum Circular Subarray - Leetcode 918 - Python
NeetCodeIO
Find Closest Node to Given Two Nodes - Leetcode 2359 - Python
NeetCodeIO
Concatenated Words - Leetcode 472 - Python
NeetCodeIO
Data Stream as Disjoint Intervals - Leetcode 352 - Python
NeetCodeIO
LFU Cache - Leetcode 460 - Python
NeetCodeIO
N-th Tribonacci Number - Leetcode 1137
NeetCodeIO
Best Team with no Conflicts - Leetcode 1626 - Python
NeetCodeIO
Greatest Common Divisor of Strings - Leetcode 1071 - Python
NeetCodeIO
Shortest Path in a Binary Matrix - Leetcode 1091 - Python
NeetCodeIO
Insert into a Binary Search Tree - Leetcode 701 - Python
NeetCodeIO
Delete Node in a BST - Leetcode 450 - Python
NeetCodeIO
Shuffle the Array (Constant Space) - Leetcode 1470 - Python
NeetCodeIO
Fruits into Basket - Leetcode 904 - Python
NeetCodeIO
Number of Subarrays of size K and Average Greater than or Equal to Threshold - Leetcode 1343 Python
NeetCodeIO
Naming a Company - Leetcode 2306 - Python
NeetCodeIO
As Far from Land as Possible - Leetcode 1162 - Python
NeetCodeIO
Shortest Path with Alternating Colors - Leetcode 1129 - Python
NeetCodeIO
Minimum Fuel Cost to Report to the Capital - Leetcode 2477 - Python
NeetCodeIO
Count Odd Numbers in an Interval Range - Leetcode 1523 - Python
NeetCodeIO
Contains Duplicate II - Leetcode 219 - Python
NeetCodeIO
Path with Maximum Probability - Leetcode 1514 - Python
NeetCodeIO
Add to Array-Form of Integer - Leetcode 989 - Python
NeetCodeIO
Unique Paths II - Leetcode 63 - Python
NeetCodeIO
Minimum Distance between BST Nodes - Leetcode 783 - Python
NeetCodeIO
Design Hashmap - Leetcode 706 - Python
NeetCodeIO
Range Sum Query Immutable - Leetcode 303 - Python
NeetCodeIO
Binary Tree Zigzag Level Order Traversal - Leetcode 103 - Python
NeetCodeIO
Middle of the Linked List - Leetcode 876 - Python
NeetCodeIO
Course Schedule IV - Leetcode 1462 - Python
NeetCodeIO
Single Element in a Sorted Array - Leetcode 540 - Python
NeetCodeIO
Capacity to Ship Packages - Leetcode 1011 - Python
NeetCodeIO
IPO - Leetcode 502 - Python
NeetCodeIO
Minimize Deviation in Array - Leetcode 1675 - Python
NeetCodeIO
Longest Turbulent Array - Leetcode 978 - Python
NeetCodeIO
Last Stone Weight II - Leetcode 1049 - Python
NeetCodeIO
Construct Quad Tree - Leetcode 427 - Python
NeetCodeIO
Find Duplicate Subtrees - Leetcode 652 - Python
NeetCodeIO
Sort an Array - Leetcode 912 - Python
NeetCodeIO
Ones and Zeroes - Leetcode 474 - Python
NeetCodeIO
Remove Duplicates from Sorted Array II - Leetcode 80 - Python
NeetCodeIO
Maximum Twin Sum of a Linked List - Leetcode 2130 - Python
NeetCodeIO
Concatenation of Array - Leetcode 1929 - Python
NeetCodeIO
Symmetric Tree - Leetcode 101 - Python
NeetCodeIO
Check Completeness of a Binary Tree - Leetcode 958 - Python
NeetCodeIO
Construct Binary Tree from Inorder and Postorder Traversal - Leetcode 106 - Python
NeetCodeIO
Find Peak Element - Leetcode 162 - Python
NeetCodeIO
Accounts Merge - Leetcode 721 - Python
NeetCodeIO
Binary Tree Preorder Traversal (Iterative) - Leetcode 144 - Python
NeetCodeIO
Binary Tree Postorder Traversal (Iterative) - Leetcode 145 - Python
NeetCodeIO
Number of Zero-Filled Subarrays - Leetcode 2348 - Python
NeetCodeIO
Minimum Score of a Path Between Two Cities - Leetcode 2492 - Python
NeetCodeIO
Sqrt(x) - Leetcode 69 - Python
NeetCodeIO
Successful Pairs of Spells and Potions - Leetcode 2300 - Python
NeetCodeIO
Optimal Partition of String - Leetcode 2405 - Python
NeetCodeIO
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Bloom Filters, Explained Properly
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Medium · Python
Chapters (3)
Read the problem
0:30
Drawing Explanation
16:59
Coding Explanation
🎓
Tutor Explanation
DeepCamp AI