Longest common substring problem suffix array part 2

WilliamFiset · Beginner ·⚡ Algorithms & Data Structures ·8y ago

Key Takeaways

The video demonstrates how to solve the longest common substring problem using a suffix array, with a focus on a full example with four strings and a minimum of two strings sharing the longest common substring. The solution involves building a suffix array and an LCP array, and then using a window-based approach to find the longest common substring.

Full Transcript

welcome back we're going to finish where we left off in the last video in this video I want to do a full example solving the longest common substring problem with a suffix array for this example we're going to have four strings s1 s2 s3 and s4 I have also selected the value of K to be equal to two meaning that we want a minimum of two strings of our pool of four to share the longest common substring between them I have also provided you with a concatenated text we'll be working with as well as the solution at the bottom of the screen in case you want to pause the video and figure it out for yourself the first step in finding the longest common substring between a set of our four strings is to build the suffix array and the LCP array which I have displayed on the right side and the left side respectively while I will be conducting the longest common substring algorithm notice the variables on the left as they change the window LCP and the window LCS values will track the longest common prefix and the longest common substring values for the current window and the LCS length and the Alsea setter will track the best values so far so let's get started initially our window starts at the top and we want the window to contain two different colors so our rule is to expand downwards when we do not meet this criteria as I expand down the suffix is green so we still only have one color I expand down again and still one more green suffix I expand downwards again and now we arrive at a blue suffix and here we are able to perform a range query for our window however our query isn't fruitful because the window longest common prefix value is zero so there's no longest common substring here when we sass by the window color criteria like we do now we decrease the window size I decrease the window size by one and still nothing interesting and decrease the window size again and this time look what we found the current window contains a longest common prefix length of two so we obtained the longest common substring BC and add it to our solution set now we keep shrinking the interval size because we meet the color requirement our window size is now too small because K is 2 so we need two different color strings so we expand once more now something interesting has happened because we find an LC p value of 3 which is larger than our current best value so we update the solution set to have the string and BCD instead of just BC which is one character longer now get to shrink the window size the window was now too small so we expand the LC p value of 0 here so that is no good so shrink the window size now expand to meet the color requirement we get an LC p value of 1 but that doesn't beat our current best which is 3 so shrink the window now we need to meet a color requirement so expand we have only blue string so we keep expanding an LC p value of 1 for this window range so that's no good so shrink now we have an LC p value of 2 we're getting closer to our best but still not good enough so we have to shrink and like let go now expand now something interesting is going on here we have a window LCP value of 3 which is equal to our best so far so instead of saying that the CDE our newfound longest common substring value beats BCD which is of the same length we keep both in the solution set alright so now let's shrink our window interval because we meet the color requirement now expand it we still need one more color expand again our LCP window value is zero so shrink and shrink again now expand LCP value of 1 here that's not good enough so smaller expand i'll see p value of 2 okay we might be getting closer but we meet color requirement so shrink now expand to meet the color requirement these two strings have an Elsa p value is 0 shrink now expand now shrink now we've reached the end and found our solution to the longest common substring problem with 4 strings and a k value of 2 as I was doing the window expanding and shrinking I want you to notice that each time the window either expanded or shrank I only ever moved one of the endpoints downwards and they were always going downwards so we know that the number of windows has to be linear in proportion to the number of suffixes that we have and the number of suffixes that we have is the length of our text T so we come to the conclusion that there must be a linear amount of windows that we must consider which is really good because we want our time complexity be quick so that's for now thank you for watching I hope you learned something and I will catch you next time

Original Description

Related Videos: Suffix array intro: https://www.youtube.com/watch?v=zqKlL3ZpTqs Longest common prefix (LCP) array: https://www.youtube.com/watch?v=53VIWj8ksyI Counting unique substrings: https://www.youtube.com/watch?v=m2lZRmMjebw Longest common substring 1/2: https://www.youtube.com/watch?v=Ic80xQFWevc Longest common substring 2/2: https://www.youtube.com/watch?v=DTLjHSToxmo Longest repeated substring: https://www.youtube.com/watch?v=OptoHwC3D-Y Data structures repository: https://github.com/williamfiset/algorithms Kattis problem: https://open.kattis.com/problems/lifeforms My website: http://www.williamfiset.com =================================== Practicing for interviews? I have used, and recommend `Cracking the Coding Interview` which got me a job at Google. Link on Amazon: https://amzn.to/3cvMof5 A lot of the content on this channel is inspired by the book `Competitive Programming` by Steven Halim which I frequently use as a resource and reference. Link on Amazon: https://amzn.to/3wC2nix
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from WilliamFiset · WilliamFiset · 0 of 60

← Previous Next →
1 JES Image Manipulation - 2 - Installation
JES Image Manipulation - 2 - Installation
WilliamFiset
2 JES Image Manipulation - 3 - User Interface
JES Image Manipulation - 3 - User Interface
WilliamFiset
3 JES Image Manipulation - 5 - Negative
JES Image Manipulation - 5 - Negative
WilliamFiset
4 JES Image Manipulation - 6 - Black & White
JES Image Manipulation - 6 - Black & White
WilliamFiset
5 JES Image Manipulation - 4 - Grayscale
JES Image Manipulation - 4 - Grayscale
WilliamFiset
6 JES Image Manipulation - 8 - Blur
JES Image Manipulation - 8 - Blur
WilliamFiset
7 JES Image Manipulation - 7 - Edge Detection
JES Image Manipulation - 7 - Edge Detection
WilliamFiset
8 JES Image Manipulation - 9 - Blend
JES Image Manipulation - 9 - Blend
WilliamFiset
9 JES Image Manipulation - 10 - Matte
JES Image Manipulation - 10 - Matte
WilliamFiset
10 JES Image Manipulation - 13 - Rotate90
JES Image Manipulation - 13 - Rotate90
WilliamFiset
11 JES Image Manipulation - 12 - Mirroring Picture
JES Image Manipulation - 12 - Mirroring Picture
WilliamFiset
12 JES Image Manipulation - 11  - Crop Image
JES Image Manipulation - 11 - Crop Image
WilliamFiset
13 JES Image Manipulation - 14 - Stretch picture
JES Image Manipulation - 14 - Stretch picture
WilliamFiset
14 Java Fractal Explorer [6/8]
Java Fractal Explorer [6/8]
WilliamFiset
15 Java Fractal Explorer [4/8]
Java Fractal Explorer [4/8]
WilliamFiset
16 Java Fractal Explorer [8/8]
Java Fractal Explorer [8/8]
WilliamFiset
17 Java Fractal Explorer [5/8]
Java Fractal Explorer [5/8]
WilliamFiset
18 Java Fractal Explorer [2/8]
Java Fractal Explorer [2/8]
WilliamFiset
19 Java Fractal Explorer [7/8]
Java Fractal Explorer [7/8]
WilliamFiset
20 Java Fractal Explorer [1/8]
Java Fractal Explorer [1/8]
WilliamFiset
21 Java Fractal Explorer [3/8]
Java Fractal Explorer [3/8]
WilliamFiset
22 Introduction [Programming Competition Problems]
Introduction [Programming Competition Problems]
WilliamFiset
23 String Manipulation 1 [Programming Competition Problems]
String Manipulation 1 [Programming Competition Problems]
WilliamFiset
24 String Manipulation 2 [Programming Competition Problems]
String Manipulation 2 [Programming Competition Problems]
WilliamFiset
25 Graph Theory 1 [Programming Competition Problems]
Graph Theory 1 [Programming Competition Problems]
WilliamFiset
26 Logic 1 [Programming Competition Problems]
Logic 1 [Programming Competition Problems]
WilliamFiset
27 Grid Problems 1 [Programming Competition Problems]
Grid Problems 1 [Programming Competition Problems]
WilliamFiset
28 Dynamic Programming 1 [Programming Competition Problems]
Dynamic Programming 1 [Programming Competition Problems]
WilliamFiset
29 Introduction to Big-O
Introduction to Big-O
WilliamFiset
30 Dynamic and Static Arrays
Dynamic and Static Arrays
WilliamFiset
31 Dynamic Array Code
Dynamic Array Code
WilliamFiset
32 Linked Lists Introduction
Linked Lists Introduction
WilliamFiset
33 Doubly Linked List Code
Doubly Linked List Code
WilliamFiset
34 Stack Introduction
Stack Introduction
WilliamFiset
35 Stack Implementation
Stack Implementation
WilliamFiset
36 Stack Code
Stack Code
WilliamFiset
37 Queue Introduction
Queue Introduction
WilliamFiset
38 Queue Implementation
Queue Implementation
WilliamFiset
39 Queue Code
Queue Code
WilliamFiset
40 Priority Queue Introduction
Priority Queue Introduction
WilliamFiset
41 Priority Queue Min Heaps and Max Heaps
Priority Queue Min Heaps and Max Heaps
WilliamFiset
42 Priority Queue Inserting Elements
Priority Queue Inserting Elements
WilliamFiset
43 Priority Queue Removing Elements
Priority Queue Removing Elements
WilliamFiset
44 Priority Queue Code
Priority Queue Code
WilliamFiset
45 Union Find Introduction
Union Find Introduction
WilliamFiset
46 Union Find Kruskal's Algorithm
Union Find Kruskal's Algorithm
WilliamFiset
47 Union Find - Union and Find Operations
Union Find - Union and Find Operations
WilliamFiset
48 Union Find Path Compression
Union Find Path Compression
WilliamFiset
49 Union Find Code
Union Find Code
WilliamFiset
50 Binary Search Tree Introduction
Binary Search Tree Introduction
WilliamFiset
51 Binary Search Tree Insertion
Binary Search Tree Insertion
WilliamFiset
52 Binary Search Tree Removal
Binary Search Tree Removal
WilliamFiset
53 Binary Search Tree Traversals
Binary Search Tree Traversals
WilliamFiset
54 Binary Search Tree Code
Binary Search Tree Code
WilliamFiset
55 Fenwick Tree range queries
Fenwick Tree range queries
WilliamFiset
56 Fenwick Tree point updates
Fenwick Tree point updates
WilliamFiset
57 Fenwick Tree construction
Fenwick Tree construction
WilliamFiset
58 Fenwick tree source code
Fenwick tree source code
WilliamFiset
59 Hash table hash function
Hash table hash function
WilliamFiset
60 Hash table separate chaining
Hash table separate chaining
WilliamFiset

This video teaches how to solve the longest common substring problem using a suffix array and a window-based approach. The solution involves building a suffix array and an LCP array, and then using a window-based approach to find the longest common substring. The time complexity of the algorithm is linear, making it efficient for large inputs.

Key Takeaways
  1. Build a suffix array
  2. Build an LCP array
  3. Initialize a window with two different colors
  4. Expand the window downwards until a longest common prefix is found
  5. Shrink the window size if the longest common prefix is not found
  6. Repeat the process until the end of the suffix array is reached
💡 The number of windows to consider is linear in proportion to the number of suffixes, resulting in a linear time complexity.

Related AI Lessons

Bloom Filters, Explained Properly
Learn how Bloom filters work and their benefits, including tiny memory and blazing speed, in exchange for potential false positives.
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Learn how prefix sums enable instant range queries in arrays, boosting performance in various applications
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
A simple math question can destroy a developer's interview, highlighting the importance of being prepared for unexpected questions
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Learn to remove duplicates from a sorted array using the two pointers technique, improving from brute force to optimized solutions
Medium · Python
Up next
Stump Grinder Carbide Wheel Grinds Hardwood To Chips
Innoforge Studio
Watch →