Suffix array finding unique substrings
Key Takeaways
The video discusses using suffix arrays and LCP arrays to find and count unique substrings in a string, providing a space-efficient solution with improved time complexity.
Full Transcript
in this video I want to discuss a neat application of suffix arrays and LCP arrays and that is finding and counting unique sub strings there are a variety of interesting problems in computer science especially bioinformatics that require you to find all the unique substrings of a string the naive algorithm has a terrible time complexity of N squared which requires a lot of space the idea is to generate all the sub strings of the string and dump them inside a set a superior approach is to use the information stored inside the LC pra this provides not only a quick but also a space efficient solution I'm not saying that this is the canonical way of finding all unique sub strings because there exists other notable algorithms such as rabbin carp in combination with bloom filters let's now look at an example of how to find unique sub strings let's now look at an example of how to find all the uni substrings of the string a Z a Z a for every string there are exactly n times n plus 1 over 2 sub strings the proof of this I will leave as an exercise so listener but it's really not that hard to derive now notice that all the sub strings here there are a few duplicate ones I have highlighted the repeated sub strings there are exactly 6 of them and 9 unique ones now let's use the information inside the LCP array to figure out which of those sub strings really were the duplicate once in the table on the right I generated the LCP array for the string a a za za remember what the LCP array represents it represents that two adjacent suffixes of the original string shares certain Americ ters with each other so if the LCP value at a search index is say five and there are five characters in common between those two suffixes in other words there are five repeated sub strings between those two suffixes since they come from the same larger string so if we inspect the first LCP position at index one we see that has a value of one we know that the repeated string is the first character in a so we know that a is a repeated substring the next LCP value is three so there are three repeated substring between a z ay and a Z a Z a namely a AZ and ay z a the next interesting LCP values - for the last two suffixes so there are two repeated substrings here we can eliminate z and z a we can then come up with an interesting way of counting all unique substrings we know how many sub strings there are and that is n times n plus 1 over 2 and we also know the number of duplicate strings that is the sum of all the LCP values if this doesn't make immediate sense try out a few examples and play around with it if we go back to the example we just did with the string a Z a Z a and we set N equal to 5 which is the length of the string then we can get the correct answer of 9 by punching N equals 5 and then removing all the repeated substring values summed up in the LCP array alright so if you're looking for a programming challenge concerning counting substrings like I mentioned here you have the link to the cat is online judge for some practice I also included some source code for the suffix array and LCP are a available on github I get dot-com / William Fiza / - structures thank you for watching and in the next video we'll be talking about other applications of suffix arrays and LCP arrays stay tuned
Original Description
Related Videos:
Suffix array intro: https://www.youtube.com/watch?v=zqKlL3ZpTqs
Longest common prefix (LCP) array: https://www.youtube.com/watch?v=53VIWj8ksyI
Counting unique substrings: https://www.youtube.com/watch?v=m2lZRmMjebw
Longest common substring 1/2: https://www.youtube.com/watch?v=Ic80xQFWevc
Longest common substring 2/2: https://www.youtube.com/watch?v=DTLjHSToxmo
Longest repeated substring: https://www.youtube.com/watch?v=OptoHwC3D-Y
Github link:
https://github.com/williamfiset/algorithms
Kattis link:
https://open.kattis.com/problems/substrings
My website: http://www.williamfiset.com ===================================
Practicing for interviews? I have used, and recommend `Cracking the Coding Interview` which got me a job at Google. Link on Amazon: https://amzn.to/3cvMof5
A lot of the content on this channel is inspired by the book `Competitive Programming` by Steven Halim which I frequently use as a resource and reference. Link on Amazon: https://amzn.to/3wC2nix
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from WilliamFiset · WilliamFiset · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
JES Image Manipulation - 2 - Installation
WilliamFiset
JES Image Manipulation - 3 - User Interface
WilliamFiset
JES Image Manipulation - 5 - Negative
WilliamFiset
JES Image Manipulation - 6 - Black & White
WilliamFiset
JES Image Manipulation - 4 - Grayscale
WilliamFiset
JES Image Manipulation - 8 - Blur
WilliamFiset
JES Image Manipulation - 7 - Edge Detection
WilliamFiset
JES Image Manipulation - 9 - Blend
WilliamFiset
JES Image Manipulation - 10 - Matte
WilliamFiset
JES Image Manipulation - 13 - Rotate90
WilliamFiset
JES Image Manipulation - 12 - Mirroring Picture
WilliamFiset
JES Image Manipulation - 11 - Crop Image
WilliamFiset
JES Image Manipulation - 14 - Stretch picture
WilliamFiset
Java Fractal Explorer [6/8]
WilliamFiset
Java Fractal Explorer [4/8]
WilliamFiset
Java Fractal Explorer [8/8]
WilliamFiset
Java Fractal Explorer [5/8]
WilliamFiset
Java Fractal Explorer [2/8]
WilliamFiset
Java Fractal Explorer [7/8]
WilliamFiset
Java Fractal Explorer [1/8]
WilliamFiset
Java Fractal Explorer [3/8]
WilliamFiset
Introduction [Programming Competition Problems]
WilliamFiset
String Manipulation 1 [Programming Competition Problems]
WilliamFiset
String Manipulation 2 [Programming Competition Problems]
WilliamFiset
Graph Theory 1 [Programming Competition Problems]
WilliamFiset
Logic 1 [Programming Competition Problems]
WilliamFiset
Grid Problems 1 [Programming Competition Problems]
WilliamFiset
Dynamic Programming 1 [Programming Competition Problems]
WilliamFiset
Introduction to Big-O
WilliamFiset
Dynamic and Static Arrays
WilliamFiset
Dynamic Array Code
WilliamFiset
Linked Lists Introduction
WilliamFiset
Doubly Linked List Code
WilliamFiset
Stack Introduction
WilliamFiset
Stack Implementation
WilliamFiset
Stack Code
WilliamFiset
Queue Introduction
WilliamFiset
Queue Implementation
WilliamFiset
Queue Code
WilliamFiset
Priority Queue Introduction
WilliamFiset
Priority Queue Min Heaps and Max Heaps
WilliamFiset
Priority Queue Inserting Elements
WilliamFiset
Priority Queue Removing Elements
WilliamFiset
Priority Queue Code
WilliamFiset
Union Find Introduction
WilliamFiset
Union Find Kruskal's Algorithm
WilliamFiset
Union Find - Union and Find Operations
WilliamFiset
Union Find Path Compression
WilliamFiset
Union Find Code
WilliamFiset
Binary Search Tree Introduction
WilliamFiset
Binary Search Tree Insertion
WilliamFiset
Binary Search Tree Removal
WilliamFiset
Binary Search Tree Traversals
WilliamFiset
Binary Search Tree Code
WilliamFiset
Fenwick Tree range queries
WilliamFiset
Fenwick Tree point updates
WilliamFiset
Fenwick Tree construction
WilliamFiset
Fenwick tree source code
WilliamFiset
Hash table hash function
WilliamFiset
Hash table separate chaining
WilliamFiset
More on: Algorithm Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Bloom Filters, Explained Properly
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Medium · Python
🎓
Tutor Explanation
DeepCamp AI