Choosing Indexes for Similarity Search (Faiss in Python)

James Briggs · Intermediate ·🔍 RAG & Vector Search ·4y ago

Key Takeaways

The video discusses the use of Facebook AI Similarity Search (Faiss) in Python for efficient similarity search, with a focus on choosing indexes for optimal performance. Faiss is used with various indexing techniques, including flat indexes, LSH, and HNSW, to achieve sub-second search times for large datasets.

Full Transcript

hi welcome to the video i'm going to take you through a few different indexes in five today suffice for similarity search and we're going to learn how we can decide which index to use based on our data now these indexes are reasonably complex but we're going to just have a high level look at each one of them at some point in the future we'll go into more depth for sure but for now this is what we're going to do so we're going to cover the indexes that you see on the screen at the moment so we have the flat indexes which are just the plain and simple nothing special going on there and then we're going to have a look at lsh or locality sensitive hashing hn sw which is hierarchical navigable small worlds and then finally we're going to have a look at an ivf index as well so first thing i'm going to show you is how to get some data for following through this so we're going to be using the sift 1m data set which is 1 million vectors that we can use for testing similarity now there's a little bit of code so i'm just going to show it to you so we have here we're just downloading the code um there'll be a notebook for this in the description as well so you can just use that and copy things across but we're downloading it from here and this will give us a tar file so we download that and then here all doing is extracting all the files from inside the tar file and then here i'm reading everything into the notebook so inside that tar file we'll get these f files and we have to open them in a certain way which is what we're doing here so we're setting up the the function to read them sorry here and then here i'm reading in two files so we get a few different files here so i'm sorry this should be sift so we get the base data which is going to be the data that we're going to search through and then we also have query data here and then what i'm doing here is just selecting a single query or a single vector to query with rather than all of them because we get quite a few in there then here we can just see so this is our query vector the xq and then we also have wb here which is going to be the data that will index and search through and we can see some of it there as well so that's how we get data let's move on to some flat indexes so what you can see at the moment is a sort of a visual representation of a flat l2 index now up here this is what we're doing so we're calculating we have all these points so these are all of the wb points that we saw before and this is our query vector and we just calculate the distance between all of those and then what we do is just take the top three so the top k in reality but in this case it's top three now we also have ip so we have both l2 distance and ip distance as well ip works in a different way so we're using a different formula to actually calculate the distance or similarity there so it's not exactly as you as you see here but before we write any code i just want to say with flat indexes they are you know 100 quality and typically what we want to do with files and similarity search indexes is balance the search quality versus speed higher search quality usually slower search speed and flight indexes are just pure search quality because there are an exhaustive search so they check the distance between your query vector and every other vector in the index which is fine if you don't have a particularly big data set or you don't care about time but if you do then you probably don't want to use that because it can take an incredibly long time if you have a billion vectors in your data set and you do 100 queries a minute then as far as i know it's it's impossible to to run that and if you are going to run that you need some pretty insane hardware so we can't use flight indexes in exhaustive search in most cases but i will show you how to do it so first i'm just going to define the dimensionality of our data which is 128 which we can see up here one two eight i'm also going to say how many so how many results do you want to do so i'm going to say 10. okay we also need to import files before we do anything and then we can initialize our index so i said we have two so we have five index flat l2 or ip i'm going to use ip because it's very slightly faster it seems from me testing it it's very slightly faster but there's hardly any difference in reality so it initializes our index and then we want to add our data to it so we add wb and then we perform a search so let me create a new cell and let me just run this quickly okay and what i'm going to do is just time it so you can see how long this takes as well so i'm going to do time and we're going to do index or so d i equals index search then in here we have our query vector and how many samples we'd like to return so i'm going to go with k okay so that was reasonably quick and that's cause we're not we don't have a huge data set and we're just searching for one query so it's not really too much of a problem there but what i do want to show you is so if we print out i that returns all of the ids or the indexes of the 10 most similar vectors now i'm going to use that as a baseline for each of our other indexes so this is like i said 100 quality and we can use this accuracy to test out other indexes as well so what i'm going to do is take that and convert it into a list and if we just have a look what we get we'll see that we get a list like that and we're just going to use that like i said to see how our other indexes are performing so we'll move on to the other indexes and like i said before we want to try and go from this which is the flight indexes where it's just 100 search quality to something that's more 5050 but it depends on our use case as well sometimes we might want more speed sometimes higher quality so we will see a few of those through these indexes so we start with lsh so a very high level lsh works by grouping vectors in two different buckets now what we can see on the screen now is a typical hashing function for like a python dictionary and what these hashing functions do is they try to minimize collisions so collision is where we would have the case of two items maybe say these two being hashed into the same bucket and we with a dictionary you don't want that because you want every bucket to be an independent value otherwise it increases the complexity of extracting your values from a single bucket if they've collided now ls h is slightly different because we actually do want to group things so we can see it as a as a dictionary but rather than whereas before we were avoiding those collisions you can see here we're putting them into completely different buckets every time rather than doing that we're trying to maximize collisions so you can see here that we've pushed all three of these keys into this single bucket here and we've also pushed all of these keys into this single bucket so we get groupings of our values now when it comes to performing our search we before we process our query through the same hashing function and that will push it to one of our buckets now in the case of maybe appearing in this bucket here we use hamming distance to find the nearest bucket and then we can search or we restrict our scope to these values so we just restricted our scope there which means that we do not need to search through everything so we are avoiding searching through those values down there now let's have a look at how we implement that so it's pretty straightforward all we do is index we do index lsh we have our dimensionality and then we also have this other variable which is called n bits so i will put that in a variable up here do n bits and what i'm going to do is i'm going to make it d multiplied by four so n bits we will have to scale with the dimensionality of our data which comes into another problem which i'll mention later on which the cursive dimensionality but i'll talk more about it in a moment so here we have n bits and then we add our data like we did before and then we can search our data just like we did before so time and we do we want d i equals index search and we are searching using our query our search query and we want to return 10 items okay so quicker speed see here and what we can also do is compare the results to our 100 quality index or flight index and we do that using numpy in 1d baseline i okay so i'm just going to look at it visually here so we can see we have quite a lot of matches so plenty trues a couple of falses true photos so these are the top 10 that have been returned using our lsh algorithm and we're checking if they exist in the baseline results that we got from our flight index earlier and we're returning that most of them are present in that baseline so most of them do match so it's you know reasonably good recall there so that's good and it was faster so we've got 17.6 milliseconds here how much did we get up here we got 157 milliseconds so slightly less accurate but what is that 10 times faster so it's pretty good and we can mess around with n bits we can increase it to increase the accuracy of our index or we decrease it to increase the speed so again it's just trying to balance find that balance between them both okay so this is a graph of just showing you the the recall so with different end bit values so as we sort of saw before we increase the end bits value for good recall but at the same time we have that curse of dimensionality so if we are multiplying our dimensionality value d by eight in order to get a good recall then if we have a dimensionality four that's not a very high number so it's gonna be reasonably fast but if we increase that to a dimensionality for example 512 that becomes very very complex very quickly so you have to be careful with your dimensionality lower dimensionality is very good for lsh otherwise it's not so good you can see that here so at the bottom here i've used this is on the same data set so an n bits value of d multiplied by two the with lsh it's super fast it's faster than our flat index which is you know what you would hope but if we increase the n bits value quite a bit so maybe we want very high performance then it gets out of hand very quickly and our search time it just grows massively so you kind of have to find that balance but what we got before was pretty good we had a d multiplied by e4 i think and we got reasonable performance and it was it was fast so it's good and that also applies to the index side as well so low end bit size index size isn't too bad with higher end bits it's pretty huge so also something to think about let's move on to hnsw now hnsw is well the first part of it is nsw which is navigate small world graphs now what makes a graph small world it's essentially means that this graph can be very large but the number of hops so the number of steps you need to take between any two vertexes which is the the points is very low so in in this example here we have this vertex over here and to get over to this one on the opposite side we need to take one two three four hops and this is obviously a very small network so it doesn't really count but you can see this sort of behavior in very large networks so i think in 2016 there was a study from facebook and at that point i don't remember the exact number of people that they had on the platform but it's i think it's in the billions and they found that the average number of hops that you need to take between any two people on the platform is like 3.6 so that's a very good example of a navigable small world graph now hierarchical nsw graphs which is what we are using they're built in the same way like a nsw graph but then they're split across multiple layers which is what you can see here and when we are performing our search the path it takes will hop between different layers in order to find our nearest neighbor now it's pretty complicated and this is really i think oversimplifying it a lot but that's the general gist of it i'm not going to go any further into it we will i think in a future video and article now let's put that together in code so we have a few different variables here we have m which i'm going to set 50 to 16 and m is the number of connections that each vertex so of course that means greater connectivity we're probably going to find our nearest neighbors more accurately ef search which is how what is the depth of our search every time we we perform a search so we we can set this to a higher value if we want to search more for the network or a lower value if you want to search less of a network obviously low values can be quicker higher value it's going to be more accurate and then we have ef construction now this similar to ef search is how much of the network will we search but not during the actual search during the construction of the network so this is essentially how efficiently and accurately are we going to build the network in the first place so this will increase the the ad time but the search time it makes no difference on so it's good to use a high number i think for this one so we'll initialize our index and we have it's index h and sw flat so we can use different uh vectors here we can i think pq pq there and essentially what that's going to do is make this search faster but slightly less accurate now this is already really fast with flats and that's all we're going to stick with but again like i said we will return to this at some point in future and cover it in a lot more detail for sure so dimensionality we need to pass in our m value here as well now we want to apply those two parameters so we have ef search which is obviously f search and then we also have hmswd obviously the ef construction so that should be everything ready to go and all we want to do now is add our data so index.add wb okay now like i said we have that ef construction use a reasonably high value so you can see this is already taking a lot longer than the previous indexes to actually add our vectors into it but it's still not going to take that long and then once it is done we are going to do our search just like we did every other time so we have d i equals search sorry index.search and we are going to pass in our query and also k okay so 43.6 seconds to add the vector set so a fair bit longer but then look at this super fast like that 3.7 milliseconds so much faster than last one i think the last one was 16 milliseconds right okay this is a flight index 157 lsh we have 17.6 okay so really quick which is cool but how's the how's the performance so let's have a look okay so we get quite a few falses here and only a couple of truths so okay it's not so great it was really fast but it's not very accurate but fortunately we can fix that so let's increase our ef search i'm going to increase it a fair bit let's go 32-32 and this is probably i would imagine more than enough uh to get good performance so run this and run this okay and now we see we get pretty good results now the wartime is higher so it's just a case of balancing it because this is now higher than lsh but what we can do is increase ef construction time the value for ef construction increases or decrease those depending on what you want so a lot of flexibility with this and it can be really fast this is hnsw is essentially one of the best performing indexes that you can use if you look at the current state of the art a lot of them are hmsw or they're based on hsw in in some way or another so these are good ones good ones to go with you just need to play around them a little bit so this is a few of the sort of performance i found using the same data set but i'm messing around so we have the ef construction values down here so we start with 16 over here up to 64. ef search values over here and our m values over here and we've got pretty good recall over 64 on the uh ef construction so if construction is a really good one to just increase because it doesn't increase your search time which is it's pretty cool i think and then here is the the search time again for hsw m and ef search obviously i didn't include ef construction lab because it doesn't make a difference and that's this is the one thing uh with hsw the index size is absolutely huge so that's just one thing uh to bear in mind the the index size you can take a lot of memory but otherwise really really cool index and then that leads us on to our final index which is the ivf index and this is super popular and with good reasons it is very good so the inverted file index is based on essentially clustering uh data points so we see here we have all of these different data points the little crosses and then we have these three other points which are going to be our cluster centroids so around each or base in each of our plus centroids we expand attachment radius around each of those and as you can see here where each of those circles collides it creates the edge of what are going to be our almost like catchment cells this is called a voronoi diagram or try it's a really hard word direct tessellation i don't know if that's correct but it sounds i i think it sounds pretty cool so i thought throw that in there so we create these cells in each one of those cells any data point within those cells will be allocated to that given centroid and then when you search within a specific cell you you pass your xq value in there and that will be compared the xq value will be compared to every single cluster centroid but not the other values within that cluster or the other clusters only the cluster centroids and then from that you find out which centroid is the closest to your query vector and then what we do is we restrict our search scope to only the data points within that cluster order or that cell and then we um and then we calculate the nearest vector so at this point we have all the vectors only within that cell and we compare all of those to our current vector now there is one problem with this which is called the edge problem now we're just showing this in two dimensional space obviously in reality for example that the data that we're using we have 128 dimensions so dimension the edge problem is kind of complicated when you think about it in the hundreds of dimensions but what this is is so with say with our query we we find our query vector is right on the edge of one of the cells and if we sell n probe value so i mentioned m probe here that's how many cells we search if that is set to one it means that we're going to restrict our search to only that cell even though if you if you look at this we have two or we have i'm trying to think so this one for sure is closer to our query vector than any of the magenta data points and possibly also this one and this one but and maybe even this one but we're not going to consider any days because we're restricting our search only to this cell so we're only going to look at you know these data points and also these over here so that's that's the edge problem but we can get around that by not just searching one cell but by searching quite a few so in this case our end probe value is eight and that means we're going to search eight of the nearest centroids or or centroid cells and that's how ivf work let's go ahead and implement that in code so first thing we need to do is it's our endless value which is the number of centroids that we will have within our within our data and then this time so this is a little bit different we need to set the the final vector search that we're going to do so we're this is kind of split into two different operations right so we're searching based on clusters and then we're actually comparing the full vectors within the selected clusters so we need to define how we're going to do that final uh that final search between our four vectors and our query vector so what we do is we write fise so we do index flat we're going to index for ip you can use l2 as well we set our dimensionality so we're just initializing a flat index there and then what we're going to do is feed that into our ivf index so our ivf index is feis index ivf and flat which are using the the flat indexes the flat vectors there we need to pass our quantizer so the this step here the the other step to the search process the dimensionality and also our end list value so how many cells or clusters we're going to have in there and with this because we're clustering data we need to do uh something else so in fact let me show you so if we write index dot is trained we get this false if we wrote off any of our other indexes this would have been true because they don't need to be trained because we're not doing clustering or any other form of training or optimization there so what we need to do is train our index before we use it so we write index train and we just pass all of our vectors into that but it's very quick so it's not really an issue and then we do index add pass our data and then what we do one thing so i want to show you we have our end probe value we'll search with one for now so we search one cell and to search we write d i as we have every other time search execute okay okay so i mean super fast 3.32 milliseconds i think that's maybe the fastest other than how bad performing or low quality hmsw index so let's see how how that's performed so you write mp.in on d baseline hi you can see it's not too bad to be fair like 50 50 almost so that's it's actually pretty good but what we can do if we want it to be even better is we increase the emp value so let's go up to four so that's increased the wartime quite a bit so from like three to 125 which is now super slow actually but now we're getting perfect results and we can maybe decrease that too so now it's faster that could have been a one-off sometimes occasionally you get a really slow search it just happens sometimes so this is so we set em probes two super fast and super accurate so that that's a very good index as well so these are the stats i got in terms of recall and search time in in milliseconds for different endpoint values and different endless values so again so it's just about balancing it again index size uh the only thing that affects your index size here is obviously the size of your data and the endless value but you can increase the endless value loads and the index size hardly increases so this is like increasing by 100 kilobytes per like double of the endless value so it's it's very it's like nothing so that's it for this video and we we covered quite a lot um so i'm going to leave it there but i think these all these indexes are super useful and and quite interesting and figuring out just playing around with them like you've seen i i've done loads with these with these graphs just seeing what is faster what is slower what where the good quality is and just playing around with parameters and seeing what you can get out of it is super useful for actually understanding these now what i do want to do going forwards is actually explore each one of these indexes in more depth because we've only covered them like very very very high level at the moment so in future videos articles we're going to go into more depth and explore them a lot more so that'll be pretty interesting i think so that's it for this video thank you very much for watching and i'll see you in the next one bye

Original Description

Facebook AI Similarity Search (Faiss) is a game-changer in the world of search. It allows us to efficiently search a huge range of media, from GIFs to articles - with incredible accuracy in sub-second timescales for billion+ size datasets. The success in Faiss is due to many reasons. One of those, in particular, is its flexibility. Faiss recognizes that there is no 'one-size-fits-all' in similarity search. Instead, Faiss comes with a wide range of search indexes - which we can mix and match to our choosing. However, this great flexibility produces a question - how do we know which size fits our use case? Which index do we choose? Should we use multiple indexes, or is one enough? This video will explore the pros and cons of some of the most important indexes - Flat, LSH, HNSW, and IVF. We will learn how we decide which to use and the impact of parameters in each index to build some of the best indexes for semantic search. 🌲 Pinecone Article: https://www.pinecone.io/learn/vector-indexes/ 🎉 Sign-up For New Articles Every Week on Medium! https://medium.com/@jamescalam/membership Download script for Sift1M dataset: https://gist.github.com/jamescalam/a09a16c17b677f2cf9c019114711f3bf Similarity Search Series: https://www.youtube.com/playlist?list=PLIUOU7oqGTLhlWpTz4NnuT3FekouIVlqc 🤖 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5 👾 Discord https://discord.gg/c5QtDB9RAP Mining Massive Datasets Book (Similarity Search): 📚 https://amzn.to/3CC0zrc (3rd ed) 📚 https://amzn.to/3AtHSnV (1st ed, cheaper) 🕹️ Free AI-Powered Code Refactoring with Sourcery: https://sourcery.ai/?utm_source=YouTub&utm_campaign=JBriggs&utm_medium=aff
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from James Briggs · James Briggs · 51 of 60

1 Stoic Philosophy Text Generation with TensorFlow
Stoic Philosophy Text Generation with TensorFlow
James Briggs
2 How to Build TensorFlow Pipelines with tf.data.Dataset
How to Build TensorFlow Pipelines with tf.data.Dataset
James Briggs
3 Every New Feature in Python 3.10.0a2
Every New Feature in Python 3.10.0a2
James Briggs
4 How-to Build a Transformer for Language Classification in TensorFlow
How-to Build a Transformer for Language Classification in TensorFlow
James Briggs
5 How-to use the Kaggle API in Python
How-to use the Kaggle API in Python
James Briggs
6 Language Generation with OpenAI's GPT-2 in Python
Language Generation with OpenAI's GPT-2 in Python
James Briggs
7 Text Summarization with Google AI's T5 in Python
Text Summarization with Google AI's T5 in Python
James Briggs
8 How-to do Sentiment Analysis with Flair in Python
How-to do Sentiment Analysis with Flair in Python
James Briggs
9 Python Environment Setup for Machine Learning
Python Environment Setup for Machine Learning
James Briggs
10 Sequential Model - TensorFlow Essentials #1
Sequential Model - TensorFlow Essentials #1
James Briggs
11 Functional API - TensorFlow Essentials #2
Functional API - TensorFlow Essentials #2
James Briggs
12 Training Parameters - TensorFlow Essentials #3
Training Parameters - TensorFlow Essentials #3
James Briggs
13 Input Data Pipelines - TensorFlow Essentials #4
Input Data Pipelines - TensorFlow Essentials #4
James Briggs
14 6 of Python's Newest and Best Features (3.7-3.9)
6 of Python's Newest and Best Features (3.7-3.9)
James Briggs
15 Novice to Advanced RegEx in Less-than 30 Minutes + Python
Novice to Advanced RegEx in Less-than 30 Minutes + Python
James Briggs
16 Building a PlotLy $GME Chart in Python
Building a PlotLy $GME Chart in Python
James Briggs
17 How-to Use The Reddit API in Python
How-to Use The Reddit API in Python
James Briggs
18 How to Build Custom Q&A Transformer Models in Python
How to Build Custom Q&A Transformer Models in Python
James Briggs
19 How to Build Q&A Models in Python (Transformers)
How to Build Q&A Models in Python (Transformers)
James Briggs
20 How-to Decode Outputs From NLP Models (Python)
How-to Decode Outputs From NLP Models (Python)
James Briggs
21 Identify Stocks on Reddit with SpaCy (NER in Python)
Identify Stocks on Reddit with SpaCy (NER in Python)
James Briggs
22 Sentiment Analysis on ANY Length of Text With Transformers (Python)
Sentiment Analysis on ANY Length of Text With Transformers (Python)
James Briggs
23 Unicode Normalization for NLP in Python
Unicode Normalization for NLP in Python
James Briggs
24 The NEW Match-Case Statement in Python 3.10
The NEW Match-Case Statement in Python 3.10
James Briggs
25 Multi-Class Language Classification With BERT in TensorFlow
Multi-Class Language Classification With BERT in TensorFlow
James Briggs
26 How to Build Python Packages for Pip
How to Build Python Packages for Pip
James Briggs
27 How-to Structure a Q&A ML App
How-to Structure a Q&A ML App
James Briggs
28 How to Index Q&A Data With Haystack and Elasticsearch
How to Index Q&A Data With Haystack and Elasticsearch
James Briggs
29 Q&A Document Retrieval With DPR
Q&A Document Retrieval With DPR
James Briggs
30 How to Use Type Annotations in Python
How to Use Type Annotations in Python
James Briggs
31 Extractive Q&A With Haystack and FastAPI in Python
Extractive Q&A With Haystack and FastAPI in Python
James Briggs
32 Sentence Similarity With Sentence-Transformers in Python
Sentence Similarity With Sentence-Transformers in Python
James Briggs
33 Sentence Similarity With Transformers and PyTorch (Python)
Sentence Similarity With Transformers and PyTorch (Python)
James Briggs
34 NER With Transformers and spaCy (Python)
NER With Transformers and spaCy (Python)
James Briggs
35 Training BERT #1 - Masked-Language Modeling (MLM)
Training BERT #1 - Masked-Language Modeling (MLM)
James Briggs
36 Training BERT #2 - Train With Masked-Language Modeling (MLM)
Training BERT #2 - Train With Masked-Language Modeling (MLM)
James Briggs
37 Training BERT #3 - Next Sentence Prediction (NSP)
Training BERT #3 - Next Sentence Prediction (NSP)
James Briggs
38 Training BERT #4 - Train With Next Sentence Prediction (NSP)
Training BERT #4 - Train With Next Sentence Prediction (NSP)
James Briggs
39 FREE 11 Hour NLP Transformers Course (Next 3 Days Only)
FREE 11 Hour NLP Transformers Course (Next 3 Days Only)
James Briggs
40 New Features in Python 3.10
New Features in Python 3.10
James Briggs
41 Training BERT #5 - Training With BertForPretraining
Training BERT #5 - Training With BertForPretraining
James Briggs
42 How-to Use HuggingFace's Datasets - Transformers From Scratch #1
How-to Use HuggingFace's Datasets - Transformers From Scratch #1
James Briggs
43 Build a Custom Transformer Tokenizer - Transformers From Scratch #2
Build a Custom Transformer Tokenizer - Transformers From Scratch #2
James Briggs
44 3 Traditional Methods for Similarity Search (Jaccard, w-shingling, Levenshtein)
3 Traditional Methods for Similarity Search (Jaccard, w-shingling, Levenshtein)
James Briggs
45 3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)
3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)
James Briggs
46 Building MLM Training Input Pipeline - Transformers From Scratch #3
Building MLM Training Input Pipeline - Transformers From Scratch #3
James Briggs
47 Training and Testing an Italian BERT - Transformers From Scratch #4
Training and Testing an Italian BERT - Transformers From Scratch #4
James Briggs
48 Faiss - Introduction to Similarity Search
Faiss - Introduction to Similarity Search
James Briggs
49 Angular App Setup With Material - Stoic Q&A #5
Angular App Setup With Material - Stoic Q&A #5
James Briggs
50 Why are there so many Tokenization methods in HF Transformers?
Why are there so many Tokenization methods in HF Transformers?
James Briggs
Choosing Indexes for Similarity Search (Faiss in Python)
Choosing Indexes for Similarity Search (Faiss in Python)
James Briggs
52 Locality Sensitive Hashing (LSH) for Search with Shingling + MinHashing (Python)
Locality Sensitive Hashing (LSH) for Search with Shingling + MinHashing (Python)
James Briggs
53 How LSH Random Projection works in search (+Python)
How LSH Random Projection works in search (+Python)
James Briggs
54 IndexLSH for Fast Similarity Search in Faiss
IndexLSH for Fast Similarity Search in Faiss
James Briggs
55 Faiss - Vector Compression with PQ and IVFPQ (in Python)
Faiss - Vector Compression with PQ and IVFPQ (in Python)
James Briggs
56 Product Quantization for Vector Similarity Search (+ Python)
Product Quantization for Vector Similarity Search (+ Python)
James Briggs
57 How to Build a Bert WordPiece Tokenizer in Python and HuggingFace
How to Build a Bert WordPiece Tokenizer in Python and HuggingFace
James Briggs
58 Metadata Filtering for Vector Search + Latest Filter Tech
Metadata Filtering for Vector Search + Latest Filter Tech
James Briggs
59 Build NLP Pipelines with HuggingFace Datasets
Build NLP Pipelines with HuggingFace Datasets
James Briggs
60 Composite Indexes and the Faiss Index Factory
Composite Indexes and the Faiss Index Factory
James Briggs

This video teaches how to use Faiss in Python for efficient similarity search, with a focus on choosing optimal indexes for large datasets. The video covers various indexing techniques, including flat indexes, LSH, and HNSW, and provides practical examples and code snippets.

Key Takeaways
  1. Download and extract the SIFT 1M dataset
  2. Import necessary libraries and initialize the index
  3. Add data to the index using index.add()
  4. Perform search using the index
  5. Compare the performance of different indexing techniques
  6. Fine-tune indexing parameters for optimal performance
💡 The choice of index and indexing parameters can significantly impact the performance of similarity search, and balancing index size and search time is crucial for optimal results.

Related AI Lessons

Why you shouldn’t search your documents directly with AI
Learn why directly searching documents with AI can be inefficient and how retrieval-augmented systems can improve the process
Medium · Programming
Your AI Keeps Making Things Up. RAG Is How You Make It Use Real Facts Instead.
Learn how to use RAG to make your AI provide accurate answers based on real facts instead of making things up
Medium · RAG
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · AI
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · Data Science
Up next
RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python
Professor Py: AI Engineering
Watch →