The Knapsack Problem & Genetic Algorithms - Computerphile

Computerphile · Intermediate ·⚡ Algorithms & Data Structures ·5y ago

Key Takeaways

The video demonstrates the use of Genetic Algorithms to solve the Knapsack Problem, a classic problem in computer science, using techniques such as tournament selection, roulette selection, mutation, and crossover.

Full Transcript

i would like to talk about genetic algorithms they were particularly useful for me during my phd and i think in terms of being able to build something yourself and experiment with them i think they're quite good bang for buck and i think you can solve a lot of problems using them which would otherwise be able to do so and even if you can't solve them perfectly you get an indication of what's what and i think that's pretty neat there's a broader category called evolutionary algorithms and this is a particular type of them if you represent a problem in a specific way you can evolve solutions to that problem in a way which mimics biological evolution and it's quite visual as well where we start is we have to have a suitable problem in order in order to be applied genetic algorithm um some problems are more suitable than others and we're going to do a toy problem for this called the knapsack problem you have a knapsack which is like a rucksack or a bergen and you fill it with boxes and each box has a weight and a value and the objective is to fill the bergen with the most value without going over a specified weight limit so a good way to think about the knapsack problem is an aladdin's cave full of treasure so if you're in the desert and you had one rucksack you'd have to fill it with the most valuable items and that's a very good representation of this exact problem we're going to do a simple version abstract problem just because it's easy to understand so we're going to have four boxes and each one has a weight and then we'll write the values below which again will be just made up so if we do five four seven two so these are our four boxes and we've got a bergen here and the goal is to put the maximum value of boxes in this burgundy without going over a weight limit so seeing if these are the weight limits we're going to impose a limit of 15 kilograms so if the limit is 15 what we put into the bergen or the knapsack has got to be less than 15. we can actually represent a solution to this problem with four bits so four either zeros or ones and each one corresponding to a box and this one's pointing to this box what this says is that the first box this one won't be included in the knapsack the second one will the third one will and the fourth one won't and what we can do is then we can add up the values and the weights to see how well this satisfies this problem so the weights are two and one so that's three and three is less than 15 kilograms so what we can do is then we can give it a score on the value so we can actually return the score which is four plus seven which is eleven so this solution here scores 11. now if we were gonna do one which is perhaps not so successful so we can do one one zero one this solution has these two boxes in and these two boxes are quite heavy so what we know with the limit being 15 if we add 9 and 7 that's 16 which is above the limit so what this solution returns is zero so this is a useless solution we don't want it and the goal is is to get over time the best combination of boxes so that we can get the most value without being overweight so now we know how to represent our solutions what we can do is we can make a population of solutions which for now will just be random so we got no idea how well they're going to score so this is our population that we've got to start with so this is eight randomly generated solutions so what we did previously when we defined the problem that's fitness function so if we were going to write that into code what would happen is we would be able to give it one of these solutions and it would return how good that solution is so what we can do is we can go through each one of these and we can get a score to say how fit this solution is we can put the scores here so let's just have a little table so basically they get the score of the value but if the weight's too high to get zero exactly right yep that's it so we've now got our score so we've got an idea of how good these solutions are it's similar to survival of the fittest so in general with this algorithm the better solutions tend to survive and propagate over time what we do now is once we've got our population and we've got the fitnesses of that population we do something called selection we have to select some of the population to go forward to the next step there are many ways we can do this one of the ways which is quite common is something called roulette wheel selection where each one of these solutions will get a chunk of an artificial roulette wheel and the bigger the score the bigger the chunk and we spin it generally if you have a high score it's like you've got a bigger chunk and you're more likely to be selected however that's not always the case we'll do a different one which is objectively quite good and um it's easier to implement and that's called tournament selection so what we're gonna do is we're gonna have a mini tournament and we're gonna randomly select two members of the population two solutions to have a little battle and the one with the highest score wins these are just picked randomly so we're going to pick this solution and this one for the first tournament okay and the winner of this tournament is this one so the best score here for this tournament was top one is that six and then we're going to have another tournament between say this one and this one so these two have a little battle and it's this one that wins we've now got our two parents which we will draw here so they were the lucky winners of the tournaments so we've got zero one zero one for the other parent we've got zero zero one one so then what we can do is we can do something called crossover which is when we cross over some information from parent a with some information from parent b and we're going to do this by basically splitting in half and just swapping each part over so the idea of this is that then we get two children which have a little bit of parent a and a little bit of parent b and when we're using a genetic care and we do this according to it to a rate so we can set a parameter at the start called the crossover rate a lot of playing with genetic algorithms is changing the the rates and population size so yes we're gonna swap over our two parents to create two children so we're gonna do the first two for the first one and the last two and then so then we will have two children now and we'll draw them down here so the first one zero one from here and then one one from this parent and then we'll have zero zero from this one and then zero one from up here so these are our two children so we've done crossover to get the children so now we've got one final um operator and that's mutation so we also have a mutation rate as well as a crossover rate and a mutation rate is typically a lot lower than the um crossover rate um so typically something like 0.05 0.01 0.1 something like that and what we do is we loop over every bit of information that we've got in the two children and for each one of these we pull a random number between zero and one if it's less than the mutation rate we flip a bit okay so this mimics mutation in nature the idea that sometimes you can just get random unexpected changes which adds the variance to the population which over time is quite useful so what we're going to do is we are going to flip just this bit so now this child is we started off with we had our population we went through tournament selection we had a little mini tournament we've got two parents we then crossed the parents over in this instance that doesn't always happen but it did here and then we've gone through mutation and mutation has flipped one of the bits from a one to a zero so these are now our two completed children once we have our children here what we do is we repeat this process until we get an equivalent amount of children that were in our original population so what we do is we go back to this stage and we've already calculated the fitnesses of every single member of the population so what we do is we just generate another random tournament so it won't be this one and this one or it could be but it will likely be a different tournament so we could do this one and this one for the first tournament and then say this one and this one for the second tournament and then we would generate another two children and then we go back and do the genetic operations on those two children and then if we've done that twice we'll have four children so we'd repeat that process until we have the same amount of children as we had in the population and then once we've swapped over the population of children we've created with the old population that's one generation okay and typically what we do is you run this for hundreds 250 500 000 generations and over time what you should see is the average fitness of the population gets much better and the best fitness gets a lot better and generally what you see is you see a curve that goes up and then it tapers off at the end and that's generally a good time to stop the algorithm i'm guessing this is normally done with a lot more parameters than four right yes so you're exactly right this problem is trivial it's more interesting if you add other parameters so if instead of having weight and value you could have weight value and size or additionally you have something like weight value and robustness so if it's like an aladdin's k full of jewelry um you'd want to take the ones that are lowest in weight highest in value but if two were the same in that respect you'd probably elect to take a piece of gold rather than a china pot because it might break in your rucksack so when you start having more parameters it becomes less easy for to see what exactly is a good solution to this problem and indeed how good a given box is because if some parameters are really high and some are really low you don't quite know where it exists and when you start having a hundred a thousand of these this is when um jayhawns can be quite useful they sometimes don't give you the optimum solution in fact they often don't but they give you a good idea of what direction it's going in so you can get a better grasp of the problem at hand this is a very vanilla genetic algorithm and there are some things that could be improved and one of them is if we look at this sheet here this is our original population where we had scores this solution here wasn't picked even though it was the highest so it was just randomly not picked in this instance but hypothetically it is possible for this solution solution not to be picked at all and not to not be passed to the next generation so there are some strategies which you can use to overcome this one of them is elitism which is where you basically take the best solution from the population and then you put it in the next generation anyway so you just remove one from the the new eight that we had the new population and then just put that one in here if we were going to use this legitimately to solve this problem there are some parameters in here which would be better if they were changed so for example the population size 8 is good visually but generally if we're going to start out using genetic algorithms we use a population size for something like 100 250 a thousand something like that because with this it's very hard to get variation partly because we've only got eight solutions but partly because the problem's so simple so if we did have value weight and then we had size because some shapes are better than others and robustness and then we had a hundred boxes i'd probably start with a population size of 500 and i'd have a crossover rate of 0.5 so crossover happens half the time and probably a mutation rate of 0.05 and i'd use a tournament selection because very rare you have a tournament selection size too it's not particularly useful i have a tournament selection of something like four or eight those parameters just loosely should help preserve the heterogeneity within the population to make sure that you have a rich space to look through because ultimately this is a search algorithm this is just going through this search space here to try and find the best solution and if you solve other functions you can actually plot how the jet algorithm is moving through a specific function you only have to work out whether it's worth alerting the user if you find the key so you know you download the temporary exposure key you perform the encryption you generate the potential rpis and you compare them with the ones you've seen or if you want a more slightly comprehensible message it's saying maybe you haven't applied a function to enough arguments

Original Description

Tournament selection, roulette selection, mutation, crossover - all processes used in genetic algorithms. Dr Alex Turner explains using the Knapsack Problem. https://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computer Science at the University of Nottingham: https://bit.ly/nottscomputer Computerphile is a sister project to Brady Haran's Numberphile. More at http://www.bradyharan.com
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Computerphile · Computerphile · 0 of 60

← Previous Next →
1 Follow the Cookie Trail - Computerphile
Follow the Cookie Trail - Computerphile
Computerphile
2 EXTRA BITS - Follow the Cookie Trail - Computerphile
EXTRA BITS - Follow the Cookie Trail - Computerphile
Computerphile
3 Musical Floppy Drives - Computerphile
Musical Floppy Drives - Computerphile
Computerphile
4 The Hair Algorithm - Computerphile
The Hair Algorithm - Computerphile
Computerphile
5 Getting Sorted & Big O Notation - Computerphile
Getting Sorted & Big O Notation - Computerphile
Computerphile
6 Quick Sort - Computerphile
Quick Sort - Computerphile
Computerphile
7 Hyper History and Cyber War - Computerphile
Hyper History and Cyber War - Computerphile
Computerphile
8 Entropy in Compression - Computerphile
Entropy in Compression - Computerphile
Computerphile
9 Original Elite on the BBC B - Computerphile
Original Elite on the BBC B - Computerphile
Computerphile
10 IP Addresses and the Internet - Computerphile
IP Addresses and the Internet - Computerphile
Computerphile
11 A Career in Video Games - Computerphile
A Career in Video Games - Computerphile
Computerphile
12 Error Detection and Flipping the Bits - Computerphile
Error Detection and Flipping the Bits - Computerphile
Computerphile
13 Programming BASIC and Sorting - Computerphile
Programming BASIC and Sorting - Computerphile
Computerphile
14 Birthplace of the World Wide Web - Computerphile
Birthplace of the World Wide Web - Computerphile
Computerphile
15 Punch Card Programming - Computerphile
Punch Card Programming - Computerphile
Computerphile
16 Programming Paradigms - Computerphile
Programming Paradigms - Computerphile
Computerphile
17 CERN Computing Centre (and mouse farm) - Computerphile
CERN Computing Centre (and mouse farm) - Computerphile
Computerphile
18 Error Correction - Computerphile
Error Correction - Computerphile
Computerphile
19 Home-Made Code - Computerphile
Home-Made Code - Computerphile
Computerphile
20 Security of Data on Disk - Computerphile
Security of Data on Disk - Computerphile
Computerphile
21 Gesture Controls - Computerphile
Gesture Controls - Computerphile
Computerphile
22 How Intelligent is Artificial Intelligence? - Computerphile
How Intelligent is Artificial Intelligence? - Computerphile
Computerphile
23 Encryption and Security Agencies - Computerphile
Encryption and Security Agencies - Computerphile
Computerphile
24 Virtual Machines Power the Cloud - Computerphile
Virtual Machines Power the Cloud - Computerphile
Computerphile
25 Hacking Websites with SQL Injection - Computerphile
Hacking Websites with SQL Injection - Computerphile
Computerphile
26 How Huffman Trees Work - Computerphile
How Huffman Trees Work - Computerphile
Computerphile
27 Cracking Websites with Cross Site Scripting - Computerphile
Cracking Websites with Cross Site Scripting - Computerphile
Computerphile
28 Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile
Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile
Computerphile
29 Texting Cabbage with a Recorder - Computerphile
Texting Cabbage with a Recorder - Computerphile
Computerphile
30 Hashing Algorithms and Security - Computerphile
Hashing Algorithms and Security - Computerphile
Computerphile
31 How YouTube Works - Computerphile
How YouTube Works - Computerphile
Computerphile
32 How NOT to Store Passwords! - Computerphile
How NOT to Store Passwords! - Computerphile
Computerphile
33 A New Golden Age of Video Games - Computerphile
A New Golden Age of Video Games - Computerphile
Computerphile
34 A Universe of Triangles - Computerphile
A Universe of Triangles - Computerphile
Computerphile
35 Cross Site Request Forgery - Computerphile
Cross Site Request Forgery - Computerphile
Computerphile
36 The True Power of the Matrix (Transformations in Graphics) - Computerphile
The True Power of the Matrix (Transformations in Graphics) - Computerphile
Computerphile
37 The Great 202 Jailbreak - Computerphile
The Great 202 Jailbreak - Computerphile
Computerphile
38 EXTRA BITS - Printing and Typesetting History - Computerphile
EXTRA BITS - Printing and Typesetting History - Computerphile
Computerphile
39 Triangles to Pixels - Computerphile
Triangles to Pixels - Computerphile
Computerphile
40 The Problem with Time & Timezones - Computerphile
The Problem with Time & Timezones - Computerphile
Computerphile
41 The Visibility Problem - Computerphile
The Visibility Problem - Computerphile
Computerphile
42 Lights and Shadows in Graphics - Computerphile
Lights and Shadows in Graphics - Computerphile
Computerphile
43 The Penguin Barcode - Computerphile
The Penguin Barcode - Computerphile
Computerphile
44 Typesetters in the '80s - Computerphile
Typesetters in the '80s - Computerphile
Computerphile
45 The Font Magicians - Computerphile
The Font Magicians - Computerphile
Computerphile
46 The Little Mac with the Big Bite - Computerphile
The Little Mac with the Big Bite - Computerphile
Computerphile
47 EXTRA BITS - More on the Original Mac at 30 - Computerphile
EXTRA BITS - More on the Original Mac at 30 - Computerphile
Computerphile
48 XP to Ubuntu with an 8yr old Hacktop - Computerphile
XP to Ubuntu with an 8yr old Hacktop - Computerphile
Computerphile
49 EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile
EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile
Computerphile
50 EXTRA BITS - Making a Bootable USB in Linux - Computerphile
EXTRA BITS - Making a Bootable USB in Linux - Computerphile
Computerphile
51 EXTRA BITS - Installing Ubuntu Permanently - Computerphile
EXTRA BITS - Installing Ubuntu Permanently - Computerphile
Computerphile
52 The Dawn of Desktop Publishing - Computerphile
The Dawn of Desktop Publishing - Computerphile
Computerphile
53 What is Bootstrapping? - Computerphile
What is Bootstrapping? - Computerphile
Computerphile
54 Reverse Polish Notation and The Stack - Computerphile
Reverse Polish Notation and The Stack - Computerphile
Computerphile
55 Home-Made Z80 Retro Computer - Computerphile
Home-Made Z80 Retro Computer - Computerphile
Computerphile
56 Should Everybody Learn to Code? - Computerphile
Should Everybody Learn to Code? - Computerphile
Computerphile
57 Programming in PostScript - Computerphile
Programming in PostScript - Computerphile
Computerphile
58 Heartbleed, Running the Code - Computerphile
Heartbleed, Running the Code - Computerphile
Computerphile
59 YouTube's Secret Algorithm - Computerphile
YouTube's Secret Algorithm - Computerphile
Computerphile
60 YouTube Search & Discovery - Computerphile
YouTube Search & Discovery - Computerphile
Computerphile

This video teaches how to use Genetic Algorithms to solve the Knapsack Problem, a classic problem in computer science, and demonstrates the use of techniques such as tournament selection, roulette selection, mutation, and crossover. The video provides a comprehensive introduction to Genetic Algorithms and their application to optimization problems. By watching this video, viewers can learn how to apply evolutionary computation to complex problems and use Genetic Algorithms for unsupervised and s

Key Takeaways
  1. Represent the Knapsack Problem as a binary string of 0s and 1s
  2. Define a fitness function to evaluate the quality of a solution
  3. Generate a population of solutions randomly
  4. Evaluate each solution in the population using the fitness function
  5. Evolve the population over time to find the best combination of boxes that maximizes value without exceeding the weight limit
  6. Use tournament selection or roulette wheel selection to select parents
  7. Perform crossover to combine information from parents
  8. Perform mutation to introduce random changes
💡 Genetic Algorithms can be used to solve complex optimization problems such as the Knapsack Problem by iteratively generating new populations and selecting the fittest individuals.

Related Reads

📰
Advanced Stack ApplicationsData Structures and Algorithms Deep‑Dive — Advanced Stack Applications…
Learn advanced stack applications in data structures and algorithms to improve coding skills
Medium · Programming
📰
The Minecraft anvil is a tree-cost optimization problem in disguise
Optimize tree costs in Minecraft using graph theory and algorithms, just like the anvil repair system
Dev.to · Mark
📰
KMP Algorithm (Knuth-Morris-Pratt): The Smart Way to Perform String Matching in O(N)
Learn the KMP algorithm for efficient string matching in O(N) time complexity and improve your coding skills
Dev.to · Jaspreet singh
📰
Every Backtracking Problem Is the Same Three Lines. I Just Couldn't See the Tree.
Master backtracking problems with a simple three-line approach to improve problem-solving skills in coding interviews and challenges
Dev.to · Alex Mateo
Up next
Stump Grinder Carbide Wheel Grinds Hardwood To Chips
Innoforge Studio
Watch →