The Knapsack Problem & Genetic Algorithms - Computerphile

Computerphile · Intermediate ·⚡ Algorithms & Data Structures ·5y ago

Skills: ML Maths Basics80%Unsupervised Learning70%Supervised Learning60%

Key Takeaways

The video demonstrates the use of Genetic Algorithms to solve the Knapsack Problem, a classic problem in computer science, using techniques such as tournament selection, roulette selection, mutation, and crossover.

Full Transcript

i would like to talk about genetic algorithms they were particularly useful for me during my phd and i think in terms of being able to build something yourself and experiment with them i think they're quite good bang for buck and i think you can solve a lot of problems using them which would otherwise be able to do so and even if you can't solve them perfectly you get an indication of what's what and i think that's pretty neat there's a broader category called evolutionary algorithms and this is a particular type of them if you represent a problem in a specific way you can evolve solutions to that problem in a way which mimics biological evolution and it's quite visual as well where we start is we have to have a suitable problem in order in order to be applied genetic algorithm um some problems are more suitable than others and we're going to do a toy problem for this called the knapsack problem you have a knapsack which is like a rucksack or a bergen and you fill it with boxes and each box has a weight and a value and the objective is to fill the bergen with the most value without going over a specified weight limit so a good way to think about the knapsack problem is an aladdin's cave full of treasure so if you're in the desert and you had one rucksack you'd have to fill it with the most valuable items and that's a very good representation of this exact problem we're going to do a simple version abstract problem just because it's easy to understand so we're going to have four boxes and each one has a weight and then we'll write the values below which again will be just made up so if we do five four seven two so these are our four boxes and we've got a bergen here and the goal is to put the maximum value of boxes in this burgundy without going over a weight limit so seeing if these are the weight limits we're going to impose a limit of 15 kilograms so if the limit is 15 what we put into the bergen or the knapsack has got to be less than 15. we can actually represent a solution to this problem with four bits so four either zeros or ones and each one corresponding to a box and this one's pointing to this box what this says is that the first box this one won't be included in the knapsack the second one will the third one will and the fourth one won't and what we can do is then we can add up the values and the weights to see how well this satisfies this problem so the weights are two and one so that's three and three is less than 15 kilograms so what we can do is then we can give it a score on the value so we can actually return the score which is four plus seven which is eleven so this solution here scores 11. now if we were gonna do one which is perhaps not so successful so we can do one one zero one this solution has these two boxes in and these two boxes are quite heavy so what we know with the limit being 15 if we add 9 and 7 that's 16 which is above the limit so what this solution returns is zero so this is a useless solution we don't want it and the goal is is to get over time the best combination of boxes so that we can get the most value without being overweight so now we know how to represent our solutions what we can do is we can make a population of solutions which for now will just be random so we got no idea how well they're going to score so this is our population that we've got to start with so this is eight randomly generated solutions so what we did previously when we defined the problem that's fitness function so if we were going to write that into code what would happen is we would be able to give it one of these solutions and it would return how good that solution is so what we can do is we can go through each one of these and we can get a score to say how fit this solution is we can put the scores here so let's just have a little table so basically they get the score of the value but if the weight's too high to get zero exactly right yep that's it so we've now got our score so we've got an idea of how good these solutions are it's similar to survival of the fittest so in general with this algorithm the better solutions tend to survive and propagate over time what we do now is once we've got our population and we've got the fitnesses of that population we do something called selection we have to select some of the population to go forward to the next step there are many ways we can do this one of the ways which is quite common is something called roulette wheel selection where each one of these solutions will get a chunk of an artificial roulette wheel and the bigger the score the bigger the chunk and we spin it generally if you have a high score it's like you've got a bigger chunk and you're more likely to be selected however that's not always the case we'll do a different one which is objectively quite good and um it's easier to implement and that's called tournament selection so what we're gonna do is we're gonna have a mini tournament and we're gonna randomly select two members of the population two solutions to have a little battle and the one with the highest score wins these are just picked randomly so we're going to pick this solution and this one for the first tournament okay and the winner of this tournament is this one so the best score here for this tournament was top one is that six and then we're going to have another tournament between say this one and this one so these two have a little battle and it's this one that wins we've now got our two parents which we will draw here so they were the lucky winners of the tournaments so we've got zero one zero one for the other parent we've got zero zero one one so then what we can do is we can do something called crossover which is when we cross over some information from parent a with some information from parent b and we're going to do this by basically splitting in half and just swapping each part over so the idea of this is that then we get two children which have a little bit of parent a and a little bit of parent b and when we're using a genetic care and we do this according to it to a rate so we can set a parameter at the start called the crossover rate a lot of playing with genetic algorithms is changing the the rates and population size so yes we're gonna swap over our two parents to create two children so we're gonna do the first two for the first one and the last two and then so then we will have two children now and we'll draw them down here so the first one zero one from here and then one one from this parent and then we'll have zero zero from this one and then zero one from up here so these are our two children so we've done crossover to get the children so now we've got one final um operator and that's mutation so we also have a mutation rate as well as a crossover rate and a mutation rate is typically a lot lower than the um crossover rate um so typically something like 0.05 0.01 0.1 something like that and what we do is we loop over every bit of information that we've got in the two children and for each one of these we pull a random number between zero and one if it's less than the mutation rate we flip a bit okay so this mimics mutation in nature the idea that sometimes you can just get random unexpected changes which adds the variance to the population which over time is quite useful so what we're going to do is we are going to flip just this bit so now this child is we started off with we had our population we went through tournament selection we had a little mini tournament we've got two parents we then crossed the parents over in this instance that doesn't always happen but it did here and then we've gone through mutation and mutation has flipped one of the bits from a one to a zero so these are now our two completed children once we have our children here what we do is we repeat this process until we get an equivalent amount of children that were in our original population so what we do is we go back to this stage and we've already calculated the fitnesses of every single member of the population so what we do is we just generate another random tournament so it won't be this one and this one or it could be but it will likely be a different tournament so we could do this one and this one for the first tournament and then say this one and this one for the second tournament and then we would generate another two children and then we go back and do the genetic operations on those two children and then if we've done that twice we'll have four children so we'd repeat that process until we have the same amount of children as we had in the population and then once we've swapped over the population of children we've created with the old population that's one generation okay and typically what we do is you run this for hundreds 250 500 000 generations and over time what you should see is the average fitness of the population gets much better and the best fitness gets a lot better and generally what you see is you see a curve that goes up and then it tapers off at the end and that's generally a good time to stop the algorithm i'm guessing this is normally done with a lot more parameters than four right yes so you're exactly right this problem is trivial it's more interesting if you add other parameters so if instead of having weight and value you could have weight value and size or additionally you have something like weight value and robustness so if it's like an aladdin's k full of jewelry um you'd want to take the ones that are lowest in weight highest in value but if two were the same in that respect you'd probably elect to take a piece of gold rather than a china pot because it might break in your rucksack so when you start having more parameters it becomes less easy for to see what exactly is a good solution to this problem and indeed how good a given box is because if some parameters are really high and some are really low you don't quite know where it exists and when you start having a hundred a thousand of these this is when um jayhawns can be quite useful they sometimes don't give you the optimum solution in fact they often don't but they give you a good idea of what direction it's going in so you can get a better grasp of the problem at hand this is a very vanilla genetic algorithm and there are some things that could be improved and one of them is if we look at this sheet here this is our original population where we had scores this solution here wasn't picked even though it was the highest so it was just randomly not picked in this instance but hypothetically it is possible for this solution solution not to be picked at all and not to not be passed to the next generation so there are some strategies which you can use to overcome this one of them is elitism which is where you basically take the best solution from the population and then you put it in the next generation anyway so you just remove one from the the new eight that we had the new population and then just put that one in here if we were going to use this legitimately to solve this problem there are some parameters in here which would be better if they were changed so for example the population size 8 is good visually but generally if we're going to start out using genetic algorithms we use a population size for something like 100 250 a thousand something like that because with this it's very hard to get variation partly because we've only got eight solutions but partly because the problem's so simple so if we did have value weight and then we had size because some shapes are better than others and robustness and then we had a hundred boxes i'd probably start with a population size of 500 and i'd have a crossover rate of 0.5 so crossover happens half the time and probably a mutation rate of 0.05 and i'd use a tournament selection because very rare you have a tournament selection size too it's not particularly useful i have a tournament selection of something like four or eight those parameters just loosely should help preserve the heterogeneity within the population to make sure that you have a rich space to look through because ultimately this is a search algorithm this is just going through this search space here to try and find the best solution and if you solve other functions you can actually plot how the jet algorithm is moving through a specific function you only have to work out whether it's worth alerting the user if you find the key so you know you download the temporary exposure key you perform the encryption you generate the potential rpis and you compare them with the ones you've seen or if you want a more slightly comprehensible message it's saying maybe you haven't applied a function to enough arguments

Original Description

Tournament selection, roulette selection, mutation, crossover - all processes used in genetic algorithms. Dr Alex Turner explains using the Knapsack Problem. https://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computer Science at the University of Nottingham: https://bit.ly/nottscomputer Computerphile is a sister project to Brady Haran's Numberphile. More at http://www.bradyharan.com

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Computerphile · Computerphile · 0 of 60

← Previous Next →

Follow the Cookie Trail - Computerphile

Follow the Cookie Trail - Computerphile

EXTRA BITS - Follow the Cookie Trail - Computerphile

EXTRA BITS - Follow the Cookie Trail - Computerphile

Musical Floppy Drives - Computerphile

Musical Floppy Drives - Computerphile

The Hair Algorithm - Computerphile

The Hair Algorithm - Computerphile

Getting Sorted & Big O Notation - Computerphile

Getting Sorted & Big O Notation - Computerphile

Quick Sort - Computerphile

Quick Sort - Computerphile

Hyper History and Cyber War - Computerphile

Hyper History and Cyber War - Computerphile

Entropy in Compression - Computerphile

Entropy in Compression - Computerphile

Original Elite on the BBC B - Computerphile

Original Elite on the BBC B - Computerphile

IP Addresses and the Internet - Computerphile

IP Addresses and the Internet - Computerphile

A Career in Video Games - Computerphile

A Career in Video Games - Computerphile

Error Detection and Flipping the Bits - Computerphile

Error Detection and Flipping the Bits - Computerphile

Programming BASIC and Sorting - Computerphile

Programming BASIC and Sorting - Computerphile

Birthplace of the World Wide Web - Computerphile

Birthplace of the World Wide Web - Computerphile

Punch Card Programming - Computerphile

Punch Card Programming - Computerphile

Programming Paradigms - Computerphile

Programming Paradigms - Computerphile

CERN Computing Centre (and mouse farm) - Computerphile

CERN Computing Centre (and mouse farm) - Computerphile

Error Correction - Computerphile

Error Correction - Computerphile

Home-Made Code - Computerphile

Home-Made Code - Computerphile

Security of Data on Disk - Computerphile

Security of Data on Disk - Computerphile

Gesture Controls - Computerphile

Gesture Controls - Computerphile

How Intelligent is Artificial Intelligence? - Computerphile

How Intelligent is Artificial Intelligence? - Computerphile

Encryption and Security Agencies - Computerphile

Encryption and Security Agencies - Computerphile

Virtual Machines Power the Cloud - Computerphile

Virtual Machines Power the Cloud - Computerphile

Hacking Websites with SQL Injection - Computerphile

Hacking Websites with SQL Injection - Computerphile

How Huffman Trees Work - Computerphile

How Huffman Trees Work - Computerphile

Cracking Websites with Cross Site Scripting - Computerphile

Cracking Websites with Cross Site Scripting - Computerphile

Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile

Cloud Computing (Cloudy with a Chance of Pizza) - Computerphile

Texting Cabbage with a Recorder - Computerphile

Texting Cabbage with a Recorder - Computerphile

Hashing Algorithms and Security - Computerphile

Hashing Algorithms and Security - Computerphile

How YouTube Works - Computerphile

How YouTube Works - Computerphile

How NOT to Store Passwords! - Computerphile

How NOT to Store Passwords! - Computerphile

A New Golden Age of Video Games - Computerphile

A New Golden Age of Video Games - Computerphile

A Universe of Triangles - Computerphile

A Universe of Triangles - Computerphile

Cross Site Request Forgery - Computerphile

Cross Site Request Forgery - Computerphile

The True Power of the Matrix (Transformations in Graphics) - Computerphile

The True Power of the Matrix (Transformations in Graphics) - Computerphile

The Great 202 Jailbreak - Computerphile

The Great 202 Jailbreak - Computerphile

EXTRA BITS - Printing and Typesetting History - Computerphile

EXTRA BITS - Printing and Typesetting History - Computerphile

Triangles to Pixels - Computerphile

Triangles to Pixels - Computerphile

The Problem with Time & Timezones - Computerphile

The Problem with Time & Timezones - Computerphile

The Visibility Problem - Computerphile

The Visibility Problem - Computerphile

Lights and Shadows in Graphics - Computerphile

Lights and Shadows in Graphics - Computerphile

The Penguin Barcode - Computerphile

The Penguin Barcode - Computerphile

Typesetters in the '80s - Computerphile

Typesetters in the '80s - Computerphile

The Font Magicians - Computerphile

The Font Magicians - Computerphile

The Little Mac with the Big Bite - Computerphile

The Little Mac with the Big Bite - Computerphile

EXTRA BITS - More on the Original Mac at 30 - Computerphile

EXTRA BITS - More on the Original Mac at 30 - Computerphile

XP to Ubuntu with an 8yr old Hacktop - Computerphile

XP to Ubuntu with an 8yr old Hacktop - Computerphile

EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile

EXTRA BITS - Hacktop Real-Time Boot Comparison - Computerphile

EXTRA BITS - Making a Bootable USB in Linux - Computerphile

EXTRA BITS - Making a Bootable USB in Linux - Computerphile

EXTRA BITS - Installing Ubuntu Permanently - Computerphile

EXTRA BITS - Installing Ubuntu Permanently - Computerphile

The Dawn of Desktop Publishing - Computerphile

The Dawn of Desktop Publishing - Computerphile

What is Bootstrapping? - Computerphile

What is Bootstrapping? - Computerphile

Reverse Polish Notation and The Stack - Computerphile

Reverse Polish Notation and The Stack - Computerphile

Home-Made Z80 Retro Computer - Computerphile

Home-Made Z80 Retro Computer - Computerphile

Should Everybody Learn to Code? - Computerphile

Should Everybody Learn to Code? - Computerphile

Programming in PostScript - Computerphile

Programming in PostScript - Computerphile

Heartbleed, Running the Code - Computerphile

Heartbleed, Running the Code - Computerphile

YouTube's Secret Algorithm - Computerphile

YouTube's Secret Algorithm - Computerphile

YouTube Search & Discovery - Computerphile

YouTube Search & Discovery - Computerphile

This video teaches how to use Genetic Algorithms to solve the Knapsack Problem, a classic problem in computer science, and demonstrates the use of techniques such as tournament selection, roulette selection, mutation, and crossover. The video provides a comprehensive introduction to Genetic Algorithms and their application to optimization problems. By watching this video, viewers can learn how to apply evolutionary computation to complex problems and use Genetic Algorithms for unsupervised and s

Key Takeaways

Represent the Knapsack Problem as a binary string of 0s and 1s
Define a fitness function to evaluate the quality of a solution
Generate a population of solutions randomly
Evaluate each solution in the population using the fitness function
Evolve the population over time to find the best combination of boxes that maximizes value without exceeding the weight limit
Use tournament selection or roulette wheel selection to select parents
Perform crossover to combine information from parents
Perform mutation to introduce random changes

💡 Genetic Algorithms can be used to solve complex optimization problems such as the Knapsack Problem by iteratively generating new populations and selecting the fittest individuals.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related Reads

Advanced Stack ApplicationsData Structures and Algorithms Deep‑Dive — Advanced Stack Applications…

Learn advanced stack applications in data structures and algorithms to improve coding skills

Medium · Programming

The Minecraft anvil is a tree-cost optimization problem in disguise

Optimize tree costs in Minecraft using graph theory and algorithms, just like the anvil repair system

KMP Algorithm (Knuth-Morris-Pratt): The Smart Way to Perform String Matching in O(N)

Learn the KMP algorithm for efficient string matching in O(N) time complexity and improve your coding skills

Dev.to · Jaspreet singh

Every Backtracking Problem Is the Same Three Lines. I Just Couldn't See the Tree.

Master backtracking problems with a simple three-line approach to improve problem-solving skills in coding interviews and challenges

Dev.to · Alex Mateo

Stump Grinder Carbide Wheel Grinds Hardwood To Chips

Innoforge Studio