LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games

Yannic Kilcher · Advanced ·📄 Research Papers Explained ·6y ago

Skills: Research Methods90%Reading ML Papers80%RAG Basics70%Vector Stores60%RAG Evaluation50%

Key Takeaways

The LeDeepChef agent utilizes deep reinforcement learning to play text-based games, specifically cooking recipes, and handles natural language input, inventory management, and navigation. The agent's architecture includes a policy network, value network, and recipe model, trained using advantage learning, entropy penalty, and supervised learning.

Full Transcript

hi there today we're looking at the deep chef's deep reinforcement learning agent for families of text-based games by Leonard autos and Thomas Hoffman so this is a paper about engineering an agent for a particular family of tasks this is different from reinforcement learning agents that for example are just good at one game let's say pong or or whatnot and even I guess even things like like Starcraft though this it kind of depends on what do you mean by game so what are we talking about here the following is a text-based games where the goal is to cook recipes right so let's just jump in and see what goes on the game starts by telling you you are hungry right let's cook a delicious meal and so on so the objective is basically this is always the same it's fine the cookbook read recipe that's in it then collect all the things that are in the recipe prepare them in certain ways that are also specified by the recipe and then at the end you have a meal and then you can eat the meal and that will give you points but since it's a text-based games and the the input doesn't come structured but it comes in natural text so the game tells you for example kitchen so basically you're in the kitchen you are now in the kitchen I guess you better just go and list everything you see here you hear a noise you spin around so that you see that the kind of input you get from the game is very playful has a lot of descriptive elements sometimes it's like you see a closed oven you make out a table then you can see on the counter you can make out a sliced fried red hot pepper and so on so it's it's it's very much no trivial to kind of parse this in a traditional way if you were to go about this by simply writing an algorithm extracting things it's very hard because for example you might see that there's an oven but it's a closed oven you make out a table so this is kind of a synonym for you see a table but you see like there there is a table you can make out an eighth sliced fried red hot pepper and here it's important not only do you need to realise that there is a red hot pepper but also that it's state is sliced and fried this is important because you need all ingredients in a certain state right you examine here you examine the stove so there is a stove right all these things you need to kind of understand so if you now look there there there is a recipe book in in here or no there isn't a recipe you can examine recipe I guess there is a recipe book in that room if there is a recipe book then you can examine the recipe and that's the command so the the the arrows here always indicate that that's a user command and these you have to type that's like the next the next thing that your agent needs to do you can you can't select from a predefined set of actions you actually need to type in the things you want to do right and these are a lot like there are a lot of possibilities of what you could type in even if you restrict it to kind of what you know the game accepts there are still so many actions it's it's way different than for example Atari games they always have eight eight action like there's eight buttons you could possibly press and that's it and here there are like combinatoric lee many things you can do but you can prepare and take and all the ingredients you don't know which ingredients come so then yeah so here you examine the recipe let's look at a recipe it says you open the recipe start reading recipe number one here are the ingredients read out pepper here for right now that's just one ingredient then there are directions so what do you need to do slice the red out pepper fry the red hot pepper and prepare the meal that's better those are the directions of the recipe you also have this this inventory command would tells you which you're carrying next difficulty the inventory is finite so he can't carry everything at some points you have to drop things that are unnecessary you can't just take everything here you see the command take red hot pepper that only works if there's a red hot pepper in the room and here's as you take the red hot pepper from the counter your score has just gone up at one point and then if you type inventory it says you're carrying a sliced fried red hot pepper again here it it says the state of the ingredient so the ingredient is the red hot pepper and the state is sliced and fried and then you can prepare meal and then you can eat meal and then it says your score has just gone up by one point and these are the scores you collect so there are a lot of difficulties that are actually not shown in this example for example there are different rooms you may have noticed here you're in the kitchen but there could be other rooms and you start in a random room you also need to navigate through the rooms the close the doors to other rooms could be closed and you then you need to open them and so on you can only for example if this if this pepper here weren't already sliced and fried you need to find you can only slice it if there is a knife in the room all right you can only fry it if there is a frying pan or an oven or a stove sorry stove in the room so and then you'd have to to um notice that there is a knife if there is no knife you'd need to take the red out pepper bring it to a new room with a knife and then slice it so this is vastly difficult game and the last difficulty is actually that in the test set there will be ingredients that you haven't seen during training so also that there you your agent needs to generalize that's why it has a family of text-based games because the objective all was the same to kind of cook the recipe but the things you have to do and the things that appear and so on those are those change basically from episode to episode and the test set will be different than the training set or kind of there will be unseen data alright so how does this paper go about solving this problem this paper basically does the following and we are going here from high level to low level on the highest level it's a reinforcement learning agent and that is sort of how you would imagine an RL agent to work so here at the end you have a policy and the policy predicts an action if you if you don't know what a kind of a policy in an action things are in RL these are the basic RL concept and we kind of skip them here and I'll assume everyone knows what they are but essentially a policy specifies which action you take next given the current game state so the policy is made up scores different actions so at each step there are K actions available and these K actions I for said there are almost infinitely many actions that you could take the first difficulty and that's the and that's the thing that actually comes in here is to reduce all of the possible actions that you can't even list to just K commands so that we'll go into that later how this is done but basically one of the main contributions of this paper is how do you even specify what is reasonable what would be reasonable to do in the current situation and then the policy over here only has to decide among those reasonable actions not among all actions so but given that you have K reasonable commands you see here command one coming these are embedded and then fed into gr use which are recurrent neural networks so for each of these commands you'll get a 32 dimensional vector this 32 dimensional vector is here C 1 through ck each are combined with an encoding of the current state so these these thirty two dimensional vector are combined with encoding of the current state which is 256 dimensional and then fed into a neural network that will output a probability distribution over these actions this is pretty classic in a deep reinforcement learning so you have action encoding in the state encoding and the policy decides on that the state encoding you'll see here is it's the same everywhere of course because the current games that is the current game state this comes from this model up here what this does is over here you have the what you would call the state the current observation and the current observation is composed of many things specifically the following eight things so the first one is actually called observation which is I would call all of this the current observation if from an RL perspective but the first is actually observation it's whatever you saw the big text you saw before like you are in the kitchen it looks like this it smells like this d turn around and so on this would be the the observation it's what the game engine says at the current time step it's just a piece of text right second missing items third unnecessary items now these things you might wonder okay how do I know what what items are missing and unnecessary these things come from another model that this paper trains and we'll get into that later but basically they have a method of specifying which items are still missing which are unnecessary and they list those here then description which is the output of the last look command so in each room you can look you can type look and then it'll give you a description of the room and what's in there the previous commands this is often used in RL either explicitly or implicitly through a recurrent Network in order to give the agent an idea what what happened in the in the previous steps or what it did so that it doesn't repeat actions unnecessarily or so it learns to repeat actions are necessarily required utilities again this is a model that's kind of trying to predict what utilities are required to perform some actions so as I said before if you want to slice the red hot pepper you need a knife if you want to fry it you need a stove discovered locations as I said there are different rooms you actually don't know what rooms there are before you actually go in in there so before you go through a door and you reach another room so the list of previously discovered and visited locations is there and then the name of the current location it is also there so these are eight things that make up the current observation these eight things are just strings of text and these eight things are each one as you can see here these are the the eight things from observation to location each one are embedded and fed also into an RNN so for each of these eight things you'll obtain a thirty-two dimensional vector and these are all concatenated to make up one big 256 dimensional vector so this 256 dimensional vector will contain all the necessary information about the current room what's in there what what items are you still missing what items do you have in your inventory which ones are unnecessary and so on so if you trained this correctly this 256 dimensional vector will describe the current game state as it is relevant to your agent like everything about it and with every relevant information that's in here will be encoded in this vector now this vector isn't the final state encoding yet what you have is you feed this into an RNN that takes as input the last time steps you have to imagine the last time step already there was observation blah blah blah this entire thing was I'm just copying I'm just copying this box over here so this entire thing was already done last step and already fed into a narnun so this this is an Oran and that actually goes over time and the last whatever the output here is it will be fed to the next step and this is a trick often done in reinforcement learning as well that you actually have an recurrent neural network over the time steps so each time step you have a certain observation you encode it and so on you get a description of that and then you feed this into an RNN what the orang n can learn to do is it can learn to react to different not only to the current observation but to the current observation conditioned on the history of previous observations so it can learn ha before I was in this room now I am in this new room so I actually haven't you know taken all the items from this room yet because I just came into this room and so on so the the kind of component where you are able to look at the past and what happened in the past is encaptured by this RNN here so it's a fairly complicated architecture but this here this state encoding that is conditioned on there also on the history then goes into this um into here that's it that's the vector that goes in here is combined with each action so all of these actions here these K actions and this is all fed through a neural network and that will give you the policy this is a very complicated thing but if you look at it it's not it's not too it's not too difficult actually so what you'll do is you will take your observations here this is all observation it will be encoded and combined with the history in order to give you this in order to give you an encoding of the current state on the other hand you'll take all of the possible commands that you could perform right now encode each one separately right into an embedding and then you'll combine each one of those with this encoding you specified previously that you and and from that you make your decision which action to take next and the action here is the one that's output is the actually take next sampled from this policy the last thing you need is a value network and this is just important for reinforcement learning which tells you from this state here so I'm getting weird with colors here from this state here which is the same as this one so you'd simply transfer this over from this state how valuable is that what's my value of the state and the value is if I'm in this state and I act as I normally act what are all my future rewards going to be combined so it basically gives you a value of this state you can think of this in for example terms of Jess if you had this in chess and then this here is it would be a description of the chess board this HT and the value would be how valuable is this position for you so if you're very much ahead in material and position and so on this value would be very high if you're behind this value would be very low and this is a neural network simply trying to predict that value so with all of this you now have a have a good basis to do reinforcement learning you have a policy you have a value network and from that you can train a neural agent and this is done classically in an actor critic way where you do advantage learning here the advantage and the policy you train waited by the advantage then the value network you train to be close to the reward and then you have an entropy penalty if you don't know what these things are the video will get bit too long if I were to go over these reinforcement learning concepts but these are very standard in reinforcement learning so you can train these you can basically train what it does is you can train these neural networks in absence of label training data because you don't know what the best action is in each step right there's no one telling you you just have a reward you just sometimes you get a point and you don't know which actions led to that so these things will actually allow you to train these neural networks by using in just the reward without knowing which exact actions were right and wrong and that's the core of reinforcement learning obviously all right so the dick the core one of the core ingredients actually is this recipe manager and the recipe manager is a sub model that does the following so here it takes as an input the cookbook here and it also takes as an input the inventory and it outputs something like this and this this is a this is a table representation of what it outputs it will output all the ingredients that you need for the recipe whether or not this input that this ingredient is currently missing from your inventory and action to perform so which actions still need to be performed so let's look at the following let's look at this example the recipe tells you you need the ingredients are a carrot a red hot pepper and a white onion and the inventory says you care you're carrying a white onion and a carrot right so down here you see aha we we do actually have [Music] we do actually have a carrot so it's not missing the carrot isn't missing you have it in your inventory the red health pepper is missing we don't have it in the inventory but we need it for the recipe the white onion we need for the recipe but it's not missing then it also for each of the ingredients is supposed to tell you this recipe model which of the what you still need to perform on it so here it says slice the carrot roast the carrot and you simply have a carrot it doesn't say slice the roast that means it's not sliced roasted so if the recipe is supposed to output you still need to slice and roast the carrot here for example for the white onion says fry the white onion and as you can see in the inventory it says you're carrying a fried white onion so for the white onion you see we don't need to do anything anymore so that the recipe model is basically trying to to make this table here and this table you can see as an intermediary step in order to do all the other things and the difference here to a pure RL method and this is important the difference is that this representation this intermediate table representation is done explicitly so the recipe model really produces a table like this and not just in other RL methods people go about and make this recipe model output some sort of you know as I say a 200 dimensional vector that's supposed to encompass all of this information and that doesn't appear to work as well like often that if you simply train this end to end that will not pick up on the important information because the training signal tends to be way too weak you have to imagine you already have this really really big model construction here and you're trying to learn it you're trying to learn it from a tiny reward signal that you get at the end right this is very noisy signal now if if you're now trying to say well the inputs to these things right this command here and we also saw the inputs to these these depend on this rescue model also now or whatever giant neural network construction here and we'll all train this end to end and these will actually not be text these will actually be some sort of latent vectors that will often fail because you're now just trying to extract information from too noisy of a reward signal so the author is here to actually pretty neat separation of that and they train this recipe model with actually an Augmented data set so they go to freebase and get more food items and then they construct a data set that resembles this and train it in a supervised way to output tables tables like this so this is pretty smart and I think it's a good lesson if you ever attempt something like this that really really important information such as this one if you can train it in a supervised way as a kind of a pre-processing step to your oral procedure that's extremely helpful here you can you can see how this is then used so by combining this table that was output from the recipe model and your inventory and the output of this look command you can then generate these commands so before we said it's important to reduce the everything you could do which is infinite things to everything that is reasonable to do currently and this model here does that so given this given that and given the description of what's currently in the room you can have generate these commands and for example take knife if you have to slice something because you see a knife is in the room and you could conceivably take the knife right you can construct these commands but also since you know right since you know what since you know what's in your inventory and since you know which things are still missing you can generate commands like take the white onion or drop the water because you don't need the water right so um the offers also group these things here in this one take high level commands which take all required items from here simply means take everything that's in the room that is not in the inventory but you need it so these things which for an RL agent it makes sense to group these things together because it doesn't make sense to have them as two separate things if you need both take both if you don't need any what if you have an inventory drop all of these things so that makes sense that's a small optimization that apparently brought some gains but the kind of the the overarching message here is that once you have a once you have this information from the recipe model you can then use it in many useful ways in order to make life for your oral agent easier alright so that kind of is the entire model that's very it's quite convoluted but basically you start with this here this recipe manager you decide you output this table down here which ingredients are in the recipe are they still missing and which actions we need to perform you then combine it with this information here the information about the current room and your inventory in order to come up with a set of commands that are conceivable to do here you combine these commands with some commands that are always available so commands that are always available are things like look inventory prepare meal you have that right you add that if the recipe manager does not output any missing and the agents location is the kitchen so you can add these other items and also um we're not even gonna get into that you are at the navigational items because there are doors in these rooms and you need to navigate around so they actually train another model to here easy to detect to detect the actions that you could move into and open doors for every closed door in the room so that's another challenge that the agent needs to overcome they have to build an entire model to predict which doors are there and are they close do you need to open them so these commands if there are doors and if you can move through them these commands are also added to this set of commands that are reasonable so now we have a set of commands that are reasonable over here then you describe the room here you put both into these embedding and then finally your policy outputs an action that's that that's the entire process very convoluted very big very astonishing that this works with RL but in order to need to get it to work you actually need to do this supervised training and the experimental evidence here is quite solid in that day compared to baseline systems that that use classic techniques and they do some ablation over over their individual parts and they get second place I think in a competition about these text-based games so that's pretty good and that was it for me and check it out and bye bye

Original Description

The AI cook is here! This agent learns to play a text-based game where the goal is to prepare a meal according to a recipe. Challenges? Many! The number of possible actions is huge, ingredients change and can include ones never seen before, you need to navigate rooms, use tools, manage an inventory and sequence everything correctly and all of this from a noisy textual description that the game engine throws at you. This paper mixes supervised explicit training with reinforcement learning in order to solve this task. Abstract: While Reinforcement Learning (RL) approaches lead to significant achievements in a variety of areas in recent history, natural language tasks remained mostly unaffected, due to the compositional and combinatorial nature that makes them notoriously hard to optimize. With the emerging field of Text-Based Games (TBGs), researchers try to bridge this gap. Inspired by the success of RL algorithms on Atari games, the idea is to develop new methods in a restricted game world and then gradually move to more complex environments. Previous work in the area of TBGs has mainly focused on solving individual games. We, however, consider the task of designing an agent that not just succeeds in a single game, but performs well across a whole family of games, sharing the same theme. In this work, we present our deep RL agent--LeDeepChef--that shows generalization capabilities to never-before-seen games of the same family with different environments and task descriptions. The agent participated in Microsoft Research's "First TextWorld Problems: A Language and Reinforcement Learning Challenge" and outperformed all but one competitor on the final test set. The games from the challenge all share the same theme, namely cooking in a modern house environment, but differ significantly in the arrangement of the rooms, the presented objects, and the specific goal (recipe to cook). To build an agent that achieves high scores across a whole family of games, we use an acto

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Yannic Kilcher · Yannic Kilcher · 36 of 60

← Previous Next →

Imagination-Augmented Agents for Deep Reinforcement Learning

Imagination-Augmented Agents for Deep Reinforcement Learning

Learning model-based planning from scratch

Learning model-based planning from scratch

Reinforcement Learning with Unsupervised Auxiliary Tasks

Reinforcement Learning with Unsupervised Auxiliary Tasks

Attention Is All You Need

Attention Is All You Need

git for research basics: fundamentals, commits, branches, merging

git for research basics: fundamentals, commits, branches, merging

Curiosity-driven Exploration by Self-supervised Prediction

Curiosity-driven Exploration by Self-supervised Prediction

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Stochastic RNNs without Teacher-Forcing

Stochastic RNNs without Teacher-Forcing

What’s in a name? The need to nip NIPS

What’s in a name? The need to nip NIPS

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

GPT-2: Language Models are Unsupervised Multitask Learners

GPT-2: Language Models are Unsupervised Multitask Learners

Neural Ordinary Differential Equations

Neural Ordinary Differential Equations

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

Discriminating Systems - Gender, Race, and Power in AI

Discriminating Systems - Gender, Race, and Power in AI

Blockwise Parallel Decoding for Deep Autoregressive Models

Blockwise Parallel Decoding for Deep Autoregressive Models

S.H.E. - Search. Human. Equalizer.

S.H.E. - Search. Human. Equalizer.

Reinforcement Learning, Fast and Slow

Reinforcement Learning, Fast and Slow

Adversarial Examples Are Not Bugs, They Are Features

Adversarial Examples Are Not Bugs, They Are Features

I'm at ICML19 :)

I'm at ICML19 :)

Population-Based Search and Open-Ended Algorithms

Population-Based Search and Open-Ended Algorithms

XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Conversation about Population-Based Methods (Re-upload)

Conversation about Population-Based Methods (Re-upload)

Reconciling modern machine learning and the bias-variance trade-off

Reconciling modern machine learning and the bias-variance trade-off

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Manifold Mixup: Better Representations by Interpolating Hidden States

Manifold Mixup: Better Representations by Interpolating Hidden States

Processing Megapixel Images with Deep Attention-Sampling Models

Processing Megapixel Images with Deep Attention-Sampling Models

Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Auditing Radicalization Pathways on YouTube

Auditing Radicalization Pathways on YouTube

RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules

DEEP LEARNING MEME REVIEW - Episode 1

DEEP LEARNING MEME REVIEW - Episode 1

Accelerating Deep Learning by Focusing on the Biggest Losers

Accelerating Deep Learning by Focusing on the Biggest Losers

[News] The Siraj Raval Controversy

[News] The Siraj Raval Controversy

LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games

LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games

The Visual Task Adaptation Benchmark

The Visual Task Adaptation Benchmark

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

SinGAN: Learning a Generative Model from a Single Natural Image

SinGAN: Learning a Generative Model from a Single Natural Image

A neurally plausible model learns successor representations in partially observable environments

A neurally plausible model learns successor representations in partially observable environments

MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions

Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions

NeurIPS 19 Poster Session

NeurIPS 19 Poster Session

Go-Explore: a New Approach for Hard-Exploration Problems

Go-Explore: a New Approach for Hard-Exploration Problems

Reformer: The Efficient Transformer

Reformer: The Efficient Transformer

[Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization

[Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization

Turing-NLG, DeepSpeed and the ZeRO optimizer

Turing-NLG, DeepSpeed and the ZeRO optimizer

Growing Neural Cellular Automata

Growing Neural Cellular Automata

NeurIPS 2020 Changes to Paper Submission Process

NeurIPS 2020 Changes to Paper Submission Process

Deep Learning for Symbolic Mathematics

Deep Learning for Symbolic Mathematics

Online Education - How I Make My Videos

Online Education - How I Make My Videos

[Rant] coronavirus

[Rant] coronavirus

Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting

Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting

Agent57: Outperforming the Atari Human Benchmark

Agent57: Outperforming the Atari Human Benchmark

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

Dream to Control: Learning Behaviors by Latent Imagination

Dream to Control: Learning Behaviors by Latent Imagination

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions

Evaluating NLP Models via Contrast Sets

Evaluating NLP Models via Contrast Sets

[Drama] Who invented Contrast Sets?

[Drama] Who invented Contrast Sets?

The LeDeepChef agent utilizes deep reinforcement learning to play text-based games, specifically cooking recipes, and handles natural language input, inventory management, and navigation. The agent's architecture includes a policy network, value network, and recipe model, trained using advantage learning, entropy penalty, and supervised learning. By following the steps outlined in this micro-lesson, you can implement a similar agent and evaluate its performance in text-based games.

Key Takeaways

Implement a policy network to predict actions given the current game state
Use a value network to predict the value of the state
Train the policy and value networks using advantage learning, entropy penalty, and supervised learning
Implement a recipe model to output tables with ingredients and actions needed
Combine the output of the recipe model with inventory and room information to generate commands
Detect navigational items such as doors and their status
Evaluate the performance of the agent in text-based games

💡 The LeDeepChef agent's architecture, which includes a policy network, value network, and recipe model, allows it to effectively handle natural language input, inventory management, and navigation in text-based games.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Research Methods

View skill →

Mechanics of Materials III: Beam Bending

Mechanics of Materials III: Beam Bending

Inaugural Lecture: Juliane Reinecke

Inaugural Lecture: Juliane Reinecke

Saïd Business School, University of Oxford

Hands-On Learning: How and Why You Should Build a Home Lab

Hands-On Learning: How and Why You Should Build a Home Lab

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

Does Water Swirl the Other Way in the Southern Hemisphere?

Does Water Swirl the Other Way in the Southern Hemisphere?

Undergraduate Research Forum 2026

Undergraduate Research Forum 2026

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Beyond Big Vendors: ERP Systems Explained #shorts

Digital Transformation with Eric Kimberling