George Hotz | Programming | Decision Transformer Reinforcement Learning (RL) | LunarLander | Part 1
Date of the stream 6 Jan 2024.
from $1250 buy https://comma.ai/shop/comma-3x & best ADAS system in the world https://openpilot.comma.ai
Original stream title:
- tinygrad: rewriting the scheduler
Sources:
- https://arxiv.org/pdf/2106.01345.pdf
- https://huggingface.co/blog/decision-transformers
-https://medium.com/@jscriptcoder/demystifying-upside-down-reinforcement-learning-a-k-a-ꓤ-b7bd4214b33f
- https://youtu.be/xc0jGZYFQLQ
tinygrad bounties:
- https://docs.google.com/spreadsheets/d/1WKHbT-7KOgjEawq5h5Ic1qUWzpfAzuD_J06N1JwOCGs/
Follow for notifications:
- https://twitch.tv/georgehotz
Support George:
- https://twitch.tv/subs/georgehotz
Pre-order tinybox:
- https://buy.stripe.com/5kAaGL6lk9uX9nW144 (https://tinygrad.org/)
Chapters:
00:00:00 lunarlander_transformer.py
00:04:25 twitch substance warning
00:06:00 perplexity decision transformer
00:12:00 assert not x.requires_grad
00:15:00 192 % start_pos
00:21:45 food
00:24:25 fixes needed in tinygrad
00:41:00 gpt2 works
00:46:40 contraction not explained
00:55:00 rant
01:00:25 Ron Paul
01:04:40 usa population pyramid
01:05:30 jit
01:08:55 africa documentaries
01:13:15 cross
01:19:00 not supported 768 %
01:23:20 do things team
01:24:50 tinygrad intern phone call
01:28:50 postmodernism
01:36:40 assert t.grad is not None
01:38:30 advice, schedule
01:43:20 decision transformer paper
01:53:00 not balancing
02:05:00 K=20
02:10:00 plt.show()
02:15:30 clip 50
02:19:00 lunarlander fails
02:20:00 uber eats scam
02:27:00 decision transformers on Hugging Fac
02:31:30 logits
02:45:00 temperature
02:54:00 should never output 2
03:11:40 so many bugs
03:12:40 good idea from chat
03:15:00 lunarlander is not landing
03:16:30 128 clip
03:17:00 highest_reward bug
03:18:50 lunar lander rewards
03:24:30 let's make it work
03:29:00 unknown change
03:31:40 piano
03:34:20 reinforcement learning is impossible
03:37:25 write gym environment
03:50:00 stupid decision transformer
03:57:20 98%
03:58:50 that is what we get for smoking weed
04:02:10
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from george hotz archive · george hotz archive · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
comma ai Driving to self racing cars with openpilot
george hotz archive
comma ai Still driving
george hotz archive
comma ai was live
george hotz archive
comma ai Going home
george hotz archive
comma ai We go to the airport
george hotz archive
comma ai Reversing Prius with cabana + panda telethon!
george hotz archive
comma ai panda manufacturing!
george hotz archive
comma ai Self driving to Best Buy
george hotz archive
comma ai shilling for giraffe!
george hotz archive
comma ai Toyota Prius Driving!!!
george hotz archive
comma ai Late night civic driving
george hotz archive
comma ai Toyota giraffe shilling
george hotz archive
comma ai Live car hacking with panda this time or bust!
george hotz archive
comma ai Product launch question time
george hotz archive
comma ai Driving with the RAV4, launching Tuesday!
george hotz archive
comma ai giraffe ship o' clock
george hotz archive
comma ai openpilot 0.3.9
george hotz archive
comma ai EON assembly!
george hotz archive
comma ai Going through the GM investor deck
george hotz archive
comma ai I love my EON
george hotz archive
comma ai RAV4 driving
george hotz archive
comma ai Shilling at the holiday party
george hotz archive
comma ai EON shipping party
george hotz archive
comma ai EON unboxing!
george hotz archive
comma ai The very straight roads of Nevada
george hotz archive
comma ai Starting our trip with openpilot 0.4
george hotz archive
comma ai Little EON on the prairie
george hotz archive
comma ai The urban sprawl of Colorado
george hotz archive
comma ai Onward to Omaha
george hotz archive
comma ai nothing, nowhere
george hotz archive
comma ai shop.comma.ai Buy things!!!
george hotz archive
comma ai The youth are woke
george hotz archive
comma ai Photo shoot!
george hotz archive
comma ai Product announcements are LIT!
george hotz archive
comma ai Breaking down hype of CES
george hotz archive
comma ai Salt Lakes Everywhere!
george hotz archive
comma ai This is the last one
george hotz archive
comma ai Corolla port o’clock!
george hotz archive
comma ai Presentation where it’s like you are in Omaha with us
george hotz archive
comma ai Asking the scopies the banned question
george hotz archive
comma ai Driving in the Corolla!
george hotz archive
comma ai We got new products! shop.comma.ai
george hotz archive
comma ai Sunday w scopies!
george hotz archive
comma ai Our first Lexus, the Lexus RX!
george hotz archive
comma ai Scopie saturday!
george hotz archive
comma ai Panda!
george hotz archive
comma ai Scopie Sunday! *NOT CLICKBAIT*
george hotz archive
comma ai comma Tree!
george hotz archive
comma ai Scopie Saturday
george hotz archive
comma ai Ok scopie Friday
george hotz archive
comma ai comma pedal!
george hotz archive
comma ai okay this time comma pedal!
george hotz archive
comma ai Why aren’t car companies good
george hotz archive
comma ai How can driving be better
george hotz archive
comma ai Scopie Sunday
george hotz archive
comma ai comma got a new car!
george hotz archive
comma ai Mapping Sunday!
george hotz archive
comma ai Let’s go buy a car
george hotz archive
comma ai Ok I take back all the bad things I said about Ford
george hotz archive
comma ai comma smays are in stock!
george hotz archive
More on: Research Methods
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
The ABCs of reading medical research and review papers these days
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
ArXiv cs.AI
Chapters (46)
lunarlander_transformer.py
4:25
twitch substance warning
6:00
perplexity decision transformer
12:00
assert not x.requires_grad
15:00
192 % start_pos
21:45
food
24:25
fixes needed in tinygrad
41:00
gpt2 works
46:40
contraction not explained
55:00
rant
1:00:25
Ron Paul
1:04:40
usa population pyramid
1:05:30
jit
1:08:55
africa documentaries
1:13:15
cross
1:19:00
not supported 768 %
1:23:20
do things team
1:24:50
tinygrad intern phone call
1:28:50
postmodernism
1:36:40
assert t.grad is not None
1:38:30
advice, schedule
1:43:20
decision transformer paper
1:53:00
not balancing
2:05:00
K=20
2:10:00
plt.show()
2:15:30
clip 50
2:19:00
lunarlander fails
2:20:00
uber eats scam
2:27:00
decision transformers on Hugging Fac
2:31:30
logits
2:45:00
temperature
2:54:00
should never output 2
3:11:40
so many bugs
3:12:40
good idea from chat
3:15:00
lunarlander is not landing
3:16:30
128 clip
3:17:00
highest_reward bug
3:18:50
lunar lander rewards
3:24:30
let's make it work
3:29:00
unknown change
3:31:40
piano
3:34:20
reinforcement learning is impossible
3:37:25
write gym environment
3:50:00
stupid decision transformer
3:57:20
98%
3:58:50
that is what we get for smoking weed
🎓
Tutor Explanation
DeepCamp AI