From T5 to T5X: A Game-Changing Evolution with JAX & FLAX
Key Takeaways
The video discusses the evolution of the T5 model to T5X, a game-changing technology that combines the power of JAX and FLAX, and its application in Google search, with a code implementation for T5X using Google Collab notebooks and TPUs.
Full Transcript
hello Community let's talk about Transformer based language model and we are at the current state of affairs we look at the t5x model and of course the Palm model by Google and we look at the last video where I showed you that bird and jpt are more or less the encoder stack of the Transformer architecture this is bird and the decoder stack of the Transformer architecture which became TPT by open AI now GPD as being just only directional we found out that we can use the decoder stack of the Transformer as a language model so a model trained purely for the task of next step word prediction rate burden on the head is bi-directional it is an anchored only stack and it is great at doing the single prediction period input token so great for classification tasks currently if you ask we have T5 and this is a multi-task unified model or short mum and this is the technology that powers Google search today if you believe current research publication of course to be specific since the mid of 2022 you have t5x you have the text to text transfer Transformer in Jacks and in Flags so utilizing the compute infrastructure by Google and if you want to know more about Jax I have a video on about Jacks in tensorflow and pytorch and if you want to know about the hardware configuration of h100 gpus by Nvidia or the A6 by Google I have also a video for this now in July 2020 we had the publication by Google about its T5 architecture and their ideas behind this in a pre-print publication they focused specifically on transfer learning in natural language processing and they say we often have this pre-training part on unsupervised learning and unlabeled data that we have so much data free from the internet and I examined this transfer learning model when they had a first pre-training on a data Rich task before then fine-tuning the system on a downstream task now actual Downstream task a question answering document summarization sentiment classification and their idea was to put everything in a text to text framework and I wanted to explore the limits of transfer learning and of course they used a transfer Transformer architecture this means more or less that they're used on the encoder stack also something like birds MLM modeling they had a drop out of 15 of the token in the input sequence and they even cared about to reduce the computational cost of pre-training their model but the great thing about Google is they open sourced it you have here the GitHub directory by Google research where they provide you the code and as you see it has been updated just two weeks ago and they even provide you here with a free Google collab notebook where you can experience T5 yourself of course not the full-fledged half trillion parameter model but the smaller model that fit within the Google collab notebook and here we are now here in our collab notebook from Google about T5 and as you can see it is about fine-tuning text to text transfer Transformers four closed book question answering and you have here all the code you'll see how you set it up on a TPU is some Easy Pathways then you have natural questions you have the code to code this in detail if you want I can make a video going with you step by step through the code but it is really easy to implement you have to transfer to new tasks they explain in detail to you how it is done how it is coded a very nice implementation an expected results you how you evaluate your model of course and they give you all this code to play around and then of course most important and predict functionality of the model and you have here your question that you can Define and you see the output here now on the T5 model of course since you're working here on a free Google collab notebook you cannot use the highest and the half trillion parameter model but also the smaller T5 implementations show you what you can achieve with the current and open source T5 code it is free to you for you to explore here on a free Google collab notebook so summarizing we can say t5x combines the pre-training and defined tuning for specific tasks it is pre-trained on multitask mixture before fine tuning for a specific task and Google itself claims it is 1 000 times more powerful than bird I can verify this but we have a question now what about the future of this and especially if you think about conversational AI tools like jet GPT now the answer is easy but before I give you the answer in my next video I want to show you that T5 also evolved in flan T5 I have two videos for you where I show you the code and the tuning of the hyper parameters I hope you enjoyed this and I see you in my next video
Original Description
After explaining BERT vs GPT (last video) we now examine current tech like Google's T5X (for Google search) and in my next video new PaLM: Pathways Language Model (if combined w/ RLHF -Reinforcement Learning with Human feedback). T5X = Google's T5 on JAX and FLAX. Plus Code implementation for T5X.
my sources (all rights are with the corresponding authors):
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
https://arxiv.org/pdf/1910.10683.pdf
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
https://arxiv.org/pdf/1808.06226.pdf
Illustrating Reinforcement Learning from Human Feedback (RLHF)
https://github.com/huggingface/blog/blob/main/rlhf.md
Fine-Tuning the Text-To-Text Transfer Transformer (T5) for Closed-Book Question Answering
https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/main/notebooks/t5-trivia.ipynb#scrollTo=zSeyoqE7WMwu
PaLM + RLHF - Pytorch
https://github.com/lucidrains/PaLM-rlhf-pytorch
#ai
#t5
#chatgpt
#reinforcementlearning
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Discover AI · Discover AI · 15 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
▶
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Step Into the Unknown (by YouChat) - May 2023 be your best year yet
Discover AI
Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!
Discover AI
Create a Smarter Future!
Discover AI
The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers
Discover AI
Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models
Discover AI
Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business
Discover AI
Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D (SBERT 48)
Discover AI
Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey! (SBERT 49)
Discover AI
SBERT Extreme 3D: Train a BERT Tokenizer on your (scientific) Domain Knowledge (SBERT 50)
Discover AI
Discover Vision Transformer (ViT) Tech in 2023
Discover AI
Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)
Discover AI
Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI
Discover AI
BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM
Discover AI
Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source) #shorts
Discover AI
From T5 to T5X: A Game-Changing Evolution with JAX & FLAX
Discover AI
How to start with ChatGPT? | Short Introduction to OpenAI API #shorts
Discover AI
The Future of Conversational AI? Google's PaLM w/ RLHF | LLM ChatGPT Competitor
Discover AI
Microsoft and ChatGPU
Discover AI
From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch
Discover AI
Google's 2nd Answer to "BING ChatGPT": Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI
Discover AI
TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP
Discover AI
3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer
Discover AI
FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!
Discover AI
ChatGPT - Can it Lie to you?
Discover AI
ChatGPT Alternative: Perplexity by Perplexity.AI
Discover AI
2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2
Discover AI
Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING
Discover AI
BLOOM 176B Inference on AWS | Bigger than GPT-3 for more Power!
Discover AI
Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings? My own ChatGPT? | Visual Q+A
Discover AI
Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!
Discover AI
After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?
Discover AI
Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT
Discover AI
Fine-tune ChatGPT w/ in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct
Discover AI
The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE
Discover AI
New TECH: Vision Transformer 2023 on Image Classification | AI
Discover AI
PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned | AI Tech
Discover AI
New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!
Discover AI
New BING ChatGPT loses its mind
Discover AI
Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)
Discover AI
Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI
Discover AI
Microsoft strongly restricts access to ChatGPT on new BING - WHY?
Discover AI
PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)
Discover AI
New BING Chat AGGRESSIVE
Discover AI
Panoptic Image Segmentation: Mask2Former explained | Identify all objects!
Discover AI
Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial
Discover AI
Dream Job Alert: AI Prompt Engineer - $335K | AI Prompt Design: A Crash Course
Discover AI
Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide
Discover AI
Microsoft's CEO in Trouble #shorts
Discover AI
Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)
Discover AI
OpenAI's ChatGPT can NOW summarize external Sources on the Internet?
Discover AI
ChatGPT polarizes
Discover AI
Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed
Discover AI
ChatGPT Prompt Engineering w/ in-context learning (ICL) - 7 Examples | Tutorial
Discover AI
Chat with your Image! BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)
Discover AI
ChatGPT: Multidimensional Prompts
Discover AI
ChatGPT: In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples
Discover AI
Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM
Discover AI
Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?
Discover AI
Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)
Discover AI
Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?
Discover AI
More on: LLM Foundations
View skill →Related Reads
🎓
Tutor Explanation
DeepCamp AI