Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Key Takeaways
The video discusses a Google research paper on Efficient Infinite Context Transformers, which integrates compressive memory into a vanilla dot-product attention layer to enable Transformer LLMs to process infinitely long inputs with bounded memory footprint and computation. The proposed Infini-att attention technique incorporates a compressive memory module into a vanilla attention mechanism.
Full Transcript
hi everyone so I have a new paper here and this is a very exciting paper by Google that integrates compressive memory into a vanilla. product attention layer the goal of this approach is to enable Transformer large language models to effectively process infinitely long inputs with Bound in memory footprint and computation so they propose a new attention technique called infin attention which incorporates a compressive memory module into a vanilla attention mechanism it builds in both mask local attention and long-term linear attention into a single Transformer block this allows the infinity Transformer model to efficiently handle both long and short range contextual dependencies this approach will perform SpaceTime models on Long context language moding with a4x compression ratio of memory they also show that a 1 billion large language model can naturally scale to 1 million sequence length and a 8 billion parameter model achieves a new sorta result on a 500K length book summarization task so given how important long context large language moldes are becoming today having an effective memory system could unlock powerful reasoning planning continual adaption and capabilities not seen before in large language models feel free to like and comment if you want to see more of these short summaries see you in the next one
Original Description
Very exciting paper by Google that integrates compressive memory into a vanilla dot-product attention layer.
The goal is to enable Transformer LLMs to effectively process infinitely long inputs with bounded memory footprint and computation.
They propose a new attention technique called Infini-attention which incorporates a compressive memory module into a vanilla attention mechanism...
Paper: https://arxiv.org/abs/2404.07143
#chatgpt #ai #llms #tutorial #programming
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Elvis Saravia · Elvis Saravia · 30 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
▶
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
101 ways to solve search (by Pratik Bhavsar)
Elvis Saravia
TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
Elvis Saravia
Sentiment Analysis: Key Milestones, Challenges and New Directions
Elvis Saravia
Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Elvis Saravia
Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Elvis Saravia
Getting Started with NLP
Elvis Saravia
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Elvis Saravia
TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
Elvis Saravia
Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Elvis Saravia
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
How I read and annotate ML papers
Elvis Saravia
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
Elvis Saravia
Dive into Deep Learning (Study Group): Preliminaries | Session 2
Elvis Saravia
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Elvis Saravia
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Elvis Saravia
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
Keep Learning ML #3 | Contrastively Trained Structured World Models
Elvis Saravia
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
Elvis Saravia
Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Elvis Saravia
Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Elvis Saravia
101 ways to solve neural search with Jina
Elvis Saravia
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
Elvis Saravia
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
Elvis Saravia
Prompt Engineering Overview
Elvis Saravia
Getting Started with the OpenAI Playground
Elvis Saravia
LM-Guided Chain of Thought
Elvis Saravia
Elements of a Prompt
Elvis Saravia
Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Elvis Saravia
General Tips for Designing Prompts
Elvis Saravia
Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Elvis Saravia
Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Elvis Saravia
Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Elvis Saravia
Basic Prompt Examples for LLMs
Elvis Saravia
LLM In Context Recall is Prompt Dependent #llms #ai #chatgpt #machinelearning
Elvis Saravia
Zero-shot Prompting Explained
Elvis Saravia
RAG Faithfulness #llms #ai #gpt4
Elvis Saravia
Understanding LLM Settings
Elvis Saravia
Llama 3 is here! | First impressions and thoughts
Elvis Saravia
Llama 3 is Here! #ai #llms #llama3
Elvis Saravia
Microsoft introduces Phi-3 | The most capable small language model?
Elvis Saravia
Microsoft introduces Phi-3! #ai #llms #microsoft
Elvis Saravia
Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Elvis Saravia
When to Retrieve? #ai #llms #machinelearning
Elvis Saravia
Training an LLM to effectively use information retrieval
Elvis Saravia
State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
Elvis Saravia
Better and Faster LLMs via Multi-token Prediction
Elvis Saravia
AlphaMath Almost Zero #ai #science #machinelearning
Elvis Saravia
SWE-Agent | An LLM-based Software Engineering Agent
Elvis Saravia
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
Elvis Saravia
LLM-powered tool for web scraping #ai #chatgpt #engineering
Elvis Saravia
Learn about LLMs in this NEW course #ai #chatgpt #engineering
Elvis Saravia
[LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
Elvis Saravia
[LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
Elvis Saravia
Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Elvis Saravia
On exploring LLMs #ai #promptengineering #chatgpt
Elvis Saravia
Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Elvis Saravia
[LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
Elvis Saravia
PR-Agent #ai #gpt4 #software
Elvis Saravia
Extracting features from Claude 3 Sonnet
Elvis Saravia
Has prompt engineering been solved?
Elvis Saravia
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Sub-10ms AI Workflows: Accelerating sim.ai with On-Device Semantic Search using Moss
Medium · Machine Learning
Stop Guessing: Guaranteed Structured Output from LLMs in Node.js
Dev.to · Hardik Mehta
Spring AI Tutorial — Your First REST Endpoint with OpenAI (2026)
Dev.to AI
Notes: Memory, Context, and Large Language Models (LLMs)
Dev.to · Vladimir Panov
🎓
Tutor Explanation
DeepCamp AI