Zhiwen Fan - VLM 3R Vision Language Models Augmented with Instruction Aligned 3D Reconstruction

Cohere · Advanced ·📄 Research Papers Explained ·6mo ago
3D foundation models are a key step toward spatial and physical intelligence, but their impact is often limited by heavy compute and scarce, well-curated 3D data. In this talk, I will present a set of 3D methods that make reconstruction real-time and integrate it with generative VLMs for real workloads, along with a scalable 3D data engine spanning 3D/4D environments and human avatars. Together, these components give VLMs strong spatial awareness and the ability to track temporal 3D changes. The result is higher accuracy, and better scalability. Taken together, these advances move us toward AI systems that interact with the physical world with genuine spatial understanding and real-time performance. Zhiwen (“Aaron”) Fan is an Assistant Professor in the Department of Electrical and Computer Engineering at Texas A&M University. He received his Ph.D. from The University of Texas at Austin. He was awarded the 2022 Qualcomm Innovation Fellowship and a Best Paper award at the CVPR AI4CC Workshop. He has served as an Area Chair for multiple AI/ML conferences and has completed research internships at Meta, NVIDIA, and Google. This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Benedict Emoekabu and Mayank Bhaskar, Leads of our Computer Vision group for their dedication in organizing this event. If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker. Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommunityApp).
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Cohere · Cohere · 0 of 60

← Previous Next →
1 Andreas Madsen on Independent Research and Interpretability
Andreas Madsen on Independent Research and Interpretability
Cohere
2 Plex: Towards Reliability using Pretrained Large Model Extensions
Plex: Towards Reliability using Pretrained Large Model Extensions
Cohere
3 Independent Research Panel Discussion
Independent Research Panel Discussion
Cohere
4 The Future of ML Ops: Open Challenges and Opportunities
The Future of ML Ops: Open Challenges and Opportunities
Cohere
5 C4AI Special - Grad School Applications
C4AI Special - Grad School Applications
Cohere
6 Cohere For AI Fireside Chat: Samy Bengio
Cohere For AI Fireside Chat: Samy Bengio
Cohere
7 Cohere For AI - Scholars Program Information Session
Cohere For AI - Scholars Program Information Session
Cohere
8 Modular and Composable Transfer Learning with Jonas Pfeiffer
Modular and Composable Transfer Learning with Jonas Pfeiffer
Cohere
9 Jay Alammar Presents Large Language Models for Real World Applications
Jay Alammar Presents Large Language Models for Real World Applications
Cohere
10 Catherine Olsson - Mechanistic Interpretability: Getting Started
Catherine Olsson - Mechanistic Interpretability: Getting Started
Cohere
11 How To Prompt Engineer a Tech Interview App | TOHacks 2022 Winners
How To Prompt Engineer a Tech Interview App | TOHacks 2022 Winners
Cohere
12 C4AI Sparks: Samy Bengio
C4AI Sparks: Samy Bengio
Cohere
13 BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1
BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1
Cohere
14 Exploring News Headlines With Text Clustering | Jay Alammar
Exploring News Headlines With Text Clustering | Jay Alammar
Cohere
15 Scale TransformX | Fireside Chat: Aidan Gomez and Alexandr Wang
Scale TransformX | Fireside Chat: Aidan Gomez and Alexandr Wang
Cohere
16 Making Large Language Models Accessible | Scale AI Fireside chat with Bill MacCartney
Making Large Language Models Accessible | Scale AI Fireside chat with Bill MacCartney
Cohere
17 Intro to KeyBERT - BERTopic for Topic Modeling
Intro to KeyBERT - BERTopic for Topic Modeling
Cohere
18 Intro to PolyFuzz - BERTopic for Topic Modeling
Intro to PolyFuzz - BERTopic for Topic Modeling
Cohere
19 API Design Philosophy - BERTopic for Topic Modeling
API Design Philosophy - BERTopic for Topic Modeling
Cohere
20 Code demo of BERTopic - BERTopic for Topic Modeling
Code demo of BERTopic - BERTopic for Topic Modeling
Cohere
21 Short texts vs long texts in BERTopic- BERTopic for Topic Modeling
Short texts vs long texts in BERTopic- BERTopic for Topic Modeling
Cohere
22 How People can help BERTopic - BERTopic for Topic Modeling
How People can help BERTopic - BERTopic for Topic Modeling
Cohere
23 Cohere For AI: Training Sensorimotor Agency in Cellular Automata with Bert Chan
Cohere For AI: Training Sensorimotor Agency in Cellular Automata with Bert Chan
Cohere
24 Cohere API Community Demos | October 2022
Cohere API Community Demos | October 2022
Cohere
25 Perfect Prompt Demo By Arjun Patel
Perfect Prompt Demo By Arjun Patel
Cohere
26 Project Idea Generator Demo By Tobechukwu Okamkpa
Project Idea Generator Demo By Tobechukwu Okamkpa
Cohere
27 SuperTransformer Demo By Amir Nagri and Team Megatron
SuperTransformer Demo By Amir Nagri and Team Megatron
Cohere
28 Cohere For AI Fireside Chat: Pablo Samuel Castro
Cohere For AI Fireside Chat: Pablo Samuel Castro
Cohere
29 How Startups Can Use NLP to Build a Competitive Moat
How Startups Can Use NLP to Build a Competitive Moat
Cohere
30 Build Chatbots Faster with Large Language Models
Build Chatbots Faster with Large Language Models
Cohere
31 Tools to Improve Training Data - Vincent Warmerdam - Talking Language AI Ep#2
Tools to Improve Training Data - Vincent Warmerdam - Talking Language AI Ep#2
Cohere
32 Utku Evci - Sparsity and Beyond Static Network Architectures
Utku Evci - Sparsity and Beyond Static Network Architectures
Cohere
33 Adding human intelligence to ML models with human-learn #shorts #machinelearning #nlp
Adding human intelligence to ML models with human-learn #shorts #machinelearning #nlp
Cohere
34 Iterating on your data with doubtlab - Tools to Improve Training Data
Iterating on your data with doubtlab - Tools to Improve Training Data
Cohere
35 Adding Human Intelligence to ML models with Human learn - Tools to Improve Training Data
Adding Human Intelligence to ML models with Human learn - Tools to Improve Training Data
Cohere
36 Scikt Learn embeddings helpers with Embetter - Tools to Improve Training Data
Scikt Learn embeddings helpers with Embetter - Tools to Improve Training Data
Cohere
37 Building Cohere API Demo App With Streamlit | Adrien Morisot
Building Cohere API Demo App With Streamlit | Adrien Morisot
Cohere
38 Rosanne Liu - career creation for non-standard candidates
Rosanne Liu - career creation for non-standard candidates
Cohere
39 Giving computers many human languages with Cohere's multilingual embeddings
Giving computers many human languages with Cohere's multilingual embeddings
Cohere
40 Learning by Distilling Context with Charlie Snell
Learning by Distilling Context with Charlie Snell
Cohere
41 Sentence Transformers and Embedding Evaluation - Nils Reimers - Talking Language AI Ep#3
Sentence Transformers and Embedding Evaluation - Nils Reimers - Talking Language AI Ep#3
Cohere
42 Reflecting on for.ai...
Reflecting on for.ai...
Cohere
43 Create a Custom Language Model with Surge AI and Cohere
Create a Custom Language Model with Surge AI and Cohere
Cohere
44 Cohere API Community Demos | November 2022
Cohere API Community Demos | November 2022
Cohere
45 Cohere API Community Demos | December 2022
Cohere API Community Demos | December 2022
Cohere
46 Cohere For AI Presents: Colin Raffel
Cohere For AI Presents: Colin Raffel
Cohere
47 Lucas Beyer - FlexiViT: One Model for All Patch Sizes
Lucas Beyer - FlexiViT: One Model for All Patch Sizes
Cohere
48 What is Neural Search? Nils Reimers - Sentence Transformers and Embedding Evaluation
What is Neural Search? Nils Reimers - Sentence Transformers and Embedding Evaluation
Cohere
49 Evaluating Information Retrieval with BEIR
Evaluating Information Retrieval with BEIR
Cohere
50 Evaluating Embeddings with MTEB Massive text embeddings benchmark - Nils Reimers
Evaluating Embeddings with MTEB Massive text embeddings benchmark - Nils Reimers
Cohere
51 High quality text classification with few training examples with SetFit
High quality text classification with few training examples with SetFit
Cohere
52 Multilingual and cross lingual embeddings - Nils Reimers
Multilingual and cross lingual embeddings - Nils Reimers
Cohere
53 Developing open-source software: lessons, benefits, and challenges - Nils Reimers
Developing open-source software: lessons, benefits, and challenges - Nils Reimers
Cohere
54 Ask Me Anything with Ed Grefenstette, Head of Machine Learning at Cohere
Ask Me Anything with Ed Grefenstette, Head of Machine Learning at Cohere
Cohere
55 HyperWrite Powers Its Generative AI Service with Cohere
HyperWrite Powers Its Generative AI Service with Cohere
Cohere
56 EMNLP 2022 Conference Special Edition - Talking Language AI #4
EMNLP 2022 Conference Special Edition - Talking Language AI #4
Cohere
57 Cohere API Community Demos | January 2023
Cohere API Community Demos | January 2023
Cohere
58 C4AI Sparks: Rosanne Liu on Career Creation for Non-Standard Candidates
C4AI Sparks: Rosanne Liu on Career Creation for Non-Standard Candidates
Cohere
59 Michael Tschannen -  Image-and-Language Understanding from Pixels Only
Michael Tschannen - Image-and-Language Understanding from Pixels Only
Cohere
60 How to Add AI to your App
How to Add AI to your App
Cohere

Related AI Lessons

The ABCs of reading medical research and review papers these days
Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research
ArXiv cs.AI
Up next
Microsoft Research Forum | Season 2, Episode 4
Microsoft Research
Watch →