Perception Language Models (PLMs) by Meta – A Fully Open SOTA VLM

AI Papers Academy · Advanced ·📄 Research Papers Explained ·11mo ago
In this video, we dive into Perception Language Models (PLMs), introduced in a recent paper from Meta titled PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding. While most vision-language models (VLMs) today are either closed or trained via distillation from black-box models, PLMs are fully open-source and trained from scratch, without relying on proprietary systems. They achieve impressive performance, even setting new state-of-the-art results on image and video benchmarks that require detailed visual understanding. 🔗 Written Review - soon :) 🔗 Paper: https://arxiv…
Watch on YouTube ↗ (saves to browser)

Chapters (4)

Introduction
1:25 PLM Architecture
3:40 PLM Training & Data
7:30 Results
The Secret Spy Tech Inside Every Credit Card
Next Up
The Secret Spy Tech Inside Every Credit Card
Veritasium