DETR Explained | End-to-End Object Detection with Transformers | DETR Tutorial Part 1

Name: DETR Explained | End-to-End Object Detection with Transformers | DETR Tutorial Part 1
Uploaded: 2025-04-19T18:51:43+00:00
Channel: ExplainingAI
Description: This tutorial video covers DETR, end to end object detection with transformers. DETR transforms object detection into a direct set prediction problem. T...

ExplainingAI · Beginner ·👁️ Computer Vision ·1y ago

This tutorial video covers DETR, end to end object detection with transformers. DETR transforms object detection into a direct set prediction problem. There are no anchors, no need for NMS, just elegant transformers. In this video which is Part I of two part video, we go deep into DETR by Facebook AI, understanding how it replaces traditional object detection pipelines with a transformer-based architecture for end-to-end object detection. The video will go over DETR model, its architecture breakdown, how it removes the need for NMS and anchor boxes, Hungarian matching and loss used to train it. The goal of this DETR tutorial is to break down everything thats relevant in the paper, explain the DETR model architecture, and by the end give clarity on how transformers are used for end to end object detection in DETR. In the next part using the architecture and loss of DETR that we go over in this part I video, we will be implementing and training it on voc dataset. ⏱️ Timestamps: 00:00 DETR : End-to-end object detection with transformers 00:51 High Level Overview of DETR Architecture 13:10 Backbone of Detection Transformer 14:35 Detr Transformer Encoder 19:07 Detr Transformer Decoder 26:00 Hungarian matching for Detr Object Detection 38:04 Matching Strategy and Cost for Detr explained 42:57 DETR(Detection transformer) Loss Explained 45:27 Auxiliary Loss in DETR 46:58 DETR Video’s Part I and Part II Outline 📖 Resources: Detr Paper - https://tinyurl.com/exai-detr-paper Hungarian Matching Notes - https://econweb.ucsd.edu/~jsobel/172aw02/notes8.pdf Vision Transformer Videos Patch Embedding Video - https://www.youtube.com/watch?v=lBicvB4iyYU Attention Video - https://www.youtube.com/watch?v=zT_el_cjiJw Transformer Module Implementation Video - https://www.youtube.com/watch?v=G6_IA5vKXRI Cross Attention Segment from Stable Diffusion Video - https://www.youtube.com/watch?v=hEJjg7VUA8g&t=2096s Generalized IOU Segment from YOLOv4 Video - https://youtu.be/b148nt9P8J

Watch on YouTube ↗ (saves to browser)