Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
merveΒ 
posted an update 8 days ago
Post
3868
Real-time DEtection Transformer (RT-DETR) landed in transformers 🀩 with Apache 2.0 license 😍

πŸ”– models: https://huggingface.co/PekingU
πŸ”– demo: merve/RT-DETR-tracking-coco
πŸ“ paper: DETRs Beat YOLOs on Real-time Object Detection (2304.08069)
πŸ“– notebook: https://github.com/merveenoyan/example_notebooks/blob/main/RT_DETR_Notebook.ipynb

YOLO models are known to be super fast for real-time computer vision, but they have a downside with being volatile to NMS πŸ₯²

Transformer-based models on the other hand are computationally not as efficient πŸ₯²

Isn't there something in between? Enter RT-DETR!

The authors combined CNN backbone, multi-stage hybrid decoder (combining convs and attn) with a transformer decoder. In the paper, authors also claim one can adjust speed by changing decoder layers without retraining altogether.
The authors find out that the model performs better in terms of speed and accuracy compared to the previous state-of-the-art. 🀩
In this post