DJW c16313bb6a 第一次提交 | hace 10 meses | |
---|---|---|
.. | ||
configs | hace 10 meses | |
detic | hace 10 meses | |
README.md | hace 10 meses | |
demo.py | hace 10 meses |
Detic: A Detector with image classes that can use image-level labels to easily train detectors.
Detecting Twenty-thousand Classes using Image-level Supervision, Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra, ECCV 2022 (arXiv 2201.02605)
Detic requires to install CLIP.
pip install git+https://github.com/openai/CLIP.git
First, go to the Detic project folder.
cd projects/Detic
Then, download the pre-computed CLIP embeddings from dataset metainfo to the datasets/metadata
folder.
The CLIP embeddings will be loaded to the zero-shot classifier during inference.
For example, you can download LVIS's class name embeddings with the following command:
wget -P datasets/metadata https://raw.githubusercontent.com/facebookresearch/Detic/main/datasets/metadata/lvis_v1_clip_a%2Bcname.npy
You can run demo like this:
python demo.py \
${IMAGE_PATH} \
${CONFIG_PATH} \
${MODEL_PATH} \
--show \
--score-thr 0.5 \
--dataset lvis
You can detect custom classes with --class-name
command:
python demo.py \
${IMAGE_PATH} \
${CONFIG_PATH} \
${MODEL_PATH} \
--show \
--score-thr 0.3 \
--class-name headphone webcam paper coffe
Note that headphone
, paper
and coffe
(typo intended) are not LVIS classes. Despite the misspelled class name, Detic can produce a reasonable detection for coffe
.
Here we only provide the Detic Swin-B model for the open vocabulary demo. Multi-dataset training and open-vocabulary testing will be supported in the future.
To find more variants, please visit the official model zoo.
Backbone | Training data | Config | Download |
---|---|---|---|
Swin-B | ImageNet-21K & LVIS & COCO | config | model |
If you find Detic is useful in your research or applications, please consider giving a star 🌟 to the official repository and citing Detic by the following BibTeX entry.
@inproceedings{zhou2022detecting,
title={Detecting Twenty-thousand Classes using Image-level Supervision},
author={Zhou, Xingyi and Girdhar, Rohit and Joulin, Armand and Kr{\"a}henb{\"u}hl, Philipp and Misra, Ishan},
booktitle={ECCV},
year={2022}
}
[x] Milestone 1: PR-ready, and acceptable to be one of the projects/
.
[ ] Milestone 2: Indicates a successful model implementation.
[ ] Milestone 3: Good to be a part of our core package!
[ ] Move your modules into the core package following the codebase's file hierarchy structure.