DJW c16313bb6a 第一次提交 | před 10 měsíci | |
---|---|---|
.. | ||
README.md | před 10 měsíci | |
faster-rcnn_r50_fpg-chn128_crop640-50e_coco.py | před 10 měsíci | |
faster-rcnn_r50_fpg_crop640-50e_coco.py | před 10 měsíci | |
faster-rcnn_r50_fpn_crop640-50e_coco.py | před 10 měsíci | |
mask-rcnn_r50_fpg-chn128_crop640-50e_coco.py | před 10 měsíci | |
mask-rcnn_r50_fpg_crop640-50e_coco.py | před 10 měsíci | |
mask-rcnn_r50_fpn_crop640-50e_coco.py | před 10 měsíci | |
metafile.yml | před 10 měsíci | |
retinanet_r50_fpg-chn128_crop640_50e_coco.py | před 10 měsíci | |
retinanet_r50_fpg_crop640_50e_coco.py | před 10 měsíci |
Feature pyramid networks have been widely adopted in the object detection literature to improve feature representations for better handling of variations in scale. In this paper, we present Feature Pyramid Grids (FPG), a deep multi-pathway feature pyramid, that represents the feature scale-space as a regular grid of parallel bottom-up pathways which are fused by multi-directional lateral connections. FPG can improve single-pathway feature pyramid networks by significantly increasing its performance at similar computation cost, highlighting importance of deep pyramid representations. In addition to its general and uniform structure, over complicated structures that have been found with neural architecture search, it also compares favorably against such approaches without relying on search. We hope that FPG with its uniform and effective nature can serve as a strong component for future work in object recognition.
We benchmark the new training schedule (crop training, large batch, unfrozen BN, 50 epochs) introduced in NAS-FPN. All backbones are Resnet-50 in pytorch style.
Method | Neck | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
---|---|---|---|---|---|---|---|---|
Faster R-CNN | FPG | 50e | 20.0 | - | 42.3 | - | config | model | log |
Faster R-CNN | FPG-chn128 | 50e | 11.9 | - | 41.2 | - | config | model | log |
Faster R-CNN | FPN | 50e | 20.0 | - | 38.9 | - | config | model | log |
Mask R-CNN | FPG | 50e | 23.2 | - | 43.0 | 38.1 | config | model | log |
Mask R-CNN | FPG-chn128 | 50e | 15.3 | - | 41.7 | 37.1 | config | model | log |
Mask R-CNN | FPN | 50e | 23.2 | - | 49.6 | 35.6 | config | model | log |
RetinaNet | FPG | 50e | 20.8 | - | 40.5 | - | config | model | log |
RetinaNet | FPG-chn128 | 50e | 19.9 | - | 39.9 | - | config | model | log |
Note: Chn128 means to decrease the number of channels of features and convs from 256 (default) to 128 in Neck and BBox Head, which can greatly decrease memory consumption without sacrificing much precision.
@article{chen2020feature,
title={Feature pyramid grids},
author={Chen, Kai and Cao, Yuhang and Loy, Chen Change and Lin, Dahua and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2004.03580},
year={2020}
}