@inproceedings{hu2018squeeze,
title={Squeeze-and-excitation networks},
author={Hu, Jie and Shen, Li and Sun, Gang},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={7132--7141},
year={2018}
}
@inproceedings{lin2014microsoft,
title={Microsoft coco: Common objects in context},
author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
booktitle={European conference on computer vision},
pages={740--755},
year={2014},
organization={Springer}
}
Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset
Arch | Input Size | AP | AP50 | AP75 | AR | AR50 | ckpt | log |
---|---|---|---|---|---|---|---|---|
pose_seresnet_50 | 256x192 | 0.729 | 0.903 | 0.807 | 0.784 | 0.941 | ckpt | log |
pose_seresnet_50 | 384x288 | 0.748 | 0.904 | 0.819 | 0.799 | 0.941 | ckpt | log |
pose_seresnet_101 | 256x192 | 0.734 | 0.905 | 0.814 | 0.790 | 0.941 | ckpt | log |
pose_seresnet_101 | 384x288 | 0.754 | 0.907 | 0.823 | 0.805 | 0.943 | ckpt | log |
pose_seresnet_152* | 256x192 | 0.730 | 0.899 | 0.810 | 0.787 | 0.939 | ckpt | log |
pose_seresnet_152* | 384x288 | 0.753 | 0.906 | 0.824 | 0.806 | 0.945 | ckpt | log |
Note that * means without imagenet pre-training.