Lightweight Human Pose Estimation Based on Self-Attention Mechanism

Authors

  • Youtao Luo
  • Xiaoming Gao

DOI:

https://doi.org/10.56028/aetr.4.1.253.2023

Keywords:

human pose estimation; lightweight; window self-attention; inverted residual network.

Abstract

To tackle the issues of numerous parameters, high computational complexity, and extended detection time prevalent in current human pose estimation network models, we have incorporated an hourglass structure to create a lightweight single-path network model, which has fewer parameters and a shorter computation time. To ensure model accuracy, we have implemented a window self-attention mechanism with a reduced parameter count. Additionally, we have redesigned this self-attention module to effectively extract local and global information, thereby enriching the feature information learned by the model. This module merges with the inverted residual network architecture, creating a separate module of WGNet. Finally, WGNet can be flexibly embedded into different stages of the model. Training and validation on COCO and MPII datasets demonstrate that this model reduces the number of parameters by 25%, computational complexity by 41%, and inference time by nearly two times, compared to Hrformer, which also utilizes the windowed self-attention mechanism, at the cost of only 3.5% accuracy.

Downloads

Published

2023-03-21