Representation Learning for Topology-adaptive Micro-gesture Recognition and Analysis
Shah, Atif; Chen, Haoyu; Zhao, Guoying (2023-10-25)
Shah, Atif
Chen, Haoyu
Zhao, Guoying
Redaktion Sun SITE
25.10.2023
Shah, A., Chen, H. & Zhao, G. (2023). Representation Learning for Topology-adaptive Micro-gesture Recognition and Analysis. In Z. Guoying, B. W. Schuller, E. Adeli, T. Zhu, Z. Tingshao & H. Chen (Eds.), IJCAI-MIGA Workshop & Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA) July 21, 2023 Macao, China. Retrieved from https://ceur-ws.org/Vol-3522/paper_7.pdf
https://creativecommons.org/licenses/by/4.0/
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
https://creativecommons.org/licenses/by/4.0/
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
https://creativecommons.org/licenses/by/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202312083559
https://urn.fi/URN:NBN:fi:oulu-202312083559
Tiivistelmä
Abstract
Human-to-human communication is greatly influenced by micro-gestures. The actions of a person inherently reveal information about their true sentiments and potential intentions. Micro-gestures are non-verbal cues that indicate a person’s true feelings and intentions; however, they become more challenging to recognize than normal gestures because micro-gestures are subtle and appear for milliseconds. In this work, we propose a graph-encoding convolutional network to extract intrinsic joint representations from skeletons using a self-attention graph convolution module in the spatial domain. The multi-scale temporal convolution module extracts the temporal representation in the time domain and sends it to the classification module to recognize micro-gestures. We evaluate the proposed framework using two micro-gesture datasets, SMG and iMiGUE, and achieve state-of-the-art results.
Human-to-human communication is greatly influenced by micro-gestures. The actions of a person inherently reveal information about their true sentiments and potential intentions. Micro-gestures are non-verbal cues that indicate a person’s true feelings and intentions; however, they become more challenging to recognize than normal gestures because micro-gestures are subtle and appear for milliseconds. In this work, we propose a graph-encoding convolutional network to extract intrinsic joint representations from skeletons using a self-attention graph convolution module in the spatial domain. The multi-scale temporal convolution module extracts the temporal representation in the time domain and sends it to the classification module to recognize micro-gestures. We evaluate the proposed framework using two micro-gesture datasets, SMG and iMiGUE, and achieve state-of-the-art results.
Kokoelmat
- Avoin saatavuus [34609]