Hyppää sisältöön
    • FI
    • ENG
  • FI
  • /
  • EN
OuluREPO – Oulun yliopiston julkaisuarkisto / University of Oulu repository
Näytä viite 
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

MaskFusionNet: A Dual-Stream Fusion Model with Masked Pre-training Mechanism for rPPG Measurement

Zhang, Yizhu; Shi, Jingang; Wang, Jiayin; Zong, Yuan; Zheng, Wenming; Zhao, Guoying (2024-07-03)

 
Avaa tiedosto
nbnfioulu-202409256044.pdf (12.07Mt)
Lataukset: 

URL:
https://doi.org/10.1109/TCSVT.2024.3422849

Zhang, Yizhu
Shi, Jingang
Wang, Jiayin
Zong, Yuan
Zheng, Wenming
Zhao, Guoying
IEEE
03.07.2024

Y. Zhang, J. Shi, J. Wang, Y. Zong, W. Zheng and G. Zhao, "MaskFusionNet: A Dual-Stream Fusion Model With Masked Pre-Training Mechanism for rPPG Measurement," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 11, pp. 11521-11534, Nov. 2024, doi: 10.1109/TCSVT.2024.3422849

https://rightsstatements.org/vocab/InC/1.0/
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
doi:https://doi.org/10.1109/TCSVT.2024.3422849
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202409256044
Tiivistelmä
Abstract

Remote photoplethysmography (rPPG) has considerable significance in areas such as disease diagnosis and emotion analysis. Recent rPPG models have demonstrated excellent performance due to their powerful heart rate information extraction capabilities. However, these models often focus on limited regions of interest (ROI) on facial image, which makes them sensitive to interference. If the ROI is affected by muscle movement, lighting variation and noise, the model’s performance would degrade significantly. To address this limitation, we propose a two-stage model called MaskFusionNet. The model includes two stages: 1) During the pre-training stage, the mask-reconstruction mechanism drives MaskFusionNet to learn rPPG information from various facial regions by applying a tube masking strategy. This enhances the model’s ability to resist interference. Based on the periodicity and continuity of the heart rate signal, we also design a novel spatio-temporal reconstruction loss function that focuses on the data’s spatial features and temporal continuity. 2) In the fine-tuning stage, we propose the Multi-Scale Fusion Block (MFB) to combine multi-scale features from the dual-stream network. It allows the model to detect subtle heart rate variations in adjacent frames while minimizing the impact of interference by extracting features within longer segments. The transformer-based MaskFusionNet can extract multi-scale fused heart rate features from a wide range of skin regions while preserving the modeling capability of long-range sequence information. To validate its effectiveness, we extensively evaluate our model on three benchmark datasets (VIPL-HR, COHFACE, and PURE), demonstrating its superior performance in both intra-dataset and cross-dataset testing scenarios.
Kokoelmat
  • Avoin saatavuus [38840]
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatAsiasanatUusimmatSivukartta

Omat tiedot

Kirjaudu sisäänRekisteröidy
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen