DSAF: Dual Space Alignment Framework for Visible-Infrared Person Re-Identification
Jiang, Yan; Cheng, Xu; Yu, Hao; Liu, Xingyu; Chen, Haoyu; Zhao, Guoying (2025-02-21)
Jiang, Yan
Cheng, Xu
Yu, Hao
Liu, Xingyu
Chen, Haoyu
Zhao, Guoying
IEEE
21.02.2025
Y. Jiang, X. Cheng, H. Yu, X. Liu, H. Chen and G. Zhao, "DSAF: Dual Space Alignment Framework for Visible-Infrared Person Re-Identification," in IEEE Transactions on Multimedia, doi: 10.1109/TMM.2025.3542988
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202503101928
https://urn.fi/URN:NBN:fi:oulu-202503101928
Tiivistelmä
Abstract
Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval task that aims to match visible and infrared pedestrian images across non-overlapped cameras. However, we observe that three crucial challenges remain inadequately addressed by existing methods: (i) limited discriminative capacity for modality-shared representation, (ii) modality misalignment, and (iii) neglect of identity consistency knowledge. To solve the above issues, we propose a novel dual space alignment framework (DSAF) to constrain the modality in two specific spaces. Specifically, for (i), we design a lightweight and plug-and-play modality invariant enhancement (MIE) module to capture fine-grained semantic information and render identity discriminative. This facilitates the establishment of correlations between visible and infrared modalities, enabling the model to learn robust modality-shared features. To tackle (ii), a dual space alignment (DSA) is introduced to conduct the pixel-level alignment in both Euclidean space and Hilbert space. DSA establishes an elastic relationship between these two spaces, remaining invariant knowledge across two spaces. To solve (iii), we propose an adaptive identity-consistent learning (AIL) to discover identity-consistent knowledge between visible and infrared modalities in a dynamic manner. Extensive experiments on mainstream VI-ReID benchmarks show the superiority and flexibility of our proposed method, achieving competitive performance on mainstream datasets.
Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval task that aims to match visible and infrared pedestrian images across non-overlapped cameras. However, we observe that three crucial challenges remain inadequately addressed by existing methods: (i) limited discriminative capacity for modality-shared representation, (ii) modality misalignment, and (iii) neglect of identity consistency knowledge. To solve the above issues, we propose a novel dual space alignment framework (DSAF) to constrain the modality in two specific spaces. Specifically, for (i), we design a lightweight and plug-and-play modality invariant enhancement (MIE) module to capture fine-grained semantic information and render identity discriminative. This facilitates the establishment of correlations between visible and infrared modalities, enabling the model to learn robust modality-shared features. To tackle (ii), a dual space alignment (DSA) is introduced to conduct the pixel-level alignment in both Euclidean space and Hilbert space. DSA establishes an elastic relationship between these two spaces, remaining invariant knowledge across two spaces. To solve (iii), we propose an adaptive identity-consistent learning (AIL) to discover identity-consistent knowledge between visible and infrared modalities in a dynamic manner. Extensive experiments on mainstream VI-ReID benchmarks show the superiority and flexibility of our proposed method, achieving competitive performance on mainstream datasets.
Kokoelmat
- Avoin saatavuus [38840]