Learning from Yourself to Others for Unsupervised Visible-Infrared Re-Identification
Ji, Wenhui; Cheng, Xu; Jiang, Yan; Sun, Zhaodong; Zhao, Guoying (2025-05-22)
Ji, Wenhui
Cheng, Xu
Jiang, Yan
Sun, Zhaodong
Zhao, Guoying
IEEE
22.05.2025
W. Ji, X. Cheng, Y. Jiang, Z. Sun and G. Zhao, "Learning from Yourself to Others for Unsupervised Visible-Infrared Re-Identification," in IEEE Transactions on Circuits and Systems for Video Technology, doi: 10.1109/TCSVT.2025.3572697
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202506034092
https://urn.fi/URN:NBN:fi:oulu-202506034092
Tiivistelmä
Abstract
Unsupervised visible-infrared person re-identification (US-VI-ReID) aims to match unlabeled pedestrian images captured under varying lighting conditions. The key challenge lies in generating accurate pseudo-labels, alongside alleviating the significant modality gap between visible and infrared modalities. Existing methods mainly focus on mitigating the effects of noisy labels through loss functions during backward propagation. However, these noisy labels already influence the forward propagation, leading to incorrect cross-modality correspondences. To address this issue, we propose a Hierarchical Centrality Collaborative Learning (HCCL) framework for US-VI-ReID, which proactively identifies noisy labels during the forward propagation. The rationale behind HCCL is that intra-modality refinement serves as the foundation for establishing cross-modality correspondences, reflecting the principle of learning from yourself to others. For intra-modality learning, we propose a Closeness Centrality Selection (CCS), quantifying sample confidence using closeness centrality to identify noisy samples. By discarding the noisy samples during forward propagation, CCS mitigates their adverse effects and ensures identity-consistent representation learning. For cross-modality learning, a Hierarchical Consistency Matching (HCM) is proposed to establish local instance-level label associations by leveraging bidirectional consistency with the most reliable samples identified during intra-modality learning. These local associations are then propagated to guide the global cluster-level cross-modality correspondences. Extensive experiments demonstrate that our HCCL achieves competitive performance on mainstream datasets, even surpassing some supervised counterparts. Additionally, outstanding results on corrupted datasets verify its generalizability and robustness.
Unsupervised visible-infrared person re-identification (US-VI-ReID) aims to match unlabeled pedestrian images captured under varying lighting conditions. The key challenge lies in generating accurate pseudo-labels, alongside alleviating the significant modality gap between visible and infrared modalities. Existing methods mainly focus on mitigating the effects of noisy labels through loss functions during backward propagation. However, these noisy labels already influence the forward propagation, leading to incorrect cross-modality correspondences. To address this issue, we propose a Hierarchical Centrality Collaborative Learning (HCCL) framework for US-VI-ReID, which proactively identifies noisy labels during the forward propagation. The rationale behind HCCL is that intra-modality refinement serves as the foundation for establishing cross-modality correspondences, reflecting the principle of learning from yourself to others. For intra-modality learning, we propose a Closeness Centrality Selection (CCS), quantifying sample confidence using closeness centrality to identify noisy samples. By discarding the noisy samples during forward propagation, CCS mitigates their adverse effects and ensures identity-consistent representation learning. For cross-modality learning, a Hierarchical Consistency Matching (HCM) is proposed to establish local instance-level label associations by leveraging bidirectional consistency with the most reliable samples identified during intra-modality learning. These local associations are then propagated to guide the global cluster-level cross-modality correspondences. Extensive experiments demonstrate that our HCCL achieves competitive performance on mainstream datasets, even surpassing some supervised counterparts. Additionally, outstanding results on corrupted datasets verify its generalizability and robustness.
Kokoelmat
- Avoin saatavuus [38404]