Global Cross-Entropy Loss for Deep Face Recognition
Zhao, Weisong; Zhu, Xiangyu; Shi, Haichao; Zhang, Xiao-Yu; Zhao, Guoying; Lei, Zhen (2025-03-05)
Zhao, Weisong
Zhu, Xiangyu
Shi, Haichao
Zhang, Xiao-Yu
Zhao, Guoying
Lei, Zhen
05.03.2025
W. Zhao, X. Zhu, H. Shi, X. -Y. Zhang, G. Zhao and Z. Lei, "Global Cross-Entropy Loss for Deep Face Recognition," in IEEE Transactions on Image Processing, vol. 34, pp. 1672-1685, 2025, doi: 10.1109/TIP.2025.3546481
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202503192107
https://urn.fi/URN:NBN:fi:oulu-202503192107
Tiivistelmä
Abstract
Contemporary deep face recognition techniques predominantly utilize the Softmax loss function, designed based on the similarities between sample features and class prototypes. These similarities can be categorized into four types: in-sample target similarity, in-sample non-target similarity, out-sample target similarity, and out-sample non-target similarity. When a sample feature from a specific class is designated as the anchor, the similarity between this sample and any class prototype is referred to as in-sample similarity. In contrast, the similarity between samples from other classes and any class prototype is known as out-sample similarity. The terms target and non-target indicate whether the sample and the class prototype used for similarity calculation belong to the same identity or not. The conventional Softmax loss function promotes higher in-sample target similarity than in-sample non-target similarity. However, it overlooks the relation between in-sample and out-sample similarity. In this paper, we propose Global Cross-Entropy loss (GCE), which promotes 1) greater in-sample target similarity over both the in-sample and out-sample non-target similarity, and 2) smaller in-sample non-target similarity to both in-sample and out-sample target similarity. In addition, we propose to establish a bilateral margin penalty for both in-sample target and non-target similarity, so that the discrimination and generalization of the deep face model are improved. To bridge the gap between training and testing of face recognition, we adapt the GCE loss into a pairwise framework by randomly replacing some class prototypes with sample features. We designate the model trained with the proposed Global Cross-Entropy loss as GFace. Extensive experiments on several public face benchmarks, including LFW, CALFW, CPLFW, CFP-FP, AgeDB, IJB-C, IJB-B, MFR-Ongoing, and MegaFace, demonstrate the superiority of GFace over other methods. Additionally, GFace exhibits robust performance in general visual recognition task.
Contemporary deep face recognition techniques predominantly utilize the Softmax loss function, designed based on the similarities between sample features and class prototypes. These similarities can be categorized into four types: in-sample target similarity, in-sample non-target similarity, out-sample target similarity, and out-sample non-target similarity. When a sample feature from a specific class is designated as the anchor, the similarity between this sample and any class prototype is referred to as in-sample similarity. In contrast, the similarity between samples from other classes and any class prototype is known as out-sample similarity. The terms target and non-target indicate whether the sample and the class prototype used for similarity calculation belong to the same identity or not. The conventional Softmax loss function promotes higher in-sample target similarity than in-sample non-target similarity. However, it overlooks the relation between in-sample and out-sample similarity. In this paper, we propose Global Cross-Entropy loss (GCE), which promotes 1) greater in-sample target similarity over both the in-sample and out-sample non-target similarity, and 2) smaller in-sample non-target similarity to both in-sample and out-sample target similarity. In addition, we propose to establish a bilateral margin penalty for both in-sample target and non-target similarity, so that the discrimination and generalization of the deep face model are improved. To bridge the gap between training and testing of face recognition, we adapt the GCE loss into a pairwise framework by randomly replacing some class prototypes with sample features. We designate the model trained with the proposed Global Cross-Entropy loss as GFace. Extensive experiments on several public face benchmarks, including LFW, CALFW, CPLFW, CFP-FP, AgeDB, IJB-C, IJB-B, MFR-Ongoing, and MegaFace, demonstrate the superiority of GFace over other methods. Additionally, GFace exhibits robust performance in general visual recognition task.
Kokoelmat
- Avoin saatavuus [38865]