Unified Video and Image Representation for Boosted Video Face Forgery Detection
Liu, Haotian; Pan, Chenhui; Liu, Yang; Zhao, Guoying; Li, Xiaobai
Liu, Haotian
Pan, Chenhui
Liu, Yang
Zhao, Guoying
Li, Xiaobai
IOS Press
Liu, H., Pan, C., Liu, Y., Zhao, G., & Li, X. (2024). Unified video and image representation for boosted video face forgery detection. In U. Endriss, F. S. Melo, K. Bach, A. Bugarín-Diz, J. M. Alonso-Moral, S. Barro, & F. Heintz (Eds.), Frontiers in Artificial Intelligence and Applications, 673-680. IOS Press. https://doi.org/10.3233/FAIA240548
https://creativecommons.org/licenses/by-nc/4.0/
© 2024 The Authors. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0)
https://creativecommons.org/licenses/by-nc/4.0/
© 2024 The Authors. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0)
https://creativecommons.org/licenses/by-nc/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202501141156
https://urn.fi/URN:NBN:fi:oulu-202501141156
Tiivistelmä
Abstract
Face forgery detection is crucial in preserving the security and integrity of facial data amidst the rapid developments in face manipulation techniques and deep generative models. Existing methods for video face forgery detection typically assume that all frames in a forged video are manipulated, while identifying partially forged videos with only a subset of altered frames is still a challenge to be solved. To address this issue, we propose a novel framework, i.e., the UVIF, that utilizes additional annotated images to provide fine-grained supervision for detecting partial forgeries in videos. The UVIF integrates a unified encoder and a multi-task learning paradigm to model both facial videos and images for boosted video face forgery detection. A 2D backbone with temporal fusion modules is employed for the unified encoder. A pseudo labeling process is also designed for facial video frames to bridge the representation of individual video frames and static images. Extensive experiments on benchmark datasets demonstrate the effectiveness of our framework, outperforming state-of-the-art methods in detecting partially forged videos while introducing no additional computational overhead. Our code is available at unmapped: uri https://github.com/haotianll/UVIF.
Face forgery detection is crucial in preserving the security and integrity of facial data amidst the rapid developments in face manipulation techniques and deep generative models. Existing methods for video face forgery detection typically assume that all frames in a forged video are manipulated, while identifying partially forged videos with only a subset of altered frames is still a challenge to be solved. To address this issue, we propose a novel framework, i.e., the UVIF, that utilizes additional annotated images to provide fine-grained supervision for detecting partial forgeries in videos. The UVIF integrates a unified encoder and a multi-task learning paradigm to model both facial videos and images for boosted video face forgery detection. A 2D backbone with temporal fusion modules is employed for the unified encoder. A pseudo labeling process is also designed for facial video frames to bridge the representation of individual video frames and static images. Extensive experiments on benchmark datasets demonstrate the effectiveness of our framework, outperforming state-of-the-art methods in detecting partially forged videos while introducing no additional computational overhead. Our code is available at unmapped: uri https://github.com/haotianll/UVIF.
Kokoelmat
- Avoin saatavuus [42497]

