Papers
arxiv:2409.14128

Present and Future Generalization of Synthetic Image Detectors

Published on Sep 21
· Submitted by dariog on Sep 25
Authors:
,

Abstract

The continued release of new and better image generation models increases the demand for synthetic image detectors. In such a dynamic field, detectors need to be able to generalize widely and be robust to uncontrolled alterations. The present work is motivated by this setting, when looking at the role of time, image transformations and data sources, for detector generalization. In these experiments, none of the evaluated detectors is found universal, but results indicate an ensemble could be. Experiments on data collected in the wild show this task to be more challenging than the one defined by large-scale datasets, pointing to a gap between experimentation and actual practice. Finally, we observe a race equilibrium effect, where better generators lead to better detectors, and vice versa. We hypothesize this pushes the field towards a perpetually close race between generators and detectors.

Community

Paper author Paper submitter
This comment has been hidden
Paper author Paper submitter

The paper discusses the current state of synthetic image detectors, analyzing 11 different detectors and 17 synthetic datasets, produced by 14 distinct image generators (e.g., Stable Diffusion, Dalle-3, Midjourney, Flux, Firefly).

Results shows current detectors fail to generalize (every tested detector is the best and the worst detector for at least one dataset), and typically fall under two categories: Those that produce many false positives (real images labeled as fake) and those that produce many false negatives (fake images labeled as real).

Among the most relevant factors for detector performance, image resolution and dataset source stand out. The ethical implications of releasing detectors (which can be used to improve generators) is also discussed, in the context of an arms race.

These results motivate the private development of multiscale, multisource detectors, used in an ensemble manner, which can be used to label synthetic images as such, following current legislation (e.g., AI Act), always under human supervision.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2409.14128 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.