DeepID @ ICCV 2025

Dataset

We provide participants with the Fantasy ID Dataset for tuning or training their detection and localization models. They can use any other public, such as MIDV-2020 or BID datasets, and private datasets of ID documents for training/tuning their submitted models.

Fantasy ID Dataset (for training/tuning)

We provide 262 Fantasy ID cards designed to resemble ID documents of 10 different countries and languages. The design of the cards mimics the real ID documents and the corresponding cultural and language elements. The cards contain fantasy personal information (not of real people) but has faces from real people, which we took from these datasets of faces: American Multiracial Face Database (AMFD), Face London Research Dataset, and the High-Quality Wide Multi-Channel Attack (HQ-WMCA). The cards were printed using Evolis Primacy 2 card printer and scanned with three devices (iPhone 15 Pro, Huawei Mate 30, and Kyocera TASKalfa 2554ci office scanner), resulting in 786 images. These constitute bona fide samples. Manipulated versions were generated from bona fide using 2 face swapping methods and 2 text inpainting methods (by changing some fields such as name, date, id number, etc). In the dataset, one type of attack has one method for face swapping and one method for text inpainting. In the end, there are 786*2=1572 manipulated samples. All of the dataset (including more attacks and an additional test set) will be publicly released after the competition (target date for public release: August 2025). Majority of the data will be available under CC BY 4.0 license.

In-domain test dataset

This test dataset is based on a different set of Fantasy ID cards. It will be used for testing the submitted models. It will also include manipulations that are not present in the training data, which means this test set includes in-domain bona fide but unknown attacks. This test dataset will be publicly released after the competition. The Leaderboard with the evaluation results on this test dataset will be updated every day during the competition.

Private out-of-domain test dataset (hidden testing)

This private undisclosed set consists of bona fide images of ID documents from real individuals, provided by the ID verification company (PXL Vision), and their forged versions. This data represents out-of-domain set and will be used for evaluation only; no samples will be released or shown. The Leaderboard with the evaluation results on these private samples will be updated every day during the competition.

Here are some examples of Fantasy ID cards with original generated cards, printed and scanned bona fide, and manipulated then printed and re-scanned forged versions. The participants will need to detect manipulated/forged IDs either as a classification task (is it manipulated) or as a localization task (at which places it is manipulated).

Original Portugal

Bonafide Portugal

Forged Portugal

Original Turkiye

Bonafide Turkiye

Forged Turkiye

Submit to Workshop

You can submit the papers (up to 4 pages in ICCV format) to DeepID workshop via OpenReview submission system.

https://openreview.net/group?id=thecvf.com/ICCV/2025/Workshop/DeepID

We encourage top teams with the results above the baseline to submit a paper to the workshop describing their approach in details. These papers will be presented in the DeepID workshop on the morning of October 19, 2025. Please note that these papers will not be part of ICCV proceedings, so they will be only in the proceedings of DeepID 2025 workshop and publicly accessible in OpenReview.

Leaderboard

Evaluation results are posted for each track (detection and localization) separately. The submissions are sorted by the aggregate F1 score (at the moment by 'F1 on fantasy'). The 'baseline' team in the tables is the baseline TruFor model we provide in baseline docker code. If your submission is not listed, it means we could not process the docker, so we could not read it (please submit the tarball in .tgz format, not zip), it was invalid, or it did not adhere to the API. Please use our baseline docker code for preparing the submission and adhere to the suggested convention for naming the docker files < team_name >_< track_name >_< algorithm_name >_< version >.tgz, because we rely on the names to extract team, algorithm, and versions.

Best per team (detection track) (updated on July 15, 2025 at 11:23:43 Swiss time)
This is the final ranking.

	Team	Algorithm	Version	Runtime-fantasy	F1 on fantasy	Runtime-private	F1 on private	Aggregate F1
1	Sunlight	unetc	v9	0:26:08	0.9906416660367655	1:50:57	0.7191297839379392	0.800583
2	Incode	IccvDeepID	v023	0:45:35	0.8679166343284155	4:10:16	0.7533954606111526	0.787752
3	AG	edgedoc	v15	0:56:39	0.9580710412020603	1:27:56	0.7108249627328226	0.784999
4	UAM-Biometrics	trufor	v1.1	0:51:19	0.7116720789267821	1:18:41	0.7882604496971795	0.765284
5	hardik	tuned	v1	0:59:31	0.8388464501314228	1:28:16	0.6580754423848288	0.712307
6	Reagvis	tuned_vlm	v1	1:09:15	0.8158094677480847	2:12:05	0.663301034569992	0.709054
7	baseline	trufor	v0.1	0:44:46	0.8067118967592016	1:09:49	0.662222601003402	0.705569
8	LatentVibesOnly	trufor	v0.0	0:52:48	0.8067118967592016	1:26:10	0.662222601003402	0.705569
9	Deepakto	baseline	v0.1	0:47:44	0.8067118967592016	1:09:59	0.662222601003402	0.705569
10	VISION	trufor	v1	1:02:18	0.570987139473393	1:30:46	0.7506756800369373	0.696769
11	UNJ	clip	v6	0:02:25	0.6882444934886508	0:23:26	0.6553133084277257	0.665193
12	Asmodeus	trufor	v1.1	0:46:24	0.6882444934886508	1:11:21	0.6553133084277257	0.665193
13	KU-Forgeye	forgerydetection	v7	0:07:11	0.6038971637387054	1:00:18	0.6710191125688072	0.650883
14	SuperIdolSmile	unet	v0.2	0:12:04	0.8975010342736072	0:52:34	0.5386412954031168	0.646299
15	UVersumAI	ensembled	v4	0:02:56	0.537161310967704	0:22:38	0.656495936014569	0.620696
16	LJ	resnet	v2	0:09:33	0.500008301989212	2:08:23	0.6581210504898414	0.610687
17	DUCS	ensemble	v2	0:02:18	0.4576437408874124	0:22:51	0.6751082341269501	0.609869
18	ens-epfl	peunet	v2.25	0:22:38	0.32179688651450933	0:41:36	0.7147045144271646	0.596832
19	VCL-ITI	swinface	v1	0:16:09	0.31769049627017437	3:10:27	0.7010807403250812	0.586064
20	Aphi	dpadb	v2	0:03:32	0.2989340686324383	0:40:29	0.706648362562519	0.584334
21	IDNT	docauth	v0.11	0:05:49	0.37037163127979356	0:44:12	0.6553114962632904	0.569830
22	e0nia	mmfusion	v0	0:02:35	0.410882392561378	0:21:49	0.6157795362699562	0.554310
23	VisGen	scu-net	v1	0:05:42	0.48938910752959736	0:06:12	0.5506675117272216	0.532284
24	idvc	st_bf_comp_no_sc	v1.1	0:12:00	0.5632905869632292	0:45:39	0.5171312556765423	0.530979
25	Fake-Hunters	bionet	v3	0:01:48	0.32187103766106395	0:15:17	0.21610810639615832	0.247837

Full table with all detection results

Best per team (localization track) (updated on July 15, 2025 at 11:23:39 Swiss time)
This is the final ranking.

	Team	Algorithm	Version	Runtime-fantasy	F1 on fantasy	Runtime-private	F1 on private	Aggregate F1
1	Sunlight	unetc	v7	0:24:31	0.78391260410437	1:51:43	0.7162494760503502	0.736548
2	UAM-Biometrics	trufor	v1.1	0:51:19	0.6204511200997092	1:18:41	0.7569658860449113	0.716011
3	VISION	trufor	v1	1:02:18	0.6117324838968435	1:30:46	0.7378285553501286	0.700000
4	AG	trudoc	v5	8:30:49	0.6862783664612659	11:34:53	0.6618887933602785	0.669206
5	baseline	trufor	v0.1	0:44:46	0.5898009453750437	1:09:49	0.6274328723240761	0.616143
6	Reagvis	trufor	v2	0:58:16	0.5898009453750437	1:22:18	0.6274328723240761	0.616143
7	LatentVibesOnly	trufor	v0.0	0:52:48	0.5898009453750437	1:26:10	0.6274328723240761	0.616143
8	hardik	tuned	v1	0:59:31	0.5898009453750437	1:28:16	0.6274328723240761	0.616143
9	Deepakto	baseline	v0.1	0:47:44	0.5898009453750437	1:09:59	0.6274328723240761	0.616143
10	ens-epfl	peunet	v2.25	0:22:38	0.5634039904480761	0:41:36	0.6195108908471354	0.602679
11	Fake-Hunters	bionet	v2	0:04:26	0.5493777434348279	0:28:35	0.5896946730916813	0.577600
12	SuperIdolSmile	unet	v0.1	0:12:17	0.5635784925443205	0:52:13	0.5693047971614335	0.567587
13	IDNT	docauth	v0.12	0:06:16	0.5339141142472696	0:53:13	0.5504448403227907	0.545486
14	KU-Forgeye	forgerydetection	v5	0:06:04	0.5202362995180636	0:45:41	0.5381362843556258	0.532766
15	IDCH	trufor_augmented	v0.1	0:06:39	0.2629827806996036	0:47:18	0.1714349184560774	0.198899
16	VisGen	scu-net	v1	0:05:42	0.035723586318595124	0:06:12	0.16439219695195906	0.125792

Full table with all localization results

How we computed the tables

We ran all valid dockers on the dataset. At the moment, we have results for the test set of fantasy ID cards, which has 1385 cards. Algorithm name and version columns are parsed from the file names of your submissions. Runtime-db_name column shows how long it took for the docker to finish. Note that if it ran longer than 2 hours on Fantasy dataset (1385 samples), it means the docker was not using a GPU, even though it was running on the machine with one. We may stop running the dockers if they do not use a GPU, because private dataset is 20K images and it will take too long in that case.
F1 detection score is computed on one dataset from the predicted scores of each image by using f1-score function from scikit-learn like this: f1_score(labels, pred_labels, average='weighted'). It means the f1-score is weighted per class (bonafide and attack), which is a better way to compute f1-score when the data is unbalanced (our case). We used the decision threshold of 0.5 for computing the predicted label, assuming 1 is a bonafide and 0 is an attack.
F1 localization score is computed for each image independently as following:

if sample.is_bonafide:
    # Fully bonafide image: the whole image is 1 (no altered regions), 
    # bonafide pixels are positives, zeros are negatives
    # We want high F1 if the model predicted mostly 1s (bonafide)
    tp = np.sum(mask == 1)  # all pixels should be 1s in the mask
    fn = np.sum(mask == 0)  # any zeros are falsely detected as a negative class
    tn = 0
    fp = 0
else:
    tn = np.sum(mask * gt_mask)
    tp = np.sum((~mask) * (~gt_mask))
    fp = np.sum((~mask) * gt_mask)
    fn = np.sum(mask * (~gt_mask))
precision = tp / (tp + fp + 1e-8)
recall = tp / (tp + fn + 1e-8)
f1 = 2 * precision * recall / (precision + recall + 1e-8)

The final f1-score is a mean of two f1-score means computed for each bonafide and attack classes (mean(bonafide_f1_scores) + mean(attack_f1_scores)) / 2, so that the class with larger number of samples would not dominate the final f1-score.
Aggregate F1 is computed from two f1-scores on fantasy ID cards test set and on Private set of real documents as weighted average: f1_fantasy*0.3 + f1_private*0.7. This approach puts more importance on the results from Private set.

Workshop details

The workshop will feature two prominent keynote speakers, who are very well known in media forensics, deepfake detection, and face and ID document presentation attack detection. The overview and results of the challenge will be presented at the workshop by Prof. Sébastien Marcel from Idiap Research Institute. Also, the top teams, who are also co-authors of the overview paper, are going to present their approaches they used in the challenge at the workshop.

Workshop schedule (19 of October 2025 on Honolulu Time):

8:50 - 9:00 -- Welcome and introduction by Prof. Sébastien Marcel
9:00 - 9:40 -- Keynote talk by Prof. Luisa Verdoliva on "Synthetic Media Verification in the Era of Generative AI"
9:40 - 10:00 -- DeepID challenge overview
10:00 - 10:20 -- Coffee break
10:20 - 11:00 -- Keynote talk by Dr. Juan Tapia on "PAD on ID Cards and new insights"
11:00 - 12:00 -- Presentations by the top teams in the challenge

Sunlight team: Zeqin Yu presents
"From Natural Images to ID Documents: Adapting the Reinforced Multi-teacher Knowledge Distillation Framework for Forgery Detection and Localization"
Incode team
UAM-Biometrics team: Daniel DeAlcala presents
"TruForID: An Adaptation of TruFor Model for Fake ID Detection based on Knowledge Distillation"
AG team: Anjith George presents
"EdgeDoc: Hybrid CNN-Transformer Model for Accurate Forgery Detection and Localization in ID Documents"

12:00 - 12:10 -- Challenge award and closing remarks

Keynote Speakers

Prof. Luisa Verdoliva

University Federico II of Naples

Digital forensics and deepfake detection

Dr. Juan Tapia

ATHENE National Research Center

Presentation/morphing attack detection

The Challenge of Detecting Synthetic Manipulations in ID Documents
(DeepID 2025)

Challenge is held in conjunction with IEEE/CVF ICCV, Honolulu, Hawaii, USA on October 2025.

Overview

The focus of the competition is on:

Important Dates

Competition Details

Track 1

Track 2

Evaluation details

Dataset

Fantasy ID Dataset (for training/tuning)

In-domain test dataset

Private out-of-domain test dataset (hidden testing)

Original Portugal

Bonafide Portugal

Forged Portugal

Original Turkiye

Bonafide Turkiye

Forged Turkiye

Submit to Workshop

Leaderboard

Best per team (detection track) (updated on July 15, 2025 at 11:23:43 Swiss time)
This is the final ranking.

Best per team (localization track) (updated on July 15, 2025 at 11:23:39 Swiss time)
This is the final ranking.

How we computed the tables

Workshop details

Workshop schedule (19 of October 2025 on Honolulu Time):

Keynote Speakers

Prof. Luisa Verdoliva

Dr. Juan Tapia

Organizers

Pavel Korshunov

Nevena Shamoska

Magdalena Połać

Vedrana Krivokuca

Vidit

Amir Mohammadi

Christophe Ecabert

Sébastien Marcel

Overview

The focus of the competition is on:

Important Dates

Competition Details

Track 1

Track 2

Evaluation details

Dataset

Fantasy ID Dataset (for training/tuning)

In-domain test dataset

Private out-of-domain test dataset (hidden testing)

Original Portugal

Bonafide Portugal

Forged Portugal

Original Turkiye

Bonafide Turkiye

Forged Turkiye

Submit to Workshop

Leaderboard

Best per team (detection track) (updated on July 15, 2025 at 11:23:43 Swiss time) This is the final ranking.

Best per team (localization track) (updated on July 15, 2025 at 11:23:39 Swiss time) This is the final ranking.

How we computed the tables

Workshop details

Workshop schedule (19 of October 2025 on Honolulu Time):

Keynote Speakers

Prof. Luisa Verdoliva

Dr. Juan Tapia

Organizers

Pavel Korshunov

Nevena Shamoska

Magdalena Połać

Vedrana Krivokuca

Vidit

Amir Mohammadi

Christophe Ecabert

Sébastien Marcel

Best per team (detection track) (updated on July 15, 2025 at 11:23:43 Swiss time)
This is the final ranking.

Best per team (localization track) (updated on July 15, 2025 at 11:23:39 Swiss time)
This is the final ranking.