legekka commited on
Commit
8d78495
1 Parent(s): 951bdf0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -2
README.md CHANGED
@@ -1,5 +1,61 @@
1
  ---
2
  license: apache-2.0
3
  pipeline_tag: image-classification
4
- base-model: legekka/AI-Anime-Image-Detector-ViT
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  pipeline_tag: image-classification
4
+ ---
5
+
6
+ # AI Anime Image Detector ViT
7
+
8
+ This model is a proof of concept model of detecting anime styled AI images. Using Vision Transformer, it was trained on 1M human-made real and 217K AI generated anime images. During training either type appeared in equal amount to avoid biases. The model was trained on a single RTX 3090 GPU for about 40 hours, ~35 epochs.
9
+
10
+ ## Evaluation
11
+
12
+ Each checkpoint was evaluated on 500-500 real and AI images.
13
+ - Training Loss: 0.1009
14
+ - Eval Loss: 0.1386
15
+
16
+ It seems like using random crops helped the model to generalize better, however, the training dataset only contained 512x512 images, which meant that every cropped image had bilinear interpolation. Training the model on 1024x1024 images could probably further improve its performance.
17
+
18
+ We did a small comparison with the current available AI image detectors:
19
+
20
+ | Image | Nahrawy/AIorNot | umm-maybe/AI-image-detector | Organika/sdxl-detector | Ours |
21
+ |--------------------|-----------------|-----------------------------|------------------------|------------|
22
+ | D:\test\ai_1.jpg | ai (100%) | human (86%) | artificial (100%) | ai (100%) |
23
+ | D:\test\ai_2.jpg | ai (99%) | human (96%) | artificial (100%) | ai (100%) |
24
+ | D:\test\ai_3.jpg | ai (77%) | human (98%) | artificial (100%) | ai (100%) |
25
+ | D:\test\ai_4.jpg | real (66%) | human (100%) | human (100%) | real (100%)|
26
+ | D:\test\ai_5.jpg | ai (51%) | human (99%) | artificial (55%) | real (65%) |
27
+ | D:\test\ai_6.jpg | ai (100%) | human (98%) | artificial (100%) | ai (84%) |
28
+ | D:\test\real_1.jpg | ai (99%) | human (99%) | artificial (100%) | ai (55%) |
29
+ | D:\test\real_2.jpg | ai (88%) | human (100%) | artificial (100%) | real (85%) |
30
+ | D:\test\real_3.jpg | ai (95%) | human (96%) | artificial (100%) | real (97%) |
31
+ | D:\test\real_4.jpg | real (90%) | human (100%) | artificial (97%) | real (94%) |
32
+ | D:\test\real_5.jpg | ai (75%) | human (100%) | human (57%) | real (100%)|
33
+ | D:\test\real_6.jpg | ai (89%) | human (98%) | human (100%) | real (99%) |
34
+ | **Accuracy:** | 50% | 50% | 58% | **75%** |
35
+
36
+
37
+ ## Usage
38
+
39
+ Example inference code:
40
+
41
+ ```python
42
+ from transformers import AutoModelForImageClassification, AutoFeatureExtractor
43
+ import torch
44
+ from PIL import Image
45
+
46
+ model = AutoModelForImageClassification.from_pretrained("legekka/AI-Anime-Image-Detector-ViT")
47
+ feature_extractor = AutoFeatureExtractor.from_pretrained("legekka/AI-Anime-Image-Detector-ViT")
48
+
49
+ model.eval()
50
+
51
+ image = Image.open("example.jpg")
52
+ inputs = feature_extractor(images=image, return_tensors="pt")
53
+
54
+ outputs = model(**inputs)
55
+ logits = outputs.logits
56
+
57
+ label = model.config.id2label[torch.argmax(logits).item()]
58
+ confidence = torch.nn.functional.softmax(logits, dim=1)[0][torch.argmax(logits)].item()
59
+
60
+ print(f"Prediction: {label} ({round(confidence * 100)}%)")
61
+ ```