ImageDiff

Multi-Model AI Image Reproduction Evaluation

Evaluate how well different AI models (Nano Banana, GPT Image, Claude, etc.) reproduce images using multiple perceptual quality metrics.

Upload Image
Tip: For best comparison results, upload images with standard aspect ratios (16:9, 4:3, 1:1) or common dimensions (1024×1024, 1920×1080).

Drag & drop an image here or click to browse

Clone Prompt
Select AI Models for Comparison
Select one or more models to generate and compare image reproductions
Loading models...

Sensitivity for pixel difference detection (0-50)
All Models - Image Comparisons

Generate images with models to see comparisons...

Detailed Metrics Comparison (Current Image)

Click column headers to sort. blue = best, Red = worst.

Model PSNR ↑ SSIM ↑ SSIMULACRA2 ↑ Butteraugli ↓ FLIP ↓ LPIPS ↓ % Diff ↓
No comparison data yet
Metric Interpretation Guide:
  • ↑ Higher is better: PSNR, SSIM, SSIMULACRA2
  • ↓ Lower is better: Butteraugli, FLIP, LPIPS, % Difference
  • Best human perception alignment: SSIM, SSIMULACRA2, LPIPS
  • SSIMULACRA2: Considered state-of-the-art for perceptual quality assessment
  • LPIPS: Uses deep learning features, excellent for perceptual similarity
Pixel Difference Analysis
Total Pixels:
-
Different:
-
Percent Diff:
-
PSNR:
-
MAE (R): -
MAE (G): -
MAE (B): -
MSE: -
Advanced Perceptual Metrics
SSIM: -
Structural Similarity (0-1, higher better)
SSIMULACRA2: -
Perceptual quality (0-100, higher better)
Butteraugli: -
Perceptual diff (0-10, lower better)
FLIP: -
Feature-based metric (lower better)
LPIPS: -
Learned Perceptual (lower better)
Error Range Pixel Count Percentage Visual
No comparison data yet