Training-Free Single-Image Diffusion Model Achieves Fast, High-Quality Generation
By
[Submitted on 3 Jun 2026]
A bagel you'd recommend to a friend without hedging.
Summary
This paper presents a training-free approach to single-image diffusion models. Instead of training a neural network on a single image (which is computationally expensive), the authors model the image using a dataset of its patches at different scales. They compute the score function for noisy patches using an optimal, closed-form denoiser, eliminating neural network training entirely. The method achieves state-of-the-art generation quality and diversity compared to trained single-image diffusion models, and enables applications like unconditional generation, text-guided stylization, image symmetrization, and retargeting. It also supports latent space diffusion and acceleration techniques for megapixel generation in one second and gigapixel generation in minutes.
Key quotes
· 4 pulledWe model the image using a dataset of its patches at different scales.
The score function for a noisy patch can be computed tractably using an optimal, closed-form denoiser, eliminating the need for neural network training.
Our approach achieves state-of-the-art generation quality and diversity compared to trained single-image diffusion models.
We demonstrate multiple additional acceleration techniques to achieve megapixel single-image generation in one second, and gigapixel generation in minutes.
You might also wanna read
Robust Prior Update (RPU): Reducing Hallucination in Diffusion-Based Inverse Problem Solvers
This paper introduces Robust Prior Update (RPU), a module for diffusion-based inverse problem solvers that addresses measurement-conditioned
VideoMLA: Low-Rank Latent KV Cache Reduces Memory by 92.7% for Minute-Scale Video Diffusion
This paper introduces VideoMLA, the first application of Multi-Head Latent Attention (MLA) to video diffusion models. It replaces per-head k
ByteDance's Seed Diffusion Model Boosts Code Generation Speed by 5.4x
Seed Diffusion, an experimental open-source diffusion language model by ByteDance's Seed team, offers a 5.4x inference speedup over comparab
