Abstract
The sampling process of Denoising Diffusion Probabilistic Models (DDPMs) can be accelerated by leveraging second-order information in the form of approximations to the denoising posterior covariance – allowing samples of acceptable quality to be produced in fewer but larger sampling steps. Previous attempts at using such information have used drastic (e.g. diagonal) simplifications of the covariance. These do not do justice to the peculiar statistical structure of natural images, which exhibit strong non-diagonal correlations between pixels and color channels, and a slow-decaying power-law frequency spectrum. Here, we develop a novel covariance model that captures these features. Our Kronecker-DCT (K-DCT) model uses a Kronecker-factored decomposition of inter-color covariances and spatial covariances modeled in the frequency domain using the Discrete Cosine Transform (DCT). The use of the DCT reduces the computational complexity from quadratic to log-linear, resulting in negligible computational and memory overhead in each denoising step. By learning K-DCT-structured amortizations of the denoising posterior covariance using pre-trained score models on CIFAR-10, Celeb-A, ImageNet and LSUN datasets, we show improved performance compared to previous SOTA denoising samplers, both in terms of FID and likelihoods, especially in the regime of few denoising steps.
Please check back later.
Please check back later.
Citation
@inproceedings{xia2026,
author = {Xia, Rui and Das, Ayan and Artemev, Artem and Zhang, Andy
and Hennequin, Guillaume and Bernacchia, Alberto},
title = {Improved Denoising Diffusion Probabilistic Models with
Efficient Non-Diagonal Covariance Modeling},
booktitle = {Transactions on Machine Learning Research (TMLR), 2026},
date = {2026-06-15},
url = {https://openreview.net/pdf?id=V6FBm4kfML},
langid = {en}
}