Continuous Diffusion for Mixed-Type Tabular Data
Published:
The noise schedule is a key design parameter for diffusion models. It determines how the magnitude of the noise varies over the course of the diffusion process.
For noise schedule designs, our paper on Continuous Diffusion for Mixed-Type Tabular Data has been featured in a recent blog-post by Google’s Sander Dieleman:
In Continuous Diffusion for Mixed-Type Tabular Data, Mueller et al. (2014) extend the time warping mechanism to heterogeneous data, and use it to learn different noise level distributions for different data types. This is useful in the context of continuous diffusion on embeddings which represent discrete categories, because a given corruption process may destroy the underlying categorical information at different rates for different data types. Adapting to the data type compensates for this, and ensures information is destroyed at the same rate across all data types.
If you are interested in diffusion probabilistic models in general, it is also worth the time to read the full blog-post here.