It is possible that the idea of highlighting this connection was emphasized for marketing purposes in order to generate interest and excitement around the development of these models.
When I first learned about diffusion models, I immediately drew a connection to denoise auto encoders, as both attempt to recover underlying information from noisy data.
However, the differences between the two models are not discussed in articles about diffusion models. From my perspective, it appears that denoise auto encoders may have played a role in inspiring the development of diffusion models.
The differences are thoroughly discussed here. From my point of view, while there is this surface-level similarity, the training process for diffusion models is directly a result of the mathematical framework on which they are defined which is distinct from denoising autoencoders.
I'd suggest reading the above blog if you are interested, it is thorough!
When I first learned about diffusion models, I immediately drew a connection to denoise auto encoders, as both attempt to recover underlying information from noisy data.
However, the differences between the two models are not discussed in articles about diffusion models. From my perspective, it appears that denoise auto encoders may have played a role in inspiring the development of diffusion models.