A Comparative Analysis Between GAN and Diffusion Models in Image Generation

Authors

  • Yingying Peng

DOI:

https://doi.org/10.62051/0f1va465

Keywords:

Image generation; generative neural network; diffusion model.

Abstract

In the field of artificial intelligence, image-generation techniques have been a hotspot for research. Two generative models that have garnered a lot of attention are diffusion models and generative adversarial networks (GANs). This review paper aims to compare and analyze GAN and Diffusion Models in the field of picture generation, as well as to give a thorough discussion of their features, applications, benefits, and drawbacks. Firstly, the work related to the working principle of GAN and diffusion models are introduced, and then their applications and results in image generation are reviewed. By comparing the existing research results, the author find that GAN performs well in generating realistic images but suffers from problems such as pattern collapse and unstable training, while the diffusion model has better stability and controllability. Combining the advantages of the two methods, this paper explores the possible fusion methods and looks forward to the future development direction in the field of image generation. These research results provide important references and insights to further enhance the level and application scope of image generation technology.

Downloads

Download data is not yet available.

References

Wang, Lei, Wei Chen, Wenjia Yang, Fangming Bi, and Fei Richard Yu. A state-of-the-art review on image synthesis with generative adversarial networks. IEEE Access, 2020, 8: 63514-63537.

Gui, Jie, Zhenan Sun, Yonggang Wen, Dacheng Tao, and Jieping Ye. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE transactions on knowledge and data engineering. 2021, 35(4): 3313-3332.

Creswell, Antonia, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, and Anil A. Bharath. Generative adversarial networks: An overview. IEEE signal processing magazine, 2018, 35(1): 53-65.

Zhu, Jun-Yan, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2017: 2223-2232.

Karras, Tero, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4401-4410.

Kipf, Thomas N., and Max Welling. Variational graph auto-encoders. ArXiv Preprint. 2016:1611.07308.

Brock, Andrew, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. ArXiv Preprint, 2018: 1809.11096.

Ho, Jonathan, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems. 2020, 33: 6840-6851.

Rombach, Robin, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 10684-10695.

Stable Diffusion 1 vs 2 - What you need to know. Tutorials, AI Research. Last Accessed: Mar. 02, 2024. URL: https://www.assemblyai.com/blog/stable-diffusion-1-vs-2-what-you-need-to-know/

Zhang, Lvmin, Anyi Rao, and Maneesh Agrawala. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 3836-3847.

Downloads

Published

12-08-2024

How to Cite

Peng, Y. (2024) “A Comparative Analysis Between GAN and Diffusion Models in Image Generation”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 189–195. doi:10.62051/0f1va465.