Advanced text-to-image AI models typically require immense computational resources, limiting their accessibility to researchers and developers with substantial hardware. Not anymore with LightGen, an efficient text-to-image model that delivers impressive results while dramatically reducing computational requirements, leaping a significant step toward democratizing AI image generation.
LightGen was developed by the research team led by Prof. Harry YANG, Assistant Professor of the Division of Arts and Machine Creativity (AMC), with collaboration from Everlyn AI and the University of Central Florida. It combines Knowledge Distillation (KD) and Direct Preference Optimization (DPO) to reduce training time from thousands of GPU days to just 88 GPU days. This breakthrough makes cutting-edge image generation technology more accessible to smaller research teams and developers with limited computing power.
Rather than relying on massive datasets, the team found that data diversity is more important than volume. Using just 2 million high-quality synthetic images generated from varied captions, they created a model that performs exceptionally well on various image generation tasks, particularly excelling at single object, dual object, and color synthesis.
LightGen is currently available for developers and researchers on GitHub and Hugging Face. The findings were published in the paper titled "LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization".
More coverage:
無需百卡集群!港科等開源LightGen: 極低成本文生圖方案媲美SOTA模型
新浪網 – https://news.sina.cn/ai/2025-03-19/detail-ineqcrww6073149.d.html
新浪網(香港) – https://portal.sina.com.hk/technology/sina/2025/03/19/1142632/無需百卡集群!港科等開源lightgen-極低成本文生圖方案/