Achievements
Community
Corporate Projects
Exchange
Field Trips & Visits
Internship & Career
Undergraduate
Office
Research
Seminars & forums
Student Activities
T&M-DDP
Postgraduate
EVMT
Innovation
Entrepreneurship
Sustainability
Engineering
Environment
Air Quality
GBA
PublicPolicy
ENVR
PPOL
Teaching&Learning
Technology
Research and Technology
Greater Bay Area
IIM
Fintech
Research and Innovation
AMC Faculty Shine at CVPR 2025 with High-caliber Research and Workshop Leadership
16/07/2025
Thumbnail
At the CVPR 2025 Workshop: AI for Creative Visual Content Generation, Editing and Understanding, co-organized by AMC faculty member Prof. Anyi RAO
At the CVPR 2025 Workshop: AI for Creative Visual Content Generation, Editing and Understanding, co-organized by AMC faculty member Prof. Anyi RAO

The Division of Arts and Machine Creativity (AMC) proudly celebrates its contributions to the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025, one of the world’s premier conferences in computer vision and artificial intelligence.


This year, AMC faculty members have made significant strides in the intersection of AI, machine creativity, and visual computing, with eight research papers accepted and a major workshop co-organized at the conference. These achievements reflect the Division’s commitment to pioneering the convergence of art and technology.


Our warmest congratulations to Prof. FU Hongbo, Prof. LUO Wenhan, Prof. XUE Wei, Prof. RAO Anyi and Prof. Harry YANG, as well as all collaborators for their remarkable contributions. We look forward to seeing more innovative work from AMC faculty members as they continue to shape the future of machine creativity.

 

Accepted Research Papers

 

  1. SketchVideo: Sketch-based Video Generation and Editing
    Feng-Lin Liu, Hongbo Fu, Xintao Wang, Weicai Ye, Pengfei Wan, Di Zhang, Lin Gao

    Introduces a sketch-driven framework for video generation and editing, enabling precise spatial and motion control using keyframe sketches. The method supports fine-grained edits while preserving original content, offering a powerful tool for intuitive video creation.

    SketchVideo
     
  2. VODiff: Controlling Object Visibility Order in Text-to-Image Generation
    Dong Liang, Jinyuan Jia, Yuhao Liu, Zhanghan Ke, Hongbo Fu, Rynson W.H. Lau

    Presents a training-free framework for controlling object occlusion in text-to-image synthesis. Through a Sequential Denoising Process (SDP) and a Visibility-Order-Aware (VOA) Loss, VODiff generates photorealistic images that respect user-defined spatial layouts and object visibility orders.

    VODiff
     
  3. StyleMaster: Stylize Your Video with Artistic Generation and Translation
    Zixuan Ye, Huijuan Huang, Xintao Wang, Pengfei Wan, Di Zhang, Wenhan Luo

    Introduces a novel video stylization framework that enhances global and local style fidelity through prompt-patch similarity filtering, contrastive learning with model-generated pairs, and a lightweight motion adapter, achieving superior style resemblance and temporal coherence without content leakage.

    StyleMaster
     
  4. OSV: One Step is Enough for High-Quality Image to Video Generation
    Xiaofeng Mao, Zhengkai Jiang, Fu-Yun Wang, Jiangning Zhang, Hao Chen, Mingmin Chi, Yabiao Wang, Wenhan Luo

    Proposes a novel two-stage training framework that combines consistency distillation with GAN training—enhanced by an efficient video discriminator—to enable high-quality one-step video generation, outperforming existing methods while allowing optional multi-step refinement.

    OSV
     
  5. VideoRepainter: Keyframe-Guided Creative Video Inpainting
    Yuwei Guo, Ceyuan Yang, Anyi Rao, Chenlin Meng, Omer Bar-Tal, Shuangrui Ding, Maneesh Agrawala, Dahua Lin, Bo Dai

    Introduces a two-stage framework for video inpainting that leverages keyframe-based image editing and propagates changes across frames. By integrating a symmetric condition mechanism and efficient mask synthesis, it enables visually coherent and creatively diverse video edits with reduced computational cost.

    VideoRepainter
     
  6. VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
    Zeyue Tian, Zhaoyang Liu, Ruibin Yuan, Jiahao Pan, Qifeng Liu, Xu Tan, Qifeng Chen, Wei Xue, Yike Guo

    Establishes a pioneering video-to-music generation framework by first understanding the visual contents, and then creatively generating temporally and semantically aligned music. This work has been cited by the Meta's impactful Movie Gen paper as the most related work on video to music generation.

    VidMuse
     
  7. PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing
    Peng Li, Wangguandong Zheng, Yuan Liu, Tao Yu, Yangguang Li, Xingqun Qi, Xiaowei Chi, Siyu Xia, Yan-Pei Cao, Wei Xue, Wenhan Luo, Yike Guo

    Introduces a novel framework using cross-scale multiview diffusion and explicit remeshing to generate highly detailed and photorealistic 3D human models from a single RGB image. The framework addresses key challenges such as face distortion and self-occlusions, producing high-fidelity multiview images with consistent geometry and realistic textures.

    PSHuman
     
  8. Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly
    Yexin Liu, Zhengyang Liang, Yueze Wang, Xianfeng Wu, Feilong Tang, Muyang He, Jian Li, Zheng Liu, Harry Yang, Sernam Lim, Bo Zhao

    Reveals critical flaws in Multimodal Large Language Models (MLLMs), which often misanswer despite understanding visuals. Proposes new benchmarks and refinement strategies to improve visual attention and reasoning accuracy.

    MLLMs
Workshop Co-Organization

 

AI for Creative Visual Content Generation, Editing and Understanding

Organized by Ozgur Kara, Fabian Caba Heilbron, Anyi Rao, Victor Escorcia, Ruihan Zhang, Mia Tang, Dong Liu, Maneesh Agrawala, James Rehg

 

Now in its sixth edition, this workshop unites researchers, artists, and innovators to explore how AI empowers creative visual content creation and interpretation across disciplines.
 

 

 


At the CVPR 2025 Workshop: AI for Creative Visual Content Generation, Editing and Understanding, co-organized by AMC faculty member Prof. Anyi RAO
SHARE
TAGS
Research and Technology
Technology
Research