Revolutionizing Video Generation: Vidu Unveils Groundbreaking Tech to Keep Any Subject in Frame
- One month after its launch, Vidu, China's first self-developed original video model, has received a major update.
- The "Subject Reference" function enables users to upload a picture of any subject.
- For example, for the "subject reference" of characters, whether real or fictional, Vidu can keep their images consistent in different environments and different shots.
Vidu Revolutionizes Video Generation with “Subject Consistency” Function
One month after its launch, Vidu, China’s first self-developed original video model, has received a major update. The new “Subject Consistency” function allows for consistent generation of any subject, making video generation more stable and controllable.
World’s First: One Picture is Enough to Control the Subject
The “Subject Reference” function enables users to upload a picture of any subject. Vidu can then lock the image of the subject, switch scenes arbitrarily through descriptive words, and output videos with the same subject. This feature is not limited to a single object, but is aimed at “any subject”, whether it is a person, animal, commodity, anime character, or fictional subject.
For example, for the “subject reference” of characters, whether real or fictional, Vidu can keep their images consistent in different environments and different shots. For animals, Vidu can keep their detailed features consistent in different environments and large movements. For products, the appearance and details of the products are highly consistent in different scenes.
Changing the Rules of the Game for Video Creation
Competition in the field of large video models is becoming increasingly fierce. Although numerous models are emerging, they all have a core problem – lack of controllability, or lack of consistency. Vidu’s “Subject Reference” function has completely changed this situation. It abandons the traditional steps of generating storyboards and directly generates video materials by “uploading the subject image + entering the scene description words”.
This innovative method not only greatly reduces the workload, but also breaks the restrictions of storyboards on video content, allowing creators to use their imagination based on text descriptions to create rich, flexible and changeable video content. This breakthrough will bring unprecedented freedom and innovation space to video creation.
Accelerate the Creation of Story and Advertising Videos
The “Subject Reference” function has indeed received “high praise” from many first-line creators. For example, Li Ning, the founder of Guangchi Matrix and young director, used Vidu to pre-create a video clip of the male protagonist, in which all character scenes were generated using only three final makeup photos of the male protagonist.
China Central Radio and Television Station director and AIGC artist Shi Yuxiang (Senhai Fluorescence) created an animated short film “Summer Gift”. When sharing the creative process, he said that compared with the basic Tusheng video function, the “subject reference” function Freed from the constraints of static images, the generated images are more appealing and free, greatly improving the coherence of creation.
Subject Reference is the Beginning of AI’s Complete Narrative
As the most self-developed video model in China, Vidu has received widespread attention overseas since its release. After its official launch at the end of July, Vidu’s product performance ranked in the “top echelon” of global video models with its highlights in dynamics, semantic understanding, animation style, and fast reasoning.
Tang Jiayu, co-founder and CEO of Shengshu Technology, said that the launch of the new feature “Subject Reference” represents the beginning of complete AI narrative, and AI video creation will also move towards a more efficient and flexible stage. Whether it is making short videos, animations or commercials, in the art of narrative, a complete narrative system is an organic combination of elements such as “consistent subject, consistent scene, and consistent style”.
From a longer-term perspective, once full controllability is achieved, the video creation industry will undergo a disruptive change. At that time, characters, scenes, styles, and even elements such as lens usage and light and shadow effects will be transformed into flexibly adjustable parameters. Users only need to move their fingers and adjust parameters to complete the creation of a video work, and behind each work will also be the user’s unique worldview and self-expression built based on AI.
