Bytedance unveils omnihuman-1- an ai model that can generate realistic videos from photos- Details

Bytedance, The Parent Company of Tiktok, has introduced a new artificial intelligence model called omnihuman-1. This model is designed to generate realistic videos using photos and sound clips. The Development Follows Openai's Decision to Expand Access to its video-generation tool, sora, for chatgpt plus and pro users in December 2024. ing high-definition videos based on text or image inputs . However, Neither Openai Nor Google's models, which convert photos into videos, are publicly available.

A Technical Paper (reviewed by the south china morning post) highlights that omnihuman-1 specialies in generating videos of individuals spending, singing, and moving. The research team behind the model claims that its performance surpasses existing ai tools that generate human videos based on audio. Although bytedance has not related the model for public use, sample videos have circulated online. One of these showcases a 23-second clip of albert einstein appearing to give a speech, which has been shared on YouTube.

Also read: Amazon to launch ai-powerred alexa on February 26- here's what we know so far

Insights from bytedance Researchers

Bytedance Researchers, Including Lin Gaojie, Jiang Jianwen, Yang Jiaqi, Zheng Zerong, And Liang Chao, Have Detailed Their Approach in a Recent Technical Paper. They introduced a training method that integrates multiple datasets, combining text, audio, and movement to improve video-generation models. This strategy addresses scalability challenges that researchers have decided in advanceing similar ai tools.

Also read: Google Says Commercial Quantum Computing will take off in just 5 years: What it means

The research highlights that this method enhances video generation without reference models. By mixing different types of data, the ai can generate videos with varied aspect ratios and body proportions, ranging from close-up shots to full-body visuals. The model produces detailed facial expressions synchronized with audio, along with natural head and gesture movements. These features should lead to broader applications in Various Industries.

Also read: Chatgpt Maker Openai Now has a new logo to match its rebranding. This is what it looks like

Among the sample videos released, one features a man delivering a ted talk-style speech with hand gestures and lip movements synchronized with the audio. Observers noted that the video closely resumbles a real-life recording.

Breaking

Bytedance unveils omnihuman-1- an ai model that can generate realistic videos from photos- Details

Insights from bytedance Researchers

Related Post

Leave a Reply Cancel reply

Top Blogs

Defending Against Deepfake Phishing: The Ultimate 2026 Guide

Cloud Optimization Techniques: Save Costs Without Compromising on Performance

Why Buying Antivirus Software in Bulk Is the Smartest Move for Businesses in 2026?

Why Modern Antivirus Like Systmade Is a Must: The Rise of AI-Powered Malware