LeVanLoi'log, ⌚ 2023-11-17
***
Emu Video & Emu Edit
Tác giả: Meta AI
Today we’re announcing two new milestones in our generative AI research: Emu Video & Emu Edit.
These two new models build upon our previous work in both images and videos to deliver impressive new results in high quality, diffusion-based text-to-video generation and controlled image editing using just text instructions.
-
More details ➡️ https://ai.meta.com/blog/emu-text-to-video-generation-image-editing-research/
1️⃣ Emu Video
This new text-to-video model leverages our Emu image generation model and can respond to text-only, image-only or combined text & image inputs to generate high quality video. It uses a factorized approach that not only allows us to train video generation models more efficiently but also produces higher quality video generations.
The outputs of this model were preferred by 96% of respondents over our previous model based on output quality in human evaluations.
2️⃣ Emu Edit
This new model is capable of free-form editing through text instructions. Unlike many existing models, Emu Edit precisely follows instructions and ensures only specified elements of the input image are edited while leaving areas unrelated to instruction untouched. This enables more powerful and reliable editing and iteration.
To train the model we developed a dataset containing 10M synthesized samples of input images, instructions and target outputs — the largest dataset of its kind to date. As a result the model demonstrates a new state-of-the-art in both qualitative and quantitative evaluations for a range of image editing tasks.
While this work is still fundamental research, we see exciting potential for future cases for this technology to enhance the way we share, communicate and express ourselves in creative ways in our family of apps. We’re excited to continue pushing this field of work forward.
-
More details ➡️ https://ai.meta.com/blog/emu-text-to-video-generation-image-editing-research/