Lumiere multimodal AI unveiled, can create 5-second videos from text, images

By Ann Roberts on January 30, 2024

Google has launched its latest AI model. Called Lumiere, the multimodal video generation tool is capable of producing realistic 5-second-long videos using just text, or still images as prompts

Google is getting itself into a position where it can challenge OpenAI’s dominance of AI.

Google has introduced its latest artificial intelligence model, Lumiere, a multimodal video generation tool capable of producing realistic 5-second-long videos.

Lumiere supports both text-to-video and image-to-video generation, using a Space-Time U-Net (STUNet) architecture to enhance the realism of motion in AI-generated videos.

Related Articles

ChatGPT, Attack! OpenAI is working with US armed forces, making cybersecurity tools for them

How AI can now copy handwriting. Is it a reason to worry?

Unlike existing models such as Runway Gen-2 and Pika 1.0, Lumiere has not been made public yet.

According to a preprint paper accompanying the release, Lumiere’s innovation lies in generating the entire video in a single process rather than combining still frames.

This approach allows for the simultaneous creation of both spatial (objects in the video) and temporal (movement within the video) aspects, resulting in a more natural perception of motion.

Lumiere generates 80 frames, compared to Stable Diffusion’s 25 frames, utilizing spatial and temporal down- and up-sampling and leveraging a pre-trained text-to-image diffusion model.

Although Lumiere is not available for testing, its website showcases various videos created using the AI model, along with the corresponding text prompts and input images.

The tool can produce videos in different styles, create cinemagraphs for animating specific video parts, and perform inpainting by completing masked-out videos or images based on prompts.

Google’s Lumiere competes with existing AI models like Runway Gen-2 (launched in March 2023) and Pika Lab’s Pika 1.0, both accessible to the public.

While Pika can create 3-second-long videos (extendable to 4 more seconds), Runway can generate videos up to 4 seconds long. Both models offer multimodal capabilities and support video editing.

(With inputs from agencies)

Google has launched its latest AI model. Called Lumiere, the multimodal video generation tool is capable of producing realistic 5-second-long videos using just text, or still images as prompts

Google is getting itself into a position where it can challenge OpenAI’s dominance of AI.

Google has introduced its latest artificial intelligence model, Lumiere, a multimodal video generation tool capable of producing realistic 5-second-long videos.

Lumiere supports both text-to-video and image-to-video generation, using a Space-Time U-Net (STUNet) architecture to enhance the realism of motion in AI-generated videos.

Related Articles

ChatGPT, Attack! OpenAI is working with US armed forces, making cybersecurity tools for them

How AI can now copy handwriting. Is it a reason to worry?

Unlike existing models such as Runway Gen-2 and Pika 1.0, Lumiere has not been made public yet.

According to a preprint paper accompanying the release, Lumiere’s innovation lies in generating the entire video in a single process rather than combining still frames.

This approach allows for the simultaneous creation of both spatial (objects in the video) and temporal (movement within the video) aspects, resulting in a more natural perception of motion.

Lumiere generates 80 frames, compared to Stable Diffusion’s 25 frames, utilizing spatial and temporal down- and up-sampling and leveraging a pre-trained text-to-image diffusion model.

Although Lumiere is not available for testing, its website showcases various videos created using the AI model, along with the corresponding text prompts and input images.

The tool can produce videos in different styles, create cinemagraphs for animating specific video parts, and perform inpainting by completing masked-out videos or images based on prompts.

Google’s Lumiere competes with existing AI models like Runway Gen-2 (launched in March 2023) and Pika Lab’s Pika 1.0, both accessible to the public.

While Pika can create 3-second-long videos (extendable to 4 more seconds), Runway can generate videos up to 4 seconds long. Both models offer multimodal capabilities and support video editing.

(With inputs from agencies)

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.

5Second Create Images Latest LUMIERE MultiModal Technoblender Technology Text Unveiled Videos

Technology

Comments (0)