Google DeepMind’s new AI tech will generate soundtracks for videos

on

|

views

and

comments

Google’s DeepMind artificial intelligence laboratory is working on a new technology that can generate soundtracks, even dialogue, to go along with videos. The lab has shared its progress on the video-to-audio (V2A) technology project, which can be paired with Google Veo and other video creation tools like OpenAI’s Sora. In its blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects for what’s happening onscreen. To note, the tool can also be used to make soundtracks for traditional footage, such as silent films and any other video without sound.DeepMind’s researchers trained the technology on videos, audios and AI-generated annotations that contain detailed descriptions of sounds and dialogue transcripts. They said that by doing so, the technology learned to associate specific sounds with visual scenes. As TechCrunch notes, DeepMind’s team isn’t the first to release an AI tool that can generate sound effects — ElevenLabs released one recently, as well — and it won’t be the last. “Our research stands out from existing video-to-audio solutions because it can understand raw pixels and adding a text prompt is optional,” the team writes.While the text prompt is optional, it can be used to shape and refine the final product so that it’s as accurate and as realistic as possible. You can enter positive prompts to steer the output towards creating sounds you want, for instance, or negative prompts to steer it away from the sounds you don’t want. In the sample below, the team used the prompt: “Cinematic, thriller, horror film, music, tension, ambience, footsteps on concrete.The researchers admit that they’re still trying to address their V2A technology’s existing limitations, like the drop in the output’s audio quality that can happen if there are distortions in the source video. They’re also still working on improving lip synchronizations for generated dialogue. In addition, they vow to put the technology through “rigorous safety assessments and testing” before releasing it to the world.This article contains affiliate links; if you click such a link and make a purchase, we may earn a commission.

Share this
Tags

Must-read

Peloton headed to court over AI chat snooping complaint • The Register

Peloton is pedaling toward a court date after a California judge denied its bid to dismiss a lawsuit that alleges the pandemic darling violated...

Google Play Protect may get a more powerful local APK scanning

Google could boost Play Protect’s local app scanning capabilities soon. Now, the system would be more powerful and efficient thanks to the implementation of...
spot_img

Recent articles

More like this

LEAVE A REPLY

Please enter your comment!
Please enter your name here