• Home
  • >
  • Software Development
  • >
  • These AI-Synthesized Sound Effects Are Realistic Enough to Fool Humans – InApps Technology 2022

These AI-Synthesized Sound Effects Are Realistic Enough to Fool Humans – InApps Technology is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn These AI-Synthesized Sound Effects Are Realistic Enough to Fool Humans – InApps Technology in today’s post !

Read more about These AI-Synthesized Sound Effects Are Realistic Enough to Fool Humans – InApps Technology at Wikipedia

You can find content about These AI-Synthesized Sound Effects Are Realistic Enough to Fool Humans – InApps Technology from the Wikipedia website

Films are generally immersive experiences, made with the aim to impress their viewers with their engaging plotlines and dazzling special effects. While some sounds may be recorded at the time of filming, movies also rely on convincing sound effects — often made during post-production by someone known as a Foley artist — to fill in those all-important background noises like footsteps, rustling leaves or falling raindrops to create a sense of reality in a film. Not surprisingly, creating and integrating such sound effects is a time-consuming and costly part of any film budget.

Now, new work from a University of Texas at San Antonio research team shows that the Foley process can be automated — using artificial intelligence that can analyze motion in a given video, and then generate its own matching artificial sound effects.

A ‘Deep Sound Synthesis Network’

Dubbed AutoFoley, the team’s system uses deep learning AI to create what they call a “deep sound synthesis network,” which can analyze, categorize and recognize what kind of action is happening in a video frame, and then produce the appropriate sound effect to enhance video that may or may not already have some sound.

Read More:   Boomi the Cloud Integrator Takes on Data Management – InApps 2022

“Unlike existing sound prediction and generation architectures, our algorithm is capable of precise recognition of actions as well as inter-frame relations in fast-moving video clips,” explained the researchers in their paper, which was recently published in IEEE Transactions on Multimedia.

To achieve this, the AutoFoley system first identifies the actions in a video clip, then selects a suitable sound from a customized database that matches the action. AutoFoley then attempts to ensure that the sound matches the timing of the movements in each video frame. The first part of the system analyzes the association of movement and timing in video-frame images by extracting features like color, using a multiscale recurrent neural network (RNN) combined with a convolutional neural network (CNN). However, for faster-moving actions in video clips where there may be missing visual information between consecutive frames, an interpolation technique using CNNs and a temporal relational network (TRN) is utilized so that the system can preemptively “fill in” any missing gaps and link them smoothly, so that it can still accurately time the actions along with the predicted sound.

Diagram of the architecture of AutoFoley, showing the stages of sound prediction and sound generation.

Next, AutoFoley synthesizes a sound to correspond with the action identified from the video in the previous steps. To aid in its training, the team curated their own database of common sound effects, categorized in different “sound classes” that included things like rainfall, crackling fire, galloping horses, breaking objects, and typing.

“Our interest is to enable our Foley generation network to be trained with the exact natural sound produced in a particular movie scene,” said the researchers. “To do so, we need to train the system explicitly with the specific categories of audio-visual scenes that are closely related to manually generated Foley tracks for silent movie clips.”

Some of the sounds in the database were created by the team, while others were culled from online videos. All told, the researchers’ Automatic Foley Dataset (AFD) contains sounds from a total of 1000 videos from 12 different classes, with each video duration averaging about five seconds each. As seen and heard below, the resulting AI-synthesized audio as applied to sample video clips does sound pretty realistic.

Read More:   VMware Hires Serverless OpenFaaS Founder; Dell Shutters Its {code} Open Source Initiative – InApps 2022

To test how convincing the results were, the research team presented the finalized videos with the AI-generated sound effects to 57 volunteers. Surprisingly, 73% of participants believed that the synthesized AutoFoley sounds were actually the original soundtracks — a significant improvement over comparable methods that also generate sound from visual inputs.

To improve their model, the researchers now plan to expand their training dataset so to include a wider variety of realistic-sounding audio clips, in addition to further optimizing time synchronization. The team is aiming to also boost the system’s computational efficiency so that it will be capable of processing and generating sound effects in real-time. With AI now able to generate rather convincing pieces of music, literature, informational texts, and even faked videos of politicians or famous works of art that are almost indistinguishable from the real thing, it was only a matter of time before machines fooled humans with their artificially created sounds as well.

Read more in the team’s paper.

Images: Eduardo Santos Gonzaga via Pixabay; University of Texas at San Antonio

Source: InApps.net

List of Keywords users find our article on Google:

great clips san antonio
horse sound effects
footsteps sound effects
horses sound effects
footsteps sound effect
50k jobs san antonio
sound effects for youtube videos
train sound effects
reddit wawa
falling sound effects
rainfall sounds
san antonio great clips
paper sound effects
rnn facebook
static sound effects
youtube sound effects
sound on reddit app
paper sound effect
falling sound effect
mission sound effect
footsteps sound effect free
technology sound effects
synthesized
text message sound effects
breaking sound effects
great movie scenes
eduardo santos facebook
pixabay sound effects
service actions outsystems
film frame interpolation for large motion
niche audio sound packs
hcmc reddit
horses for sources robotic process automation
pixabay sounds
san antonio to ho chi minh city
raindrops clip art
reddit hcmc
sound effects train
foley wikipedia
great clips san antonio texas
facebook fool
leaves sound effect
train sound fx
great clips app
saas sales reddit
great clips foley
movement sound effects
famous youtube sound effects
foot steps sound effect
ai synthesis
realistic code
ieee transactions paper template
street sound effects
ui sound effects
outsystems background image
t systems multimedia solutions jobs
ho ho ho sound effect
motion recruitment reddit
niche audio samples
street sound effect
horse galloping sound
horse sounds app
robocorp
success sound effects
train sound effect
Rate this post
As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Get a custom Proposal

Please fill in your information and your need to get a suitable solution.

    You need to enter your email to download

      [cf7sr-simple-recaptcha]

      Success. Downloading...