I'm not saying this tool is bad but I would be really careful about using tools ...

manojlds · on July 18, 2020

This demonstration with Nvidia RTX Voice sounds pretty good

nickjj · on July 18, 2020

Definitely sounds better than I thought it would have and I've watched tons of this guy's videos in the past.

It really distorts his voice / range in some cases, such as when he taps his desk with that orange hammer. The difference there is night and day. It chops out his his natural voice's range. It seems to degrade his voice the more intense the background noise is, such as the leaf blower (lol), but that's reasonable to expect. But at the same time, even the mechanical keyboard has a very noticeable negative effect on his range.

It's one of those things where I wish so much that it worked perfectly, but I couldn't realistically think about using it for any recording work due to things like the above. There's just too many common noises (typing, etc.) that drastically distorts your voice.

9:23 in that video is hilarious though. Have to love Jerry!

amcoastal · on July 18, 2020

I wonder if its the algorithm degrading his voice or if the input sound is already degraded. Is it possible a leaf blower or a hammer would cause enough "noise" to make it so our ears couldnt hear his voice clearly as well? Then when you subtract out the portion of the sound attributed to the leafblower, youre hearing the parts of his voice that werent being jumbled by the leaf blower?

jacobush · on July 18, 2020

Like the blown out whites of a photograph. You can adjust levels, but if the input peaked, there’s just no information left in the data.

nickjj · on July 18, 2020

Hard to say because softer noises like typing still makes his voice sound like it's cutting out unnaturally. It's like the frequencies are being subtracted out of his normal tone, but it's more subtle than the leaf blower so you may not notice it without good headphones. It makes him sound very choppy and mechanical.

simias · on July 18, 2020

With the leaf blower I suspect that when it gets too close the microphone/ADC is saturating, which clips his voice. I wonder if it would've sounded better had he attempted to lower the gain on the microphone.

rcxdude · on July 18, 2020

Results can be mixed. Personally when I tried it it gave me a lisp.

exhilaration · on July 18, 2020

That's pretty amazing

asutekku · on July 18, 2020

The difference in here is that RNNoise does not just remove some specific frequency, it uses neural networks to remove it which results in much higher quality compared to what you were implying.

lawl · on July 18, 2020

Hey (author here)

I have personally not noticed voice quality suffering too much, but you are of course right. And this is not what it was made for. My personal use case is mostly voip where RNNoise (imo) does an amazing job.

g_p · on July 18, 2020

Looks excellent and keen to delve into the code a bit.

One quick question since you'll clearly know the codebase - do you think this could easily be adapted to create a "playback-side" noise filter?

Use-case rationale here is noisy and poor quality podcasts or "other people's" audio - it would be awesome to be able to configure your tool as the output for Chrome or Firefox or whatever program I'm listening to, then route the cleaned audio from your tool to the physical audio port.

Is that something which would be feasible to do here?

lawl · on July 18, 2020

> do you think this could easily be adapted to create a "playback-side" noise filter?

Yes, the hardest part about this is making the UI not confusing when you now have two separate instances loaded in PulseAudio.

g_p · on July 18, 2020

Agreed, but now this has piqued my interest in a good way.

Having two instances loaded might be a bit confusing as you say - I imagine it would need to be something like "NoiseTorch for Recording" and NoiseTorch for Playback.

I'd need to go and play around with Pulse but I guess it would be possible to present 2 interfaces into Pulse with different names, then hope users can see the distinction when selecting a microphone versus the output device.

nickjj · on July 18, 2020

Would it be possible to upload a few before / after samples with varying degrees of background noise? Even if it's all the same person that would be a huge help to gauge the quality.

lawl · on July 18, 2020

Yes! https://github.com/lawl/NoiseTorch/issues/19

I just wont get to it today unfortunately.

nickjj · on July 18, 2020

Cool thanks.

Just a suggestion if you do it, please include realistic room noises in some of the samples.

I looked at the RNNoise examples and it was pretty bad. I mean, the audio quality of the speaker got completely mangled but the background noise was also comically high. It sounded like the person just sat down in the middle of the street in NYC or was inside of a busy train terminal.

ClawsOnPaws · on July 18, 2020

Here are some demos, I believe this is the same algorithm: https://jmvalin.ca/demo/rnnoise/

lawl · on July 18, 2020

Yes and no. NoiseTorch also has VAD (Voice Activity Detection). RNNoise also returns the probability of a sound sample being voice, I use that to clamp the microphone completely if its < the configured probability.

This works really well for situations like Discord or Teamspeak where you're usually not constantly talking, but doing things that can still set off "normal" voice activation. RNNoise's model often knows it's not voice, but cannot denoise it completely.

formerly_proven · on July 18, 2020

Yes, classic noise suppression sounds very poor very quickly. Noisy or poor audio is like blurred photos or videos, very hard to fix, while noisy or shaky videos are easily fixed (especially temporal de-noising on videos is akin to magic, it can extend the performance of the camera by multiple stops with very low IQ impact).

That's why these ML tools are potentially huge, good ol' noise suppression just isn't good.

brownbat · on July 18, 2020

How long until we can get some kind of open AI project to take in incoming bad quality voice and output clear noiseless human speech (in our, or whoever's voice we want), so podcasters don't have to buy expensive microphones and try to soundproof their rooms anymore?

I know we're not there yet, but I feel like we're about to break "garbage in garbage out" with AI.

nickjj · on July 18, 2020

I'm just a video course / podcaster who spent a decent amount of time researching audio and I'm not a deep down audio engineer.

But based on the results I see with automated software tools that only try to reduce noise, I would say we're no where near there and a really good solution would involve things that haven't been invented yet. I think we'll have manned trips to Mars well before you have a software solution that can emulate the sound of a moderately treated room with ~2ms of latency or less.

With that said, I think we're there today if all you want to do is help reduce the noise of an air conditioner so you can chat with a friend on Hangouts, Discord or Zoom. This is a scenario where audio quality doesn't matter, but not hearing an A/C or lawn mower is worth having the person talking sound like a choppy robot. You probably won't even notice it too much with earbuds.

bigiain · on July 18, 2020

Click through to the RNNoise link at the bottom. Lots of tweakable demos, and a real time JavaScript implementation to play with too...

ipunchghosts · on July 18, 2020

That's not how this works. It's much more sophisticated than that.