Windows Audio Effects SDK

NVIDIA

Resource

NVIDIA

Windows Audio Effects SDK

Audio Effects SDK for Windows delivers AI-based audio enhancement algorithms, improving end-to-end conversation quality.

Join or Subscribe to get accessSubscribe to the product below to access this premium content:

NVIDIA AI EnterpriseAccelerate your AI agent development

NVIDIA Developer ProgramJoin the Developer Program for access to free tools, support, and tech resources.

Note: You can gain access to hundreds more GPU-optimized artifacts by creating a free NGC account.

Already Subscribed?Log in

The NVIDIA Windows Audio Effects SDK provides the following audio effects for broadcast use cases with real-time audio processing:

Background Noise Removal: removes common background noise while preserving the speaker’s natural voice, with improved accuracy for automated speech recognition.

Room Echo Cancellation: removes and suppresses reverbs from audio that might occur from the recording environment, improving speech clarity.

Background Noise Reduction + Room Echo Cancellation: removes unwanted noises and reverberations from audio, improving speech intelligibility.

Audio Super Resolution: improves sound quality by adding higher frequency content to the audio stream. For low-frequency audio, this feature predicts the higher frequency spectrum of input audio, which improves audio quality.

Acoustic Echo Cancellation: removes acoustic echo and feedback from audio, which improves the bidirectional audio quality.

Studio Voice: enables ordinary headset, laptop, and desktop microphones to deliver the sound of a high-end studio mic, even if recorded in less-than-ideal acoustic environments with distortions such as reverberations or static noise. Studio Voice enhances and recovers speech degraded by noise reduction filters and beamforming algorithms, making the audio sound like it was recorded in a professional studio. This effect has two variations: Studio Voice High Quality and Studio Voice Low Latency.

Speaker Focus: identifies and isolates the primary speaker and removes all other speakers from the input audio. This significantly improves the intelligibility of the primary speaker’s voice when others are speaking in the background.

Voice Font: converts the input voice to match the reference speaker’s voice while keeping linguistic information and prosody unchanged. Currently only available as an EA feature.

Get Help

Please refer to the programming guide for quick start guide, API reference and more.

Get access to knowledge base articles and support cases or submit a ticket.

Publisher

NVIDIA

Latest Version2.1.0

UpdatedMarch 16, 2026 UTC

Compressed Size973.13 MB

Labels