NVIDIA
NVIDIA
Multilingual Silero VAD
Model
NVIDIA
NVIDIA
Multilingual Silero VAD

Multilingual Silero Voice Activity Detection model

Sign in to access all content for this ModelSigning in will also allow download accessSign In

Model Overview

Description:

This model can be used for Voice Activity Detection (VAD), and serves as the first step for Automatic Speech Recognition (ASR). Silero VAD works with 8 kHz and 16 kHz sample rates, with fixed 256 and 512 sample windows respectively. It supports more than 6,000 languages.

This model is ready for commercial use.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see Silero Voice Activity Detector | PyTorch.

License/Terms of Use:

This model is governed by the NVIDIA RIVA License Agreement.

Disclaimer: AI models generate responses and outputs based on complex algorithms and machine learning techniques, and those responses or outputs may be inaccurate or offensive. By downloading a model, you assume the risk of any harm caused by any response or output of the model.

By using this software or model, you are agreeing to the terms and conditions of the license, acceptable use policy and Silero VAD’s privacy policy. Silero VAD is released under the MIT license.

References:

Silero VAD website Silero VAD citation

@misc{Silero VAD,
  author = {Silero Team},
  title = {Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/snakers4/silero-vad}},
  commit = {insert_some_commit_here},
  email = {hello@silero.ai}
}

Model Architecture:

Architecture Type: Unknown Network Architecture: Silero VAD

Input:

Input Type(s): Audio Input Format(s): Linear PCM 16-bit 1 channel (Audio) Input Parameters: One-Dimensional (1D)

Output:

Output Type(s): Probabilities of speech Output Format: Float Output Parameters: 1D

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
  • NVIDIA Blackwell

Supported Operating System(s):

  • Linux

Model Version(s): v5

Training Dataset:

  • Bible.is Data Collection Method: Unknown Labeling Method: Unknown

  • globalrecordings.net Data Collection Method : Unknown Labeling Method: Unknown

  • VoxLingua107 Data Collection Method : Unknown Labeling Method: Unknown

  • Common Voice Data Collection Method : Human Labeling Method: Human

  • MLS Data Collection Method : Human Labeling Method: Human

Inference: Engine: Onnxruntime, Triton

Test Hardware:

  • A100
  • H100

For more detail on model usage, evaluation, training dataset and implications, please refer to Silero VAD github.

## Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

Publisher
NVIDIA
NVIDIA
Latest Versionv5
UpdatedDecember 11, 2024 UTC
Compressed Size2.22 MB

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.