NGC Catalog
CLASSIC
Welcome Guest
Models
Whisper ASR GGUF for Nv IGI SDK

Whisper ASR GGUF for Nv IGI SDK

For downloads and more information, please view on a desktop device.
Logo for Whisper ASR GGUF for Nv IGI SDK
Features
Description
Whisper ASR GGUF for Nv IGI SDK ASR plugin
Publisher
-
Latest Version
1.0
Modified
December 6, 2024
Size
465.02 MB

Whisper Model Overview

Description:

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labeled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. Whisper-small is one of the 5 configurations of the model with 244M parameters.

This model version is formatted and packaged for use with the Nv IGI SDK plugin for easier integration with native Windows applications, to do local inference.

For details on model usage, evaluation, training data set and implications, please refer to Whisper Model Card

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see Whisper Evaluated Use Policy.

Terms of Use

This model is governed by the NVIDIA Software and Model Evaluation License Agreement.

Disclaimer: AI models generate responses and outputs based on complex algorithms and machine learning techniques, and those responses or outputs may be inaccurate or offensive. By downloading a model, you assume the risk of any harm caused by any response or output of the model.

By using this software or model, you are agreeing to the terms and conditions of the license, acceptable use policy and Whisper’s privacy policy. Whisper is released under the Apache 2.0 License.

References:

Whisper Model Card on Github Whisper website Whisper paper

Model Architecture:

Architecture Type : Transformer

Network Architecture: Sequence to sequence (RNN)

Model Version : whisper-small

Input:

Input Type(s): Audio

Input Format(s): flac, ogg, wav

Input Parameters: Sampling rate

Output:

Output Type(s) : Text

Output Format : String

Output Parameters:None

Software Integration:

Runtime Engine(s): ggml

Supported Hardware Platform(s): Ada, Ampere

Supported Operating System(s): Windows 11

Inference:

Engine : GGML - CUDA / CPU

Test Hardware: RTX 4090

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.