NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use case and deliver real-time performance. Riva offers pre-trained speech models in NVIDIA NGC that can be fine-tuned with the TAO Toolkit on a custom data set, accelerating the development of domain-specific models.
Some of the major tasks that you can perform using Riva are:
Customizing a model with your data Using the NVIDIA TAO Toolkit, you can use a custom-trained model in Riva. NVIDIA TAO Toolkit is a low-coding tool for fine-tuning models on the domain-specific dataset.
Deploying a model in Riva Riva is designed for speech AI at scale. To help you efficiently serve models across different servers robustly, NVIDIA provides push-button model deployment using Helm charts.
TAO models can be easily exported, optimized, and deployed as a speech service on premises or in the cloud with a single command using Helm charts.
Riva’s high performance inference is powered by NVIDIA TensorRT optimizations and served using the NVIDIA Triton Inference Server. Riva services are available as gRPC-based microservices for low-latency streaming, as well as high-throughput offline use cases. Riva is fully containerized and can easily scale to hundreds and thousands of parallel streams.
Riva Speech Server: Riva Speech Skills is a Docker image containing a toolkit for production-grade conversational AI inference. The Riva Speech API server exposes a simple API for performing speech recognition, speech synthesis, and a variety of NLP inferences. No GPU is required to run the sample clients.
Riva Speech Client: Riva Speech Client is a Docker image containing sample command-line drivers for the Riva services. The client expects that a Riva server is running with models deployed, and all command-line drivers allow an optional argument to specify the location of the server. No GPU is required to run the sample clients.
Riva Quick Start Scripts: Riva includes Quick Start scripts to help you get started with Riva AI Services. These scripts are meant for deploying the services locally for testing and running our example applications.
Notebooks: Notebooks provide step-by-step directions on training and deploying models.
Models: Trainable and deployable versions of Riva Automatic Speech Recognition (ASR) and Speech Synthesis models.
Riva Speech Skills Helm chart: The Helm chart is used to deploy ASR, NLP, and text-to-speech (TTS) services automatically. Specifically, it is designed to automate the steps for push-button deployment to a Kubernetes cluster.
Refer to the Riva Skills Quick Start page for step-by-step instructions on getting started with Riva.
By pulling and using Riva software, you accept the terms and conditions of the corresponding license below:
Building Transcription and Entity Recognition Apps using Riva
Developing a Question Answering application quickly using Riva
Building and deploying Conversational AI models using NVIDIA TAO Tollkit