In a typical Audio Effects Microservice deployment, a service provider configures and launches the Audio Effects Microservice on a GPU-based server. A client application (remote or local) connects to the microservice, negotiates the desired audio effects and connection parameters via gRPC, and starts streaming audio to the microservice (via RTP/UDP, gRPC/TCP or another protocol) and receives the processed (enhanced) audio back.
The Audio Effects Microservice includes Docker containers which can be used to demonstrate the typical deployment scenario described above. Specifically, this quick start guide provides step-by-step instructions for the following:
Configuring the Audio Effects Microservice
Launching the Audio Effects Microservice
Testing the microservice using the Audio Effects Sample client, which connects with the launched microservice, streams audio to the microservice and receives the processed audio back.
Ensure the following prerequisites are available on your system for running the container:
Select CLI Command to copy the download command.
Ensure you have installed the NGC CLI tool.
After you install the tool, to start the download, paste the copied command in a Command Prompt window.
Login to NGC docker registry.
docker login nvcr.io
This will prompt for username
and password
. Use $oauthtoken
for username
and your NGC_API_KEY
for password
The file config.sh
contains the parameters used for configuring various aspects of the Maxine Audio Effects Microservice. Before launching the microservice, ensure that these parameters are set properly. An example file config.sh
is included in the package.
(Required) audio_effects_api_port: The port to be used for gRPC end-point.
(Optional) input_sample_rate: The input audio sample rate that will be supported by service. The default value is 16000. When use_studio_voice_quality is enabled: supported values are 16000 and 48000. When use_studio_voice_perf is enabled: supported value is 48000.
(Optional) output_sample_rate: The output audio sample rate that will be generated by service. The default value is 16000. Typically the output audio sample rate (output_sample_rate
) matches the input audio sample rate (input_sample_rate
), except when audio super-resolution effect is enabled for 8 kHz to 16 kHz or 16 kHz to 48 kHz upscaling.
(Optional) audio_chunk_duration: The audio chunk-duration in milliseconds that will be used to load the audio-effect model. The default value is 10.
(Optional) max_streams: Maximum number of streams to be supported by the service. It's used to define batch-size of audio-effect model. When use_studio_voice_quality/use_studio_voice_perf is enabled, this params should be equal to num_pipelines. Please also refer audio-effect-wise / gpu-wise "Maximum Batch Size" values mentioned at Audio Effects SDK documentation to check supported max value. optional. default: 10
(Optional) To enable or disable audio-effects in the service, set the corresponding parameter to true
or false
, respectively.
These flags are used to load the corresponding audio effect model. At least one audio effect must be enabled. The default value for all flags is false
. If none of the flags are set to true, the service launch will fail.
Microservice also supports chaining of certain effects. The user needs to just set the effects and sample rates and the microservice will internally choose the appropriate chaining effect. Currently the following chaining effects are supported:
(Optional) intensity_ratio: Intensity of Effects to be applied. Intensity percent as a float values. when multiple supported effects are enabled, all effects will be applied with same provided intensity. This values applies to all client connections for a deployment. Applicable for Denoiser and Dereverb effects, it will be ignored if set for other effects.
(Optional) udp_port_range: Host port number(or range) to be mapped for udp streaming case. eg. "9001", "9001-9005". optional. required if service to be used for udp streaming input.
(Optional) grpc_worker_threads: Number of worker threads used by gRPC async server. For max-stream=200, <=2 threads are sufficient. Increasing number of threads as max-streams increases may give performance benefit. optional. default: 2
(Optional) max_udp_ports_per_stream: Maximum number of udp ports that need to be allocated per stream. This option is provided to allocate number UDP ports per session. It is set as 2 specifically when client is sending RTCP packets (though ignored on service) on odd port (e.g. 9001) next to RTP data packets specific port (e.g. 9000) optional. default: 2
(Optional) num_pipelines: Number of gstreamer pipelines to launch. max_streams are equally divided and placed into each of the pipeline. It is recommeded to use 250 streams for each pipeline. Also, note that the gpu memory usage will increase with increase in num_pipelines value. default: 1
my_pod_ip: Pod or Host ip where microservice is deployed. It is required for UDP data streaming support. Microservice includes this value as udp-host ip in response back to client.
enable_traces: Enable/Disable OpenTelemetry (https://opentelemetry.io/) traces.
enable_metrics: Enable/Disable OpenTelemetry metrics.
use_ostream_exporter: If enabled, OpenTelemetry traces and metrics will be printed on standard output. If disabled, OpenTelemetry will use otlp exporter (https://opentelemetry.io/docs/reference/specification/protocol/exporter/#configuration-options) and data is sent to lightstep.
Below lightstep related params are needed to be supplied if enable_traces=true && use_ostream_exporter=false
lightstep_token_filepath: Lightstep backend needs token for access. This parameter provides a local file path for token.
lightstep_cert_filepath: To access lightstep grpc endpoint, it needs SSL certificate. Use this parameter to provide file path.
lightstep_endpoint: URL to access lightstep backend.
(Optional) Logging: service prints logs to stdout or stderr. "docker logs " can be used to check service logs.
(Optional) GST_DEBUG can be used to control gStreamer logs. service uses gStreamer pipeline for audio processing. For more info, Please refer gStreamer - Printing Debug Information. optional. default:0.
(Optional) GLOG_logtostderr can be used to enable glogs. Please find more details related to glogs here. optional. default:0.
(Optional) GLOG_v controls the log verbose level. High level means more detailed logs. optional. default:0.
Set appropriate permissions for downloaded scripts, using command as follows
chmod -R 775 maxine_audio_quick_startv1.3.0
To initialize the required components, run the following command.
./audio_effects_init.sh
Start Audio Effects Microservice.
./audio_effects_start.sh
How to use custom pipline IO tuning configs:
io_tuning_configs/
$ ./audio_effects_stop.sh
$ ./audio_effects_start.sh
Audio Effects sample client is an application which can be run on any machine. This application is provided in the docker in binary form (executable) and required dependencies. The client application, once configured, connects to the microservice, negotiates a session, streams audio to and receives processed audio from the microservice.
An example of Audio Effects sample client configuration file is included in the package.
audio_effects_test_client_config.json
For more information about each of the configration fields in these files, please refer to the file audio_effects_test_client.proto
.
The ms_config_file
field in client config specifies path to the file which contains the config packet to be sent to the microservice. Refer to configs
folder for more information. Refer to protos/audio_effects/<api-version>/audio_effects.proto
for more details about the Microservice config request fields.
To test the microservice using this pre-built client,
chmod +x audio_effects_client_start.sh
./audio_effects_client_start.sh
/opt/nvidia/maxine-microservices/bin/mic_pipeline -audio_effects_test_client_config=/host/audio_effects_test_client_config.json 2>&1 | tee /host/audio_effects_client_logs
Upon launching the client,
audio_effects_uri
field of the client configuration and establishes a gRPC channel.ms_config_file
field in client configuration and processes the response.input_audio_file
in client configuration and streams data packets using gRPC or RTP/UDP, as configuared in ms_config_file
.output_audio_file
field in client configuration.enable_denoiser_v2
feild in config.sh
as requiredenable_denoiser_v2=false
to enable denoiser version 1enable_denoiser_v2=true
to enable denoiser version 2A few audio test files with artifacts are included in the package for testing purposes. These files can be found at the following paths in the package:
To shut down the service, run the following command.
chmod +x audio_effects_stop.sh
./audio_effects_stop.sh
To check the logs, check container logs as follows on host machine:
docker logs -f maxine-audio-effects-service
To debug or troubleshoot issues while you run the client command, run the mic_pipeline
client sample app with GST_DEBUG=3
Please refer the Audio Effects User Guide
(Under Resources) for a more detailed documentation.
By pulling and using Maxine software, you accept the terms and conditions of the corresponding license (Under Resources).