The Audio2Face Microservice converts speech into facial animation in the form of ARKit blendshapes. The facial animation includes emotional expression. Where emotions can be detected, the facial animation system captures key poses and shapes to replicate character facial performance by automatically detecting emotions in the input audio. Additionally emotions can be directly specified as part of the input to the microservice. A rendering engine can consume blendshape topology to display a 3D avatar’s performance.
The Audio2Face (A2F) Controller Microservice is designed to facilitate the management and integration of the A2F microservice within larger workflows. It acts as both the origin and the destination of the A2F outputs, simplifying the interaction with the A2F service by providing a bi-directional API.
Audio2Face Service helm chart contains both Audio2Face Microservice and A2F Controller Microservice.
Please follow the Quick start guide for Prerequisites & dependencies
This helm chart is used as part of the Audio2Face Kubernetes quick deployment script.
Refer to the Audio2Face Kubernetes deployment documentation for more details.
Once Deployed you can use the A2F Controller sample application to send audio data and receive animation data back.
Refer to the Sample application connecting to A2F Controller for more details.
OS : Ubuntu 22.04
CUDA : 12.1
Driver : 535.54
NOTE: Any Linux distribution should work but has not been tested by our teams. Some of the newer versions of CUDA 12.x have not been fully tested and may encounter issues during TRT model generation.
Enterprise Support Get access to knowledge base articles and support cases or submit a ticket: https://www.nvidia.com/en-us/data-center/products/ai-enterprise-suite/support/
By downloading and using this software, you accept the terms and conditions of this license and ACE EULA.
You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.