This repository shows to deploy BERT QA model on Azure Machine Learning with NVIDIA Triton Inference Server for high performance inferencing. It includes a set of scripts and a Jupyter Notebook that details the step by step guides.
To successfully run the included Jupyter Notebook, you need the followings:
config.json
from Azure ML, saved in the root of the directory. To create one use the create_config method. To take a look at this, or any other, notebook; follow these steps:
If you're new to Azure Machine Learning deployments, check out How and where to deploy models and Troubleshooting and debugging for additional resources.
If you're new to the NVIDIA Triton Inference Server, check out this information page and the Github page
Refer to the following NVIDIA End User License Agreements, included in LICENSE file.