BERT QA on Azure ML with Triton Demo
This repository shows to deploy BERT QA model on Azure Machine Learning with NVIDIA Triton Inference Server for high performance inferencing. It includes a set of scripts and a Jupyter Notebook that details the step by step guides.
To successfully run the included Jupyter Notebook, you need the followings:
- Ensure you have a
config.jsonfrom Azure ML, saved in the root of the directory. To create one use the create_config method.
- Azure SDK for Python. Refer to the documentation.
- NVIDIA TensorRT optimized BERT QA model. Refer to the developer blog to create one before you proceed.
Viewing the Jupyter Notebook in NGC
To take a look at this, or any other, notebook; follow these steps:
- Navigate to the File Browser tab of the asset in NGC
- Select the version you'd like to see
- Under the actions menu (three dots) for the .ipynb file select "View Jupyter"
- There you have it! You can read a notebook for documentation and copy code samples without ever leaving NGC.
If you're new to Azure Machine Learning deployments, check out How and where to deploy models and Troubleshooting and debugging for additional resources.
If you're new to the NVIDIA Triton Inference Server, check out this information page and the Github page
End User License Agreements
Refer to the following NVIDIA End User License Agreements, included in LICENSE file.