Linux / arm64
Linux / amd64
The core of NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network, which consists of a network definition and a set of trained parameters, and produces a highly optimized runtime engine that performs inference for that network.
You can describe a TensorRT network using a C++ or Python API, or you can import an existing Caffe, ONNX, or TensorFlow model using one of the provided parsers.
TensorRT provides APIs via C++ and Python that help to express deep learning models via the Network Definition API or load a pre-defined model via the parsers that allows TensorRT to optimize and run them on a NVIDIA GPU. TensorRT applies graph optimizations, layer fusion, among other optimizations, while also finding the fastest implementation of that model leveraging a diverse collection of highly optimized kernels. TensorRT also supplies a runtime that you can use to execute this network on all of NVIDIA's GPUs from the Kepler generation onwards.
TensorRT also includes optional high speed mixed precision capabilities introduced in the Tegra X1, and extended with the Pascal, Volta, and Turing architectures.
Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers And Frameworks User Guide and specify the registry, repository, and tags. For more information about using NGC, refer to the NGC Container User Guide.
The method implemented in your system depends on the DGX OS version installed (for DGX systems), the specific NGC Cloud Image provided by a Cloud Service Provider, or the software that you have installed in preparation for running NGC containers on TITAN PCs, Quadro PCs, or vGPUs.
Select the Tags tab and locate the container image release that you want to run.
In the Pull Tag column, click the icon to copy the
docker pull command.
Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.
Run the container image.
If you have Docker 19.03 or later, a typical command to launch the container is:
docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorrt:xx.xx-py3
If you have Docker 19.02 or earlier, a typical command to launch the container is:
nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorrt:xx.xx-py3
--rmwill delete the container when finished
xx.xxis the container version. For example,
cd /workspace/tensorrt/samples make -j4 cd /workspace/tensorrt/bin ./sample_mnist
cd /workspace/tensorrt/samples/python/introductory_parser_samples python caffe_resnet50.py -d /workspace/tensorrt/python/data
/workspace/README.mdinside the container for information on customizing your image.
In order to save space, some of the dependencies of the Python samples have not been pre-installed in the container. To install these dependencies, run the following command before you run these samples:
For the latest TensorRT container Release Notes see the TensorRT Container Release Notes website.
For a full list of the supported software and specific versions that come packaged with this framework based on the container image, see the Frameworks Support Matrix.
For the latest TensorRT product Release Notes, Developer and Installation Guides, see the TensorRT Product Documentation website.
To review known CVEs on the 21.07 image, please refer to the Known Issues section of the Product Release Notes.
By pulling and using the container, you accept the terms and conditions of this End User License Agreement.