NVIDIA L4T TensorRT

NGC Catalog

CLASSIC

Welcome Guest

For copy image paths and more information, please view on a desktop device.

Description

NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network and produces a highly optimized runtime engine that performs inference for that network.

Publisher

NVIDIA

Latest Tag

r10.3.0-devel

Modified

October 10, 2024

Compressed Size

4.06 GB

Multinode Support

Multi-Arch Support

r10.3.0-devel (Latest) Security Scan Results

Linux / arm64

What Is TensorRT?

The core of NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network, which consists of a network definition and a set of trained parameters, and produces a highly optimized runtime engine that performs inference for that network.

You can describe a TensorRT network using a C++ or Python API, or you can import an existing Caffe, ONNX, or TensorFlow model using one of the provided parsers.

TensorRT provides APIs via C++ and Python that help to express deep learning models via the Network Definition API or load a pre-defined model via the parsers that allows TensorRT to optimize and run them on a NVIDIA GPU. TensorRT applies graph optimizations, layer fusion, among other optimizations, while also finding the fastest implementation of that model leveraging a diverse collection of highly optimized kernels. TensorRT also supplies a runtime that you can use to execute this network on all of NVIDIA's GPUs from the Kepler generation onwards.

TensorRT also includes optional high speed mixed precision capabilities introduced in the Tegra X1, and extended with the Pascal, Volta, and Turing architectures.

Overview of Images

Currently only TensorRT runtime container is provided. The TensorRT runtime container image is intended to be used as a base image to containerize and deploy AI applications on Jetson. This container uses l4t-cuda runtime container as the base image. The container includes with in itself the TensorRT runtime componetns and also includes CUDA runtime and CUDA math libraries ; these components does not get mounted from host by NVIDIA container runtime. NVIDIA container rutime still mounts platform specific libraries and select device nodes into the container.

The image is tagged with the version corresponding to the TensorRT release version. Based on this, the l4t-tensorrt:r8.0.1-runtime container is intended to be run on devices running JetPack 4.6 which supports TensorRT version 8.0.1

Running the container

Prerequisites

Ensure that NVIDIA Container Runtime on Jetson is running on Jetson.

Note that NVIDIA Container Runtime is available for install as part of Nvidia JetPack

Pull the container

Before running the l4t-cuda runtime container, use Docker pull to ensure an up-to-date image is installed. Once the pull is complete, you can run the container image.

Procedure

In the Pull column, click the icon to copy the Docker pull command for the l4t-cuda-runtime container.
Open a command prompt and paste the pull command. Docker will initiate a pull of the container from the NGC registry.
Ensure the pull completes successfully before proceeding to the next step.

Run the container

To run the container:

Allow external applications to connect to the host's X display:

xhost +

Run the docker container using the docker command

sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-tensorrt:r8.0.1-runtime

Option explained:

-it means run in interactive mode
--rm will delete the container when finished
--runtime nvidia will use the NVIDIA container runtime while running the l4t-base container
-v is the mounting directory, and used to mount host's X11 display in the container filesystem to render output videos
r8.0.1 is the tag for the image corresponding to the tensorrt release

Exposing additional features

By default a limited set of device nodes and associated functionality is exposed within the cuda-runtime containers using the mount plugin capability. This list is documented here.

User can expose additional devices using the --device command option provided by docker.
Directories and files can be bind mounted using the -v option.

Note that usage of some devices might need associated libraries to be available inside the container.

Run a sample application

Once you have successfully launched the l4t-tensorrt container, you run TensorRT samples inside it. For example, to run TensorRT sampels inside the l4t-tensorrt runtime container, you can mount the TensorRT samples inside the container using -v options (-v ) during "docker run" and then run the TensorRT samples from within the container.

License

By pulling and using the container, you accept the terms and conditions of this End User License Agreement.