NGC | Catalog
CatalogContainersNVIDIA L4T TensorRT

NVIDIA L4T TensorRT

Logo for NVIDIA L4T TensorRT
Features
Description
NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network and produces a highly optimized runtime engine that performs inference for that network.
Publisher
NVIDIA
Latest Tag
r8.6.2-devel
Modified
March 1, 2024
Compressed Size
5.5 GB
Multinode Support
No
Multi-Arch Support
No
r8.6.2-devel (Latest) Security Scan Results

Linux / arm64

Sorry, your browser does not support inline SVG.

What Is TensorRT?

The core of NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network, which consists of a network definition and a set of trained parameters, and produces a highly optimized runtime engine that performs inference for that network.

You can describe a TensorRT network using a C++ or Python API, or you can import an existing Caffe, ONNX, or TensorFlow model using one of the provided parsers.

TensorRT provides APIs via C++ and Python that help to express deep learning models via the Network Definition API or load a pre-defined model via the parsers that allows TensorRT to optimize and run them on a NVIDIA GPU. TensorRT applies graph optimizations, layer fusion, among other optimizations, while also finding the fastest implementation of that model leveraging a diverse collection of highly optimized kernels. TensorRT also supplies a runtime that you can use to execute this network on all of NVIDIA's GPUs from the Kepler generation onwards.

TensorRT also includes optional high speed mixed precision capabilities introduced in the Tegra X1, and extended with the Pascal, Volta, and Turing architectures.

Overview of Images

Currently only TensorRT runtime container is provided. The TensorRT runtime container image is intended to be used as a base image to containerize and deploy AI applications on Jetson. This container uses l4t-cuda runtime container as the base image. The container includes with in itself the TensorRT runtime componetns and also includes CUDA runtime and CUDA math libraries ; these components does not get mounted from host by NVIDIA container runtime. NVIDIA container rutime still mounts platform specific libraries and select device nodes into the container.

The image is tagged with the version corresponding to the TensorRT release version. Based on this, the l4t-tensorrt:r8.0.1-runtime container is intended to be run on devices running JetPack 4.6 which supports TensorRT version 8.0.1

Running the container

Prerequisites

Ensure that NVIDIA Container Runtime on Jetson is running on Jetson.

Note that NVIDIA Container Runtime is available for install as part of Nvidia JetPack

Pull the container

Before running the l4t-cuda runtime container, use Docker pull to ensure an up-to-date image is installed. Once the pull is complete, you can run the container image.

Procedure

  1. In the Pull column, click the icon to copy the Docker pull command for the l4t-cuda-runtime container.
  2. Open a command prompt and paste the pull command. Docker will initiate a pull of the container from the NGC registry.
    Ensure the pull completes successfully before proceeding to the next step.

Run the container

To run the container:

  1. Allow external applications to connect to the host's X display:
xhost +
  1. Run the docker container using the docker command
sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-tensorrt:r8.0.1-runtime

Option explained:

  • -it means run in interactive mode
  • --rm will delete the container when finished
  • --runtime nvidia will use the NVIDIA container runtime while running the l4t-base container
  • -v is the mounting directory, and used to mount host's X11 display in the container filesystem to render output videos
  • r8.0.1 is the tag for the image corresponding to the tensorrt release

Exposing additional features

By default a limited set of device nodes and associated functionality is exposed within the cuda-runtime containers using the mount plugin capability. This list is documented here.

User can expose additional devices using the --device command option provided by docker.
Directories and files can be bind mounted using the -v option.

Note that usage of some devices might need associated libraries to be available inside the container.

Run a sample application

Once you have successfully launched the l4t-tensorrt container, you run TensorRT samples inside it. For example, to run TensorRT sampels inside the l4t-tensorrt runtime container, you can mount the TensorRT samples inside the container using -v options (-v ) during "docker run" and then run the TensorRT samples from within the container.

Suggested Reading

For the latest TensorRT container Release Notes see the TensorRT Container Release Notes website.

For a full list of the supported software and specific versions that come packaged with this framework based on the container image, see the Frameworks Support Matrix.

For the latest TensorRT product Release Notes, Developer and Installation Guides, see the TensorRT Product Documentation website.

License

By pulling and using the container, you accept the terms and conditions of this End User License Agreement.