DeepStream - CV Deployment

DeepStream - CV Deployment

Logo for DeepStream - CV Deployment
DeepStream SDK delivers a complete streaming analytics toolkit for AI based video and image understanding and multi-sensor processing. The DeepStream SDK brings deep neural networks and other complex processing tasks into a stream processing pipeline.
May 17, 2024
Sorry, your browser does not support inline SVG.
Helm Charts
Sorry, your browser does not support inline SVG.
Sorry, your browser does not support inline SVG.
Sorry, your browser does not support inline SVG.

What is DeepStream?

NVIDIA’s DeepStream SDK delivers a complete streaming analytics toolkit for AI-based multi-sensor processing, video and image understanding. DeepStream is an integral part of NVIDIA Metropolis, the platform for building end-to-end services and solutions for transforming pixels and sensor data to actionable insights. . DeepStream SDK features hardware-accelerated building blocks, called plugins that bring deep neural networks and other complex processing tasks into a stream processing pipeline. The SDK allows you to focus on building core deep learning networks and IP rather than designing end-to-end solutions from scratch.

The SDK uses AI to perceive pixels and analyze metadata while offering integration from the edge-to-the-cloud. The SDK can be used to build applications across various use cases including retail analytics, patient monitoring in healthcare facilities, parking management, optical inspection, managing logistics and operations.

The SDK features:

  • Running inference in native TensorFlow and TensorFlow-TensorRT using Triton inference server

  • Development in Python using DeepStream Python bindings

  • Edge to cloud integration using standard message brokers like Kafka and MQTT or with Azure Edge IoT

  • IoT and manageability features: bi-directional messaging between edge and cloud, over the air model (OTA) update, smart recording and TLS based authentication for secure messaging

  • Turnkey deployment of models trained with TAO Toolkit

  • Latest releases of NVIDIA libraries for AI and other GPU computing tasks: TensorRT™ 7.0 and CUDA® 10.2/CUDA® 11

  • Hardware accelerated video and image decoding

  • Sample apps in C/C++ and Python to get started

See DeepStream and TAO in action by exploring our latest NVIDIA AI demos. Demos include multi-camera tracking, Project Tokkio and Self-checkout and more.

What is in this Collection?

DeepStream Container

The NVIDIA DeepStream Container

TLT Models

Purpose-built pre-trained models ready for inference..



The purpose-built models are available on NGC. Under each model cards, there is a pruned version that can be deployed as is or an unpruned version which can be used with TLT to fine tune with your own dataset.

Model Name Network Architecture Number
of classes
Accuracy Use Case
TrafficCamNet DetectNet_v2-ResNet18 4 83.5% Detect and track cars
PeopleNet DetectNet_v2-ResNet18 3 80% People counting, heatmap generation, social distancing
PeopleNet DetectNet_v2-ResNet34 3 84% People counting, heatmap generation, social distancing
DashCamNet DetectNet_v2-ResNet18 4 80% Identify objects from a moving object
FaceDetectIR DetectNet_v2-ResNet18 1 96% Detect face in a dark environment with IR camera
VehicleMakeNet ResNet18 20 91% Classifying car models
VehicleTypeNet ResNet18 6 96% Classifying type of cars as coupe, sedan, truck, etc

Architecture specific pre-trained models

In addition to purpose-built models, Transfer Learning Toolkit supports the following detection architectures:

These detection meta-architectures can be used with 13 backbones or feature extractors with TLT. For a complete list of all the permutations that are supported by TLT, please see the matrix below:

TLT2.0 supports instance segmentation using MaskRCNN architecture

DeepStream container for x86 and NVIDIA GPU

This page describes container for NVIDIA Data Center GPUs such as T4 or A100 running on x86 platform

Starting with DeepStream 4.0.1 release, different container variants are being released for x86 with NVIDIA GPU platforms to cater to different user needs. These are differentiated based on image tags and are described below:

  • Base: the DeepStream base container contains the plugins and libraries that are part of the DeepStream SDK along with dependencies such as CUDA, TensorRT, Gstreamer etc. This image should be used as the base image by users for creating docker images for their own DeepStream based applications. Not supported on A100. Note that the base images does not contain sample apps (deepstream:5.0-20.07-base)

  • Samples: The DeepStream samples container extends the base container to also include sample applications that are included in the DeepStream SDK along with associated config files, models and streams. It thereby provides a ready means by which to explore the DeepStream SDK using the samples. Not supported on A100 (deepstream:5.0-20.07-samples)

    • Limitations: On Samples docker container some time Cuda failure:status=4 message is observed for deepstream-test1 application for T4 platform but the test should run fine.
  • IoT:The DeepStream IoT container extends the base container to include the DeepStream test5 application along with associated configs and models. The container can be readily be used to enable multi stream DeepStream application that can be integrated with the various messaging backend including kafka, Azure IoT and MQTT thereby enabling IoT use cases. Not supported on A100 (deepstream:5.0-20.07-iot)

  • Development: The DeepStream development container further extends the samples container by including the build toolchains, development libraries and packages necessary for building deepstream reference applications from within the container. This container is the slightly larger in size by virtue of including the build dependencies. Not supported on A100 (deepstream:5.0-20.07-devel)

  • Deployment with Triton: The DeepStream Triton container enables running inference using Triton Inference server. With this developers can run inference natively using TensorFlow, TensorFlow-TensorRT, PyTorch and ONNX-RT. Inference with Triton is supported in the reference application (deepstream-app). To read more about how to use Triton with DeepStream, refer to Plugins manual. This container is the biggest in size because it combines multiple containers. Not supported on A100 (deepstream:5.0-20.07-triton)

  • A100 container: The DeepStream A100 container enables DeepStream development and deployment on A100 GPUs running CUDA11. The other containers run on CUDA10.2 and will not work on A100. This container builds on top of ‘deepstream:5.0-20.07-devel` container and adds CUDA11 and A100 support. It includes all the build toolchains, development libraries and packages necessary for building deepstream reference applications from within the container. This container has few known limitations noted below, which will be addressed in future releases. (deepstream:5.0-20.08-devel-a100)

    • Known limitations in A100 container:
      1. NvDCF Tracker doesn’t work on this container
      2. NV Optical flow is not supported
      3. YOLO sample app doesn’t build properly

Running DeepStream


Ensure these prerequisites are available on your system:

  1. nvidia-docker We recommend using Docker 19.03 along with the latest nvidia-container-toolkit as described in the installation steps. Usage of nvidia-docker2 packages in conjunction with prior docker versions is now deprecated.

  2. NVIDIA display driver version 440+

Pull the container

Before running the container, use docker pull to ensure an up-to-date image is installed. Once the pull is complete, you can run the container image.


  1. In the Pull column, click the icon to copy the docker pull command for the deepstream container of your choice

  2. Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.

Run the container

To run the container:

  1. Allow external applications to connect to the host's X display:
xhost +
  1. Run the docker container (use the desired container tag in the command line below):
docker run --gpus all -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-5.0
If using nvidia-docker (deprecated) based on a version of docker prior to 19.03: nvidia-docker run -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-5.0

Note that the command mounts the host's X11 display in the guest filesystem to render output videos.

Options explained:

  • -it means run in interactive mode

  • --gpus option makes GPUs accessible inside the container

  • --rm will delete the container when finished

  • -v is the mounting directory, and used to mount host's X11 display in the container filesystem to render output videos

  • 5.0-20.07-samples is the tag for the image; 20.07 refers to the version of the container for that release; samples refers to the DeepStream container variant

  • user can mount additional directories (using -v option) as required containing configuration file and models for access by applications executed from within the container

  • Additionally, --cap-add SYSLOG option needs to be included to enable usage of the nvds_logger functionality inside the container

  • to enable RTSP out, network port needs to be mapped from container to host to enable incoming connections using the -p option in the command line; eg: -p 8554:8554

See /opt/nvidia/deepstream/deepstream-5.0/README inside the container for deepstream-app usage.


There are known bugs and limitations in the SDK. To learn more about those, refer to the release notes


The DeepStream SDK license is available within the container at the location /opt/nvidia/deepstream/deepstream-5.0/LicenseAgreement.pdf. By pulling and using the DeepStream SDK (deepstream) container in NGC, you accept the terms and conditions of this license.

Technical blogs

Suggested Reading

Ethical AI

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.