NGC Catalog
CLASSIC
Welcome Guest
Containers
DOCA Telemetry Service

DOCA Telemetry Service

For copy image paths and more information, please view on a desktop device.
Logo for DOCA Telemetry Service
Description
DOCA Telemetry Service (DTS) is a DOCA Service for collecting and exporting telemetry data. Both predefined and user-defined telemetry counters are available and the export is made by Prometheus, Fluent Bit, Open Telemetry and netflow.
Publisher
NVIDIA
Latest Tag
1.21.2-doca3.0.0-host
Modified
May 4, 2025
Compressed Size
360.64 MB
Multinode Support
No
Multi-Arch Support
Yes
1.21.2-doca3.0.0-host (Latest) Security Scan Results

Linux / arm64

Sorry, your browser does not support inline SVG.

Linux / amd64

Sorry, your browser does not support inline SVG.

Introduction

DOCA Telemetry Service (DTS) can run on hosts and BlueField, and collects data from built-in providers and external telemetry applications. The following providers are available:

  • sysfs
  • ethtool
  • ifconfig
  • amBER
  • PPCC (Port Programmable Congestion Control)
  • DCGM (Data Center GPU Manager)
  • NVIDIA SMI (System Management Interface)
  • Diagnostic Data
  • Inventory
  • tc (Traffic Control)

Aggregation providers to collect data from other applications via TCP:

  • fluent_aggr
  • prometheus_aggr

Additional telemetry applications written on top of DOCA API can send the service data over IPC (shared memory).

On bluefields, DTS providers that require additional permissions, which cannot be given by DTS, can be executed via Doca Privileged Executer (DPE) server. Data collected by DPE server will be consumed by DTS as standard providers data. For details please refer to documentation.

Collected data is written/exported according to the configuration. The following oprions are available:

  • Write data as binary files to storage
  • Fluent Bit - Export data through Fluent Bit forwarding.
  • Prometheus - Creates Prometheus endpoint and keeps the most recent data to be scraped by Prometheus. Enabled by default on port 9100.
  • Prometheus remote write - Export data through Prometheus remote-write protocol.
  • Open Telemetry - Export data through Open Telemetry Protocol.

Note: starting from DTS version 1.9.0-doca1.3.0, shared memory is mounted under "/dev/shm/telemetry/". Telemetry applications should be mounted to the same folder.

Note: DTS version 1.9.0-doca1.3.0 is not backward compatible with previous DOCA API versions.
Note: DTS version 1.11.0-doca1.5.0 is not backward compatible with previous DOCA API versions.

Bluefield Deployment

Installation and Getting Started

All preparation steps are listed under DOCA's Container Deployment User Guide.

Note: The DOCA Service container is configured for K8S-based deployment, hence the use of the docker pull command is discouraged.

Preparation steps for the DOCA Service

None needed.

Adjusting the .yaml configuration

The .yaml configuration for our container is doca_telemetry.yaml:

wget https://api.ngc.nvidia.com/v2/resources/nvidia/doca/doca_telemetry/versions/1.20.2/files/configs/1.20.2/doca_telemetry_standalone.yaml

The yaml file allows generating DTS configuration from scratch, and overwriting Fluent-Bit export configuration.

Note: The file is also stored with the rest of the .yaml configurations as were pulled from NGC in the previous steps (See "Installation and Getting Started").

Enable fluent bit forwarding

In case Fluent Bit forwarding should be enabled, please add the destination host and port to the "command" found in the initContainers section:

command: ["/bin/bash/", "-c", "/usr/bin/telemetry-init.sh && /usr/bin/enable-forward-to-morpheus.sh -i=127.0.0.1 -p=24224"]

Please note that the host and port shown above are just an example.

Spawning the container

Simply copy the updated doca_telemetry.yaml file to the /etc/kubelet.d directory. Kubelet will automatically pull the container image from NGC, and spawn a pod executing the container. The DOCA Telemetry Service application will start executing right away.

# View currently active pods, and their IDs (it might take up to 20 seconds for the pod to start)
crictl pods

# View currently active containers, and their IDs
crictl ps

# Examine logs of a given container
crictl logs 

# Examine kubelet logs, in case something didn't work as expected
journalctl -u kubelet

Please refer to the documentation for more information.

Host deployment:

Note: Host deployment is available starting from DTS version 1.13.0-doca2.0.2

x86_64 host version runs as docker container. Please refer to section Host Deployment of documentation for configuration and run commands.

Documentation

The DOCA Telemetry Service guide is available here.

License & EULA

DOCA is licensed under the NVIDIA DOCA License. By pulling and using the container, you accept the terms and conditions of this license.

Technical Support

Use the NVIDIA Developers forum for questions regarding this Software.