VLM Summarization (Jetson)

NGC Catalog

CLASSIC

Welcome Guest

For copy image paths and more information, please view on a desktop device.

Description

Provides natural language interfaces for performing video summarization

Publisher

NVIDIA

Latest Tag

2.0.9

Modified

August 2, 2025

Compressed Size

11.6 GB

Multinode Support

Multi-Arch Support

2.0.9 (Latest) Security Scan Results

Linux / arm64

VLM Video Summarization

Video recording systems collect vast amounts of data, often with sparse but significant events of interest. An event when occurring may prolong for a period of time, but may not merit an extended mention based on its duration of occurrence. These aspects make efficient but effective video summarization an important feature in the overall usefulness of an AI based video system. Generative AI provides an accurate, generalizable technique based on natural language interfaces for performing video summarization that the industry has actively investigated. The video summarization microservice addresses these functional and design requirements that can be leveraged out of the box by users. It's design and functionality is modelled after the Video Search and Summarization (VSS) Agent Blueprint from NVIDIA released for Tesla GPUs.

Usage of the video summarization service involves a 2-step process, achieving API compatibility with the VSS Blueprint. The user first uploads a file through the files API, which returns a handle. The user could then launch summarization functionality through invocation of the summarize API.

The video summarization microservice is based on the NanoLLM framework for NVIDIA Jetson platform.

For more information about this microservice, refer to https://docs.nvidia.com/jetson/jps/inference-services/video_summarization.html

License

By downloading or using the software and materials, you agree to the License Agreement for Jetson Platform Services