NGC Catalog
CLASSIC
Welcome Guest
Containers
VLM Inference Service (Jetson)

VLM Inference Service (Jetson)

For copy image paths and more information, please view on a desktop device.
Logo for VLM Inference Service (Jetson)
Description
AI Inference Service for using VLM (visual language model) on streaming video for greater contextual understanding and natural language interaction
Publisher
NVIDIA
Latest Tag
2.0.0
Modified
March 1, 2025
Compressed Size
8.57 GB
Multinode Support
No
Multi-Arch Support
No
2.0.0 (Latest) Security Scan Results

Linux / arm64

Sorry, your browser does not support inline SVG.

VLM Inference Service

VLMs are multi-modal models supporting images, video and text and using a combination of large language models (LLMs) and vision transformers (ViT). Based on this capability, they are able to support text prompts to query videos and images thereby enabling capabilities such as chatting with the video and defining natural language based alerts.

The VLM AI service, enables quick deployment of VLMs with Jetson Platform Services for video insight applications. The VLM service exposes REST API endpoints to configure the video stream input, set alerts and ask questions in natural language about the input video stream.

Additionally, the output of the VLM can be viewed as an RTSP stream, the alert states are stored by the jetson-monitoring service and sent over a websocket to integrate with other services.

For more information on VLM inference service and using it in application, refer to https://docs.nvidia.com/jetson/jps/inference-services/vlm.html

License

By downloading or using the software and materials, you agree to the License Agreement for Jetson Platform Services