NVIDIA
NVIDIA
Multi-LLM PB October 2025 (PB 25h2)
Container
NVIDIA
NVIDIA
Multi-LLM PB October 2025 (PB 25h2)

Multi-LLM NIM Production Branch October 2025 (PB25h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities. This release includes Government Ready images for regulated environments

Subscribe to get accessSubscribe to the product below to access this premium content:
NVIDIA AI Enterprise
NVIDIA AI EnterpriseAccelerate your AI agent development
Subscribe Now
Note: You can gain access to hundreds more GPU-optimized artifacts by creating a free NGC account.
Already Subscribed?Log in

Description:

This container houses the Multi-LLM PB October 2025 (PB 25h2), which is a NIM microservice designed for use with a broad range of LLMs. It includes a production-ready inference runtime with optimized inference engines from NVIDIA and the community— such as NVIDIA TensorRT-LLM, vLLM, and SGLang. When provided an LLM input at container runtime, stored in either Hugging Face or TensorRT-LLM formats, the NIM identifies the model’s format, architecture, and quantization, selects the inference backend, applies pre-configured settings for the LLM and backend, and starts serving the model for inference. For more information, see the NIM documentation and supported model architectures.

The container components are ready for commercial/non-commercial use.

The Multi-LLM NIM container production branch October 2025 (PB 25h2), exclusively available with NVIDIA AI Enterprise, is a 9-month supported, API-stable branch that includes monthly fixes for high and critical software vulnerabilities. This branch provides a stable and secure environment for building your mission-critical AI applications. The Multi-LLM NIM container production branch releases every six months with a three-month overlap in between two releases. This release includes Government Ready images for regulated environments.

Government Ready: STIG/FIPS Hardening

This ensures the highest level of security for regulated environments, the x86 container image for this branch is:

  • STIG Ubuntu 24.04 hardened
  • Supports FIPS 140-2 / 3 validated crypto / uses libraries that support FIPS crypto

To use this specific hardened image, navigate to the repository's Tags tab and look for the purple label indicating Gov ready displayed alongside the tag.

Learn more about NVIDIA's hardened image in the AI Software for Regulated Environments White Paper.

Compatible Infrastructure Software Versions

For the optimized performance, it is highly recommended to deploy the supported NVIDIA AI Enterprise Infrastructure software in conjunction with your AI software. Production Branch - October 2025 (25h2) is compatible with NVIDIA AI Enterprise Infrastructure 7.

Warning:

NVIDIA cannot guarantee the security of any models hosted on non-NVIDIA systems such as HuggingFace. Malicious or insecure models can result in serious security risks up to and including full remote code execution. We strongly recommend that before attempting to load it, you manually verify the safety of any model not provided by NVIDIA, through such mechanisms as a) ensuring that the model weights are serialized using the safetensors format, b) conducting a manual review of any model or inference code to ensure that it is free of obfuscated or malicious code, and c) validating the signature of the model, if available, to ensure that it comes from a trusted source and has not been modified.

License/Terms of Use:

GOVERNING TERMS: The NIM container is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.

Get Help

Enterprise Support

Get access to knowledge base articles and support cases or submit a ticket.

You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.

Deployment Geography:

Global

Getting started with NVIDIA NIM

Deploying and integrating NVIDIA NIM is straightforward thanks to our industry standard APIs. Visit the NIM Container LLM page for release documentation, deployment guides and more.

Security Vulnerabilities in Open Source Packages

Please review the Security Scanning (LINK) tab to view the latest security scan results. For certain open-source vulnerabilities listed in the scan results, NVIDIA provides a response in the form of a Vulnerability Exploitability eXchange (VEX) document. The VEX information can be reviewed and downloaded from the Security Scanning (LINK) tab

PB5 Multi-LLM

PB5 Multi-LLM Container includes the following model:

Model Name & LinkUse CaseHow to Pull the Model
PB5 Multi-LLMA production-ready inference runtime (using NVIDIA TensorRT-LLM, vLLM, or SGLang) for user-provided LLMs stored in Hugging Face or TensorRT-LLM formats.instruction-following tasks like chatbot creation, content generation, and question-answering.Automatic

Deployment Details:

Visit the NIM Container LLM page for release documentation, deployment guides, and more.

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Container Version(s):

PB5 Multi-LLM PB25H2

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.

You are responsible for ensuring that your use of NVIDIA provided Models complies with all applicable laws.

Publisher
NVIDIA
NVIDIA
Latest Tag1.14.0-pb5.8-stig-fips-x86-64
UpdatedJune 25, 2026 UTC
Compressed Size9.24 GB
Multinode SupportNo
Multi-Arch SupportYes