GPU-optimized AI, Machine Learning, & HPC Software

NVIDIA

The MiniMax-M2.5 NIM Container is a deployable inference container for serving MiniMax-M2.5, a third-party text generation model optimized for complex agentic tasks including software engineering, tool use, search.

Container

Qwen

Qwen3.5-397B-A17B

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

Container

NVIDIA

gliner-pii

This container houses GLiNER PII, which detects and classifies a broad range of Personally Identifiable Information (PII) and Protected Health Information (PHI) in structured and unstructured text.

Container

NVIDIA

Nemotron-3-Super-120B-A12B (Experimental)

Nemotron-3-Super-120B-A12B is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks.

Container

Qwen

Qwen3.5-122B-A10B

Qwen3.5-122B-A10B is a multimodal vision-language Mixture-of-Experts model designed for native multimodal agent applications, supporting text, image, and video inputs.

Container

Qwen

Qwen3.5-35B-A3B

Qwen3.5-35B-A3B is a multimodal vision-language Mixture-of-Experts model designed for native multimodal agent applications, supporting text, image, and video inputs.

Container

NVIDIA

step-35-flash

Step 3.5 Flash is a sparse Mixture-of-Experts (MoE) large language model developed by StepFun, engineered to deliver frontier reasoning and agentic capabilities with exceptional efficiency

Container

Moonshot AI

kimi-k2.5-Turbo

This turbo container houses the Kimi K2.5 model which is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base.

Container

Mistral AI

Mistral Small-4 119b-2603

Mistral Small 4 model is a powerful hybrid model with the capability of acting as both a general instruction model and a reasoning model.

Container

NVIDIA

nvidia-ising-calibration-1

The NVIDIA Ising Calibration 1 NIM houses the NVIDIA-Ising-Calibration-1-35B-A3B-BF16 model, which is a purpose-built Mixture-of-Experts vision-language model (MoE VLM) built on Qwen3.5-35B-A3B,

Container

Z.Ai

GLM-5

GLM-5 is a next-generation large language model targeting complex systems engineering and long-horizon agentic tasks.

Container

Google

Gemma 4 31B IT

Gemma 4 31B IT model which, is an open multimodal model built by Google DeepMind that handles text and image inputs, can process video as sequences of frames, and generates text output.

Container

NVIDIA

Nemotron-Content-Safety-Reasoning-4B (Experimental)

Nemotron Content Safety Reasoning 4B is a Large Language Model (LLM) classifier designed to function as a dynamic and adaptable guardrail for content safety and dialogue moderation (topic-following).

Container

NVIDIA

Nemotron-3-Nano-Omni-30B-A3B-Reasoning

Nemotron Nano V3 Omni is a multi-modal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows.

Container

NVIDIA

GLM-5.1

This container houses GLM-5.1, which is a next-generation flagship model for agentic engineering with significantly stronger coding capabilities than its predecessor GLM-5. The model achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5

Container

Xiaomi

Mimo-V2-Flash (Experimental)

This container houses the model MiMo-V2-Flash.

Container

Qwen

qwen3.6-27b

The Qwen3.6-27B NIM Container is a deployable inference container for serving Qwen3.6-27B, a third-party multimodal dense model capable of processing text, image, and video inputs for text generation.

Container

NVIDIA

nemotron-3.5-content-safety

The Nemotron 3.5 Content Safety NIM container packages NVIDIA's small language model (SLM) that uses Google's Gemma-3-4B-it as the base and is fine-tuned by NVIDIA on multimodal, multilingual, and reasoning-oriented content-safety datasets.

Container

Deepseek AI

DeepSeek-V4-Pro

The DeepSeek-V4-Pro Container is a deployable inference container for serving DeepSeek-V4-Pro, a third-party sparse Mixture-of-Experts language model for reasoning, coding, and agentic tasks.

Container

NVIDIA

Nemotron-3 Content Safety VLM

This NIM container houses the Nemotron 3 Content Safety model which, is a small language model (SLM) that uses Google's Gemma-3-4B-it as the base and is fine-tuned by NVIDIA on multimodal and multilingual content-safety related datasets.

Container

Mistral AI

Mistral-Medium-3.5-128b

Mistral-Medium-3.5-128B VLM model is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and coding in a single set of weights.

Container

NVIDIA

Nemotron-3-Ultra-550B-A55B

Nemotron-3-Ultra-550B-A55B NIM container packages NVIDIA's large language model featuring a hybrid Latent Mixture-of-Experts (LatentMoE) architecture with Multi-Token Prediction (MTP) layers.

Container

Google

Gemma 4 26B A4B IT

Gemma 4 26B A4B IT is a Google multimodal instruction-tuned model packaged as an NVIDIA NIM container for deployment through NVIDIA NGC as a Downloadable NIM.

Container

Google

DiffusionGemma 4 26B A4B IT

The DiffusionGemma-4-26B-A4B-IT model is an open-weights multimodal generative model developed by Google DeepMind that processes text, image, and video inputs to produce text output via discrete diffusion.

Container