This container houses GLM-5.2, a flagship long-context large language model for agentic engineering and advanced reasoning, packaged as an NVIDIA NIM for staging.
GLM-5.2 NIM Overview
Description:
This container houses GLM-5.2, a flagship long-horizon reasoning and coding model from Z.AI. GLM-5.2 improves on GLM-5.1 with a solid 1M-token context window, stronger coding performance, flexible thinking effort levels, and architecture improvements for sparse attention and speculative decoding.
The GLM-5.2 NIM is distributed through NGC and runs with the NVIDIA NIM for Large Language Models runtime using SGLang-backed profiles for supported NVIDIA GPU-accelerated systems.
The container components are ready for commercial use.
Third-Party Community Consideration
The model embedded in the container is not owned or developed by NVIDIA. Please see link to Non-NVIDIA GLM-5.2 Model Card and GLM-5 GitHub repository.
License/Terms of Use:
Governing Terms: The NIM container is governed by the NVIDIA Software License Agreement and Product-Specific Terms for NVIDIA AI Products. Use of the model is governed by the NVIDIA Open Model Agreement. Additional Information: MIT License.
You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.
NVIDIA Legal Release Process provides instructions for getting legal support for license selection.
Deployment Geography:
Global
Release Date:
NGC: 06/30/2026 via NVIDIA NGC GLM-5.2 container page
Program Classes:
GLM-5.2 Container includes the following model:
| Model Name & Link | Use Case | How to Pull the Model |
|---|---|---|
| GLM-5.2 | GLM-5.2 is a long-horizon reasoning and coding model for advanced coding, tool use, terminal operations, repository-scale generation, and multi-step agentic workflows. | Automatic |
Deployment Details:
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
GLM-5.2 is deployed as a Downloadable NIM container from NGC. Users pull the container image from NGC, configure the service following the NGC container page and NVIDIA NIM documentation, and integrate the standard inference APIs into downstream applications and tools.
Supported Hardware Microarchitecture Compatibility: GPUs: 8 NVIDIA B200 Tensor Core GPUs, 8 NVIDIA H20 Tensor Core GPUs, 8 NVIDIA H200 Tensor Core GPUs, or compatible systems with more than 900 GB of aggregate GPU memory. CPU: x86_64 host CPU compatible with NVIDIA NIM for Large Language Models deployments. Memory: Host memory sized for the selected NIM deployment profile and workload concurrency. Storage: At least 736 GB of disk space for the GLM-5.2 NIM profile.
Reference(s):
- GLM-5 GitHub repository
- GLM-5.2 model page
- Z.AI GLM-5.2 blog
- Z.AI GLM-5.2 developer documentation
- NVIDIA NIM for Large Language Models documentation
Container Version(s):
GLM-5.2 NIM v1.11.0-variant - base Downloadable NIM container release.
Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. Developers should work with their internal developer team to ensure these software components meet requirements for the relevant industry and use case and address unforeseen product misuse.
Please report quality, risk, security vulnerabilities or NVIDIA AI Concerns here.
Get Help
Getting started with the NIM
Deploying and integrating the NIM is straightforward thanks to our industry standard APIs. Visit the NVIDIA NIM documentation for release documentation, deployment guides and more.
NVIDIA Developer Community Forum
For support, visit the NVIDIA Developer Community Forum.