SearchSearch thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
NVIDIA Enterprise
NVIDIA Enterprise
5
NVIDIA NIM
NVIDIA NIM
NIM Container GPUs
NIM Container GPUs
Use Case
Use Case
3
3
1
1
NVIDIA Platform
NVIDIA Platform
148
102
84
78
59
49
47
43
41
39
35
35
34
26
25
25
24
19
18
16
15
11
11
11
10
8
5
4
3
3
3
3
3
3
2
2
2
1
1
1
Industry
Industry
6
1
1
Solution
Solution
26
3
1
1
1
Publisher
Publisher
8
6
6
5
4
1
Policy
Policy
Displaying 35 results
Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University
ChatGLM3-6B Chat Int4
ChatGLM3-6B is the latest open-source model in the ChatGLM series. ChatGLM3-6B introduces the following features (1) More Powerful Base Model (2) More Comprehensive Function Support (3) More Comprehensive Open-source Series.
Model
LlaMa 2 is a large language AI model capable of generating text and code in response to prompts.
Model
RT-DETR object detection model for 2D warehouse applications
Model
LlaMa 2 is a large language AI model capable of generating text and code in response to prompts.
Model
The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.
Model
Code Llama is a code-specialized version of Llama 2. It can generate code, and natural language about code, from both code and natural language prompts.
Model
Built with Meta Llama 3 - Meta Llama 3 family of large language models (LLMs) is a collection of pretrained and instruction tuned generative text models in 8B and 70B sizes.
Model
Whisper ASR GGUF for Nv IGI SDK ASR plugin
Model
A collection of models to enable OpenVoice support for the NVIDIA In-Game Inferencing (NVIGI) SDK.
Model
Meta Llama 3.2 3B Instruct INT4 ONNX model is the quantized version of the Meta Llama-3.2-3B-Instruct model, which is an auto-regressive language model that uses an optimized transformer architecture.
Model
Built with Llama - The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned text-to-text generative models in 1B and 3B sizes.
Model
The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets. This dataset includes both synthetic data and filtered publicly available website data, with high-quality properties.
Model
The Phi-3-Mini-4K is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality properties.
Model
Mistral-NeMo is a Large Language Model (LLM) composed of 12B parameters. This model leads accuracy on popular benchmarks across common sense reasoning, coding, math, multilingual and multi-turn chat tasks.
Model
Built with Llama - Meta-Llama 3.1 8B Instruct INT4 ONNX model is the AWQ quantized version model, which is an auto-regressive language model that uses an optimized transformer architecture for multilingual dialogue use cases.
Model
The Phi-3-Medium-128K-Instruct is a 14B parameters, lightweight, open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.
Model
Nemotron-Mini-4B Instruct model is for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model optimized through distillation, pruning and quantization for speed and on-device deployment.
Model
e5-large-unsupervised GGUF for Nv IGI SDK Embed plugin
Model
Gemma-7B is a 7B parameter model from Gemma family of models from Google. It has been instruction-tuned so it can respond to prompts in a conversation manner.
Model
Gemma-2B is a 2.5B parameter model from Gemma family of models from Google. It has been instruction-tuned so it can respond to prompts in a conversation manner.
Model
Built with Meta Llama 3.1 - The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained, and instruction tuned generative models in 8B, 70B and 405B sizes.
Model
The Mistral-7B-Instruct-v0.3 INT4 ONNX model is the quantized version of the Mistral-7B-Instruct-v0.3 model, which is an instruct fine-tuned version of the Mistral-7B-v0.3 model used for text generation and question answering.
Model
The NVIDIA Phi-3.5-mini-Instruct INT4 ONNX model is the quantized version of the Microsoft Phi-3.5-mini-Instruct model which has 3.8B parameters and is a dense decoder-only Transformer model using the same tokenizer as Phi-3 Mini.
Model
CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text decoder-only models. This is a 7 billion parameter instruction-tuned varient for code chat and instruction.
Model

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.