NGC | Catalog
CatalogAI Foundation Models
NVIDIA AI Foundation Models
Interact with the latest state-of-the-art AI model APIs optimized on the NVIDIA accelerated computing stack—from your browser.
Experience AI in Action

Join a passionate community and work with state-of-the-art models to kick-start your own development efforts.

Logo for Smaug 72B
Smaug 72BText and Code Generation
Smaug-72B is a large language model developed by Abacus.AI by finetuning a Qwen-72B based model, ultimately LLAMA2 architecture, using the DPO-Positive (DPOP) technique.
Logo for Kosmos-2
Kosmos-2Multimodal Large Language Model
Kosmos-2 model is a groundbreaking multimodal large language model (MLLM). Kosmos-2 is designed to ground text to the visual world, enabling it to understand and reason about visual elements in images.
Logo for StarCoder2-15B
StarCoder2-15BText and Code Generation
StarCoder2 is a LLM specializing in code completion developed in partnership with BigCode community.
Logo for Phi-2
Phi-2Text Generation
Phi-2 is a 2.7 billion parameter language model developed by Microsoft Research. The phi-2 model is best suited for prompts using the QA format, the chat format, and the code format
Logo for Gemma 2B
Gemma 2BText and Code Generation
Gemma is a family of lightweight, state-of-the art LLM open models from Google,
Logo for Gemma 7B
Gemma 7BText and Code Generation
Gemma is a family of lightweight, state-of-the art LLM open models from Google,
Logo for cuOpt
cuOptVehicle Route Optimization
NVIDIA cuOpt is a world-record-breaking accelerated optimization engine. cuOpt helps teams solve complex routing problems with multiple constraints and deliver new capabilities, like dynamic rerouting, job scheduling, and robotic simulations.
Logo for Mamba-Chat
Mamba-ChatText Generation
Mamba-Chat is a state-of-the-art AI model designed for efficient sequence modeling. The model can be used for text generation and chat applications
Logo for DePlot
DePlotVisual Language Reasoning on Charts and Plots
The Google DePlot model is a one-shot visual language understanding solution that translates images of plots or charts into linearized tables.
Logo for Maxine Live Portrait
Maxine Live Portrait is a generative model which animates a portrait photo with a driving video such that the facial expressions and movements of the person in the video are transferred to the photo.
Logo for Code Llama 70B
Code Llama 70BCode Generation
Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.
Logo for SeamlessM4T V2 - T2TT
SeamlessM4T V2 - T2TTText-to-text translation
The SeamlessM4T V2-T2TT model is a part of the SeamlessM4T-v2 collection of models, which are designed to provide high-quality translation for various tasks, including speech and text translation.
Logo for Llama Guard
Llama GuardText Generation
Llama Guard is a model for classifying the safety of LLM prompts and responses, using a taxonomy of safety risks.
Logo for Maxine Voice Font
Maxine Voice FontVoice Conversion
Voice Font converts the speaker's voice in input audio to match the target voice while keeping linguistic information and prosody unchanged.
Logo for NV-Llama2-70B-RLHF
NV-Llama2-70B-RLHFText Generation
NV-Llama2-70B-RLHF-Chat is a 70 billion parameter generative language model instruct-tuned on LLama2-70B model. It takes input with context length up to 4,096 tokens.
Logo for NV-Llama2-70B-SteerLM-Chat
Llama 2 SteerLM Chat is a large language model, aligned using the SteerLM technique developed by NVIDIA. This allows you to adjust the preferred style of response to attributes (such as creativity, complexity and verbosity) at inference time.
Logo for Mixtral 8x7B Instruct
Mixtral 8x7B InstructText Generation
Mixtral 8x7B Instruct is a language model that can follow instructions, complete requests, and generate creative text formats.
Logo for Yi-34B
Yi-34BText and Code Generation
The Yi-34B is a large language model trained from scratch by developers at 01.AI. Yi-34B has been finetuned for various chat usecases and has upto 200K context window.
Logo for Nemotron-3-8B-Chat-SteerLM
Nemotron-3-8B-Chat-SteerLM is an 8 billion parameter generative language model based on the Nemotron-3-8B base model. It has been customized for user control of model outputs during inference using the SteerLM method developed by NVIDIA.
Logo for Llama 2 70B
Llama 2 70BText Generation
Llama 2 is a large language AI model capable of generating text and code in response to prompts.
Logo for Llama 2 13B
Llama 2 13BText Generation
Llama 2 is a large language AI model capable of generating text and code in response to prompts.
Logo for NVIDIA Retrieval QA Embedding
NVIDIA Retrieval QA Embedding is an embedding model that represents words, phrases, or other entities as vectors of numbers and understands the relation between words and phrases.
Logo for GenSLM: Genome-scale language model
A genome-scale language foundation model (GenSLM) is an LLM trained on all known genomes from a virus or bacteria. It learns the evolutionary landscape of viruses like SARS-CoV-2 and can accurately and rapidly identify new variants.
Logo for Code Llama 13B
Code Llama 13BCode Generation
Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.
Logo for Code Llama 34B
Code Llama 34BCode Generation
Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.
Logo for NeVA-22B
NeVA-22BImage Conversation
NeVA is a multi-modal vision-language model that understands text and images and generates informative responses.
Logo for Mistral 7B Instruct
Mistral 7B InstructText Generation
Mistral-7B-Instruct is a language model that can follow instructions, complete requests, and generate creative text formats.
Logo for Fuyu 8B
Fuyu 8BImage Conversation
Fuyu-8B is a multi-modal transformer that can perform a wide range of tasks, including image understanding, text generation, and code generation.
Logo for Nemotron-3-8B-QA
Nemotron-3-8B-QAText Generation
Nemotron-3-8B-QA is a 8 billion parameter generative language model based on the Nemotron-3-8B base model. The model has been further fine-tuned for instruction following by NVIDIA specifically for Question Answering.
Logo for Stable Diffusion XL
Stable Diffusion XLImage Generation
Stable Diffusion XL (SDXL) enables you to generate expressive images with shorter prompts and insert words inside images.
Logo for CLIP
CLIPImage Classification and Object Detection
The CLIP (Contrastive Language-Image Pretraining) model combines vision and language using contrastive learning. It understands images and text together, enabling tasks like image classification and object detection.