GPU-optimized AI, Machine Learning, & HPC Software

NGC | Catalog

Experience AI in ActionJoin a passionate community and work with state-of-the-art models to kick-start your own development efforts.

Maxine Live Portrait2D Animation

Maxine Live Portrait is a generative model which animates a portrait photo with a driving video such that the facial expressions and movements of the person in the video are transferred to the photo.

NeVA-22BImage Conversation

NeVA is a multi-modal vision-language model that understands text and images and generates informative responses.

StarCoder2-15BText and Code Generation

StarCoder2 is a LLM specializing in code completion developed in partnership with BigCode community.

Smaug 72BText and Code Generation

Smaug-72B is a large language model developed by Abacus.AI by finetuning a Qwen-72B based model, ultimately LLAMA2 architecture, using the DPO-Positive (DPOP) technique.

Kosmos-2Multimodal Large Language Model

Kosmos-2 model is a groundbreaking multimodal large language model (MLLM). Kosmos-2 is designed to ground text to the visual world, enabling it to understand and reason about visual elements in images.

Phi-2Text Generation

Phi-2 is a 2.7 billion parameter language model developed by Microsoft Research. The phi-2 model is best suited for prompts using the QA format, the chat format, and the code format

Gemma 2BText and Code Generation

Gemma is a family of lightweight, state-of-the art LLM open models from Google,

Gemma 7BText and Code Generation

Gemma is a family of lightweight, state-of-the art LLM open models from Google,

cuOptVehicle Route Optimization

NVIDIA cuOpt is a world-record-breaking accelerated optimization engine. cuOpt helps teams solve complex routing problems with multiple constraints and deliver new capabilities, like dynamic rerouting, job scheduling, and robotic simulations.

Mamba-ChatText Generation

Mamba-Chat is a state-of-the-art AI model designed for efficient sequence modeling. The model can be used for text generation and chat applications

DePlotVisual Language Reasoning on Charts and Plots

The Google DePlot model is a one-shot visual language understanding solution that translates images of plots or charts into linearized tables.

Code Llama 70BCode Generation

Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.

SeamlessM4T V2 - T2TTText-to-text translation

The SeamlessM4T V2-T2TT model is a part of the SeamlessM4T-v2 collection of models, which are designed to provide high-quality translation for various tasks, including speech and text translation.

Llama GuardText Generation

Llama Guard is a model for classifying the safety of LLM prompts and responses, using a taxonomy of safety risks.

Maxine Voice FontVoice Conversion

Voice Font converts the speaker's voice in input audio to match the target voice while keeping linguistic information and prosody unchanged.

NV-Llama2-70B-RLHFText Generation

NV-Llama2-70B-RLHF-Chat is a 70 billion parameter generative language model instruct-tuned on LLama2-70B model. It takes input with context length up to 4,096 tokens.

NV-Llama2-70B-SteerLM-ChatText Generation

Llama 2 SteerLM Chat is a large language model, aligned using the SteerLM technique developed by NVIDIA. This allows you to adjust the preferred style of response to attributes (such as creativity, complexity and verbosity) at inference time.

Mixtral 8x7B InstructText Generation

Mixtral 8x7B Instruct is a language model that can follow instructions, complete requests, and generate creative text formats.

Yi-34BText and Code Generation

The Yi-34B is a large language model trained from scratch by developers at 01.AI. Yi-34B has been finetuned for various chat usecases and has upto 200K context window.

Nemotron-3-8B-Chat-SteerLMText Generation

Nemotron-3-8B-Chat-SteerLM is an 8 billion parameter generative language model based on the Nemotron-3-8B base model. It has been customized for user control of model outputs during inference using the SteerLM method developed by NVIDIA.

Llama 2 70BText Generation

Llama 2 is a large language AI model capable of generating text and code in response to prompts.

Llama 2 13BText Generation

Llama 2 is a large language AI model capable of generating text and code in response to prompts.

NVIDIA Retrieval QA EmbeddingEmbedding Model

NVIDIA Retrieval QA Embedding is an embedding model that represents words, phrases, or other entities as vectors of numbers and understands the relation between words and phrases.

GenSLM: Genome-scale language modelTransformer Model

A genome-scale language foundation model (GenSLM) is an LLM trained on all known genomes from a virus or bacteria. It learns the evolutionary landscape of viruses like SARS-CoV-2 and can accurately and rapidly identify new variants.

Code Llama 13BCode Generation

Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.

Code Llama 34BCode Generation

Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts.

Mistral 7B InstructText Generation

Mistral-7B-Instruct is a language model that can follow instructions, complete requests, and generate creative text formats.

Fuyu 8BImage Conversation

Fuyu-8B is a multi-modal transformer that can perform a wide range of tasks, including image understanding, text generation, and code generation.

Nemotron-3-8B-QAText Generation

Nemotron-3-8B-QA is a 8 billion parameter generative language model based on the Nemotron-3-8B base model. The model has been further fine-tuned for instruction following by NVIDIA specifically for Question Answering.

Stable Diffusion XLImage Generation

Stable Diffusion XL (SDXL) enables you to generate expressive images with shorter prompts and insert words inside images.

CLIPImage Classification and Object Detection

The CLIP (Contrastive Language-Image Pretraining) model combines vision and language using contrastive learning. It understands images and text together, enabling tasks like image classification and object detection.