NGC Catalog
CLASSIC
Welcome Guest
Collections
Cosmos World Foundation Models

Cosmos World Foundation Models

For contents of this collection and more information, please view on a desktop device.
Logo for Cosmos World Foundation Models
Description
Cosmos World Foundation Models: A family of highly performant pre-trained world foundation models purpose-built for generating physics-aware videos and world states for physical AI development.
Curator
NVIDIA
Modified
March 14, 2025
Containers
Sorry, your browser does not support inline SVG.
Helm Charts
Sorry, your browser does not support inline SVG.
Models
Sorry, your browser does not support inline SVG.
Resources
Sorry, your browser does not support inline SVG.

Cosmos World Foundation Models

A family of highly performant pre-trained models purpose-built for generating physics-aware videos and world states for physical AI development.

Cosmos Autoregressive: The Cosmos autoregressive models are a collection of pre-trained models that are ideal for predicting and rapidly generating video sequences from video or image inputs for physical AI. They can serve as the building block for various applications or research that are related to video generation. The models are ready for commercial use under NVIDIA Open Model license agreement.

Cosmos Diffusion: The Cosmos diffusion models are a collection of diffusion based world foundation models that generate dynamic, high quality videos from text, image, or video inputs. They can serve as the building block for various applications or research that are related to generation of video data to train Physical AI systems. The models are ready for commercial use under NVIDIA Open Model license agreement.

Model Developer: NVIDIA

License:

This models are released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com.

Under the NVIDIA Open Model License, NVIDIA confirms:

  • Models are commercially usable.
  • You are free to create and distribute Derivative Models.
  • NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.

Important Note: If you bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under NVIDIA Open Model License Agreement will automatically terminate.

Models List:

Autoregressive Models:

  1. Cosmos-1.0-Autoregressive-4B
    4 Billion parameter autoregressive model that generates high-fidelity physics-aware videos from simple video inputs.

  2. Cosmos-1.0-Autoregressive-5B-Video2World
    5 Billion parameter autoregressive model that generates and predicts detailed video states from video+text inputs.

  3. Cosmos-1.0-Autoregressive-12B
    12 Billion parameter autoregressive model that generates high-fidelity physics-aware videos from simple video inputs.

  4. Cosmos-1.0-Autoregressive-13B-Video2World
    13 Billion parameter autoregressive model that generates and predicts detailed video states from video+text inputs.

  1. Cosmos-1.0-Tokenizer-DV8x16x16
    A discrete video tokenizer with a compression rate of 8x temporally and 16x16 spatially, with 49 temporal frames context.

Diffusion Models:

  1. Cosmos-1.0-Diffusion-7B-Text2World
    7 Billion parameter diffusion model that generates physics-aware videos from text prompts.

  2. Cosmos-1.0-Diffusion-7B-Video2World
    7 Billion parameter diffusion model that converts video inputs into real-world simulation outputs.

  3. Cosmos-1.0-Diffusion-14B-Text2World
    14 Billion parameter diffusion models that generates physics-aware videos from text prompts.

  4. Cosmos-1.0-Diffusion-14B-Video2World
    14 Billion parameter diffusion models for using video inputs to real-world simulation outputs. 

  5. Cosmos-1.0-Tokenizer-CV8x8x8
    A continuous video tokenizer with a compression rate of 8x temporally and 8x8 spatially, with 121 temporal frames context.

Cosmos Supporting Models:

  1. Cosmos-1.0-Guardrail
    State of the art guardrail small models to ensure safety and consistency in world models.

  2. Cosmos-1.0-PromptUpsampler-12B-Text2World
    12 billion parameter neural network to enhance prompt-driven quality through improving the text prompt descriptions and details automatically.

  3. Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8
    Decodes autoregressive video sequences using a 7B parameter model for augmented reality.