NVILA is a Vision Language Model developed by NVIDIA that can achieve state of the art image and video understanding. There are many subsequent work from VILA such as VILA^2, Long VILA, and NVILA. The container card walks through the required tools to finetune High Resolution Video NVILA using two popular approaches: LoRA (Low-Rank Adaptation) and Full Finetuning.
NVILA FTMS is a visual language model (VLM) finetuning microservice that allows customers to finetune a pre-trained NVILA-Lite-15B high-res video model, with video/image-text data at scale, enabling multi-image and video VLM for user specific downstream use cases.
NVILA FTMS EA package is comprised of:
All containers needed to run the finetuning microservice can be pulled from this location. See the list below for all available containers in this registry.
Container Type | container_name:tag |
---|---|
NVILA Finetuning Microservice - Early Access | nvcr.io/nvidia/tao/vlm-finetuning-ea:0.2.0-ea |
Model Name | Link |
---|---|
NVILA-Lite-15B-HighRes | nvidia/tao/nvila:nvila-lite-15b-highres-lita |
NGC Resource | Link |
---|---|
VLM Getting Started - Early Access | nvidia/tao/vlm-getting-started-ea:0.2.0-ea |
Access the latest in Vision AI development workflows with NVIDIA TAO Toolkit 5.0
More information about TAO Toolkit and pre-trained models can be found at the NVIDIA Developer Zone
NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended. Please report security vulnerabilities or NVIDIA AI Concerns here.
Security Vulnerabilities in Open Source Packages Please review the Security Scanning (LINK) tab to view the latest security scan results. For certain open-source vulnerabilities listed in the scan results, NVIDIA provides a response in the form of a Vulnerability Exploitability eXchange (VEX) document. The VEX information can be reviewed and downloaded from the Security Scanning (LINK) tab.