Gemma-7B-INT4-RTX | NVIDIA NGC

Google

Gemma-7B-INT4-RTX

Resource

Google

Gemma-7B-INT4-RTX

Gemma-7B is a 7B parameter model from Gemma family of models from Google. It has been instruction-tuned so it can respond to prompts in a conversation manner.

Model Overview

Description:

Gemma-7B is a 7B parameter model from Gemma family of models from Google. It has been instruction-tuned so it can respond to prompts in a conversation manner. Nvidia has converted original Gemma weights and format into weight and format that can be consumed by Tensorrt-LLM.

Terms of use:

By accessing this model, you are agreeing to Gemma Terms of Use, Gemma Prohibited Use Policy .

References(s):

Gemma Model Card
Gemma blogpost

Input:

Input Format: Text

Input Parameters: None

Output:

Output Format: Text

Output Parameters: None

Software Integration:

Supported Hardware Platform(s): RTX 4090

Supported Operating System(s): Windows

Inference:

Windows Setup with TRT-LLM

TRT-LLM Inference Engine

Test Hardware:

RTX 4090

Publisher

Google

Latest Version1.1

UpdatedApril 17, 2024 UTC

Compressed Size7.41 GB

Labels