Linux / amd64
Linux / arm64
Ardennes is a random forest model trained by NVIDIA on snowflake-arctic-embed-m-long embeddings to detect attempts to jailbreak large language models. At the time of release, it is the best known publicly available model for detecting LLM jailbreak attempts.
Additional details about the model, including comparisons to other public models are available in the accompanying paper, accepted to the 2025 AAAI workshop on AI for Cyber Security (AICS).
One-time access setup, as needed:
export NGC_API_KEY=<YOUR NGC API KEY>
docker login nvcr.io # ensure you login with the right key. Sometimes, if you've used another key in the past, this will just succeed without asking you for your new key. In this case, delete the config file it caches creds in, and try again.
# Username: $oauthtoken
# Password: <NGC_API_KEY>
We provide the Ardennes model as an NVIDIA NIM, so you can simply pull the image from docker and run it.
#!/bin/bash
export NGC_API_KEY=<your NGC personal key with access to the "nvstaging/nim" org/team>
export NIM_IMAGE='nvcr.io/nvstaging/nim/ardennes-jailbreak-arctic-nim:v0.1'
export MODEL_NAME='ardennes-jailbreak-arctic'
docker pull $NIM_IMAGE
And go!
docker run -it --name=$MODEL_NAME \
--gpus=all --runtime=nvidia \
-e NGC_API_KEY="$NGC_API_KEY" \
--expose 8000 \
$NIM_IMAGE
The running NIM container exposes a standard REST API and you can send POST requests to the v1/classify
endpoint as JSON to get model responses.
$ curl --data '{"input": "hello this is a test"}' --header "Content-Type: application/json" --header "Accept: application/json" http://0.0.0.0:8000/v1/classify
This will return a JSON dictionary with the model’s prediction of whether or not the provided input is a jailbreaking attempt.
{"jailbreak": false, "score": -0.9921652427737031}
NemoGuard JailbreakDetect is a random forest model trained by NVIDIA on snowflake-arctic-embed-m-long embeddings to detect attempts to jailbreak large language models. At the time of release, it is the best known publicly available model for detecting LLM jailbreak attempts.
This container includes two models: the NemoGuard JailbreakDetect random forest classifier, and the snowflake-arctic-embed-m-long embedding model.
The container components are ready for commercial/non-commercial use.
This container includes a model that is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see snowflake-arctic-embed-m-long model card.
GOVERNING TERMS: Use of the NIM container is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products; use of this model is governed by the NVIDIA Community Model License.
You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.
Global
Build.Nvidia.com [Insert 08/14/2025] via [https://build.nvidia.com/nvidia/nemoguard-jailbreak-detect]
Hugging Face [01/15/2025] via [https://huggingface.co/nvidia/NemoGuard-JailbreakDetect]
NGC [Insert 08/14/2025] via [https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemoguard-jailbreak-detect]
For example: The NemoGuard JailbreakDetect Container includes the following models:
Model Name & Link | Use Case | How to Pull the Model |
---|---|---|
NemoGuard JailbreakDetect random forest classifier | Intended to be deployed as a guardrail in an LLM system, to scan user-provided prompts for jailbreaking attempts prior to sending those prompts to an LLM. | Automatic |
snowflake-arctic-embed-m-long embedding model | Provides input embeddings of user prompts which are then used by the NemoGuard JailbreakDetect random forest classifier above. | Automatic |
Get access to knowledge base articles and support cases or submit a ticket.
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
One-time access setup, as needed:
export NGC_API_KEY=<YOUR NGC API KEY>
docker login nvcr.io # ensure you login with the right key. Sometimes, if you've used another key in the past, this will just succeed without asking you for your new key. In this case, delete the config file it caches creds in, and try again.
# Username: $oauthtoken
# Password: <NGC_API_KEY>
We provide the JailbreakDetect model as an NVIDIA NIM, so you can simply pull the image from docker and run it.
#!/bin/bash
export NGC_API_KEY=<your NGC personal key with access to the "nvstaging/nim" org/team>
export NIM_IMAGE='nvcr.io/nvstaging/nim/ardennes-jailbreak-arctic-nim:v0.1'
export MODEL_NAME='ardennes-jailbreak-arctic'
docker pull $NIM_IMAGE
And go!
docker run -it --name=$MODEL_NAME
--gpus=all --runtime=nvidia
-e NGC_API_KEY="$NGC_API_KEY"
--expose 8000
$NIM_IMAGE
The running NIM container exposes a standard REST API and you can send POST requests to the v1/classify endpoint as JSON to get model responses.
$ curl --data '{"input": "hello this is a test"}' --header "Content-Type: application/json" --header "Accept: application/json" http://0.0.0.0:8000/v1/classify
This will return a JSON dictionary with the model’s prediction of whether or not the provided input is a jailbreaking attempt.
{"jailbreak": false, "score": -0.9921652427737031}
Additional details about the model, including comparisons to other public models are available in the accompanying paper, accepted to the 2025 AAAI workshop on AI for Cyber Security (AICS).
NemoGuard-JailbreakDetect-v1.10.1: Jailbreak detection model using Snowflake-arctic-embed-m embeddings
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns here.
You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.