This is a multi-headed model which classifies English text prompts across task types and complexity dimensions. Tasks are classified across 11 common categories. Complexity is evaluated across 6 dimensions and ensembled to create an overall complexity score. Further information on the taxonomies can be found below.
This model is ready for commercial use.
Task types:
Complexity dimensions:
This model is released under the NVIDIA Open Model License Agreement.
The model architecture uses a DeBERTa backbone and incorporates multiple classification heads, each dedicated to a task categorization or complexity dimension. This approach enables the training of a unified network, allowing it to predict simultaneously during inference. Deberta-v3-base can theoretically handle up to 12k tokens, but default context length is set at 512 tokens.
NeMo Curator improves generative AI model accuracy by processing text, image, and video data at scale for training and customization. It also provides pre-built pipelines for generating synthetic data to customize and evaluate generative AI systems.
The inference code for this model is available through the NeMo Curator GitHub repository. Check out this example notebook to get started.
Prompt: Write a mystery set in a small town where an everyday object goes missing, causing a ripple of curiosity and suspicion. Follow the investigation and reveal the surprising truth behind the disappearance.
Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
---|---|---|---|---|---|---|---|
Text Generation | 0.472 | 0.867 | 0.056 | 0.048 | 0.226 | 0.785 | 0 |
Prompt: Antibiotics are a type of medication used to treat bacterial infections. They work by either killing the bacteria or preventing them from reproducing, allowing the body’s immune system to fight off the infection. Antibiotics are usually taken orally in the form of pills, capsules, or liquid solutions, or sometimes administered intravenously. They are not effective against viral infections, and using them inappropriately can lead to antibiotic resistance. Explain the above in one sentence.
Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
---|---|---|---|---|---|---|---|
Summarization | 0.133 | 0.003 | 0.014 | 0.003 | 0.644 | 0.211 | 0 |
NemoCurator Prompt Task and Complexity Classifier v1.1
Task distribution:
Task | Count |
---|---|
Open QA | 1214 |
Closed QA | 786 |
Text Generation | 480 |
Chatbot | 448 |
Classification | 267 |
Summarization | 230 |
Code Generation | 185 |
Rewrite | 169 |
Other | 104 |
Brainstorming | 81 |
Extraction | 60 |
Total | 4024 |
For evaluation, Top-1 accuracy metric was used, which involves matching the category with the highest probability to the expected answer. Additionally, n-fold cross-validation was used to produce n different values for this metric to verify the consistency of the results. The table below displays the average of the top-1 accuracy values for the N folds calculated for each complexity dimension separately.
Task Accuracy | Creative Accuracy | Reasoning Accuracy | Contextual Accuracy | FewShots Accuracy | Domain Accuracy | Constraint Accuracy | |
---|---|---|---|---|---|---|---|
Average of 10 Folds | 0.981 | 0.996 | 0.997 | 0.981 | 0.979 | 0.937 | 0.991 |
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns here.