Model
Open vocabulary multi-modal instance segmentation model trained on commercial data.
By using the Model(s), you agree to the Model License(s).
Use the NGC CLI to download:
Copied!
| Field | Response |
|---|---|
| Intended Applications & Domains: | Open Set Segmenting (Text) |
| Model Type: | Instance Segmentation with arbitrary object categories or reference descriptions |
| Intended Users: | This model is intended for developers working in smart spaces, retail, and industrial applications. |
| Output: | Bounding Boxes, Confidence Scores, and Segmentation masks |
| Describe how the model works: | This model predicts bounding boxes and segmentation masks for each object in the image. It can detect objects specified via open-vocabulary text input or referring expressions, allowing flexible object recognition. |
| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
| Technical Limitations: | The model may have difficulties in non-Flickr style data like medical, satellite, and industrial data. |
| Verified to have met prescribed NVIDIA standards: | Yes |
| Performance Metrics: | generalized Intersection over Union (gIoU), Target Accuracy (T_acc), No-Target Accuracy (N_acc), Mean Average Precision (mask_mAP) |
| Potential Known Risks: | Model may mis-localize or over-segment objects when expression/categories are ambiguous or outside the training distribution. |
| Licensing: | NVIDIA Open Model License |