NGC Catalog
CLASSIC
Welcome Guest
Containers
DeepSAP

DeepSAP

For copy image paths and more information, please view on a desktop device.
Logo for DeepSAP
Features
Description
DeepSAP is a transformer-based workflow designed to enhance splice junction detection in RNA-seq data.
Publisher
NVIDIA
Latest Tag
v0.0.1
Modified
May 2, 2025
Compressed Size
18.18 GB
Multinode Support
No
Multi-Arch Support
No
v0.0.1 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

DeepSAP

DeepSAP is a transformer-based workflow designed to enhance splice junction detection in RNA-seq data. By default, DeepSAP utilizes the highly sensitive GSNAP TGGA aligner for FASTQ inputs. Alternatively, it can process pre-aligned BAM files directly.

Table of Contents

  • Requirements
  • Installation
  • Usage
  • Command-line Arguments
  • License/Terms of Use

Requirements

  • Docker with GPU support
  • Input sorted alignment file in BAM/SAM or RNA-seq reads in FASTQ format
  • Reference file in FASTA format
  • Annotaion file in GTF format

Installation

To use DeepSAP, you must have Docker with GPU support enabled and make sure the DeepSAP Docker image is available on your system. You can obtain the image by running the following command:

$ docker pull nvcr.io/nvidia/clara/clara-parabricks-deepsap:<TAG>

Usage

You can use the accompanied dataset named malaria_short_pe under the test folder to test DeepSAP's functionality with minimal setup. This dataset includes:

  • Paired-end FASTQ files for alignment.
  • Malaria reference genome in FASTA format.
  • Malaria annotation file in GTF format.

Update the paths in the provided example commands to point to the files in the test folder.

1- Running DeepSAP with short-read RNA-seq FASTQ files

docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --rm   \
    --volume $(pwd)/test:/workdir                                                   \
    --volume $(pwd)/test/outputdir:/outputdir                                       \
    nvcr.io/nvidia/clara/clara-parabricks-deepsap:latest                            \
    --out /outputdir/                                                               \
    --prefix test_run_10K                                                           \
    --mate_1 /workdir/malaria_short_pe/SRR14793977_10K_1.fastq.gz                   \
    --mate_2 /workdir/malaria_short_pe/SRR14793977_10K_2.fastq.gz                   \
    --gtf /workdir/malaria_short_pe/Plasmodium_falciparum.ASM276v2.60.gtf           \
    --fasta /workdir/malaria_short_pe/Plasmodium_falciparum.ASM276v2.dna.toplevel.fa

2- Running DeepSAP with short-read RNA-seq FASTQ files and GSNAP index.

docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --rm   \
    --volume $(pwd)/test:/workdir                                                   \
    --volume $(pwd)/test/outputdir:/outputdir                                       \
    nvcr.io/nvidia/clara/clara-parabricks-deepsap:latest                            \
    --out /outputdir/                                                               \
    --prefix test_run_10K                                                           \
    --mate_1 /workdir/malaria_short_pe/SRR14793977_10K_1.fastq.gz                   \
    --mate_2 /workdir/malaria_short_pe/SRR14793977_10K_2.fastq.gz                   \
    --gtf /workdir/malaria_short_pe/Plasmodium_falciparum.ASM276v2.60.gtf           \
    --fasta /workdir/malaria_short_pe/Plasmodium_falciparum.ASM276v2.dna.toplevel.fa\
    --gsnap_idx /outputdir/gsnap_idx/

Command-line Arguments

Argument Description Required
-o, --out Path to the output folder Yes
--prefix Output files prefix string Yes
-g, --gtf Path to the GTF annotation file compatible with the BAM file Yes
-f, --fasta Path to the FASTA genome file compatible with the BAM file Yes
-s, --sam Path to the SAM/BAM file or directory of files Yes (if BAM)
--mate_1 Path to FASTQ file of mate 1 (for paired-end reads) Yes (if FASTQ)
--mate_2 Path to FASTQ file of mate 2 (for paired-end reads) Yes (if FASTQ)
--gsnap_idx Path to GSNAP index No
-c, --config Config .json file to control DeepSAP internal parameters No
--batch Batch size for inference No
--set_size Set size to split datasets for inference No
-t, --threads Number of threads No
--score_reads Classify also reads using the transformer model and add scores to SAM, as appose to only SJ No
--n_reads Number of reads to classify if --score_reads is used No

License/Terms of Use

By pulling and using the Parabricks container, you accept the governing terms: The software and materials are governed by the NVIDIA Software License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and the Product-Specific Terms for NVIDIA AI Products (found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/); except for the model which is governed by the NVIDIA Models Community License Agreement(found at NVIDIA Community Model License). ADDITIONAL INFORMATION: Apache 2.0.