NVIDIA Morpheus PB May 2024 (PB 24h1)

NGC Catalog

CLASSIC

Welcome Guest

For copy image paths and more information, please view on a desktop device.

Associated Products

Features

Description

NVIDIA Morpheus Production Branch May 2024 (PB 24h1) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.

Publisher

NVIDIA

Latest Tag

24.02.06-runtime

Modified

July 23, 2025

Compressed Size

9.58 GB

Multinode Support

Multi-Arch Support

Yes

24.02.06-runtime (Latest) Security Scan Results

Linux / amd64

Linux / arm64

Important Notice: Production Branch May 2024 has reach its End of Life on January 2025. To receive enterprise support, monthly bug fixes, and security updates for high and critical software vulnerabilities, please transition your application to Production Branch October 2024. For more information, refer to the NVIDIA AI Entperise Release Branches documentation.

Morpheus

Morpheus allows teams to build their own optimized pipelines that address cybersecurity and information security use cases. Morpheus provides development capabilities around dynamic protection, real-time telemetry, adaptive policies, and cyber defenses for detecting and remediating cybersecurity threats.

Getting Started

Prerequisites

Volta architecture GPU or better
CUDA 12.4
Docker
The NVIDIA Container Toolkit

Running Directly with Docker

You will need to ensure that Docker has been properly configured for including the NVIDIA Container Runtime. This can be specified as either the default runtime or explicitly like this:

docker run --rm -ti --runtime=nvidia --gpus=all $ANY_OTHER_DOCKER_ARGS nvcr.io/nvstaging/nvaie/morpheus-pb24h1:24.02.01-runtime bash

More detailed instructions for this mode can be found in the Getting Started Guide on GitHub.

Configuration

The Morpheus pipeline can be configured in two ways:

Manual configuration in Python script.
Configuration via the provided CLI (i.e., morpheus)

Starting the Pipeline (via Manual Python Config)

See the examples directory in the Github repo for examples on how to configure a pipeline via Python.

Starting the Pipeline (via CLI)

The provided CLI (morpheus) is capable of running the included tools as well as any linear pipeline.

morpheus

Usage: morpheus [OPTIONS] COMMAND [ARGS]...

Options:
  --debug / --no-debug            [default: no-debug]
  --log_level [CRITICAL|FATAL|ERROR|WARN|WARNING|INFO|DEBUG]
                                  Specify the logging level to use.  [default:
                                  WARNING]
  --log_config_file FILE          Config file to use to configure logging. Use
                                  only for advanced situations. Can accept
                                  both JSON and ini style configurations
  --plugin TEXT                   Adds a Morpheus CLI plugin. Can either be a
                                  module name or path to a python module
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Commands:
  run    Run one of the available pipelines
  tools  Run a utility tool

morpheus run

Usage: morpheus run [OPTIONS] COMMAND [ARGS]...

Options:
  --num_threads INTEGER RANGE     Number of internal pipeline threads to use  [default: 80; x>=1]
  --pipeline_batch_size INTEGER RANGE
                                  Internal batch size for the pipeline. Can be much larger than the model batch size. Also used for Kafka consumers  [default: 256; x>=1]
  --model_max_batch_size INTEGER RANGE
                                  Max batch size to use for the model  [default: 8; x>=1]
  --edge_buffer_size INTEGER RANGE
                                  The size of buffered channels to use between nodes in a pipeline. Larger values reduce backpressure at the cost of memory. Smaller values
                                  will push messages through the pipeline quicker. Must be greater than 1 and a power of 2 (i.e. 2, 4, 8, 16, etc.)  [default: 128; x>=2]
  --use_cpp BOOLEAN               Whether or not to use C++ node and message types or to prefer python. Only use as a last resort if bugs are encountered  [default: True]
  --help                          Show this message and exit.

Commands:
  pipeline-ae     Run the inference pipeline with an AutoEncoder model
  pipeline-fil    Run the inference pipeline with a FIL model
  pipeline-nlp    Run the inference pipeline with a NLP model
  pipeline-other  Run a custom inference pipeline without a specific model type

morpheus run pipeline-ae

Usage: morpheus run pipeline-ae [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --columns_file DATA FILE        [required]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label. If unspecified, only a single output label is created for FIL
  --userid_column_name TEXT       Which column to use as the User ID.  [default: userIdentityaccountId; required]
  --userid_filter TEXT            Specifying this value will filter all incoming data to only use rows with matching User IDs. Which column is used for the User ID is
                                  specified by `userid_column_name`
  --feature_scaler [none|standard|gauss_rank]
                                  Autoencoder feature scaler  [default: standard]
  --use_generic_model             Whether to use a generic model when user does not have minimum number of training rows
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class        Add detected classifications to each message.
  add-scores       Add probability scores to each message.
  buffer           (Deprecated) Buffer results.
  delay            (Deprecated) Delay results for a certain duration.
  filter           Filter message by a classification threshold.
  from-azure       Source stage is used to load Azure Active Directory messages.
  from-cloudtrail  Load messages from a Cloudtrail directory.
  from-duo         Source stage is used to load Duo Authentication messages.
  inf-pytorch      Perform inference with PyTorch.
  inf-triton       Perform inference with Triton Inference Server.
  monitor          Display throughput numbers at a specific point in the pipeline.
  preprocess       Prepare Autoencoder input DataFrames for inference.
  serialize        Include & exclude columns from messages.
  timeseries       Perform time series anomaly detection and add prediction.
  to-file          Write all messages to a file.
  to-kafka         Write all messages to a Kafka cluster.
  train-ae         Train an Autoencoder model on incoming data.
  trigger          Buffer data until previous stage has completed.
  validate         Validate pipeline output for testing.

morpheus run pipeline-fil

Usage: morpheus run pipeline-fil [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 29; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified  [default: mining]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label. If unspecified the value specified by the --label flag will be used.
  --columns_file DATA FILE        Specifies a file to read column features.  [default: data/columns_fil.txt]
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class       Add detected classifications to each message.
  add-scores      Add probability scores to each message.
  buffer          (Deprecated) Buffer results.
  delay           (Deprecated) Delay results for a certain duration.
  deserialize     Deserialize source data into Dataframes.
  dropna          Drop null data entries from a DataFrame.
  filter          Filter message by a classification threshold.
  from-appshield  Source stage is used to load Appshield messages from one or more plugins into a dataframe. It normalizes nested json messages and arranges them into a
                  dataframe by snapshot and source(Determine which source generated the plugin messages).
  from-file       Load messages from a file.
  from-kafka      Load messages from a Kafka cluster.
  inf-identity    Perform inference for testing that performs a no-op.
  inf-pytorch     Perform inference with PyTorch.
  inf-triton      Perform inference with Triton Inference Server.
  mlflow-drift    Report model drift statistics to ML Flow.
  monitor         Display throughput numbers at a specific point in the pipeline.
  preprocess      Prepare FIL input DataFrames for inference.
  serialize       Include & exclude columns from messages.
  to-file         Write all messages to a file.
  to-kafka        Write all messages to a Kafka cluster.
  trigger         Buffer data until previous stage has completed.
  validate        Validate pipeline output for testing.

morpheus run pipeline-nlp

Usage: morpheus run pipeline-nlp [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_seq_length INTEGER RANGE
                                  Limits the length of the sequence returned. If tokenized string is shorter than max_length, output will be padded with 0s. If the
                                  tokenized string is longer than max_length and do_truncate == False, there will be multiple returned sequences containing the overflowing
                                  token-ids. Default value is 256  [default: 256; x>=1]
  --label TEXT                    Specify output labels.
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.Ignored when --label is specified  [default: data/labels_nlp.txt]
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class     Add detected classifications to each message.
  add-scores    Add probability scores to each message.
  buffer        (Deprecated) Buffer results.
  delay         (Deprecated) Delay results for a certain duration.
  deserialize   Deserialize source data into Dataframes.
  dropna        Drop null data entries from a DataFrame.
  filter        Filter message by a classification threshold.
  from-file     Load messages from a file.
  from-kafka    Load messages from a Kafka cluster.
  gen-viz       (Deprecated) Write out vizualization DataFrames.
  inf-identity  Perform inference for testing that performs a no-op.
  inf-pytorch   Perform inference with PyTorch.
  inf-triton    Perform inference with Triton Inference Server.
  mlflow-drift  Report model drift statistics to ML Flow.
  monitor       Display throughput numbers at a specific point in the pipeline.
  preprocess    Prepare NLP input DataFrames for inference.
  serialize     Include & exclude columns from messages.
  to-file       Write all messages to a file.
  to-kafka      Write all messages to a Kafka cluster.
  trigger       Buffer data until previous stage has completed.
  validate      Validate pipeline output for testing.

morpheus run pipeline-other

Usage: morpheus run pipeline-other [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 1; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class     Add detected classifications to each message.
  add-scores    Add probability scores to each message.
  buffer        (Deprecated) Buffer results.
  delay         (Deprecated) Delay results for a certain duration.
  deserialize   Deserialize source data into Dataframes.
  dropna        Drop null data entries from a DataFrame.
  filter        Filter message by a classification threshold.
  from-file     Load messages from a file.
  from-kafka    Load messages from a Kafka cluster.
  inf-identity  Perform inference for testing that performs a no-op.
  inf-pytorch   Perform inference with PyTorch.
  inf-triton    Perform inference with Triton Inference Server.
  mlflow-drift  Report model drift statistics to ML Flow.
  monitor       Display throughput numbers at a specific point in the pipeline.
  serialize     Include & exclude columns from messages.
  to-file       Write all messages to a file.
  to-kafka      Write all messages to a Kafka cluster.
  trigger       Buffer data until previous stage has completed.
  validate      Validate pipeline output for testing.

morpheus tools

Usage: morpheus tools [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  autocomplete  Utility for installing/updating/removing shell completion for Morpheus
  onnx-to-trt   Converts an ONNX model to a TRT engine

morpheus tools autocomplete

Usage: morpheus tools autocomplete [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  install  Install the Morpheus shell command completion
  show     Show the Morpheus shell command completion code

morpheus tools onnx-to-trt

Usage: morpheus tools onnx-to-trt [OPTIONS]

Options:
  --input_model PATH              [required]
  --output_model PATH             [required]
  --batches <INTEGER INTEGER>...  [required]
  --seq_length INTEGER            [required]
  --max_workspace_size INTEGER    [default: 16000]
  --help                          Show this message and exit.

NOTE: The conversion tooling requires the separate installation of TensorRT 8.2.

Container Security

NVIDIA has observed false positive identification, by automated vulnerability scanning tools, of packages against National Vulnerability Database (NVD) security bulletins and GitHub Security Advisories (GHSA). This can happen due to package name collisions (e.g., Mamba Boa with GPG Boa, python docker SDK with docker core). NVIDIA is committed to providing the highest quality software distribution to our customers.

Known Issues

There is a bug in RAPIDS whereby attempting to serialize any cudf dataframe whose column names are numpy integers will result in a TypeError similar to TypeError: can not serialize 'numpy.int64' object. A fix will be provided in the next Production Branch October 2024 (PB24h2) release. As a workaround, users should rewrite the dataframe column names by getting the underlying int/float value from the numpy type and reassigning that value as the column name.

License

Morpheus is distributed as open source software under the Apache Software License 2.0.

NVIDIA AI Enterprise

NVIDIA AI Enterprise provides global support for NVIDIA AI software, including Morpheus. For more information on NVIDIA AI Enterprise please consult this overview and the NVIDIA AI Enterprise End User License Agreement.