NGC | Catalog

Morpheus

Logo for Morpheus
Features
Description
NVIDIA Morpheus is an open AI application framework for cybersecurity developers.
Publisher
NVIDIA
Latest Tag
v23.11.01-runtime
Modified
December 18, 2023
Compressed Size
7.5 GB
Multinode Support
No
Multi-Arch Support
No
v23.11.01-runtime (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

Morpheus

Morpheus allows teams to build their own optimized pipelines that address cybersecurity and information security use cases. Morpheus provides development capabilities around dynamic protection, real-time telemetry, adaptive policies, and cyber defenses for detecting and remediating cybersecurity threats.

Getting Started

Prerequisites
  • Pascal architecture or better (CUDA Compute Capability >= 6.0)
    • P100 and V100 not regularly tested
  • NVIDIA driver 520.61.05 or higher (CUDA 11.8)
  • Docker
  • The NVIDIA Container Toolkit
Installation
Pre-built runtime Docker image

Pre-built Morpheus Docker images can be downloaded from NGC. The runtime image includes pre-installed Morpheus and its dependencies:

docker pull nvcr.io/nvidia/morpheus/morpheus:23.11-runtime

There is also a Helm chart for deploying the Morpheus runtime container as a pod into a Kubernetes cluster.

Note: You must log into the NGC public catalog to download the Morpheus image. For more information see this guide for accessing NGC.

There are two CalVer tags for the Morpheus runtime image that is published to NGC but they ultimately refer to the same latest digest for a release image:

  • YY.MM-runtime
  • vYY.MM.[00..NN]-runtime

This is something of a technical detail but allows for the tag used by the Helm charts to always point to the latest version which may be a point release or hotfix from the same CalVer release. Thus, a hotfix image release doesn't necessarily require a new Helm chart update for the same CalVer release. Refer to this note on Morpheus image versioning.

Running Directly with Docker

If you choose to run the Morpheus runtime image directly without Kubernetes, you will need to ensure that Docker has been properly configured for including the NVIDIA Container Runtime. This can be specified as either the default runtime or explicitly like this:

docker run --rm -ti --runtime=nvidia --gpus=all $ANY_OTHER_DOCKER_ARGS nvcr.io/nvidia/morpheus/morpheus:23.11-runtime bash

More detailed instructions for this mode can be found in the Getting Started Guide on GitHub.

Kafka Cluster and Triton Inference Server

Morpheus provides Kubernetes Helm charts for deploying a basic Kafka cluster, a single Triton Inference Server, and an MLflow server. These are also available in NGC.

Configuration

The Morpheus pipeline can be configured in two ways:

  1. Manual configuration in Python script.
  2. Configuration via the provided CLI (i.e., morpheus)
Starting the Pipeline (via Manual Python Config)

See the examples directory in the Github repo for examples on how to configure a pipeline via Python.

Starting the Pipeline (via CLI)

The provided CLI (morpheus) is capable of running the included tools as well as any linear pipeline.

morpheus

Usage: morpheus [OPTIONS] COMMAND [ARGS]...

  Main entry point function for the CLI.

Options:
  --debug / --no-debug            [default: no-debug]
  --log_level [CRITICAL|FATAL|ERROR|WARN|WARNING|INFO|DEBUG]
                                  Specify the logging level to use.  [default:
                                  WARNING]
  --log_config_file FILE          Config file to use to configure logging. Use
                                  only for advanced situations. Can accept
                                  both JSON and ini style configurations
  --plugin TEXT                   Adds a Morpheus CLI plugin. Can either be a
                                  module name or path to a python module
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Commands:
  run    Run one of the available pipelines
  tools  Run a utility tool

morpheus run

Usage: morpheus run [OPTIONS] COMMAND [ARGS]...

  Run subcommand, used for running a pipeline

Options:
  --num_threads INTEGER RANGE     Number of internal pipeline threads to use  [default: 128; x>=1]
  --pipeline_batch_size INTEGER RANGE
                                  Internal batch size for the pipeline. Can be much larger than the model batch size. Also used for Kafka consumers  [default: 256; x>=1]
  --model_max_batch_size INTEGER RANGE
                                  Max batch size to use for the model  [default: 8; x>=1]
  --edge_buffer_size INTEGER RANGE
                                  The size of buffered channels to use between nodes in a pipeline. Larger values reduce backpressure at the cost of memory. Smaller values
                                  will push messages through the pipeline quicker. Must be greater than 1 and a power of 2 (i.e. 2, 4, 8, 16, etc.)  [default: 128; x>=2]
  --use_cpp BOOLEAN               Whether or not to use C++ node and message types or to prefer python. Only use as a last resort if bugs are encountered  [default: True]
  --help                          Show this message and exit.

Commands:
  pipeline-ae     Run the inference pipeline with an AutoEncoder model
  pipeline-fil    Run the inference pipeline with a FIL model
  pipeline-nlp    Run the inference pipeline with a NLP model
  pipeline-other  Run a custom inference pipeline without a specific model type

morpheus run pipeline-ae

Usage: morpheus run pipeline-ae [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model --server_url=localhost:8001 filter --threshold=0.5 to-file
  --filename=classifications.json

  Pipelines must follow a few rules: 1. Data must originate in a source stage. Current options are `from-file` or `from-kafka` 2. A `deserialize` stage must be placed
  between the source stages and the rest of the pipeline 3. Only one inference stage can be used. Zero is also fine 4. The following stages must come after an inference
  stage: `add-class`, `filter`, `gen-viz`

Options:
  --columns_file DATA FILE        Specifies a file to read column features.  [required]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.
  --userid_column_name TEXT       Which column to use as the User ID.  [default: userIdentityaccountId; required]
  --userid_filter TEXT            Specifying this value will filter all incoming data to only use rows with matching User IDs. Which column is used for the User ID is
                                  specified by `userid_column_name`
  --feature_scaler [NONE|STANDARD|GAUSSRANK]
                                  Autoencoder feature scaler  [default: STANDARD]
  --use_generic_model             Whether to use a generic model when user does not have minimum number of training rows
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --viz_direction [BT|LR|RL|TB]   Set the direction for the Graphviz pipeline diagram, ignored unless --viz_file is also specified.  [default: LR]
  --timestamp_column_name TEXT    Which column to use as the timestamp.  [default: timestamp; required]
  --help                          Show this message and exit.

Commands:
  add-class                  Add detected classifications to each message.
  add-scores                 Add probability scores to each message.
  buffer                     (Deprecated) Buffer results.
  delay                      (Deprecated) Delay results for a certain duration.
  filter                     Filter message by a classification threshold.
  from-arxiv                 Source stage that downloads PDFs from arxiv and converts them to dataframes.
  from-azure                 Source stage is used to load Azure Active Directory messages.
  from-cloudtrail            Load messages from a Cloudtrail directory.
  from-databricks-deltalake  Source stage used to load messages from a DeltaLake table.
  from-duo                   Source stage is used to load Duo Authentication messages.
  from-http                  Source stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  from-http-client           Source stage that polls a remote HTTP server for incoming data.
  from-rss                   Load RSS feed items into a DataFrame.
  inf-pytorch                Perform inference with PyTorch.
  inf-triton                 Perform inference with Triton Inference Server.
  monitor                    Display throughput numbers at a specific point in the pipeline.
  preprocess                 Prepare Autoencoder input DataFrames for inference.
  serialize                  Includes & excludes columns from messages.
  timeseries                 Perform time series anomaly detection and add prediction.
  to-elasticsearch           This class writes the messages as documents to Elasticsearch.
  to-file                    Write all messages to a file.
  to-http                    Write all messages to an HTTP endpoint.
  to-http-server             Sink stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  to-kafka                   Write all messages to a Kafka cluster.
  train-ae                   Train an Autoencoder model on incoming data.
  trigger                    Buffer data until the previous stage has completed.
  validate                   Validate pipeline output for testing.

morpheus run pipeline-fil

Usage: morpheus run pipeline-fil [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model --server_url=localhost:8001 filter --threshold=0.5 to-file
  --filename=classifications.json

  Pipelines must follow a few rules: 1. Data must originate in a source stage. Current options are `from-file` or `from-kafka` 2. A `deserialize` stage must be placed
  between the source stages and the rest of the pipeline 3. Only one inference stage can be used. Zero is also fine 4. The following stages must come after an inference
  stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 18; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified  [default: mining]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label. If unspecified the value specified by the --label flag will be used.
  --columns_file DATA FILE        Specifies a file to read column features.
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --viz_direction [BT|LR|RL|TB]   Set the direction for the Graphviz pipeline diagram, ignored unless --viz_file is also specified.  [default: LR]
  --help                          Show this message and exit.

Commands:
  add-class                  Add detected classifications to each message.
  add-scores                 Add probability scores to each message.
  buffer                     (Deprecated) Buffer results.
  delay                      (Deprecated) Delay results for a certain duration.
  deserialize                Messages are logically partitioned based on the pipeline config's `pipeline_batch_size` parameter.
  dropna                     Drop null data entries from a DataFrame.
  filter                     Filter message by a classification threshold.
  from-appshield             Source stage is used to load Appshield messages from one or more plugins into a dataframe. It normalizes nested json messages and arranges
                             them into a dataframe by snapshot and source.
  from-arxiv                 Source stage that downloads PDFs from arxiv and converts them to dataframes.
  from-databricks-deltalake  Source stage used to load messages from a DeltaLake table.
  from-file                  Load messages from a file.
  from-http                  Source stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  from-http-client           Source stage that polls a remote HTTP server for incoming data.
  from-kafka                 Load messages from a Kafka cluster.
  from-rss                   Load RSS feed items into a DataFrame.
  inf-identity               Perform inference for testing that performs a no-op.
  inf-pytorch                Perform inference with PyTorch.
  inf-triton                 Perform inference with Triton Inference Server.
  mlflow-drift               Report model drift statistics to MLflow.
  monitor                    Display throughput numbers at a specific point in the pipeline.
  preprocess                 Prepare FIL input DataFrames for inference.
  serialize                  Includes & excludes columns from messages.
  to-elasticsearch           This class writes the messages as documents to Elasticsearch.
  to-file                    Write all messages to a file.
  to-http                    Write all messages to an HTTP endpoint.
  to-http-server             Sink stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  to-kafka                   Write all messages to a Kafka cluster.
  trigger                    Buffer data until the previous stage has completed.
  validate                   Validate pipeline output for testing.

morpheus run pipeline-nlp

Usage: morpheus run [OPTIONS] COMMAND [ARGS]...

  Run subcommand, used for running a pipeline

Options:
  --num_threads INTEGER RANGE     Number of internal pipeline threads to use  [default: 128; x>=1]
  --pipeline_batch_size INTEGER RANGE
                                  Internal batch size for the pipeline. Can be much larger than the model batch size. Also used for Kafka consumers  [default: 256; x>=1]
  --model_max_batch_size INTEGER RANGE
                                  Max batch size to use for the model  [default: 8; x>=1]
  --edge_buffer_size INTEGER RANGE
                                  The size of buffered channels to use between nodes in a pipeline. Larger values reduce backpressure at the cost of memory. Smaller values
                                  will push messages through the pipeline quicker. Must be greater than 1 and a power of 2 (i.e. 2, 4, 8, 16, etc.)  [default: 128; x>=2]
  --use_cpp BOOLEAN               Whether or not to use C++ node and message types or to prefer python. Only use as a last resort if bugs are encountered  [default: True]
  --help                          Show this message and exit.

Commands:
  pipeline-ae     Run the inference pipeline with an AutoEncoder model
  pipeline-fil    Run the inference pipeline with a FIL model
  pipeline-nlp    Run the inference pipeline with a NLP model
  pipeline-other  Run a custom inference pipeline without a specific model type
(morpheus) root@8d1b4542dd99:/workspace# morpheus run pipeline-nlp --help
Usage: morpheus run pipeline-nlp [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model --server_url=localhost:8001 filter --threshold=0.5 to-file
  --filename=classifications.json

  Pipelines must follow a few rules: 1. Data must originate in a source stage. Current options are `from-file` or `from-kafka` 2. A `deserialize` stage must be placed
  between the source stages and the rest of the pipeline 3. Only one inference stage can be used. Zero is also fine 4. The following stages must come after an inference
  stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_seq_length INTEGER RANGE
                                  Limits the length of the sequence returned. If tokenized string is shorter than max_length, output will be padded with 0s. If the
                                  tokenized string is longer than max_length and do_truncate == False, there will be multiple returned sequences containing the overflowing
                                  token-ids. Default value is 256  [default: 256; x>=1]
  --label TEXT                    Specify output labels.
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.Ignored when --label is specified  [default: data/labels_nlp.txt]
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --viz_direction [BT|LR|RL|TB]   Set the direction for the Graphviz pipeline diagram, ignored unless --viz_file is also specified.  [default: LR]
  --help                          Show this message and exit.

Commands:
  add-class                  Add detected classifications to each message.
  add-scores                 Add probability scores to each message.
  buffer                     (Deprecated) Buffer results.
  delay                      (Deprecated) Delay results for a certain duration.
  deserialize                Messages are logically partitioned based on the pipeline config's `pipeline_batch_size` parameter.
  dropna                     Drop null data entries from a DataFrame.
  filter                     Filter message by a classification threshold.
  from-arxiv                 Source stage that downloads PDFs from arxiv and converts them to dataframes.
  from-databricks-deltalake  Source stage used to load messages from a DeltaLake table.
  from-doca                  A source stage used to receive raw packet data from a ConnectX-6 Dx NIC.
  from-file                  Load messages from a file.
  from-http                  Source stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  from-http-client           Source stage that polls a remote HTTP server for incoming data.
  from-kafka                 Load messages from a Kafka cluster.
  from-rss                   Load RSS feed items into a DataFrame.
  gen-viz                    (Deprecated) Write out visualization DataFrames.
  inf-identity               Perform inference for testing that performs a no-op.
  inf-pytorch                Perform inference with PyTorch.
  inf-triton                 Perform inference with Triton Inference Server.
  mlflow-drift               Report model drift statistics to MLflow.
  monitor                    Display throughput numbers at a specific point in the pipeline.
  preprocess                 Prepare NLP input DataFrames for inference.
  serialize                  Includes & excludes columns from messages.
  to-elasticsearch           This class writes the messages as documents to Elasticsearch.
  to-file                    Write all messages to a file.
  to-http                    Write all messages to an HTTP endpoint.
  to-http-server             Sink stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  to-kafka                   Write all messages to a Kafka cluster.
  trigger                    Buffer data until the previous stage has completed.
  validate                   Validate pipeline output for testing.

morpheus run pipeline-other

Usage: morpheus run pipeline-other [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model --server_url=localhost:8001 filter --threshold=0.5 to-file
  --filename=classifications.json

  Pipelines must follow a few rules: 1. Data must originate in a source stage. Current options are `from-file` or `from-kafka` 2. A `deserialize` stage must be placed
  between the source stages and the rest of the pipeline 3. Only one inference stage can be used. Zero is also fine 4. The following stages must come after an inference
  stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 1; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --viz_direction [BT|LR|RL|TB]   Set the direction for the Graphviz pipeline diagram, ignored unless --viz_file is also specified.  [default: LR]
  --help                          Show this message and exit.

Commands:
  add-class                  Add detected classifications to each message.
  add-scores                 Add probability scores to each message.
  buffer                     (Deprecated) Buffer results.
  delay                      (Deprecated) Delay results for a certain duration.
  deserialize                Messages are logically partitioned based on the pipeline config's `pipeline_batch_size` parameter.
  dropna                     Drop null data entries from a DataFrame.
  filter                     Filter message by a classification threshold.
  from-arxiv                 Source stage that downloads PDFs from arxiv and converts them to dataframes.
  from-databricks-deltalake  Source stage used to load messages from a DeltaLake table.
  from-file                  Load messages from a file.
  from-http                  Source stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  from-http-client           Source stage that polls a remote HTTP server for incoming data.
  from-kafka                 Load messages from a Kafka cluster.
  from-rss                   Load RSS feed items into a DataFrame.
  inf-identity               Perform inference for testing that performs a no-op.
  inf-pytorch                Perform inference with PyTorch.
  inf-triton                 Perform inference with Triton Inference Server.
  mlflow-drift               Report model drift statistics to MLflow.
  monitor                    Display throughput numbers at a specific point in the pipeline.
  serialize                  Includes & excludes columns from messages.
  to-elasticsearch           This class writes the messages as documents to Elasticsearch.
  to-file                    Write all messages to a file.
  to-http                    Write all messages to an HTTP endpoint.
  to-http-server             Sink stage that starts an HTTP server and listens for incoming requests on a specified endpoint.
  to-kafka                   Write all messages to a Kafka cluster.
  trigger                    Buffer data until the previous stage has completed.
  validate                   Validate pipeline output for testing.

morpheus tools

Usage: morpheus tools [OPTIONS] COMMAND [ARGS]...

  Tools subcommand

Options:
  --help  Show this message and exit.

Commands:
  autocomplete  Utility for installing/updating/removing shell completion for Morpheus
  onnx-to-trt   Converts an ONNX model to a TRT engine

morpheus tools autocomplete

Usage: morpheus tools autocomplete [OPTIONS] COMMAND [ARGS]...

  Utility for installing/updating/removing shell completion for Morpheus

Options:
  --help  Show this message and exit.

Commands:
  install  Install the Morpheus shell command completion
  show     Show the Morpheus shell command completion code

morpheus tools onnx-to-trt

Usage: morpheus tools onnx-to-trt [OPTIONS]

  Converts an ONNX model to a TRT engine

Options:
  --input_model PATH              [required]
  --output_model PATH             [required]
  --batches <INTEGER INTEGER>...  [required]
  --seq_length INTEGER            [required]
  --max_workspace_size INTEGER    [default: 16000]
  --help                          Show this message and exit.

NOTE: The conversion tooling requires the separate installation of TensorRT 8.2.

Container Security

NVIDIA has observed false positive identification, by automated vulnerability scanning tools, of packages against National Vulnerability Database (NVD) security bulletins and GitHub Security Advisories (GHSA). This can happen due to package name collisions (e.g., Mamba Boa with GPG Boa, python docker SDK with docker core). NVIDIA is committed to providing the highest quality software distribution to our customers. The containers are purpose built for Morpheus use cases, have several dependencies, and are not intended for general purpose utility such as web hosting.

In this release, we note the following new vulnerabilties:

  • GHSA-5wvp-7f3h-6wmm+pyarrow-11.0.0: A CVE found in the pyarrow package which is fixed in the release 14.0.1. Morpheus has patched this vulnerability using a public hotfix.
  • GHSA-5p3h-7fwh-92rc: A vulnerability in MLflow which will be fixed in the 2.9.0 release.

License

Morpheus is distributed as open source software under the Apache Software License 2.0.

NVIDIA AI Enterprise

NVIDIA AI Enterprise provides global support for NVIDIA AI software. For more information on NVIDIA AI Enterprise please consult this overview and the NVIDIA AI Enterprise End User License Agreement.