NGC | Catalog

Morpheus

For copy image paths and more information, please view on a desktop device.
Logo for Morpheus

Description

NVIDIA Morpheus is an open AI application framework for cybersecurity developers.

Publisher

NVIDIA

Latest Tag

23.07.02-runtime

Modified

September 3, 2023

Compressed Size

7.33 GB

Multinode Support

No

Multi-Arch Support

No

23.07.02-runtime (Latest) Scan Results

Linux / amd64

Morpheus

Morpheus allows teams to build their own optimized pipelines that address cybersecurity and information security use cases. Morpheus provides development capabilities around dynamic protection, real-time telemetry, adaptive policies, and cyber defenses for detecting and remediating cybersecurity threats.

Getting Started

Prerequisites
  • Pascal architecture or better (CUDA Compute Capability >= 6.0)
    • P100 and V100 not regularly tested
  • NVIDIA driver 520.61.05 or higher (CUDA 11.8)
  • Docker
  • The NVIDIA Container Toolkit
Installation
Pre-built runtime Docker image

Pre-built Morpheus Docker images can be downloaded from NGC. The runtime image includes pre-installed Morpheus and its dependencies:

docker pull nvcr.io/nvidia/morpheus/morpheus:23.07-runtime

There is also a Helm chart for deploying the Morpheus runtime container as a pod into a Kubernetes cluster.

Note: You must log into the NGC public catalog to download the Morpheus image. For more information see this guide for accessing NGC.

There are two CalVer tags for the Morpheus runtime image that is published to NGC but they ultimately refer to the same latest digest for a release image:

  • YY.MM-runtime
  • vYY.MM.[00..NN]-runtime

This is something of a technical detail but allows for the tag used by the Helm charts to always point to the latest version which may be a point release or hotfix from the same CalVer release. Thus, a hotfix image release doesn't necessarily require a new Helm chart update for the same CalVer release. Refer to this note on Morpheus image versioning.

Running Directly with Docker

If you choose to run the Morpheus runtime image directly without Kubernetes, you will need to ensure that Docker has been properly configured for including the NVIDIA Container Runtime. This can be specified as either the default runtime or explicitly like this:

docker run --rm -ti --runtime=nvidia --gpus=all $ANY_OTHER_DOCKER_ARGS nvcr.io/nvidia/morpheus/morpheus:23.07-runtime bash

More detailed instructions for this mode can be found in the Getting Started Guide on GitHub.

Kafka Cluster and Triton Inference Server

Morpheus provides Kubernetes Helm charts for deploying a basic Kafka cluster, a single Triton Inference Server, and an MLflow server. These are also available in NGC.

Configuration

The Morpheus pipeline can be configured in two ways:

  1. Manual configuration in Python script.
  2. Configuration via the provided CLI (i.e., morpheus)
Starting the Pipeline (via Manual Python Config)

See the examples directory in the Github repo for examples on how to configure a pipeline via Python.

Starting the Pipeline (via CLI)

The provided CLI (morpheus) is capable of running the included tools as well as any linear pipeline.

morpheus

Usage: morpheus [OPTIONS] COMMAND [ARGS]...

Options:
  --debug / --no-debug            [default: no-debug]
  --log_level [CRITICAL|FATAL|ERROR|WARN|WARNING|INFO|DEBUG]
                                  Specify the logging level to use.  [default:
                                  WARNING]
  --log_config_file FILE          Config file to use to configure logging. Use
                                  only for advanced situations. Can accept
                                  both JSON and ini style configurations
  --plugin TEXT                   Adds a Morpheus CLI plugin. Can either be a
                                  module name or path to a python module
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Commands:
  run    Run one of the available pipelines
  tools  Run a utility tool

morpheus run

Usage: morpheus run [OPTIONS] COMMAND [ARGS]...

Options:
  --num_threads INTEGER RANGE     Number of internal pipeline threads to use  [default: 80; x>=1]
  --pipeline_batch_size INTEGER RANGE
                                  Internal batch size for the pipeline. Can be much larger than the model batch size. Also used for Kafka consumers  [default: 256; x>=1]
  --model_max_batch_size INTEGER RANGE
                                  Max batch size to use for the model  [default: 8; x>=1]
  --edge_buffer_size INTEGER RANGE
                                  The size of buffered channels to use between nodes in a pipeline. Larger values reduce backpressure at the cost of memory. Smaller values
                                  will push messages through the pipeline quicker. Must be greater than 1 and a power of 2 (i.e. 2, 4, 8, 16, etc.)  [default: 128; x>=2]
  --use_cpp BOOLEAN               Whether or not to use C++ node and message types or to prefer python. Only use as a last resort if bugs are encountered  [default: True]
  --help                          Show this message and exit.

Commands:
  pipeline-ae     Run the inference pipeline with an AutoEncoder model
  pipeline-fil    Run the inference pipeline with a FIL model
  pipeline-nlp    Run the inference pipeline with a NLP model
  pipeline-other  Run a custom inference pipeline without a specific model type

morpheus run pipeline-ae

Usage: morpheus run pipeline-ae [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --columns_file DATA FILE        [required]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label. If unspecified, only a single output label is created for FIL
  --userid_column_name TEXT       Which column to use as the User ID.  [default: userIdentityaccountId; required]
  --userid_filter TEXT            Specifying this value will filter all incoming data to only use rows with matching User IDs. Which column is used for the User ID is
                                  specified by `userid_column_name`
  --feature_scaler [none|standard|gauss_rank]
                                  Autoencoder feature scaler  [default: standard]
  --use_generic_model             Whether to use a generic model when user does not have minimum number of training rows
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class        Add detected classifications to each message.
  add-scores       Add probability scores to each message.
  buffer           (Deprecated) Buffer results.
  delay            (Deprecated) Delay results for a certain duration.
  filter           Filter message by a classification threshold.
  from-azure       Source stage is used to load Azure Active Directory messages.
  from-cloudtrail  Load messages from a Cloudtrail directory.
  from-duo         Source stage is used to load Duo Authentication messages.
  inf-pytorch      Perform inference with PyTorch.
  inf-triton       Perform inference with Triton Inference Server.
  monitor          Display throughput numbers at a specific point in the pipeline.
  preprocess       Prepare Autoencoder input DataFrames for inference.
  serialize        Include & exclude columns from messages.
  timeseries       Perform time series anomaly detection and add prediction.
  to-file          Write all messages to a file.
  to-kafka         Write all messages to a Kafka cluster.
  train-ae         Train an Autoencoder model on incoming data.
  trigger          Buffer data until previous stage has completed.
  validate         Validate pipeline output for testing.

morpheus run pipeline-fil

Usage: morpheus run pipeline-fil [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 29; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified  [default: mining]
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label. If unspecified the value specified by the --label flag will be used.
  --columns_file DATA FILE        Specifies a file to read column features.  [default: data/columns_fil.txt]
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class       Add detected classifications to each message.
  add-scores      Add probability scores to each message.
  buffer          (Deprecated) Buffer results.
  delay           (Deprecated) Delay results for a certain duration.
  deserialize     Deserialize source data into Dataframes.
  dropna          Drop null data entries from a DataFrame.
  filter          Filter message by a classification threshold.
  from-appshield  Source stage is used to load Appshield messages from one or more plugins into a dataframe. It normalizes nested json messages and arranges them into a
                  dataframe by snapshot and source(Determine which source generated the plugin messages).
  from-file       Load messages from a file.
  from-kafka      Load messages from a Kafka cluster.
  inf-identity    Perform inference for testing that performs a no-op.
  inf-pytorch     Perform inference with PyTorch.
  inf-triton      Perform inference with Triton Inference Server.
  mlflow-drift    Report model drift statistics to ML Flow.
  monitor         Display throughput numbers at a specific point in the pipeline.
  preprocess      Prepare FIL input DataFrames for inference.
  serialize       Include & exclude columns from messages.
  to-file         Write all messages to a file.
  to-kafka        Write all messages to a Kafka cluster.
  trigger         Buffer data until previous stage has completed.
  validate        Validate pipeline output for testing.

morpheus run pipeline-nlp

Usage: morpheus run pipeline-nlp [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_seq_length INTEGER RANGE
                                  Limits the length of the sequence returned. If tokenized string is shorter than max_length, output will be padded with 0s. If the
                                  tokenized string is longer than max_length and do_truncate == False, there will be multiple returned sequences containing the overflowing
                                  token-ids. Default value is 256  [default: 256; x>=1]
  --label TEXT                    Specify output labels.
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.Ignored when --label is specified  [default: data/labels_nlp.txt]
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class     Add detected classifications to each message.
  add-scores    Add probability scores to each message.
  buffer        (Deprecated) Buffer results.
  delay         (Deprecated) Delay results for a certain duration.
  deserialize   Deserialize source data into Dataframes.
  dropna        Drop null data entries from a DataFrame.
  filter        Filter message by a classification threshold.
  from-file     Load messages from a file.
  from-kafka    Load messages from a Kafka cluster.
  gen-viz       (Deprecated) Write out vizualization DataFrames.
  inf-identity  Perform inference for testing that performs a no-op.
  inf-pytorch   Perform inference with PyTorch.
  inf-triton    Perform inference with Triton Inference Server.
  mlflow-drift  Report model drift statistics to ML Flow.
  monitor       Display throughput numbers at a specific point in the pipeline.
  preprocess    Prepare NLP input DataFrames for inference.
  serialize     Include & exclude columns from messages.
  to-file       Write all messages to a file.
  to-kafka      Write all messages to a Kafka cluster.
  trigger       Buffer data until previous stage has completed.
  validate      Validate pipeline output for testing.

morpheus run pipeline-other

Usage: morpheus run pipeline-other [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Configure and run the pipeline. To configure the pipeline, list the stages in the order that data should flow. The output of each stage will become the input for the
  next stage. For example, to read, classify and write to a file, the following stages could be used

  pipeline from-file --filename=my_dataset.json deserialize preprocess inf-triton --model_name=my_model
  --server_url=localhost:8001 filter --threshold=0.5 to-file --filename=classifications.json

  Pipelines must follow a few rules:
  1. Data must originate in a source stage. Current options are `from-file` or `from-kafka`
  2. A `deserialize` stage must be placed between the source stages and the rest of the pipeline
  3. Only one inference stage can be used. Zero is also fine
  4. The following stages must come after an inference stage: `add-class`, `filter`, `gen-viz`

Options:
  --model_fea_length INTEGER RANGE
                                  Number of features trained in the model  [default: 1; x>=1]
  --label TEXT                    Specify output labels. Ignored when --labels_file is specified
  --labels_file DATA FILE         Specifies a file to read labels from in order to convert class IDs into labels. A label file is a simple text file where each line
                                  corresponds to a label.
  --viz_file FILE                 Save a visualization of the pipeline at the specified location
  --help                          Show this message and exit.

Commands:
  add-class     Add detected classifications to each message.
  add-scores    Add probability scores to each message.
  buffer        (Deprecated) Buffer results.
  delay         (Deprecated) Delay results for a certain duration.
  deserialize   Deserialize source data into Dataframes.
  dropna        Drop null data entries from a DataFrame.
  filter        Filter message by a classification threshold.
  from-file     Load messages from a file.
  from-kafka    Load messages from a Kafka cluster.
  inf-identity  Perform inference for testing that performs a no-op.
  inf-pytorch   Perform inference with PyTorch.
  inf-triton    Perform inference with Triton Inference Server.
  mlflow-drift  Report model drift statistics to ML Flow.
  monitor       Display throughput numbers at a specific point in the pipeline.
  serialize     Include & exclude columns from messages.
  to-file       Write all messages to a file.
  to-kafka      Write all messages to a Kafka cluster.
  trigger       Buffer data until previous stage has completed.
  validate      Validate pipeline output for testing.

morpheus tools

Usage: morpheus tools [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  autocomplete  Utility for installing/updating/removing shell completion for Morpheus
  onnx-to-trt   Converts an ONNX model to a TRT engine

morpheus tools autocomplete

Usage: morpheus tools autocomplete [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  install  Install the Morpheus shell command completion
  show     Show the Morpheus shell command completion code

morpheus tools onnx-to-trt

Usage: morpheus tools onnx-to-trt [OPTIONS]

Options:
  --input_model PATH              [required]
  --output_model PATH             [required]
  --batches <INTEGER INTEGER>...  [required]
  --seq_length INTEGER            [required]
  --max_workspace_size INTEGER    [default: 16000]
  --help                          Show this message and exit.

NOTE: The conversion tooling requires the separate installation of TensorRT 8.2.

Container Security

NVIDIA has observed false positive identification, by automated vulnerability scanning tools, of packages against National Vulnerability Database (NVD) security bulletins and GitHub Security Advisories (GHSA). This can happen due to package name collisions (e.g., Mamba Boa with GPG Boa, python docker SDK with docker core). NVIDIA is committed to providing the highest quality software distribution to our customers. The containers are purpose built for Morpheus use cases, have several dependencies, and are not intended for general purpose utility such as web hosting.

In this release, we note the following vulnerabilties:

  • CVE-2018-20225: A disputed vulnerability in pip which is undergoing reanalysis at NIST.
  • GHSA-r9hx-vwmv-q579: The setuptools dependency is pinned due to python, pip, and conda dependencies. Fixed in a future release.
  • GHSA-ffw3-6378-cqgp+mlflow-2.5.0: An OS command injection CVE addressed in MLflow 2.6.0. Morpheus will upgrade in the next release.
  • CVE-2023-36632+python-3.11.4: A recursion exploit which has been disputed and rejected by the cython upstream.

License

Morpheus is distributed as open source software under the Apache Software License 2.0.

NVIDIA AI Enterprise

NVIDIA AI Enterprise provides global support for NVIDIA AI software. For more information on NVIDIA AI Enterprise please consult this overview and the NVIDIA AI Enterprise End User License Agreement.