NVIDIA Morpheus is an open AI application framework that provides cybersecurity developers with a highly optimized AI pipeline and pre-trained AI capabilities that, for the first time, allow them to instantaneously inspect all IP traffic across their data center fabric. Bringing a new level of security to data centers, Morpheus provides dynamic protection, real-time telemetry, adaptive policies, and cyber defenses for detecting and remediating cybersecurity threats.
The Morpheus SDK container is packaged as a Kubernetes (aka k8s) deployment using a Helm chart. NVIDIA provides installation instructions for the NVIDIA Cloud Native Stack which incorporates the setup of these platforms and tools. Morpheus and its use of the Triton Inference Server are designed to work with any GPU with CUDA Compute Capability greater than 6.0 including Hopper and Ampere models.
NGC API Key
First, you will need to set up your NGC API Key to access all the Morpheus components, using the instructions from the NGC Registry CLI User Guide. Once you have created your API key, create an environment variable containing your API key for use by the commands used further in these instructions:
export API_KEY="<your key>"
After installing the Cloud Native Stack, install and configure the NGC Registry CLI using the instructions from the NGC Registry CLI User Guide.
Create Namespace for Morpheus
Create a namespace and an environment variable for the namespace to organize the k8s cluster deployed via the Cloud Native Stack and logically separate Morpheus-related deployments from other projects using the following command:
kubectl create namespace <some name>
export NAMESPACE="<some name>"
Install the Morpheus SDK CLI
helm fetch https://helm.ngc.nvidia.com/nvidia/morpheus/charts/morpheus-sdk-client-23.11.tgz --username='$oauthtoken' --password=$API_KEY --untar
helm install --set ngc.apiKey="$API_KEY" \
--set sdk.args="/bin/sleep infinity" \
--namespace $NAMESPACE \
Chart values explained
Various fields required to access the container images from NGC. For the public catalog images, it should be sufficient to just specify the provided username and your API_KEY.
The identity of the public catalog Morpheus SDK client image which could be overridden for other registry locations. As noted in the AI Engine chart overview, the withEngine flag specifies node affinity with the ai-engine deployment which will provide the most optimal performance since the SDK pod and the Triton pod can share the CUDA device pool memory of the GPU. The default args for the pod are to put it to sleep which means a user can use 'kubectl exec -it <sdk pod name> -- bash' to run Morpheus pipelines interactively or just inspect the contents. The resources spec is empty by default since the SDK should be typically placed on the same GPU-enabled node as Triton. However some CSP, specifically GKE, will not load the required Nvidia tools and libraries unless there is the explicit resource request for a GPU. GPU time-sharing must be enabled in GKE for this to work with a minimum of two GPU clients.
args: "/bin/sleep infinity"
# args: "morpheus --help"
# args: "morpheus run pipeline-nlp ..."
# nvidia.com/gpu: 1
The SDK image incorporates Jupyter notebooks for some of the examples such as Production Digital Fingerprint, so a NodePort is opened for access.
A local host path which can be used by the charts for sharing models and datasets.
The imagePullPolicy determines whether the container runtime should retrieve an image from a registry to create the pod. Use 'Always' for development.
The SDK pod should run to completion normally and not be restarted. This is done for the non-interactive use case where a user wants the pipeline to run (possibly) indefinitely.
Image pull secrets provide the properly formatted credentials for accessing the container images from NGC. It essentially encodes the provided API_KEY. Note that Fleet Command deployments create these secrets automatically based on the FC org, named literally 'imagepullsecret'.
- name: nvidia-registrykey-secret
# - name: imagepullsecret
When deploying to OpenShift we need to create a ServiceAccount for attaching permissions, such as the use of hostPath volumes.
General flag for OpenShift adjustments.
Deployment in CSP environments such as AWS EC2 require a Load Balancer ingress.
Use a nodeSelector with OpenShift for GPU affinity. The default is nil for the non GPU Operator/NFD use case. Also, refer to the AI Engine chart deployment.