Linux / amd64
The DFP Pipeline container image contains a compiled Morpheus pipeline that is designed to do DFP analysis on Azure AD logs being streamed in via Kafka. This container image is part of the Digital Fingerprinting AI Workflow.
This image can be run in one of two modes: Inference or Training.
In Inference mode, the pipeline will listen to a Kafka topic, pre-process the incoming stream of data, calculate the risk score, and then publish the results to an Elasticsearch table. This is typically started with the following arguments:
--tracking-uri=http://MLFLOW-INSTANCE --num-threads=10 --prometheus-port=8080 inference --kafka-bootstrap=BOOTSTRAP:PORT --kafka-input-topic=TOPIC --elastic-host=ELASTIC_HOSTNAME --elastic-port=PORT --elastic-user USERNAME --elastic-password PASSWORD --elastic-cacrt /etc/ca.crt --elasstic-https
The training mode is meant to be run with a cronjob, but can also be run manually. It will load the data in JSON format from a URL, create a new model for every user is the dataset, and then publish the model to an existing MLFlow instance. The data used for training will then be saved to an S3 compatible object bucket. This is usually started with the following arguments:
--tracking-uri=http://MLFLOW-INSTANCE --num-threads=10 --prometheus-port=8080 load-data-then-train --bucket-name=S3_BUCKET --training-endpoint-url=http://SErVER/SAMPLE_DATA.json --aws-endpoint-url=YOUR_S3_SERVER