What is the PACC-GPU?
The Persisent Application Client Container (PACC) with GPU support (GPU based PACC Container). MapR PACC supports containerization of existing and new applications by providing containers with persistent data access from anywhere. This enables building ML/AI applications by providing access to data outside of thecontainer. Applications running in containers are then no longer transient and are able to run anywhere MapR is running including on-premises, in the cloud or even at the edge.
Containers have no scalable, out-of-box way to let applications persist state. All data written by containerized applications is typically lost when there is an application or hardware failure. Most options seeking to address this problem have proven to be inadequate for a variety of reasons. Storing data on local nodes, for example, forces IT departments to track down the data and move files when containers are redeployed. In addition, commercially available NAS/SAN-based solutions are not cost-effective and introduce unnecessary overhead.
With the introduction of MapR Persistent Application Client Containers (PACCs), containerized applications can easily leverage all the MapR platform services (Distributed File and Object Storage, NoSQL JSON Database, Event Store for Apache Kafka) as a persistent data store.
- Pre-built, certified container image. A Docker image is provided for connecting to all converged data platform services, including MapR XD Distributed File and Object Store, MapR Database, and MapR Event Store for Apache Kafka. The image is streamlined to include all the necessary bits - no more, no less - required to leverage MapR as a persistent data store for your containerized applications.
- Flexibility in Deployment. Application containers that leverage MapR PACCs can be deployed on MapR nodes as well as remote client nodes, including nodes in the cloud.
- Secure authentication at container level, allowing for secure connections. MapR PACCs allow for authentication at a container level to ensure containerized applications only have access to data for which they are authorized. Communications are encrypted to ensure privacy when accessing data in MapR.
Containerizing an Existing Application
Applications store state in the MapR Converged Data Platform, ensuring resilience and availability across application and infrastructure failures. In this use case:
- Containers can survive application or hardware failures by accessing persisted data upon restart.
- Containers can move across nodes or even data centers (including cloud environments), yet still retain access to persisted data.
Shared Storage for Applications and Analytics
Here, containerized applications persist data to a secure, shared location (the MapR Converged Data Platform), where other applications can access and make use of it. For instance, containerized operational applications can write data to MapR, where analytical applications leveraging Apache Drill, Spark, etc. can then make use of it.
Microservice applications leverage the complete suite of reliable, scalable data services that only the MapR Converged Data Platform can provide. By leveraging client pathways to MapR Event Store and MapR Database, for example, microservices can immediately gain advantages including statefulness and efficient inter-microservice communication, as well as the simplified deployment model that containers provide.
Running the GPU Based PACC Container
To start the PACC container with GPU support: docker pull maprtech/pacc-gpu:cuda10-6.1.0_6.0.0_ubuntu16
This docker image is a base image with the GPU support and support for communicating with the MapR Data Platform.
In order to really make the most of this container, you should build your solution on top of this container. In your own Dockerfile you can build on top of this container with the following line
To run the container, follow these steps:
- Select the PACC container image you want to run.
- Run the docker pull command to pull the image and confirm that the image is pulled successfully.
- The PACC with GPU support container can be run with a Secure or Unsecure MapR cluster.
- Run the docker command with the NVIDIA required parameters and MapR required parameters.
Here is the sample command to run the container:
nvidia-docker run -it
<pacc-gpu docker image>
- MAPR_CLUSTER: Name of the MapR cluster
- MAPR_CLDB_HOSTS: CLDB host IP addresses separated by a comma
- MAPR_MOUNT_PATH: The path to the FUSE mount point in the container
- MAPR_CONTAINER_USER: The user that the user application inside the Docker container will run as
- MAPR_CONTAINER_UID: The UID that the application inside the Docker container will run as
- MAPR_CONTAINER_GROUP: The group that the application inside the Docker container will run as
- MAPR_CONTAINER_GID: The GID that the application inside the Docker container will run as
- MAPR_TICKETFILE_LOCATION: The location inside the container where the ticket file resides (For secure cluster only)
- -v /tmp/user1_ticket_500:/tmp/maprticket_500:ro: The location of the ticket on the host, and the desired location of the ticket file in the container (For secure cluster only)
- --device /dev/fuse: A parameter that is required to mount the FUSE device
For further reference on the PACC, please see PACC Documentation.
If you do not already have MapR in your environment, you can get started here by [downloading the MapR Community Edition] (https://mapr.com/download/)