NVIDIA AI Enterprise includes the RAPIDS Accelerator for Apache Spark, which leverages GPUs to accelerate processing via the RAPIDS libraries. The RAPIDS Accelerator for Apache Spark provides transparent acceleration of Spark jobs by NVIDIA GPUs via a plugin that integrates with Spark’s query planner. The plugin transparently intercepts operations that can be accelerated by GPUs, so no code changes are required. Operations that cannot be accelerated will continue to run on the CPU. Important areas of benefit when using the RAPIDS Accelerator are: No Code Changes Required: Transparent GPU acceleration with a plugin that works on all major Apache Spark platforms, including Google Cloud Dataproc, Amazon EMR, and Databricks. Full Stack Acceleration: Run existing Apache Spark 3.x jobs up to 5x faster than equivalent CPU-only systems. Enterprise Support: Mission critical support, bug fixes, and professional services available through NVIDIA AI Enterprise.
RAPIDS Accelerator for Apache Spark is exclusively available with NVIDIA AI Enterprise. Before you start, ensure that your environment is set up by following one of the deployment guides available in the NVIDIA AI Enterprise Documentation. The RAPIDS Accelerator is delivered as a jar file which gets deployed on each Spark worker node. The Accelerated Spark stack consists of three main components, each playing a role in enabling Spark users to accelerate their ETL or DL or ML application. Spark 3.0 Core engine Spark 3.0 core provides two critical capabilities, GPU scheduling and columnar processing, that enable the RAPIDS Accelerator to execute the Spark operations on the GPU. The plugin supports SQL and dataframe operations, which is commonly used for data processing. RAPIDS Software The second component is the RAPIDS Software, an open-source collection of libraries aimed to democratize data science on GPUs. NVIDIA GPU Accelerated Infrastructure NVIDIA AI Enterprise includes support for running the RAPIDS Accelerator for Apache Spark on three leading Spark platforms: Google Cloud Dataproc Databricks (Azure & AWS) Amazon EMR
Before you start, ensure that your environment is set up by following one of the deployment guides available in the NVIDIA AI Enterprise Documentation.
For an overview of the features included in the RAPIDS Accelerator for Apache Spark Production Branch May, please refer to the [Release Notes for RAPIDS Accelerator for Apache Spark].
https://nvidia.github.io/spark-rapids/docs/archive.html#release-v24020
For more information about RAPIDS Accelerator for Apache Spark, see:
https://docs.nvidia.com/spark-rapids/user-guide/24.02/index.html
For the optimized performance, it is highly recommended to deploy the supported NVIDIA AI Enterprise Infrastructure software in conjunction with your AI software.
Production Branch - May 2024 (24h1) is compatible with NVIDIA AI Enterprise Infrastructure 4.0 and NVIDIA AI Enterprise Infrastructure 4.1 and NVIDIA AI Enterprise Infrastructure 5.
Get access to knowledge base articles and support cases or submit a ticket.
Visit the NVIDIA AI Enterprise Documentation Hub for release documentation, deployment guides and more.
Go to the NVIDIA Licensing Portal to manage your software licenses. licensing portal for your products. Get Your Licenses