Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. We believe there are many good reasons to prefer dynamic languages for these applications, and we do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. The main homepage for Julia can be found at julialang.org.
See here for a document describing prerequisites and setup steps for all HPC containers and instructions for pulling NGC containers.
Julia is a free and open-source MIT licensed
Before running Julia container, please ensure that your system meets the following requirements:
Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. We believe there are many good reasons to prefer dynamic languages for these applications, and we do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. The main homepage for Julia can be found at julialang.org.
See here for a document describing prerequisites and setup steps for all HPC containers and instructions for pulling NGC containers.
Julia is a free and open-source MIT licensed
Before running Julia container, please ensure that your system meets the following requirements:
By default, Julia will automatically choose among CUDA Toolkit versions 9.2, 10.0, or 10.1/10.2 based on your installed driver.
NGC provides access to Julia containers targeting the following NVIDIA GPU architectures.
The Julia package ecosystem contains quite a few GPU-related packages and wrapper libraries, targeting different levels of abstraction. The packages below are precompiled in the container to provide users easy access to Nvidia highly parallel GPUs for accelerated computing.
The CUDA.jl package is the main programming interface for working with NVIDIA CUDA GPUs using Julia. It features a user-friendly array abstraction, a compiler for writing CUDA kernels in Julia, and wrappers for various CUDA libraries.
We included example scripts inside the container's /workspace/examples
directory for testing the GPU-accelerated CUDA packages when invoking the container and without entering REPL mode.
test.jl
: checks all cuda related componentsvadd.jl
: sums two vercors with random numbers, provide no outputversioninfo.jl
: provides info about installed Julia related packages on the screen.julia
: primary Julia executable
An example command is:
julia /workspace/examples/test.jl
The following examples demonstrate how to run the NGC Julia container under the supported runtimes.
Setup and invoke Julia container via one of the methods listed below:
nvidia-docker
or docker
run command to test a GPU-accelerated CUDA package using built-in example scripts.This example output is from the CUDA
package resolving required packages versions, dependencies, and outputs a summary of multiple tests:
┌ Info: System information:
│ CUDA toolkit 10.2.89, local installation
│ CUDA driver 10.2.0
│ NVIDIA driver 440.33.1
│
│ Libraries:
│ - CUBLAS: 10.2.2
│ - CURAND: 10.1.2
│ - CUFFT: 10.1.2
│ - CUSOLVER: 10.3.0
│ - CUSPARSE: 10.3.1
│ - CUPTI: 12.0.0
│ - NVML: 10.0.0+440.33.1
│ - CUDNN: 7.60.5 (for CUDA 10.2.0)
│ - CUTENSOR: missing
│
│ Toolchain:
│ - Julia: 1.5.0
│ - LLVM: 9.0.1
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
│ - Device support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75
│
│ Environment:
│ - JULIA_CUDA_USE_BINARYBUILDER: false
│
│ 4 devices:
│ 0: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
│ 1: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
│ 2: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
└ 3: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
[ Info: Testing using 1 device(s): 4. Tesla V100-PCIE-16GB (UUID a62c271a-dd48-9769-991a-cd5442ba5110)
[ Info: Skipping the following tests: cutensor
The following commmand will start the container with all GPUs enabled and run multiple GPU-accelerated tests without entering REPL mode in the Julia container using nvidia-docker
or docker
:
Additional test scripts information can be found here
The following command will launch a full-featured interactive command-line REPL(read-eval-print loop) in the Julia container using nvidia-docker
or docker
:
$ nvidia-docker run -it --rm --gpus '' all nvcr.io/hpc/julia:[app_tag]
Where:
-it
: start the container with an interactive terminal (short for --interactive --tty)--rm
: make container ephemeral (removes container on exit)--gpus
: the NVIDIA runtime is integrated with the Docker CLI and GPUs can be accessed seamlessly by the container via the Docker CLI options.The REPL has four main modes of operation. The first and most common is the Julia prompt. It is the default mode of operation; each new line initially starts with Julia. It is here that you can enter Julia's expressions. Hitting return or enter after a complete expression has been entered will evaluate the entry and show the result of the last expression:
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.0 (2020-08-01)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia>
Press ]
key to enter package mode and then type:
(@v1.5) pkg> test CUDA
Press ;
key to enter shell mode and execute NVIDIA System Management Interface command line
utility to monitor CUDA, graphic drivers, and GPU devices information by typing:
shell> nvidia-smi
Thu Sep 24 19:33:14 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... On | 00000000:08:00.0 Off | Off |
| N/A 34C P0 27W / 250W | 12MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-PCIE... On | 00000000:09:00.0 Off | Off |
| N/A 35C P0 26W / 250W | 12MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-PCIE... On | 00000000:88:00.0 Off | Off |
| N/A 32C P0 25W / 250W | 12MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-PCIE... On | 00000000:89:00.0 Off | Off |
| N/A 38C P0 38W / 250W | 421MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
When the cursor is at the beginning of the line, the prompt can be changed to a help mode by typing ?
. Julia will attempt to print help or documentation for anything entered in help mode.
In all of the above modes, the executed lines get saved to a history file, which can be searched. To initiate an incremental search through the previous history, type ^R – the control key together with the r key. The prompt will change to (reverse-i-search)`':, and as you type the search query will appear in the quotes. The most recent result that matches the query will dynamically update to the right of the colon as more is typed. To find an older result using the same query, simply type ^R again.
Just as ^R is a reverse search, ^S is a forward search, with the prompt (i-search)`':. The two may be used in conjunction with each other to move through the previous or next matching results, respectively.
For further instructions on how to navigate in REPL mode go to Julia's documentation
Save the NGC Julia container as a local Singularity image file:
$ singularity build julia_v1.5.0.simg docker://nvcr.io/hpc/julia:[app_tag]
The Singularity image is now saved in the current working directory as julia_v1.5.0.simg
To pull NGC images with singularity
version 2.x and earlier, NGC container registry authentication credentials are required.
To set your NGC container registry authentication credentials:
$ export SINGULARITY_DOCKER_USERNAME='$oauthtoken'
$ export SINGULARITY_DOCKER_PASSWORD=
More information describing how to obtain and use your NVIDIA NGC Cloud Services API key can be found here.
LD_LIBRARY_PATH
: (Singularity containers only) Set the environment variable to CUDA's compat library before running container when the host machine has NVIDIA 418.XX graphics driver and CUDA version are 10 or newer. Add the command below as a prefix to the Singularity run command.
LD_LIBRARY_PATH=/usr/local/cuda/compat:$LD_LIBRARY_PATH Singularity run
Julia container will attempt to precompile packages into files and save history logs inside the container's /data
directory during runtime. Unlike Docker containers that allow root access, Singularity will produce permission denied errors. The workaround is to make a new directory on the host machine and bind mount into the container's /data
directory.
mkdir data
singularity run -B $(pwd)/data:/data
Where:
-B
: a user-bind path specification$ singularity run --nv -B $(pwd)/data:/data julia_v1.5.0.simg /workspace/examples/test_cudadrv.jl
Where:
-nv
: expose the host GPU(s) to the container-B
: a user-bind path specificationThis example script loads the CUDAdrv
package then runs multiple tests.
Example of successful Julia output:
Testing CUDA
Downloading artifact: CompilerSupportLibraries
Downloading artifact: FFTW
Downloading artifact: OpenSpecFun
Downloading artifact: IntelOpenMP
Status `/tmp/jl_EmZmjf/Project.toml`
[621f4979] AbstractFFTs v0.5.0
[79e6a3ab] Adapt v2.1.0
┌ Info: System information:
│ CUDA toolkit 10.2.89, local installation
│ CUDA driver 10.2.0
│ NVIDIA driver 440.33.1
│
│ Libraries:
│ - CUBLAS: 10.2.2
│ - CURAND: 10.1.2
│ - CUFFT: 10.1.2
│ - CUSOLVER: 10.3.0
│ - CUSPARSE: 10.3.1
│ - CUPTI: 12.0.0
│ - NVML: 10.0.0+440.33.1
│ - CUDNN: 7.60.5 (for CUDA 10.2.0)
│ - CUTENSOR: missing
│
│ Toolchain:
│ - Julia: 1.5.0
│ - LLVM: 9.0.1
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
│ - Device support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75
│
│ Environment:
│ - JULIA_CUDA_USE_BINARYBUILDER: false
│
│ 4 devices:
│ 0: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
│ 1: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
│ 2: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
└ 3: Tesla V100-PCIE-16GB (sm_70, 15.770 GiB / 15.782 GiB available)
[ Info: Testing using 1 device(s): 4. Tesla V100-PCIE-16GB (UUID a62c271a-dd48-9769-991a-cd5432ba5110)
[ Info: Skipping the following tests: cutensor
| | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
initialization (2) | 4.15 | 0.00 | 0.0 | 0.00 | N/A | 0.09 | 2.2 | 211.59 | 585.86 |
apiutils (2) | 0.26 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 5.33 | 585.86 |
array (2) | 67.98 | 0.13 | 0.2 | 5.20 | N/A | 2.42 | 3.6 | 6333.03 | 804.01 |
broadcast (2) | 24.09 | 0.00 | 0.0 | 0.00 | N/A | 0.53 | 2.2 | 1457.37 | 835.50 |
codegen (2) | 5.05 | 0.00 | 0.0 | 0.00 | N/A | 0.12 | 2.5 | 298.64 | 923.12 |
cublas (2) | 73.84 | 0.03 | 0.0 | 11.12 | N/A | 2.40 | 3.3 | 6681.29 | 1388.93 |
cudnn (2) | 66.86 | 0.01 | 0.0 | 0.62 | N/A | 1.64 | 2.5 | 4934.47 | 2555.46 |
cufft (2) | 24.10 | 0.02 | 0.1 | 144.16 | N/A | 0.74 | 3.1 | 1977.75 | 2709.52 |
curand (2) | 0.09 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 5.48 | 2709.52 |
cusolver (2) | 53.75 | 0.07 | 0.1 | 1128.68 | N/A | 1.78 | 3.3 | 4513.01 | 2724.78 |
cusparse (2) | 31.80 | 0.01 | 0.0 | 4.46 | N/A | 0.79 | 2.5 | 2075.20 | 2724.78 |
examples (2) | 144.55 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 24.31 | 2724.78 |
exceptions (2) | 87.95 | 0.00 | 0.0 | 0.00 | N/A | 0.03 | 0.0 | 24.27 | 2724.78 |
execution (2) | 34.11 | 0.00 | 0.0 | 0.15 | N/A | 0.86 | 2.5 | 2350.04 | 2724.78 |
forwarddiff (2) | 98.76 | 0.00 | 0.0 | 0.00 | N/A | 0.99 | 1.0 | 2597.81 | 2724.78 |
iterator (2) | 2.39 | 0.00 | 0.0 | 1.07 | N/A | 0.08 | 3.3 | 202.23 | 2724.78 |
nnlib (2) | 2.63 | 0.00 | 0.0 | 0.00 | N/A | 0.08 | 3.1 | 136.51 | 2724.78 |
nvml (2) | 0.58 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 49.31 | 2724.78 |
nvtx (2) | 1.38 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 3.0 | 106.06 | 2724.78 |
pointer (2) | 0.20 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 19.9 | 5.85 | 2724.78 |
pool (2) | 3.48 | 0.00 | 0.0 | 0.00 | N/A | 0.71 | 20.3 | 201.80 | 2724.78 |
random (2) | 8.15 | 0.00 | 0.0 | 0.02 | N/A | 0.21 | 2.6 | 583.35 | 2724.78 |
statistics (2) | 12.72 | 0.00 | 0.0 | 0.00 | N/A | 0.43 | 3.4 | 979.70 | 2724.78 |
texture (2) | 23.14 | 0.00 | 0.0 | 0.08 | N/A | 1.01 | 4.4 | 2209.33 | 2724.78 |
threading (2) | 2.75 | 0.00 | 0.2 | 10.94 | N/A | 0.05 | 1.7 | 184.11 | 2724.78 |
utils (2) | 1.11 | 0.00 | 0.0 | 0.00 | N/A | 0.05 | 4.6 | 114.40 | 2724.78 |
cudadrv/context (2) | 1.01 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 4.2 | 62.13 | 2724.78 |
cudadrv/devices (2) | 0.32 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 32.87 | 2724.78 |
cudadrv/errors (2) | 0.23 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 18.3 | 28.18 | 2724.78 |
cudadrv/events (2) | 0.21 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 30.91 | 2724.78 |
cudadrv/execution (2) | 0.90 | 0.00 | 0.1 | 0.00 | N/A | 0.04 | 4.6 | 78.40 | 2724.78 |
cudadrv/memory (2) | 1.84 | 0.00 | 0.0 | 0.00 | N/A | 0.09 | 4.7 | 171.82 | 2724.78 |
cudadrv/module (2) | 0.57 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 30.07 | 2724.78 |
cudadrv/occupancy (2) | 0.13 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 12.21 | 2724.78 |
cudadrv/profile (2) | 0.45 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 9.1 | 61.04 | 2724.78 |
cudadrv/stream (2) | 0.28 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 41.41 | 2724.78 |
cudadrv/version (2) | 0.01 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 0.07 | 2724.78 |
cusolver/cusparse (2) | 7.43 | 0.00 | 0.0 | 0.19 | N/A | 0.16 | 2.1 | 373.26 | 2724.78 |
device/array (2) | 1.95 | 0.00 | 0.0 | 0.00 | N/A | 0.04 | 2.2 | 103.34 | 2724.78 |
Additional test scripts information can be found here
The following command will launch an interactive shell in the Julia container using singularity shell
:
$ singularity shell --nv -B $(pwd)/data:/data julia_v1.5.0.simg
Where:
--nv
: expose the host GPU(s) to the container-B
: a user-bind path specificationThis should produce a Singularity shell prompt within the container:
Singularity: Invoking an interactive shell within container...
Singularity julia_v1.5.0.simg:~>
Inside the container, you may start Julia in REPL mode by typing:
Singularity julia_1.5.0.simg:~> julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.0 (2020-08-01)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia>