Microsoft Cognitive Toolkit

Microsoft Cognitive Toolkit

Logo for Microsoft Cognitive Toolkit
Description
The Microsoft Cognitive Toolkit, formerly known as CNTK, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph.
Publisher
Microsoft Research
Latest Tag
18.08-py3
Modified
May 1, 2024
Compressed Size
2.16 GB
Multinode Support
No
Multi-Arch Support
No
18.08-py3 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

What is the Microsoft Cognitive Toolkit?

The Microsoft Cognitive Toolkit, formerly known as CNTK, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs.

Running the Microsoft Cognitive Toolkit

Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers And Frameworks User Guide and specify the registry, repository, and tags. For more information about using NGC, refer to the NGC Container User Guide.

The method implemented in your system depends on the DGX OS version installed (for DGX systems), the specific NGC Cloud Image provided by a Cloud Service Provider, or the software that you have installed in preparation for running NGC containers on TITAN PCs, Quadro PCs, or vGPUs.

Procedure

  1. Select the Tags tab and locate the container image release that you want to run.
  2. In the Pull Tag column, click the icon to copy the docker pull command.
  3. Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.
  4. Run the container image. A typical command to launch the container is:
docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/cntk:xx.xx
 Where:
   - `-it` means run in interactive mode
   - `--rm` will delete the container when finished
   - `-v` is the mounting directory
   - `local_dir` is the directory or file from your host system (absolute path) that you want to access from inside your container.  For example, the `local_dir` in the following path is `/home/jsmith/data/mnist`.  

      ```
      -v /home/jsmith/data/mnist:/data/mnist
      ```
      
      If you are inside the container, for example, `ls /data/mnist`, you will see the same files as if you issued the `ls /home/jsmith/data/mnist` command from outside the container.
      
   - `container_dir` is the target directory when you are inside your container.  For example, `/data/mnist` is the target directory in the example:
    
      ```
      -v /home/jsmith/data/mnist:/data/mnist
      ```
    
   - `xx.xx` is the container version.  For example, `18.01`.

a.  When running on a single GPU, the Microsoft Cognitive Toolkit can be invoked using a command similar to the following:
```
cntk configFile=myscript.cntk ...
```
b.  When running on multiple GPUs, run the Microsoft Cognitive Toolkit through MPI.  The following example uses four GPUs, numbered 0..3, for training:
```
export OMP_NUM_THREADS=10
export CUDA_DEVICE_ORDER=PCI_BUS_ID
export CUDA_VISIBLE_DEVICES=0,1,2,3
mpirun --allow-run-as-root --oversubscribe --npernode 4 \
       -x OMP_NUM_THREADS -x CUDA_DEVICE_ORDER -x CUDA_VISIBLE_DEVICES \
       cntk configFile=myscript.cntk ...
```
c.  When running all eight GPUs of DGX-1 together is even more simple:
```
export OMP_NUM_THREADS=10
mpirun --allow-run-as-root --oversubscribe --npernode 8 \
       -x OMP_NUM_THREADS cntk configFile=myscript.cntk ...
```

When running the Microsoft Cognitive Toolkit containers, it is important to include at least the following
options:
```
docker run --gpus all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ... nvcr.io/nvidia/cntk:17.02 ...
```
You might want to pull in data and model descriptions from locations outside the container for use by the Microsoft Cognitive Toolkit.  To accomplish this, the easiest method is to mount one or more host directories as [Docker data volumes](https://docs.docker.com/engine/tutorials/dockervolumes/#/mount-a-host-directory-as-a-data-volume). You have pulled the latest files and run the container image.

Note: In order to share data between ranks, NCCL may require shared system memory for IPC and pinned (page-locked) system memory resources. The operating system’s limits on these resources may need to be increased accordingly. Refer to your system’s documentation for details. In particular, Docker containers default to limited shared and pinned memory resources. When using NCCL inside a container, it is recommended that you increase these resources by issuing:

   ```    
   --shm-size=1g --ulimit memlock=-1
   ```
   
   in the command line to:
   ```
   docker run --gpus all
   ```
  1. See /workspace/README.md inside the container for information on customizing your the Microsoft Cognitive Toolkit image.

Suggested Reading

For the latest Release Notes, see the Microsoft Cognitive Toolkit Release Notes Documentation website.

For more information about the Microsoft Cognitive Toolkit, including tutorials, documentation, and examples, see the CNTK wiki.