## Quick Start Guide ### Docker To run docker MXNet container, run: `nvidia-docker run --rm -it --ipc=host -v :/workspace/resnet50 -v :/data/imagenet/train-val-recordio-passthrough nvcr.io/nvidia/mxnet:18.12-py3` It will also automatically start downloading the MXNet container if you haven't downloaded it yet. You can also download it manually by running: `nvidia-docker pull nvcr.io/nvidia/mxnet:18.12-py3` If you haven't prepared dataset yet (see section below), download raw ImageNet dataset (see section below), and run: `nvidia-docker run --rm -it --ipc=host -v :/workspace/resnet50 -v :/data/imagenet/train-val-recordio-passthrough -v :/data/imagenet/raw nvcr.io/nvidia/mxnet:18.12-py3` and follow step from Prepare Dataset section. ### Prepare Dataset The MXNet ResNet50 v1.5 script operates on ImageNet 1k, a widely popular image classification dataset from ILSVRC challenge. You can download the images from http://image-net.org/download-images The recommended data format is [RecordIO](http://mxnet.io/architecture/note_data_loading.html), which concatenates multiple examples into seekable binary files for better read efficiency. MXNet provides a tool called `im2rec.py` located in the `/opt/mxnet/tools/` directory. The tool converts individual images into `.rec` files. To prepare RecordIO file containing ImageNet data, we first need to create .lst files which consist of the labels and image paths. We assume that the original images were downloaded to `/data/imagenet/raw/train-jpeg` and `/data/imagenet/raw/val-jpeg`. ```bash python /opt/mxnet/tools/im2rec.py --list --recursive train /data/imagenet/raw/train-jpeg python /opt/mxnet/tools/im2rec.py --list --recursive val /data/imagenet/raw/val-jpeg ``` Then we generate the `.rec` (RecordIO files with data) and `.idx` (indexes required by DALI to speed up data loading) files. To obtain the best training accuracy we do not preprocess the images when creating RecordIO file. ```bash python /opt/mxnet/tools/im2rec.py --pass-through --num-thread 40 train /data/imagenet/raw/train-jpeg python /opt/mxnet/tools/im2rec.py --pass-through --num-thread 40 val /data/imagenet/raw/val-jpeg ``` ### Running training To run training for a standard configuration (1/4/8 GPUs, FP16/FP32), run one of the scripts in the `./examples` directory called `./examples/RN50_{FP16, FP32}_{1, 4, 8}GPU.sh`. By default the training scripts run the validation and save checkpoint after each epoch. Checkpoints will be stored in `model-symbol.json` and `model-.params` files. If imagenet is mounted in the `/data/imagenet/train-val-recordio-passthrough` directory, you don't have to specify `--data-root` flag. To run a non standard configuration use: `./runner -n -b --data-root --dtype --model-prefix ` Checkpoints will be stored in `-symbol.json` and `-.params` files. To generate JSON report with performance and accuracy stats, use `--report ` flag (see `report.py` for info about JSON report file structure). Use `./runner -h` and `python ./train.py -h` to obtain the list of available options. ### Running inference To run inference on a checkpointed model run: * For FP16 `./examples/SCORE_FP16.sh ` * For FP32 `./examples/SCORE_FP32.sh `