NGC Catalog

CLASSIC

Welcome Guest

For copy image paths and more information, please view on a desktop device.

Description

Lightweight framework for training models at scale, without the boilerplate. Train on any number of GPUs or nodes without changing your code, and turn on advanced training optimizations with a switch of a flag.

Publisher

GridAI

Latest Tag

v1.4.0

Modified

April 4, 2023

Compressed Size

6.38 GB

Multinode Support

Multi-Arch Support

What is PyTorch Lightning?

PyTorch Lightning is a powerful yet lightweight PyTorch wrapper, designed to make high performance AI research simple, allowing you to focus on science, not engineering. PyTorch Lightning is just organized PyTorch, but allows you to train your models on CPU, GPUs or multiple nodes without changing your code. Lightning makes state-of-the-art training features trivial to use with a switch of a flag, such as 16-bit precision, model sharding, pruning and many more.

Lightning ensures that when your network becomes complex your code doesn’t.

Refactoring your models to lightning is simple, allows you to get rid of a ton of boilerplate, reduce cognitive load, and gives you the ultimate flexibility to iterate on research ideas faster with all the latest deep learning best practices.

Lightning Design Philosophy

Lightning structures PyTorch code with these principles:

Lightning forces the following structure to your code which makes it reusable and shareable:

Research code (the LightningModule).
Engineering code (you delete, and is handled by the Trainer).
Non-essential research code (logging, etc... this goes in Callbacks).
Data (use PyTorch Dataloaders or organize them into a LightningDataModule).

Once you do this, you can train on multiple-GPUs, CPUs and even in 16-bit precision without changing your code!

Advantages over unstructured PyTorch

Models become hardware agnostic
Code is clear to read because engineering code is abstracted away
Easier to reproduce
Make fewer mistakes because lightning handles the tricky engineering
Keeps all the flexibility (LightningModules are still PyTorch modules), but removes a ton of boilerplate
Lightning has dozens of integrations with popular machine learning tools.
Tested rigorously with every new PR.

Get started with our 2 step guide.

How To Use

Setup

docker pull nvcr.io/partners/gridai/pytorch-lightning:v1.3.7

Run example script on multi GPUs

# for single GPU
docker run --rm -it nvcr.io/partners/gridai/pytorch-lightning:v1.3.7 bash home/pl_examples/run_examples-args.sh --gpus 1 --max_epochs 5 --batch_size 1024

# for 4 GPUs
docker run --rm -it nvcr.io/partners/gridai/pytorch-lightning:v1.3.7 bash home/pl_examples/run_examples-args.sh --gpus 4 --max_epochs 5 --batch_size 1024

Examples

Hello world

MNIST hello world

Image Classification

Cifar10 baseline

Contrastive Learning

NLP

BERT
GPT-2

Reinforcement Learning

Vision

Classic ML

Support

If you have any questions please:

Licence

Please observe the Apache 2.0 license that is listed in this repository. In addition the Lightning framework is Patent Pending.