StyleGAN2 pretrained models

NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

For use with the official StyleGAN3 implementation: https://github.com/NVlabs/stylegan3

Publisher

NVIDIA

Latest Version

Modified

November 12, 2024

Size

4.79 GB

Model Overview

StyleGAN2 pretrained models for FFHQ (aligned & unaligned), AFHQv2, CelebA-HQ, BreCaHAD, CIFAR-10, LSUN dogs, and MetFaces (aligned & unaligned) datasets. This model is ready for non-commercial uses.

We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.

References

Model Architecture

You can see the details of this model on this link: https://nvlabs.github.io/stylegan3 and the related paper can be find here: https://nvlabs.github.io/stylegan3/

Training

You can train new networks using train.py. This release contains an interactive model visualization tool that can be used to explore various characteristics of a trained model. To start it, run:

python visualizer.py

Dataset

StyleGAN2 pretrained models for these datasets: FFHQ (aligned & unaligned), AFHQv2, CelebA-HQ, BreCaHAD, CIFAR-10, LSUN dogs, and MetFaces (aligned & unaligned) datasets.

Performance

the result quality and training time depend heavily on the exact set of options. The most important ones (--gpus, --batch, and --gamma) must be specified explicitly, and they should be selected with care. See python train.py --help for the full list of options and Training configurations for general guidelines & recommendations, along with the expected training speed & memory usage in different scenarios.

The results of each training run are saved to a newly created directory, for example ~/training-runs/00000-stylegan3-t-afhqv2-512x512-gpus8-batch32-gamma8.2. The training loop exports network pickles (network-snapshot-.pkl) and random image grids (fakes.png) at regular intervals (controlled by --snap). For each exported pickle, it evaluates FID (controlled by --metrics) and logs the result in metric-fid50k_full.jsonl. It also records various statistics in training_stats.jsonl, as well as *.tfevents if TensorBoard is installed.

How to Use this Model

Requirements:

Linux and Windows are supported, but we recommend Linux for performance and compatibility reasons.
1–8 high-end NVIDIA GPUs with at least 12 GB of memory. We have done all testing and development using Tesla V100 and A100 GPUs.
64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
CUDA toolkit 11.1 or later.
GCC 7 or later (Linux) or Visual Studio (Windows) compilers. Recommended GCC version depends on CUDA version.

Input

Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels.

Output

you can check the output of the model in the paper at this address: https://arxiv.org/abs/2106.12423

License

This work is made available under the Nvidia Source Code License.

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards here. Please report security vulnerabilities or NVIDIA AI Concerns here.