StyleGAN3 pretrained models for FFHQ, AFHQv2 and MetFaces datasets.
We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation
You can train new networks using train.py. This release contains an interactive model visualization tool that can be used to explore various characteristics of a trained model. To start it, run:
FFHQ-U and MetFaces-U: We built unaligned variants of the existing FFHQ and METFACES datasets. The originals are available at https://github.com/NVlabs/ffhq-dataset and https://github.com/NVlabs/ metfaces-dataset, respectively. The datasets were rebuilt with a modification of the original procedure based on the original code, raw uncropped images, and facial landmark metadata. The code required to reproduce the modified datasets is included in the public release.
AFHQv2: We used an updated version of the AFHQ dataset where the resampling filtering has been improved. The original dataset suffers from pixel-level artifacts caused by inadequate downsampling filters. This caused convergence problems with our models, as the sharp “stair-step” aliasing artifacts are difficult to reproduce without direct access to the pixel grid.
the result quality and training time depend heavily on the exact set of options. The most important ones (--gpus, --batch, and --gamma) must be specified explicitly, and they should be selected with care. See python train.py --help for the full list of options and Training configurations for general guidelines & recommendations, along with the expected training speed & memory usage in different scenarios.
The results of each training run are saved to a newly created directory, for example ~/training-runs/00000-stylegan3-t-afhqv2-512x512-gpus8-batch32-gamma8.2. The training loop exports network pickles (network-snapshot-.pkl) and random image grids (fakes.png) at regular intervals (controlled by --snap). For each exported pickle, it evaluates FID (controlled by --metrics) and logs the result in metric-fid50k_full.jsonl. It also records various statistics in training_stats.jsonl, as well as *.tfevents if TensorBoard is installed.
Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels.
you can check the output of the model in the paper at this address: https://arxiv.org/abs/2106.12423
Copyright (C) 2021, NVIDIA Corporation & affiliates. All rights reserved.
This work is made available under the Nvidia Source Code License.
"inception-2015-12-05.pkl" is derived from the pre-trained Inception-v3 network by Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. The network was originally shared under Apache 2.0 license on the TensorFlow Models repository.
"vgg16.pkl" is derived from the pre-trained VGG-16 network by Karen Simonyan and Andrew Zisserman. The network was originally shared under Creative Commons BY 4.0 license on the Very Deep Convolutional Networks for Large-Scale Visual Recognition project page.
Additionally, "vgg16.pkl" incorporates the pre-trained LPIPS weights by Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. The weights were originally shared under BSD 2-Clause "Simplified" License on the PerceptualSimilarity repository.