NGC | Catalog
CatalogResourcesPosebusters DiffDock Sample Data

Posebusters DiffDock Sample Data

Logo for Posebusters DiffDock Sample Data
Sample dataset prepared by NVIDIA BioNeMo team from PoseBusters benchmark set.
Latest Version
January 18, 2024
Compressed Size
228.22 MB

DiffDock Dataset

This is a dataset generated following the preparation steps for PoseBusters benchmark set used for training diffdock score and confidence model. The PoseBusters Benchmark set is a new set of 428 carefully-selected publicly-available crystal complexes from the PDB. It is a diverse set of recent high-quality protein-ligand complexes which contain drug-like molecules. It only contains complexes released since 2021. A subset of 50 complexes from this database is used to create and train/validation/test datasets for training test of diffdock.

How to use the dataset?

You can use BioNeMo Framework to run DiffDock Score/Confidence model training using this dataset.


This dataset is being re-distributed under the same license as PoseBusters benchmark set (Creative Commons Attribution 4.0 International (CC BY 4.0) License)