NGC | Catalog
CatalogResourcesPosebusters DiffDock Pre-processing Test Data

Posebusters DiffDock Pre-processing Test Data

Logo for Posebusters DiffDock Pre-processing Test Data
Sample dataset prepared by NVIDIA BioNeMo team from PoseBusters benchmark set.
Latest Version
January 18, 2024
Compressed Size
14.75 MB

DiffDock Dataset

This is a dataset generated following the preparation steps for PoseBusters benchmark set used for training diffdock score and confidence model. The PoseBusters Benchmark set is a new set of 428 carefully-selected publicly-available crystal complexes from the PDB. It is a diverse set of recent high-quality protein-ligand complexes which contain drug-like molecules. It only contains complexes released since 2021. A subset of 50 complexes from this database is used to create and train/validation/test datasets for training test of diffdock as presented in Posebusters DiffDock Processed Sample Data Resource. And 2 samples are selected for use in the data preprocessing test.

How to use the dataset?

You can use BioNeMo Framework to run DiffDock Score/Confidence model training using this dataset.


This dataset is being re-distributed under the same license as PoseBusters benchmark set (Creative Commons Attribution 4.0 International (CC BY 4.0) License)