NGC | Catalog
CatalogResourcesPosebusters DiffDock Pre-processing Test Data

Posebusters DiffDock Pre-processing Test Data

Logo for Posebusters DiffDock Pre-processing Test Data
Description
Sample dataset prepared by NVIDIA BioNeMo team from PoseBusters benchmark set.
Publisher
NVIDIA
Latest Version
1.1
Modified
January 18, 2024
Compressed Size
14.75 MB

DiffDock Dataset

This is a dataset generated following the preparation steps for PoseBusters benchmark set used for training diffdock score and confidence model. The PoseBusters Benchmark set is a new set of 428 carefully-selected publicly-available crystal complexes from the PDB. It is a diverse set of recent high-quality protein-ligand complexes which contain drug-like molecules. It only contains complexes released since 2021. A subset of 50 complexes from this database is used to create and train/validation/test datasets for training test of diffdock as presented in Posebusters DiffDock Processed Sample Data Resource. And 2 samples are selected for use in the data preprocessing test.

How to use the dataset?

You can use BioNeMo Framework to run DiffDock Score/Confidence model training using this dataset.

License

This dataset is being re-distributed under the same license as PoseBusters benchmark set (Creative Commons Attribution 4.0 International (CC BY 4.0) License)