zPoseScore model for accurate and robust protein-ligand docking pose
scoring in CASP15
Abstract
We introduce a deep learning-based ligand pose scoring model called
zPoseScore for predicting protein-ligand complexes in the 15th Critical
Assessment of Protein Structure Prediction (CASP15). Our contributions
are three-fold: firstly, we generate six training and evaluation
datasets by employing advanced data augmentation and sampling methods.
Secondly, we redesign the “zFormer” module, inspired by AlphaFold2’s
Evoformer, to efficiently describe protein-ligand interactions. This
module enables the extraction of protein-ligand paired features that
lead to accurate predictions. Lastly, we develop the zPoseScore
framework with zFormer for scoring and ranking ligand poses, allowing
for atomic-level protein-ligand feature encoding and fusion to output
refined ligand poses and ligand per-atom deviations. Our results
demonstrate excellent performance on various testing datasets, achieving
Pearson’s correlation R = 0.783 and 0.659 for ranking docking decoys
generated based on experimental and predicted protein structures of
CASF-2016 protein-ligand complexes. Additionally, we obtain an averaged
lDDT = 0.558 of AIchemy_LIG2 in CASP15 for de novo
protein-ligand complex structure predictions. Detailed analysis shows
that accurate ligand binding site prediction and side-chain orientation
are crucial for achieving better prediction performance. Our proposed
model is one of the most accurate protein-ligand pose prediction models
and could serve as a valuable tool in small molecule drug discovery.