Predicting Protein-Ligand Binding Affinities: A Low Scoring Game?

Predicting Protein-Ligand Binding Affinities: A Low Scoring Game?

Predicting Protein-Ligand Binding Affinities: A Low Scoring Game?
Dushyanthan Puvanendrampillai*, Philip M Marsden*, John BO Mitchell and Robert C Glen
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
*Corresponding authors [email protected] , [email protected]

Introduction

Methods

Docking programs search for viable conformations of ligands to reside in target receptor sites and a
scoring function ranks these docked conformations in terms of the quality of the fit between the
ligand and receptor1,2.

Preparation of the Test Set.
The test set used in this study consisted of 205 protein-ligand complexes with experimentally measured Kd values, which have been assembled from scoring function and/or docking evaluation studies.
Dissociation constants of the complexes ranged from -1.49 to -13.97 in log Kd units, spanning over 12 orders of magnitude and measured by a variety of experimental methods.
All ligand molecules bind non-covalently to their target proteins.

The scoring function should ideally also be able to make predictions of binding affinity, allowing
different candidate molecules to be ranked in terms of their predicted binding to a given target3.

Analysis
We have examined the square of the product moment correlation coefficient (r2) and Spearmans rank correlation coefficient (Rs) between the binding free energies given by the five scoring functions with
experimentally measured log Kd values for the 205 protein-ligand complexes in the dataset.
This correlation between calculated binding energies and experimental values provides an indication of the performance of a scoring function, and of how well these values can predict the binding energies of proteinligand interactions.

-16

-4

-2

0

Dataset

No. of
complexes
205

-4

All

-6

BLEEP

PMF

GOLD

r2

Rs

r2

Rs

r2

Rs

r2

Rs

r2

0.59

0.32

0.31

0.11

0.50

0.20

0.43

0.20

0.45

0.18

-8

A

-10

Serine proteinases

35

0.82

0.74

0.75

0.51

0.61

B

-14

C

-18

r 2 = 0.11

function for complex 1AAQ.
Pairwise potentials are calculated between the ligand and
protein atoms within a distance cut-off. Ligand atoms
(red) and protein atoms (green) within 8 of the ligand
are shown in ball and stick representation.

0.83

0.69

0.13

0.00

Metalloproteinases

25

0.72

0.44

0.45

0.33

0.44

0.14

0.42

0.20

0.43

0.17

-16

-14

-12

-10

-8

-6

-4

-2

-16

-14

Experimental log Kd

-12

-10

-8

Carbonic anhydrase ii

18

0.53

0.47

0.46

0.32

0.42

0.34

0.54

0.01

0.50

-6

-4

Sugar binding proteins

30

0.76

0.58

0.45

0.09

0.02

0.00

0.05

0.00

0.29

E

Aspartic proteinases

38

0.08

0.01

- 0.52

0.13

0.02

0.00

0.08

0.01

-0.08 0.00

Figure 8. DOCK calculated log Kd vs. experimental log Kd

0

-2

0

GOLD calculated log K d

-2
-4
-6
-8
-10
-12
-14
-16

-18

-16

-14

-12

-10

-8

-6

-4

Experimental log Kd

0
-2
-4
-6
-8
-10
-12
-14
-16

-10
-12
-14
-16
-18

r = 0.20

Experimental log Kd

0.09

-18
2

0

-8

Figure 9. ChemScore calculated log Kd vs. experimental log Kd

0

-2

-2

-6

0.28

D

0

-4

r2 = 0.32

Figure 7. GOLD calculated log Kd vs. Experimental log Kd

r = 0.20

function for complex 1AAQ.
Based on van der Waals interactions, hydrogen
bonds (yellow dashed lines), and ligand internal
torsional energy. External van der Waals shown by
the two complementary surfaces of the ligand
(green) and the protein (brown).

0.47

-12

2

Figure 3. Representation of the BLEEP scoring

ChemScore

Rs

-18

Figure 2. Representation of the GOLD scoring

DOCK

-16

5,6

The development of such scoring functions to accurately predict experimental binding free energies
for protein-ligand complexes is currently a major challenge in structure-based drug design. We
expect that improvements in the scoring functions will be crucial in addressing this problem.

-6

Figure 6. BLEEP calculated log Kd vs. Experimental log Kd

ChemScore calculated log K d

We used the knowledge-based methods PMF and BLEEP
and empirical scoring functions of
GOLD7, DOCK8 and ChemScore9, as implemented by Sybyl 6.910.

-8

-2

We have compared the results of five different scoring functions to determine the binding energy of
a protein-ligand complex with known three-dimensional structure on a diverse dataset of 205
protein-ligand complexes, and also on various subsets of mutually similar complexes.
4

-10

0

DOCK calculated log K d

isostere inhibitor (PDB ID 1AAQ). Protein secondary structure illustrated in cartoon
representation coloured by secondary structure type. A Connolly molecular surface
shown of the cavity coloured by electrostatic potential. Ligand shown in ball and stick
representation.

-12

PMF calculated log K d

Figure 1. Crystal structure of HIV-1 protease complexed with an hydroxyethylene

-14

Table 1. Correlations Between Experimental and Calculated log Kd Values Given by Five Scoring Functions.
BLEEP calculated log K d

Figure 5. PMF calculated log Kd vs. experimental log Kd

Experimental log Kd

-16

-14

-12

-10

-8

-6

-4

-2

0
-2

0

-4
-6
-8
-10
-12
-14
-16

r2 = 0.18

Experimental log Kd

-18

Results

Conclusions and Discussion

For all 205 protein-ligand complexes

The inescapable conclusion from these results is that the problem of accurately predicting the binding energies of a
large and diverse set of protein-ligand complexes is a difficult one.

BLEEP gives the best agreement between its calculated binding free energy and experimental log K d
values with Rs=0.59.

None of the scoring functions tested here achieved r2 values above 0.32 when tested on the full 205 complex dataset.
This is a disappointing level of performance, although we should note in defence of the GOLD and DOCK functions
that they were designed to identify the correct geometries of bound complexes and not intended to be applied to the
problem of affinity prediction.

GOLD, ChemScore and DOCK have a similar level of binding free energy agreement with Rs values of
0.50, 0.45 and 0.43 respectively.

The headline figures for the correlation coefficient given here seem less impressive than in previous work.

PMF gives an Rs value of 0.31.

This is partly due to the choice of dataset. Six outliers were excluded from that previous analysis of BLEEP 6, which
raised the r2 value from 0.40 to 0.55.

All five scoring functions give modest r2 values, the highest being the 0.32 given by BLEEP.
References

Figure 4. Representation of the DOCK scoring
function for complex 1AAQ.
Based on shape complementarity. Molecular
surface of the ligand (yellow) and the
complementary surface of the protein (green).

6.

Mitchell,J.B.O., Laskowski,R.A., Alex,A., Forster,M.J. and Thornton,J.M. (1999) BLEEP - potential of mean force describing protein-ligand interactions: II. Calculation of binding
energies and comparison with experimental data. J. Comp. Chem., 20, 1177-1185.

7.

Jones,G., Willett,P., Glen,R.C., Leach,A.R. and Taylor,R. (1997) Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol., 267, 727-748.

8.

Kuntz, I.D., Blaney, J.M., Oatley, S.J., Landridge, R. and Ferrin, T.E. (1982) A geometric approach to macromolecule-ligand interactions. J. Mol. Biol., 161, 269-288.

Muegge,I. and Martin,Y.C. (1999) A general and fast scoring function for protein- ligand interactions: a simplified potential approach. J. Med.
Chem., 42, 791-804.

9.

Eldridge,M.D., Murray,C.W., Auton,T.R., Paolini,G.V. and Mee,R.P. (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding
affinity of ligands in receptor complexes. J. Comput.-Aided Mol. Des., 11, 425-445.

Mitchell,J.B.O., Laskowski,R.A., Alex,A. and Thornton,J.M. (1999) BLEEP - potential of mean force describing protein-ligand interactions: I.
Generating potential. J. Comp. Chem., 20, 1165-1176.

10.

SYBYL 6.9, Tripos Inc., 1699 South Hanley Road, St. Louis, Missouri, 63144, USA

1.

Stahl,M. and Rarey,M. (2001) Detailed analysis of scoring functions for virtual screening. J.Med.Chem., 44, 1035-1042.

2.

Perz,C.and Ortiz,A.R. (2001) Evaluation of docking functions for protein-ligand docking. J.Med.Chem., 44, 3768-3785.

3.

Bhm,H.J. (1994) The development of a simple empirical scoring function to estimate the binding constants for a protein-ligand complex of
known three-dimensional structure. J. Comput.-Aided Mol. Des., 8, 243-256.

4.
5.

Acknowledgements
We thank Unilever, the EPSRC and the Newton Trust for their funding, and Tripos Inc. for the use of Sybyl .

Recently Viewed Presentations

  • Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression

    Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression

    Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression. ... Video- Computer Skills Course: Bits, Bytes, Kilobytes, Megabytes, Gigabytes, Terabytes. Bytes and File Sizes. Lesson 1. Activity-Bytes and File Sizes. Text Compression. Lesson 2- Text Compression....
  • VOC sampling in San Antonio

    VOC sampling in San Antonio

    San Antonio ozoneTrends for April-October 2004-2018 ozone data. Estes, Air Quality Science, TCEQ Environmental Trade Fair 2019. Maximum daily eight-hour average (MDA8) ozone is the metric used to assess attainment of the ozone standard.
  • Key Findings of the 2014 REUWS Update Study

    Key Findings of the 2014 REUWS Update Study

    On average the homes use 753,000 BTU/Mo for heating water. Maximum was 1.06 MBTU in Tacoma. Minimum was 321,000 BTU in Scottsdale. Showers are the #1 hot water user, followed by faucet use. Clothes washing is a relatively small hot...
  • Welcome Back! - a2philosophyofreligion.weebly.com

    Welcome Back! - a2philosophyofreligion.weebly.com

    Welcome Back! Philosophy of Religion 13 January 7th, 2014 Mr. Dezilva Attributes: The Nature of God God as: Eternal Omniscient Omnipotent Omnibenevolent The Philosophical problems that arise from these concepts Attributes cont'd Boethius The Consolations of Philosophy (Book V) The...
  • Practice and Exploration of the SIPF 1 I.

    Practice and Exploration of the SIPF 1 I.

    On February 23, 2010, in the SIPF office building, Chairman Chen met the delegation of International Organization of Securities Commissions (IOSCO) led by Ms. Jane Diplock, the Chairperson of the IOSCO Executive Committee, expressed the willing to apply for the...
  • The Chemistry of Life - Chatham

    The Chemistry of Life - Chatham

    Atoms are the smallest unit of matter that cannot be broken down by chemical means. ... is a single sugar -- glucose is a single sugar. Disaccharide . is a two sugar - sucrose (table sugar) and lactose (found in...
  • Maryland Rate Regulation Overview and Potential Impact of

    Maryland Rate Regulation Overview and Potential Impact of

    Total Patient Revenue (TPR) Rate System developed in early 80's (Redefined and Initiated in 2010) Only 10 hospitals on TPR. ... The Chart on the next page reflects this projection. Waiver Cushion Forecast. Discussion Topics. Overview of HSCRC.
  • Cottage Grove: Darling Street Development Urban Solutions Team:

    Cottage Grove: Darling Street Development Urban Solutions Team:

    A benefit of the vegetation in the bioswales is that the complex root systems can effectively filter nitrogen, potassium, and phosphorus from fertilizer, thus degrading water pollutants in the runoff. ... Cain Project Rice University ...