A team of researchers, including research engineers from the Netherlands eScience Center, has developed a new open source software package that ranks and scores protein-protein interfaces (PPIs). Called iScore, the software package competes or even outperforms state-of-the-art protein scoring functions and could be generalized for a broad range of applications that involve the ranking of graphs. The software was announced in a recent paper in the journal Software X.
Interactions between proteins that lead to the formation of a three-dimensional (3D) complex is a crucial mechanism that underlies major biological activities in organisms ranging from immune defense system to enzyme catalysis. The 3D structure of such complexes provides fundamental insights on protein recognition mechanisms and protein functions.
The scoring problem
One way researchers in the field of molecular modeling try to predict the 3D structures of such complexes is by using computational docking, a tool that predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex. Despite its potential, however, a major drawback of computational docking is the scoring problem – the question on how to single out models that are likely to occur in real life experiment from the huge pool of generated docking models, in other words how to find a needle in a haystack.
‘The scoring problem has been a highly challenging task for decades.’
‘The scoring problem has been a highly challenging task for decades’, says Dr Nicolas Renaud, eScience Research Coordinator and member of the project team. ‘Over the years, many methods have been developed to overcome this problem. These can largely be grouped into five types: shape complementarity-based methods, physical energy-based methods, statistical potential-based methods, machine learning-based methods and coevolution-based methods. These different scoring approaches are regularly benchmarked against each other during a community-wide challenge: the Critical Assessment of Prediction of Interactions (CAPRI).’
To address the problem, the research team developed iScore. This novel kernel-based machine learning approach represents the interface of a protein complex as an interface graph, with the nodes being the interface residues and the edges connecting the residues in contact. By comparing the graph similarity between the query graph and the training graphs, iScore predicts how close the query graph is to the near-native model.
‘In a recent paper we demonstrated how iScore competes with, or even outperforms various state-of-the-art approaches on two independent test sets: the Docking Benchmark 5.0 set and the CAPRI score set’, says Renaud. ‘Using only a small number of features, iScore performs well compared with IRaPPA, the latest machine learning based scoring function, which exploits 91 features. This demonstrates the advantage of representing protein interfaces as graphs as compared to fixed-length feature vectors which discard information about the interaction topology.’
‘iScore offers a user-friendly solution to ranking PPIs more efficiently and more accurately than several similar scoring functions.’
According to the researchers, iScore offers a user-friendly experience thanks to dedicated workflows that fully automate the process of ranking PPIs. The software also allows to exploit large scale computer architecture by distributing the calculation across a large number of CPUs and GPUs. What’s more, although iScore has been developed specifically for ranking PPIs, the method is generic and could be used more generally for a broad range of applications that involve the ranking of graphs.
Renaud: ‘I am unbelievably proud of what the team has produced. iScore offers a user-friendly solution to ranking PPIs more efficiently and more accurately than several similar scoring functions. In addition, the software is open-source and freely available to use. I encourage researchers everywhere to give iScore a try and experience the benefits for themselves.’
N. Renaud, Y. Jung, V. Honavar, C. Geng, A. Bonvin, L. Xue, ‘iScore: An MPI supported software for ranking protein-protein docking models based on random walk graph kernel and support vector machines’ in Software X (January-June 2020). DOI: 10.1016/j.softx.2020.100462
Read the full paper