Alanine-scanning mutagenesis has proven to be a key experimental tool to identify the residues that are most important for mediating interactions between a protein and a ligand (which could be a small molecule, a nucleic acid, a different protein, or a different region within the same protein, etc.). However, mutating individual / combinations of amino acids to alanine experimentally is both time-consuming and costly. Consequently, several computational alanine scanning (CAS) programs have been developed that enable users to identify hotspot residues that make the greatest contribution to the interaction of interest in silico.
In two recent papers [1, 2], the Sessions and Woolfson groups at the University of Bristol, in collaboration with the Wilson group at the University of Leeds, introduce BUDE Alanine Scan (BAlaS), a command-line tool to run CAS within the BUDE force field [3, 4]. BAlaS enables high-throughput CAS of both individual X-ray crystal structures plus ensembles of structures obtained from NMR / MD analysis. In Ibarra et al. (2019), the authors introduce the BAlaS software, and benchmark its performance relative to a range of pre-existing CAS software, firstly on the SKEMPI database (a database of 3047 binding free energy changes upon mutation, collated from the literature, for PPIs with known structure), and secondly on newly acquired alanine-scanning mutagenesis data collected for three diverse protein-protein interactions. They find that BAlaS predicts the experimentally identified hotspot residues with comparable accuracy to the best-performing CAS software, whilst running considerably faster. However, the authors also find that averaging the results from all CAS software tested achieves better results than any of the individual programs, and hence recommend CAS users adopt such a meta-scoring approach.
Figure 1: Comparison of experimentally measured vs. computationally predicted (via 6 different CAS software packages) DDG values for individual alanine mutations of interface residues in three diverse protein-protein interactions: a) NOXA-B/MCL-1 (PDB model 1); b) SIMS/SUMO (PDB model 1); and c) GKAP/SHANK-PDZ (chains A and C).
In Wood et al. (2020), the authors introduce an interactive web application to run BAlaS (https://pragmaticproteindesign.bio.ed.ac.uk/balas/). This interface is simple and intuitive, allowing non-expert users to easily run CAS on their own protein-ligand interactions of interest. In addition to allowing mutation of individual residues to alanine, the interface also allows the user to simultaneously mutate constellations of residues (selected either manually or via all permutations of a specified constellation size within a selected subset of residues) to alanine in order to identify “hotspot patches” of residues within the ligand. The capability to perform mutations to residues other than alanine, available in the command-line tool, will be introduced in a future version of the web application.
Figure 2: The BAlaS web interface. Here DDG values are calculated for residues in the transactivation domain of p53 (the “ligand”) bound to MDM2 (the “receptor”) (PDB structure 1YCR).
A. A. Ibarra et al., Predicting and experimentally validating hot-spot residues at protein–protein interfaces. ACS Chem Biol14, 2252-2263 (2019).
C. W. Wood et al., BAlaS: fast, interactive and accessible computational alanine-scanning using BudeAlaScan. Bioinformatics (2020).
S. McIntosh-Smith, T. Wilson, A. Á. Ibarra, J. Crisp, R. B. Sessions, Benchmarking energy efficiency, power costs and carbon emissions on heterogeneous systems. The Computer Journal55, 192-205 (2012).
S. McIntosh-Smith, J. Price, R. B. Sessions, A. A. Ibarra, High performance in silico virtual drug screening on many-core processors. The International Journal of High Performance Computing Applications29, 119-134 (2015).
Intricate molecular choreography separates the bound and unbound states of proteins, resulting in slow binding kinetics that make simulating these events difficult. Interactive molecular dynamics (iMD) allows simulations to be driven to a desired state using a combination of human chemical and spatial intuition and current advances present virtual reality (VR) as a novel strategy for simulation biasing. Previous work shows VR interfaces to iMD (iMD-VR) are sufficiently performant for basic molecular manipulation tasks, but this begs the question of whether these interfaces can be used for more complex tasks, namely correctly establishing bound poses between proteins and ligands.
Our iMD-VR framework for interactively chaperoning drug unbinding and rebinding events was tested on a cohort of non-expert users (n=10), defined as being unfamiliar with the iMD-VR interface. Three example protein-ligand systems were chosen, and participants were asked to undock and redock a ligand from these proteins twice, first with the aid of a trace representation of the ligand in the correct position to guide them, and once again with no trace guide present. Below is a representation of two people in VR undocking a ligand from a protein, and also the three protein-ligand systems used in the study, with key interactions highlighted.
We show that it is possible to unbind and rebind ligands from three protein targets in a simulation timescale totalling less than 50ps, simulated at a rate of 4.5 ps/min of real time. Where non-expert users had trace atoms showing them the correct pose, all users were able to establish a docking pose within 2Å of the starting structure (1Å for two out of the three tasks). Where no trace atoms were present, binding poses understandably had higher variation, however, participants were still able to get within the same range of RMSD for all three systems. These results were achieved within a single hour-long training session with each participant and are presented below.
Our results show that, even with very limited experience, iMD-VR is performant enough to allow (a) unbinding of a ligand from a protein binding pocket and (b) re-establishment of the original binding pose, demonstrating iMD-VR as a useful tool for sampling states where a ligand is both bound and unbound from a protein.
Helen M. Deeks, Rebecca K. Walters, Stephanie R. Hare, Michael B. O’Connor, Adrian J. Mulholland and David R. Glowacki
In this recent publication included in the ‘Women in Computational Chemistry’ special issue of the Journal of Chemical Information and Modeling, we investigate the dependence of predicted reaction energetics on the choice of density functional and QM region size used in QM/MM calculations. DFT methods potentially offer a good combination of accuracy and computational cost but suffer from some well-known limitations and are not systematically improvable. The Claisen rearrangement of chorismate to prephenate, a simple unimolecular reaction catalysed by the enzyme chorismate mutase, was used as a model system to investigate how these choices for QM/MM calculations impact the predicted energetics of reaction.
There are many different density functionals available and the best density functional to choose for any application is not obvious. Commonly used density functionals (e.g. B3LYP, BH&HLYP, MO6-2X) predict barrier heights for the reaction in the enzyme that differ by as much 13 kcal/mol. When different density functionals give different results for the same system, which result should be preferred? Ab initio methods are potentially highly accurate and are systematically improvable but often restricted by their computational cost. Projector-based embedding allows the incorporation of ab initio methods into QM/MM calculations at reasonable computational cost as a small number of atoms can be selected for treatment at the highest levels. Embedding SCS-MP2 in DFT in these QM/MM calculations reduced the spread in predicted barrier height from 13 kcal/mol to 0.3 kcal/mol, essentially eliminating the dependence on the density functional, as shown below (top: standard DFT, bottom: SCS-MP2-in-DFT).
The optimum size of the QM region in QM/MM calculations is a matter of much debate in the literature. The effect of the size of the QM region was tested in solution (by adding additional water molecules to the QM region) and in the enzyme (by adding additional arginine side chains) showing that the difference in barrier heights predicted by common density functionals for any size of the QM region is significantly larger than the change in barrier caused by increasing the size of the QM region.
Ranaghan, K. E., Shchepanovska, D., Bennie, S. J., Lawan, N., Macrae, S. J., Zurek, J., Manby, F. R. and Mulholland, A. J.
J. Chem. Inf. Model., 2019
Our enzyme of interest—actinorhodin ketoreductase (actKR)—is found in the soil bacterium Streptomyces coelicolor, where it is normally implicated in the biosynthesis of the antibiotic actinorhodin. In addition to actKR’s natural scope, as is the case with other ketoreductases, some of its re-engineered variants are highly attractive to synthetic chemists by virtue of their high stereoselectivity in reducing small-molecule achiral ketones to chiral alcohols.
Experimental work on actKR featuring the small model substrate trans-1-decalone (a bicyclic aliphatic ketone) and similarly-sized chiral alcohols suggests that whereas the wild-type enzyme is mildly S-selective, some variants (e.g. the P94L mutant) were entirely S- selective, whereas others (e.g. the V151L mutant) only exhibited R- selectivity.
Featuring classical MD, MM/PBSA and hybrid QM/MM MD with umbrella sampling, our study successfully unravels the causes of such remarkable behaviour in wild-type, P94L, and V151L actKR towards trans-1-decalone. Explicitly examining both enantiomers of this naturally racemic substrate (something difficult to achieve in vitro), we conclude that changes in stereocontrol across actKR variants can be dictated by a subtle interplay of different causes (including reaction barrier height and accessibility of reactive poses). Interestingly, however, which factor is dominant in conferring stereoselectivity differs per variant.
The protocols we have used were chosen such that they (1) require input of the WT structure only; (2) use relatively limited computational resources (short simulations and semiempirical QM treatment); and (3) can be automated. Our study is thereby a good example of how computational biochemistry can become a practical, useful and efficient tool in biocatalyst engineering, offering perspectives that might otherwise be difficult to explore in vitro.
Our manuscript entitled, “The dynamical interplay between a megadalton peptide nanocage and solutes probed by microsecond atomistic MD; implications for design” has been accepted for publication in Physical Chemistry Chemical Physics (PCCP), (2018), DOI: 10.1039/c8cp06282j. Deborah K. Shoemark, Amaurys Avila Ibarra, James F. Ross, Joseph L. Beesley, Harriet E.V. Bray, Majid Mosayebi, Noah Linden, Tanniemola B. Liverpool, Simon N. McIntosh-Smith, Derek N. Woolfson and Richard B. Sessions.
Here we present the lessons learnt from performing atomistic simulations of 0.6 – 1 microseconds of three ~42 million atom peptide nanocage (SAGE) systems, with and without protein and small molecule solutes. Using GROMACS and the UK supercomputer Archer, data were collected over 2 years. Detailed analysis reveals the structural integrity/helical stability of the SAGEs themselves, how SAGEs interact with the proteins and small molecule species in the systems, whether contact with SAGE influences the native fold of solute proteins and how frequently solutes or ions pass through pores in the SAGEs. Knowing how and where to elaborate and/or modify the SAGE building blocks is important to inform the design process for this synthetic biological approach to diverse applications. As a vaccine delivery platform, SAGEs that are modular and reproducible, could be tailored to respond to emerging infectious threats more rapidly than conventional methods. For targeted drug delivery, adding peptides that bind SAGEs packed with drug, to specific cell types could reduce dosage demand and side-effects. As nanoreactors for complex chemistry, encapsulating enzyme pathways may allow for more efficient and environmentally friendly catalysis.
Constant pH Molecular Dynamics (CpHMD) is a simulation technique gaining in popularity owing to its ability to titrate ionisable residues and predict pKa’s using standard molecular mechanics force fields. State of the art CpHMD simulations involve running pH replica exchange MD (pH-REMD) simulations using an explicit solvent model to accurately sample the protein conformational and protonation state spaces. It has been successfully applied to a variety of natural proteins to estimate the pKa of key residues and provide mechanistic insights into pH-dependant phenomena.
In a recent paper published in Nature Communications, BCompB members Eric Lang and Adrian Mulholland pushed the boundaries of CpHMD simulations further by investigating the pH dependency of de novo α-helical barrels designed with a ring of 6 or 7 of strongly interacting glutamate residues whose side chains point toward the inside of the central, otherwise hydrophobic, lumen.
The CpHMD simulations were used to investigate the pH-dependant stability of the hexameric and heptameric α-helical barrels (named CC-Type2-LL-L17E and CC-Type2-IL-Sg-L17E respectively) observed experimentally by circular dichroism (CD). For each barrel, two 200 ns pH-REMD simulations in TIP3P water, were carried out with all ionisable residues allowed to titrate. A total of 16 replicas was used to cover a pH range between 3.0 and 10.5, with one replica per 0.5 pH unit, representing an aggregate sampling time of 6.4 µs for each structure.
Because the glutamate residues (Glu-17) are in close proximity and interact strongly, the authors anticipated that the individual microscopic pKa of each glutamate residue made little sense and instead considered the Glu-17 rings as single polyprotic species composed of 6 or 7 titrable groups. Using this approach, a macroscopic stepwise pKa can be calculated for the (Glu-17)6 and (Glu-17)7 species, giving results in excellent agreement with the experimental data. These calculations revealed that up to two negative charges can be accommodated inside the barrels but increasing the number of charged Glu dramatically decreases the stability of the barrels. The simulations also revealed that the two buried negative charges are accommodated due to the solvation of the rings of Glu-17 residues and the presence of a Na+ counter ion in the middle of the rings. Increasing the number of deprotonated Glu-17 above two leads to a reorientation of the Glu side chains to access bulk solvent. This, in turn, leads to disruption of the secondary structure and ultimately to the opening of the barrels.
The ring of Glu-17 residues in the central lumen of a hexameric α-helical barrel
The paper, which primarily presents the experimental and design work lead by Guto Rhys and Derek Woolfson, also provides an example of how the Isambard software, developed in the Wolfson group, was used by BCompB member Christopher Wood to rationalise the diversity of structures formed by the peptides studied in this work.
Growing resistance to antibiotics is a major threat to human health and there is an urgent need to develop new antibiotics or find new ways to keep existing antibiotics effective. β-lactamases are a class of enzymes that have evolved to break down penicillin-type antibiotics, leading to resistance. One method to avoid this is to administer a β-lactamase inhibitor as a resistance blocker. Clavulanic acid is one such example and is often combined with amoxicillin to form the drug commonly known as co-amoxiclav. In a recent paper published in Biochemistry, van der Kamp and co-workers have used combined quantum mechanics / molecular mechanics (QM/MM) methods to investigate how clavulanic acid can be broken down by different β-lactamases. TEM-1 and KPC-2 are clinically important enzymes that exhibit different susceptibilities to inhibition by clavulanic acid, with KPC-2 effectively resistant to inhibition by clavulanic acid.
QM/MM simulations were carried in AMBER with the semiempirical SCC-DFTB method used for the QM region and the MM region described by AMBER’s ff14SB force field. Complete two-dimensional free energy surfaces were then obtained by QM/MM umbrella sampling (US) simulations along two reaction coordinates: one to describe the proton transfer and one to describe the nucleophilic attack involved in deacylation step of the reaction. The weighted-histogram analysis method (WHAM) was used to generate the free energy surfaces and the minimum free energy path (MEP) on the surfaces was determined using the minimum energy pathway analysis for energy landscapes (MEPSA) software. The results show that the QM/MM screening protocol is reliable in discriminating the inhibitory activity of different covalent complexes in class A β-lactamases and can differentiate enzymes for which clavulanic acid is an effective inhibitor (TEM-1) from those for which it is not (KPC-2). Such screening techniques will prove to be useful in the development of new strategies to overcome antibiotic resistance.
In a recent publication in PNAS, Corey et al (Collinson Group, School of Biochemistry, University of Bristol) work on the bacterial form of the general secretion system – aka the SecY machinery. This complex carries out the bulk of pre-protein secretion at the bacterial plasma membrane, powered by both ATP and the proton-motive force (PMF). They were interested in the interaction of SecY with the energetically-important cardiolipin (CL) molecule. CL is thought to be involved in many different bioenergetics processes, and has been previously implicated in SecY activity.
To approach this problem, they coupled the high predictive power of coarse-grained (CG) molecular dynamics (MD) with experimental analyses. Considerable speed up on atomistic simulation can be achieved using CG force fields, such as the Martini force field for biomolecules. By reducing the degrees of freedom of a system, it is possible to achieve sampling orders of magnitude faster than atomistic simulation – driven primarily by an increase in permissible MD step size, a reduction in interactions to compute per step and a smoothing of the energy landscape.
The CG data revealed two distinct CL binding sites on the SecY surface, which they were able to validate using native mass spectrometry (nMS), with Dr Argyris Politis at King’s College London, and FRET-based analysis on carefully designed variants of SecY.
They then used these knockout variants to more deeply understand the importance of the SecY-CL interactions. Using a range of biochemical assays, they reveal that the specific interaction of CL at these sites is responsible for the previously-recorded heightened activity of SecY. Moreover, they establish a hitherto unknown role for CL in the PMF-driven activity of SecY. This is the first direct evidence of CL acting directly with the PMF in any bioenergetic system.
Tangible physical models have an important place in the history of chemistry and biochemistry. Three-dimensional (3D) molecular models have been used as conceptual and educational tools dating back to at least von Hofmann in the 1860s. Now we are able to further our understanding by interacting with molecules in real time as they move using virtual reality (VR) technology. The multi-user VR framework has been developed by a joint team of computer science and chemistry researchers at the University of Bristol, in collaboration with developers at Interactive Scientific and Oracle Corporation, have used Oracle’s public cloud infrastructure to combine real-time molecular simulations with the latest VR technology.
The movie below shows a researcher taking hold of a fully solvated benzylpenicillin ligand and interactively guiding it to dock it within the active site of the TEM-1 β-lactamase enzyme (with both molecules fully flexible and dynamic) and generate the correct binding mode. The simulations make use of a plug-in that communicates with the GPU-accelerated molecular simulation package OpenMM via PLUMED, allowing models of the type used for conventional MD simulations to be used directly within the VR framework. Generating a benzylpenicillin docking pathway would be very difficult just using standard computer algorithms, but as can be seen in the movie, the VR framework is intuitive and easy to control, enabling the researcher to generate a physically meaningful path. Anybody wishing to try out the tasks described in the paper can download the software at https://isci.itch.io/nsb-imd, and launch their own cloud-hosted session.
Through analysis of known protein structures, it is possible to gain insight into the rules that dictate how proteins fold. However, the number of experimentally determined protein structures is large and growing rapidly, which makes even the categorisation of protein structure difficult to perform. Computational tools can ameliorate this process, through automated categorisation and analysis.
A team from the University of Bristol, led by Prof Dek Woolfson, have recently published an article on AtlasCC. This computational resource automatically analyses the PDB to find an important protein substructure called the α-helical coiled coil and uses graph theory enumerate all the possible and observed structures this fold can adopt. These data are made accessible using a user-friendly, interactive web application that enables users to browse the structures. The application also identifies regions of coiled-coil structure space that has not been explored by nature, indicating possible opportunities for de novo design.