Blog Archives

Worldwide CompChem in the Fight against COVID-19

The war against COVID-19 has been waged in many fronts. The computational chemistry community has done their share during this pandemic to put forward a cure, a vaccine, or a better understanding of the molecular mechanisms behind the human infection by the SARS-CoV-2 virus. As few vaccines show currently their heads and start making their way around the globe to stop the spreading, amidst a climate of disinformation, distrust and political upheaval, all of which pose several challenges yet to be faced aside from the technical and scientific ones.

This is by no means a comprehensive review of the literature, in fact, most of the cited literature herein was observed in Twitter under the #CompChem and #COVID combined hashtags; Summarizing the research by the CompChem community on COVID-19 related topics in a single blog-post would be near to impossible—I trust a book is being written on it as I type these lines.

The structural elucidation of the proteins associated to the SARS-CoV-2 virus is probably the first step required in designing chemical compounds capable of modifying their functions and altering their life-cycle without altering the biochemistry of the hosts. The Coronavirus Structural Taskforce has elucidated the structure of 28 proteins of SARS-CoV-2 aside from the 300+ proteins from the previous SARS-CoV virus using the tools from the FoldIt at home game based on the Rosetta program to heuristically predict the structure of these proteins. Structure based drug design rely on the knowledge of the structure of the active site (hence the name), but in the case of newly discovered proteins for which homology modeling is not entirely feasible, a ligand-based approach named D3Similarity was developed early in the pandemic for identifying the possible active sites by the group of Prof. Zhijian Xu. Mapping of the of the viral genome and proteome was also achieved early on during the first dates of lockdown in the American continent. The information was readily made available and usable for further studies which prompts another challenge: the rapid dissemination, review and evaluation of information to make scientifically sound claims and make data-based decisions. In this regard, the role of preprints cannot be stressed enough. Without a rapid communication, scientific results cannot generate a much needed critical mass to turn all these data into knowledge. As evidenced by the vast majority of the links present in this post, ChemRXiv from the ACS served the much needed function to gather, link and put the data for scientific evaluation out there in order to accelerate the discovery of solutions to the various steps of the virus’ reproductive cycle through various strategies.

The role of supercomputing has been paramount worldwide to the various efforts made in CompChem (read the C&EN piece) in various fronts from structural elucidation, such as the AI driven structural modelling of spike proteins and their infection mechanism led by Prof. Rommie Amaro (UCSD) and Dr. Arvind Ramanathan which was celebrated by the Bell Prize, to development of vaccines. Many Molecular Dynamics simulations have been performed on potential inhibitors of proteins such as the spike protein, in some cases these simulations coupled with cryo-EM microscopy allowed for the elucidation of the hinging mechanism of these spike proteins, their thermodynamic properties, and all atoms-simulations assessed the rigidity of the receptor as the cause of its infectivity. Still, owning these computing resources isn’t always cost effective; that’s why there have been outsourced to companies such as Amazon web services as Pearlman did for the QM/DFT calculations of the binding energy of several drug candidates for the inhibition of the virus’ main protease (MPro). Many other CADD studies are available (here, here, and here). Researchers from all around the world can chip in and join the effort by reaching out to the COVID-19 High Performance Computing Consortium (HPC) which brings together some of the most advanced computing systems to the hands of private and academic researchers with relevant projects aimed to the study of the virus. On the other side of the Atlantic, the Partnership for Advanced Computing in Europe (PRACE) also provides access to advanced computing services for research. As an effort to keep all the developing information curated and concentrated, the COVID-19 Molecular Structure and Therapeutics Hub was created to provide a community-driven data repository and curation service for molecular structures, models, therapeutics, and simulations related to computational research related to therapeutic opportunities.

As described above, molecular dynamics simulations are capital in the assessment of how drugs interact with proteins. But molecular dynamics can only do so much as they’re computing intensive so, the use of Polarizable Force Fields (PFF) algorithms to obtain results in the microseconds regime with high-resolution sampling methods which have been applied also to the modeling of the MPro protein; the phase space is sampled by different MD trajectories which are then tested and selected. Aside from classical simulations, artificial intelligence predictions and docking calculations, also quantum mechanical calculations have been employed in the search for the most intimate interactions governing the mechanisms of inhibition of proteins. In this front, a Fragment Molecular Orbital based analysis was carried out to find which residues in MPro interacted the most with a given inhibitor.

Virtual screening is at the heart of the computationally aided drug discovery process, specially high-throughput virtual screening such as the one performed by the group of Andre Fischer at Basel, in which 11 potential drugs were narrowed from a pool of over 600 million compounds that were analyzed as potential protease inhibitors. Repurposing of antiviral drugs, and other entry-inhibiting compounds, is also a major avenue explored in the search for treatments; in the linked study by Shailly Tomar et al. antiviral drugs which are also anti inflammatory are believed to take care of lung inflammation and injury associated to the infection at the same time they tend to disrupt the virus’ infection mechanism. The comeback of Virtual Reality can make virtual screening more cooperative even during lockdown conditions and more ‘tangible’ as the company Nanome has proven with their COVID-19 Town Hall meetings which aim to the modeling of proteins in 3D space. Aside from the de novo and repurposing efforts, the search for peptides against infection by SARS-CoV-2 was an important topic (here and here). More recently, Skariyachan and Gopal turn to natural products from herbal origins for their virtual screening (molecular docking and dynamics). In their perspective the chemical complexity achieved through biosynthesis can overcome the bottleneck of chemical discovery while at the same time turning to the ancient practices of herbal remedies described in Ayurveda. Other researchers like Manish Manish have also turned to libraries of 500,000+ natural compounds to find potential drugs for MPro.

The year is coming to an end but not the pandemic in any way. Now, with the advent of new strains, and the widespread vaccination effort put in place, it is more important than ever to keep the fight strong in our labs but also in our personal habits and responsibilities—the same advices that were given at the beginning of the year are still in effect today and will continue to be for the months to come. I want to wish everyone who reads this a happy holiday season, but above all I want to pay a small tribute to the scientists working relentlessly in one of the largest coordinated scientific efforts in modern history, one that can only be compared to the Moon landing or the Manhattan Project; to those scientists and all the healthcare personnel, may you find rest soon, may your efforts never go unnoticed: Thank you for your service.

A New Gradúate Student. Raúl Márquez

We’re always happy at the lab when a student defends their dissertation thesis and now it was the turn of Raúl Márquez-Avilés to do so with flying colors.

The title of his dissertation is “Molecular Dynamics Simulations of 5 potential entry inhibitors for HIV-1“. He performed 500 ns long molecular dynamics simulations of the CD4 – gp 120 proteins interacting with one or several molecules of various lead compounds with inhibitory properties. The leads were obtained previously in our group (by Durbis Castillo, now at McGill) from a massive docking library of ca. 16 million compounds, all having a central piperazine core (Fig1)

Figure 1. Lead compounds: Piperazine cores with heterocyclic substitutions.

The protein gp120 is a surface glyco-protein located at the surface of the HIV virus which couples to the CD4 protein on lymphocytes-T, being this the first step in the infection process of a healthy cell; generating inhibitors of this coupling could help stop the infection from spreading systemically. Four systems were devised: (SB) The reference state for which only gp-120 and CD4 were considered, (S2) A single ligand molecule was placed in the Phe43 cavity of gp120 to assess their inhibitory capacity, (S3) the ligand was placed right outside the Phe43 cavity to assess their entry capacity, and (S4) five ligand molecules were placed outside the Phe43 cavity of gp120 to force their entry (Fig2). Their binding energies were calculated using MM-PBSA and although all five ligands show statistically similar results as inhibitors all five exhibit a stronger binding energy than the reference proving their efficacy in preventing the coupling of the virus to the healthy cell. As a bonus, his research on system S4 shed light on the existence of an allosteric site on gp120 that will warrant further research in our group.

Figure 2. Systems for which 500 ns MD simulations were performed.

This work is still pending publication.

Raúl Márquez has always proven to be a hard working person who is also very self-sufficient student, a very cheerful labmate, and, as I just learned yesterday, an avid chess player. I’m sure he has a bright future in whichever endeavor he chooses now. Congratulations Raúl Márquez-Avilés!

Our first dabble in #MedChem through #CompChem

We’ve expanded the scope of our research interests from quantum mechanical calculations to docking and MedChem for over a year now; it has been a very interesting ride and a very rich avenue of research to explore. Durbis Castillo has led -out of his own initiative- this project and today he presents us with a guest post on the nuances of his project. Bear in mind that the detail of the calculations and a small -very targeted- tutorial on MAESTRO will be provided later in further posts and that making all this decisions required a long process of trial and error, we can only thank Dr. Antonio Romo for his help in minimizing the time this process took.

HIV is a tricky virus, and even though many of the steps included in its lifecycle are druggable, the chemical machinery making it work has been quite elusive since research groups started studying it. Highly Active Antiretroviral Therapy (HAART) works thanks to the combination of several drugs targeting different proteins such as the HIV protease or reverse transcriptase.

In 1998 the elucidation of the gp120 envelope glycoprotein crystal structure introduced a new step in the drug discovery race: HIV entry. Since drugs targeting gp120 have not been widely explored or developed, we decided to use common methodologies like docking (rigid and fit-induced) and ADME predictions to address the following question: How can we easily discover a molecule that inhibits gp120 binding to the lymphocyte CD4 receptor without having to synthesize it first? The answer was to perform a virtual screening with a bottleneck methodology based on docking calculations.

Docking methodologies are often looked as insufficient, careless or even unscientific, since the algorithms they are founded upon are not as accurate or descriptive as the ones that support DFT or ab initio calculations, for example. But there is a huge advantage to simpler operations: less computational resources are required. Then, following Russia’s example when making tanks during the WWII, why not make thousands or millions of docking calculations to quickly explore an entire chemical space and find which molecules are more likely to bind the protein?

And this is exactly what we did. We built a piperazine-based dataset of 16.3 million compounds, all of them including fragments that are reported in the medicinal chemistry literature, thus having two main characteristics, synthetic accessibility and pharmacological activity. These 16.3 million compounds were thoroughly filtered through several docking steps, each one of them being more accurate and comprehensive than the previous one, abruptly eliminating poorly fitted molecules, leaving us with a total of 275 candidates that were redocked in a different crystal structure and a different program (consensus docking).

After analyzing the ADME properties of the candidates, with descriptors such as human oral absorption and possible metabolic reactions, as well as the Induced-Fit Docking score of these molecules, ten ligands were selected as the best ones inside the analyzed chemical space. You can see ligand 255 (figure 1) as an example of the molecules that obtained the best scores throughout the docking steps.


Figure 1

Many of the colleague researchers related to this kind of topics asked “Why didn’t you download a set of molecules from Zinc or Maybridge?” And the answer to this question includes three aspects: first we wanted to test a combinatorial approach to drug design, second, we wanted to test whether including a piperazine as the core of the set of molecules would immediately grant them activity and high potency, and finally, a built database will always confer a higher degree of novelty to the possible hits when compared to commercially available compounds whose synthesis has already been developed. However, this last point needs to be addressed by an organic chemist since none of the molecules from our database have ever been synthesized (any takers?).

Right now, we are trying to explore further through molecular dynamics simulations using Desmond and Amber. Other future goals for this project include screening large databases of commercial and novel compounds with gp120 and other proteins involved in the HIV lifecycle. Also, we remain open to collaborate with anyone interested in taking the challenge to synthesize our molecules, as well as performing the biochemical assays to get an idea of their activity.

More details on MD simulations and the path of our first virtual hits to follow. Anyone interested in reading my thesis work can contact me through my linkedin profile at An article is under preparation and will soon be submitted, stay tuned!

%d bloggers like this: