Blog Archives

Worldwide CompChem in the Fight against COVID-19

The war against COVID-19 has been waged in many fronts. The computational chemistry community has done their share during this pandemic to put forward a cure, a vaccine, or a better understanding of the molecular mechanisms behind the human infection by the SARS-CoV-2 virus. As few vaccines show currently their heads and start making their way around the globe to stop the spreading, amidst a climate of disinformation, distrust and political upheaval, all of which pose several challenges yet to be faced aside from the technical and scientific ones.

This is by no means a comprehensive review of the literature, in fact, most of the cited literature herein was observed in Twitter under the #CompChem and #COVID combined hashtags; Summarizing the research by the CompChem community on COVID-19 related topics in a single blog-post would be near to impossible—I trust a book is being written on it as I type these lines.

The structural elucidation of the proteins associated to the SARS-CoV-2 virus is probably the first step required in designing chemical compounds capable of modifying their functions and altering their life-cycle without altering the biochemistry of the hosts. The Coronavirus Structural Taskforce has elucidated the structure of 28 proteins of SARS-CoV-2 aside from the 300+ proteins from the previous SARS-CoV virus using the tools from the FoldIt at home game based on the Rosetta program to heuristically predict the structure of these proteins. Structure based drug design rely on the knowledge of the structure of the active site (hence the name), but in the case of newly discovered proteins for which homology modeling is not entirely feasible, a ligand-based approach named D3Similarity was developed early in the pandemic for identifying the possible active sites by the group of Prof. Zhijian Xu. Mapping of the of the viral genome and proteome was also achieved early on during the first dates of lockdown in the American continent. The information was readily made available and usable for further studies which prompts another challenge: the rapid dissemination, review and evaluation of information to make scientifically sound claims and make data-based decisions. In this regard, the role of preprints cannot be stressed enough. Without a rapid communication, scientific results cannot generate a much needed critical mass to turn all these data into knowledge. As evidenced by the vast majority of the links present in this post, ChemRXiv from the ACS served the much needed function to gather, link and put the data for scientific evaluation out there in order to accelerate the discovery of solutions to the various steps of the virus’ reproductive cycle through various strategies.

The role of supercomputing has been paramount worldwide to the various efforts made in CompChem (read the C&EN piece) in various fronts from structural elucidation, such as the AI driven structural modelling of spike proteins and their infection mechanism led by Prof. Rommie Amaro (UCSD) and Dr. Arvind Ramanathan which was celebrated by the Bell Prize, to development of vaccines. Many Molecular Dynamics simulations have been performed on potential inhibitors of proteins such as the spike protein, in some cases these simulations coupled with cryo-EM microscopy allowed for the elucidation of the hinging mechanism of these spike proteins, their thermodynamic properties, and all atoms-simulations assessed the rigidity of the receptor as the cause of its infectivity. Still, owning these computing resources isn’t always cost effective; that’s why there have been outsourced to companies such as Amazon web services as Pearlman did for the QM/DFT calculations of the binding energy of several drug candidates for the inhibition of the virus’ main protease (MPro). Many other CADD studies are available (here, here, and here). Researchers from all around the world can chip in and join the effort by reaching out to the COVID-19 High Performance Computing Consortium (HPC) which brings together some of the most advanced computing systems to the hands of private and academic researchers with relevant projects aimed to the study of the virus. On the other side of the Atlantic, the Partnership for Advanced Computing in Europe (PRACE) also provides access to advanced computing services for research. As an effort to keep all the developing information curated and concentrated, the COVID-19 Molecular Structure and Therapeutics Hub was created to provide a community-driven data repository and curation service for molecular structures, models, therapeutics, and simulations related to computational research related to therapeutic opportunities.

As described above, molecular dynamics simulations are capital in the assessment of how drugs interact with proteins. But molecular dynamics can only do so much as they’re computing intensive so, the use of Polarizable Force Fields (PFF) algorithms to obtain results in the microseconds regime with high-resolution sampling methods which have been applied also to the modeling of the MPro protein; the phase space is sampled by different MD trajectories which are then tested and selected. Aside from classical simulations, artificial intelligence predictions and docking calculations, also quantum mechanical calculations have been employed in the search for the most intimate interactions governing the mechanisms of inhibition of proteins. In this front, a Fragment Molecular Orbital based analysis was carried out to find which residues in MPro interacted the most with a given inhibitor.

Virtual screening is at the heart of the computationally aided drug discovery process, specially high-throughput virtual screening such as the one performed by the group of Andre Fischer at Basel, in which 11 potential drugs were narrowed from a pool of over 600 million compounds that were analyzed as potential protease inhibitors. Repurposing of antiviral drugs, and other entry-inhibiting compounds, is also a major avenue explored in the search for treatments; in the linked study by Shailly Tomar et al. antiviral drugs which are also anti inflammatory are believed to take care of lung inflammation and injury associated to the infection at the same time they tend to disrupt the virus’ infection mechanism. The comeback of Virtual Reality can make virtual screening more cooperative even during lockdown conditions and more ‘tangible’ as the company Nanome has proven with their COVID-19 Town Hall meetings which aim to the modeling of proteins in 3D space. Aside from the de novo and repurposing efforts, the search for peptides against infection by SARS-CoV-2 was an important topic (here and here). More recently, Skariyachan and Gopal turn to natural products from herbal origins for their virtual screening (molecular docking and dynamics). In their perspective the chemical complexity achieved through biosynthesis can overcome the bottleneck of chemical discovery while at the same time turning to the ancient practices of herbal remedies described in Ayurveda. Other researchers like Manish Manish have also turned to libraries of 500,000+ natural compounds to find potential drugs for MPro.

The year is coming to an end but not the pandemic in any way. Now, with the advent of new strains, and the widespread vaccination effort put in place, it is more important than ever to keep the fight strong in our labs but also in our personal habits and responsibilities—the same advices that were given at the beginning of the year are still in effect today and will continue to be for the months to come. I want to wish everyone who reads this a happy holiday season, but above all I want to pay a small tribute to the scientists working relentlessly in one of the largest coordinated scientific efforts in modern history, one that can only be compared to the Moon landing or the Manhattan Project; to those scientists and all the healthcare personnel, may you find rest soon, may your efforts never go unnoticed: Thank you for your service.

A New Gradúate Student. Raúl Márquez

We’re always happy at the lab when a student defends their dissertation thesis and now it was the turn of Raúl Márquez-Avilés to do so with flying colors.

The title of his dissertation is “Molecular Dynamics Simulations of 5 potential entry inhibitors for HIV-1“. He performed 500 ns long molecular dynamics simulations of the CD4 – gp 120 proteins interacting with one or several molecules of various lead compounds with inhibitory properties. The leads were obtained previously in our group (by Durbis Castillo, now at McGill) from a massive docking library of ca. 16 million compounds, all having a central piperazine core (Fig1)

Figure 1. Lead compounds: Piperazine cores with heterocyclic substitutions.

The protein gp120 is a surface glyco-protein located at the surface of the HIV virus which couples to the CD4 protein on lymphocytes-T, being this the first step in the infection process of a healthy cell; generating inhibitors of this coupling could help stop the infection from spreading systemically. Four systems were devised: (SB) The reference state for which only gp-120 and CD4 were considered, (S2) A single ligand molecule was placed in the Phe43 cavity of gp120 to assess their inhibitory capacity, (S3) the ligand was placed right outside the Phe43 cavity to assess their entry capacity, and (S4) five ligand molecules were placed outside the Phe43 cavity of gp120 to force their entry (Fig2). Their binding energies were calculated using MM-PBSA and although all five ligands show statistically similar results as inhibitors all five exhibit a stronger binding energy than the reference proving their efficacy in preventing the coupling of the virus to the healthy cell. As a bonus, his research on system S4 shed light on the existence of an allosteric site on gp120 that will warrant further research in our group.

Figure 2. Systems for which 500 ns MD simulations were performed.

This work is still pending publication.

Raúl Márquez has always proven to be a hard working person who is also very self-sufficient student, a very cheerful labmate, and, as I just learned yesterday, an avid chess player. I’m sure he has a bright future in whichever endeavor he chooses now. Congratulations Raúl Márquez-Avilés!

Stability of Unnatural DNA – @PCCP #CompChem

As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.


Figure 1

When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.

Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.

X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.

Unnatural DNA and Synthetic Biology

Ever since I read the highly praised article by Floyd Romesberg in Nature back in 2013 I got really interested in synthetic biology. In said article, an unnatural base pair (UBP) was not only inserted into a DNA double strand in vivo  but the organism was even able to reproduce the UBPs present in subsequent generations.


Romesberg’s Nucleosides. No Hydrogen bonding is formed between them!

Inserting new unnatural base pairs in DNA works a lot like editing a computer’s code. Inserting a couple UBPs in vitro is like inserting a comment; it wont make a difference but its still there. If the DNA sequence containing the UBPs can be amplified by molecular biology techniques such as PCR it means that a polymerase enzyme is able to recognize it and place it in site, this is equivalent to inserting a ‘hello world’ section into a working code; it will compile but it’s pretty much useless. Inserting these UBPs in vivo means that the organism is able to thrive despite the large deformation in a short section of its genetic code, but having it replicated by the chemical machinery of the nucleus is an amazing feat that only a few molecules could allow.

The ultimate goal of synthetic biology would be to find a UBP which codes effectively and purposefully during translation of DNA.This last feat would be equivalent to inserting a working subroutine in a program with a specific purpose. But not only could the use of UBPs serve for the purposes of expanding the genetic code from a quaternary (base four) to a senary (base six) system: the field of DNA origami could also benefit from having an expansion in the chemical and structural possibilities of the famous double helix; marking and editing a sequence would also become easier by having distinctive sections with nucleotides other than A, T, C and G.

It is precisely in the concept of double helix that our research takes place since the available biochemical machinery for translation and replication can only work on a double helix, else, the repair mechanisms get activated or the DNA will just stop serving its purpose (i.e. the code wont compile).

My good friend, Dr. Rodrigo Galindo and I have worked on the simulation of Romesberg’s UBPs in order to understand the underlying structural, dynamical and electronic causes that made them so successful and to possibly design more efficient UBPs based on a set of general principles. A first paper has been accepted for publication in Phys.Chem.Chem.Phys. and we’re very excited for it; more on that in a future post.

New paper in Computational and Theoretical Chemistry

I always get very happy to have a new paper out there! I find it exciting but most of all liberating since it makes you feel like your work is going somewhere but most of all that it is making its way ‘out there’; there is a strong feeling of validation, I guess.

Two very different families of calix[n]arenes (Fig 1) were tested as drug carriers for a very small molecule with a huge potential as a chemotherapeutic agent against Leukemia, namely, 3-phenyl-1H-[1]benzofuro[3,2-c]pyrazole a.k.a. GTP which has proven to be an effective in vitro Tyrosine Kinase III inhibitor. Having such a low molecular weight it is expected to have a very high excretion rate therefore the use of a carrier could increase its retention time and hence its activity. This time we considered n = 4, 5, 6 and 8 for the size of the cavities and R = -SO3H and -OEt as functional groups on the upper rim as to evaluate only a polar coordinating group and a non-polar non-coordinating one since GTP offers two H-bond acceptor sites and one H-bond donor one along the π electron density that could form π – π stacking interactions between the aromatic groups on GTP and the walls of the calixarene.

Fig 1. Calixarenes under study and their complexes with GTP

Fig 1. Calixarenes under study and their complexes with GTP

Once again calculations were carried out at the B97D/6-31G(d,p) level of theory along with molecular dynamics simulations for over 100 ns of production runs. NBO Deletion interaction energies were computed in order to discern which hosts formed the most stable complexes.

NBO Del interaction energies B97D/6-31G(d,p)

NBO Del interaction energies B97D/6-31G(d,p)

You may find a link to the ScienceDirect website for downloading the paper from this link. Last, but certainly not least, I’d like to thank all coauthors for their contributions and patience in getting this study published: Dr. Rodrigo Galindo-Murillo; Alberto Olmedo-Romero; Eduardo Cruz-Flores; Dr. Petronela M. Petrar and Prof. Dr. Kunsági-Máté Sándor. Thanks a lot for everything!


Donor and acceptor H-bond sites increases the probability of keeping the drug in place for a higher retention rate

Donor and acceptor H-bond sites increases the probability of keeping the drug in place for a higher retention rate

New paper in Journal of Chemical Theory and Computation

Happy new year to all my readers!

Having a new paper published is always a matter of happiness for this computational chemist but this time I’m excedingly excited about anouncing the publishing of a paper in the Journal of Chemical Theory and Computation, which is my highest ranked publication so far! It also establishes the consolidation of our research group at CCIQS as a solid and competitive group within the field of theoretical and computational chemistry. The title of our paper is “In Silico design of monomolecular drug carriers for the tyrosine kinase inhibitor drug Imatinib based on calix- and thiacalix[n]arene host molecules. A DFT and Molecular Dynamics study“.

In this article we aimed towards finding a suitable (thia-) calix[n]arene based drug delivery agent for the drug Imatinib (Gleevec by Novartis), which is a broadly used powerful Tyrosine Kinase III inhibitor used in the treatment of Chronic Myeloid Leukaemia and, to a lesser extent, Gastrointestinal Stromal Tumors; although Imatinib (IMB) exhibits a bioavailability close to 90% most of it is excreted, becomes bound to serum proteins or gets accumulated in other tissues such as the heart causing several undesired side effects which ultimately limit its use. By using a molecular capsule we can increase the molecular weight of the drug thus increasing its retention, and at the same time we can prevent Imatinib to bind, in its active form, to undesired proteins.

We suggested 36 different calix and thia-calix[n]arenes (CX) as possible candidates; IMB-CX complexes were manually docked and then optimized at the B97D/6-31G(d,p) level of theory; Stephan Grimme’s B97D functional was selected for its inclusion of dispersion terms, so important in describing π-π interactions. Intermolecular interaction energies were calculated under the Natural Bond Order approximation; a stable complex was needed but a too stable complex would never deliver its drug payload! This brings us to the next part of the study. A monomolecular drug delivery agent must be able to form a stable complex with the drug but it must also be able to release it. Molecular Dynamics simulations (+100 ns) and umbrella sampling methods were used to analyse the release of the drug into the aqueous media.

Optimized geometries for all complexes under study (B97D/6-31G*)

Optimized geometries for the 20 most stable complexes under study (B97D/6-31G*)

Potential Mean Force profiles for the four most stable complexes for position N1 and  N2 from the QM simulations are shown below (Red, complexes in the N1 position, blue, N2 position). These plots, derived from the MD simulations  give us an idea of the final destination of the drug respect of the calixarene carrier. In the next image, the three preferred structures (rotaxane-like; inside; released) for the final outcome of the delivery process are shown. The stability of the complexes was also assessed by calculating the values of ΔG binding through the use of the Poisson equations.

PMF for the most stable compounds

PMF for the most stable compounds

General MD simulation final structures

General MD simulation final structures

Thanks to my co-authors Maria Eugenia Sandoval-Salinas and Dr. Rodrigo Galindo-Murillo for their enormous contributions to this work; without their hard work and commitment to the project this paper wouldn’t have been possible.

Webinar with Dr. Erik Lindhal – NVIDIA+GROMACS

Thanks to Devang Sachdev from NVIDIA for bringing this webinar to my attention.

The future of computational chemistry seems to be written in CUDA for GPU’s specially when it comes to Molecular Dynamics; as such, NVIDIA has gone through great lengths into introducing scientific computing methods for GPU’s. I still have a pending review of a test drive that people at NVIDIA and EXXACTCORP kindly allowed me to run but that is the topic of the next post.

Next Thursday, April 4th, 2013 from 9:00 AM – 10:00 AM Pacific Standard Time there will be a webinar in which Dr. Erik Lindhal at Stockholm University and NVIDIA will discuss latest GPU-acceleration technologies available to GROMACS users; more specifically the latest accelerated version of GROMACS 4.6, which features are supported, it’s installation and use, and how it performs with latest NVIDIA Kepler GPUs.

Register here:

Please register and check your local timezone to avoid delays. I will register as soon as I finish typing this. Thanks once again to Devang Sachdev for all his help, patience and trust in this forum.

%d bloggers like this: