We’ve expanded the scope of our research interests from quantum mechanical calculations to docking and MedChem for over a year now; it has been a very interesting ride and a very rich avenue of research to explore. Durbis Castillo has led -out of his own initiative- this project and today he presents us with a guest post on the nuances of his project. Bear in mind that the detail of the calculations and a small -very targeted- tutorial on MAESTRO will be provided later in further posts and that making all this decisions required a long process of trial and error, we can only thank Dr. Antonio Romo for his help in minimizing the time this process took.
HIV is a tricky virus, and even though many of the steps included in its lifecycle are druggable, the chemical machinery making it work has been quite elusive since research groups started studying it. Highly Active Antiretroviral Therapy (HAART) works thanks to the combination of several drugs targeting different proteins such as the HIV protease or reverse transcriptase.
In 1998 the elucidation of the gp120 envelope glycoprotein crystal structure introduced a new step in the drug discovery race: HIV entry. Since drugs targeting gp120 have not been widely explored or developed, we decided to use common methodologies like docking (rigid and fit-induced) and ADME predictions to address the following question: How can we easily discover a molecule that inhibits gp120 binding to the lymphocyte CD4 receptor without having to synthesize it first? The answer was to perform a virtual screening with a bottleneck methodology based on docking calculations.
Docking methodologies are often looked as insufficient, careless or even unscientific, since the algorithms they are founded upon are not as accurate or descriptive as the ones that support DFT or ab initio calculations, for example. But there is a huge advantage to simpler operations: less computational resources are required. Then, following Russia’s example when making tanks during the WWII, why not make thousands or millions of docking calculations to quickly explore an entire chemical space and find which molecules are more likely to bind the protein?
And this is exactly what we did. We built a piperazine-based dataset of 16.3 million compounds, all of them including fragments that are reported in the medicinal chemistry literature, thus having two main characteristics, synthetic accessibility and pharmacological activity. These 16.3 million compounds were thoroughly filtered through several docking steps, each one of them being more accurate and comprehensive than the previous one, abruptly eliminating poorly fitted molecules, leaving us with a total of 275 candidates that were redocked in a different crystal structure and a different program (consensus docking).
After analyzing the ADME properties of the candidates, with descriptors such as human oral absorption and possible metabolic reactions, as well as the Induced-Fit Docking score of these molecules, ten ligands were selected as the best ones inside the analyzed chemical space. You can see ligand 255 (figure 1) as an example of the molecules that obtained the best scores throughout the docking steps.
Many of the colleague researchers related to this kind of topics asked “Why didn’t you download a set of molecules from Zinc or Maybridge?” And the answer to this question includes three aspects: first we wanted to test a combinatorial approach to drug design, second, we wanted to test whether including a piperazine as the core of the set of molecules would immediately grant them activity and high potency, and finally, a built database will always confer a higher degree of novelty to the possible hits when compared to commercially available compounds whose synthesis has already been developed. However, this last point needs to be addressed by an organic chemist since none of the molecules from our database have ever been synthesized (any takers?).
Right now, we are trying to explore further through molecular dynamics simulations using Desmond and Amber. Other future goals for this project include screening large databases of commercial and novel compounds with gp120 and other proteins involved in the HIV lifecycle. Also, we remain open to collaborate with anyone interested in taking the challenge to synthesize our molecules, as well as performing the biochemical assays to get an idea of their activity.
More details on MD simulations and the path of our first virtual hits to follow. Anyone interested in reading my thesis work can contact me through my linkedin profile at https://www.linkedin.com/in/durbisjaviercp/. An article is under preparation and will soon be submitted, stay tuned!
2017 was a complicated year for various reasons here in Mexico (and some personal health issues) but nonetheless I’m very proud of the performance of everyone at the lab whose hard work and great skills keep pushing our research forward.
Four new members joined the team and have presented their work at the national meeting for CompChem for the first time. Also, for the first time, one of my students, Gustavo Mondragón, gave a talk at this meeting with great success about his research on the Fenna Matthews Olson complex of photosynthetic bacteria.
The opportunity to attend WATOC at Munich presented me the great chance to meet wonderful people from around the world and was even kindly and undeservingly invited to write the prologue for an introductory DFT book by Prof. Pedro Cerón from Spain. I hope to Jeep up with the collaborations abroad such as the one with the Mirkin group at Nortgwestern and the one with my dear friend Kunsagi-Mate Sándor at Pecsi Tudomanyegyetem (Hungary), among many others; I’m thankful for their trust in our capabilities.
Two members got their BSc degrees, Marco an Durbis, the latter also single handedly paved the way for us to develop a new research line on the in silico drug developing front; his relentless work has also been praised by the QSAR team at the Institute of Chemistry with which he has collaborated by performing toxicity calculations for the agrochemical industry as well as by designing educational courses aimed to the dissemination of our work and QSAR in general among regulatory offices and potential clients. We’re sad to see him go next fall but at the same time we’re glad to know his scientific skills will further develop.
I cannot thank the team enough: Alejandra Barrera, Gustavo Mondragón, Durbis Castillo, Fernando Uribe, Juan Guzman, Alberto Olmedo, Eduardo Cruz, Ricardo Loaiza and Marco Garcia; may 2018 be a great year for all of you.
And to all the readers thank you for your kind words, I’m glad this little space which is about to become nine years old is regarded as useful; to all of you I wish a great 2018!
One of the most popular posts in this blog has to do with calculating Fukui indexes, however, when dealing with a large number of molecules, our described methodology can become cumbersome since it requires to manually extract the population analysis from two or three different output files and then performing the arithmetic on them separately with a spreadsheet or something.
Our new team member Ricardo Loaiza has written a python script that takes the three aforementioned files and yields a .csv file with the calculated Fukui indexes, and it even points out which of the atoms exhibit the largest values so if you have a large molecule you don’t have to manually check for them. We have also a batch version which takes all the files in any given directory and performs the Fukui calculations for each, provided it can find file triads with the naming requirements described below.
Output files must be named filename.log (the N electrons reference state), filename_plus.log (the state with N+1 electrons) and filename_minus.log (the N-1 electrons state). Another restriction is that so far these scripts only work with NBO population analysis as provided by the NBO3.1 program available in the various versions of Gaussian. I imagine the listing is similar in NBO5.x and NBO6.x and so it should work if you do the population analysis with them.
The syntax for the single molecule version is:
python fukui.py filename.log filename_minus.log filename_plus.log
For the batch version is:
(Por Lote means In Batch in Spanish.)
These scripts are available via GitHub. We hope you find them useful, and you do please let us know whether here at the comments section or at our GitHub site.
We’ve covered some common errors when dealing with formatted checkpoint files (*.fchk) generated from Gaussian, specially when analyzed with the associated GaussView program. (see here and here for previous posts on the matter.)
Prof. Neal Zondlo from the University of Delaware kindly shared this solution with us when the following message shows up:
CConnectionGFCHK::Parse_GFCHK() Missing or bad data: Rbond Line Number 1234
The Rbond label has to do with the connectivity displayed by the visualizer and can be overridden by close examination of the input file. In the example provided by Prof. Zondlo he found the following line in the connectivity matrix of the input file:
2 9 0.0
which indicates a zero bond order between atoms 2 and 9, possibly due to their proximity. He changed the line to simply
So editing the connectivity of your atoms in the input can help preventing the Rbond message.
I hope this helps someone else.
A yearly tradition of this Comp.Chem. lab and many others throughout our nation is to attend the Mexican Meeting on Theoretical Physical Chemistry to share news, progress and also a few drinks and laughs. This year the RMFQT was held in Puebla and although unfortunately I was not able to attend this lab was proudly represented by its current members. Gustavo Mondragón gave a talk about his progress on his photosynthesis research linking to the previous work of María Eugenia Sandoval already presented in previous editions; kudos to Gustavo for performing remarkably and thanks to all those who gave us their valuable feedback and criticism. Also, five posters were presented successfully, I can only thank the entire team for representing our laboratory in such an admirable way, and a special mention to the junior members, I hope this was the first of many scientific events they attend and may you deeply enjoy each one of them.
Among the invited speakers, the RMFQT had the honor to welcome Prof. John Perdew (yes, the P in PBE); the team took the opportunity of getting a lovely picture with him.
Here is the official presentation of the newest members of our group:
Alejandra Barrera (hyperpolarizabilty calculations on hypothetical poly-calyx[n]arenes for the search of NLO materials)
Fernando Uribe (Interaction energy calculations for non-canonical nucleotides)
Juan Guzmán (Reaction mechanisms calculations for catalyzed organic reactions)
We thank the organizing committee for giving us the opportunity to actively participate in this edition of the RMFQT, we eagerly await for next year as every year.
As we were hanging out recently, the idea came to us at the lab to create memes in order to summarize our work. We should be writing articles but hey, we needed the break, and so we shared them with each other in our last group meeting along with a good laugh. Here are some of the funniest ones.
Having doughnuts during our weekly meetings has proven a huge success in itself:
Finding transition states for organic chemical reactions can be a bit frustrating at times:
Good old photosynthesis sparked a few realizations too:
We’re dealing with docking calculations for a massive number of molecules. This has sparked a few inside jokes too:
A conversation about heterocyclic nomenclature that sparked this other post:
Try your own and share. Thanks for reading.
Last Friday we had a new graduate student when our very own Marco Antonio Diaz defended his BSc thesis on the in silico design of drug carriers based on calix[n]arenes. During his thesis he performed around 160 different calculations regarding the interaction energy of our host-guest inclusion complexes, both using the supramolecular method and the NBODel procedure available in NBO3.1 as provided with Gaussian 09. One of the main targets of this work was to assess both methods -with the proper BSSE corrections- in their capabilities for the calculation of interaction energies.
We found that the NBODel method consistently generates interaction energies that are similar to those of the SM method + the BSSE correction (as opposed to SM – BSSE which is the proper correction). Marco and I are still in the process of writing the article so maybe it will be published in early 2018. In this case we’re using calixarenes to deliver three drugs: warfarine, furosemide, phenylbutazone to compite with ocratoxin-A (OTA) for the binding site in Human Serum Albumin (HSA).
This project is undertaken in collaboration with my good friend Dr. Sándor Kunsági-Máté in Pécsi Tudomanyegyetem in Hungary.
Congratulations to Marco from all of us here at the lab!
Recently, the journal ACS Central Science asked me to write a viewpoint for their First Reactions section about a research article by Prof. Alán Aspuru-Guzik from Harvard University on the evolution of the Fenna-Matthews-Olson (FMO) complex. It was a very rewarding experience to write this piece since we are very close to having our own work on FMO published as well (stay tuned!). The FMO complex remains a great research opportunity for understanding photosynthesis and thus the origin of life itself.
In said article, Aspuru-Guzik’s team climbed their way up a computationally generated phylogenetic tree for the FMO from different green sulfur bacteria by creating small successive mutations on the protein at a time while also calculating their photochemical properties. The idea is pretty simple and brilliant: perform a series of “educated guesses” on the structure of FMO’s ancestors (there are no fossil records of FMO so this ‘educated guesses’ are the next best thing) and find at what point the photochemistry goes awry. In the end the question is which led the way? did the photochemistry led the way of the evolution of FMO or did the evolution of FMO led to improved photochemistry?
Since both the article and viewpoint are both published as open access by the ACS, I wont take too much space here re-writing the whole thing and will instead exhort you to read them both.
Thanks for doing so!
The compound shown below in figure 1 is listed by Aldrich as 4,5,6,7-tetrahydroindole, but is it really?
To a hardcore organic chemist it is clear that this is not an indole but a pyrrole because the lack of aromaticity in the fused ring gives this molecule the same reactivity as 2,3-diethyl pyrrole. If you search the ChemSpider database for ‘tetrahydroindole’ the search returns the following compound with the identical chemical formula C8H11N but with a different hydrogenation pattern: 2,3,3a,4-Tetrahydro-1H-indole
The real indole, upon an electrophilic attack, behaves as a free enamine yielding the product shown in figure 3 in which the substitution occurs in position 3. This compound cannot undergo an Aromatic Electrophilic Susbstitution since that would imply the formation of a sigma complex which would disrupt the aromaticity.
On the contrary, the corresponding pyrrole is substituted in position 2
These differences in reactivity towards electrophiles are easily rationalized when we plot their HOMO orbitals (calculated at the M062X/def2TZVP level of theory):
If we calculate the Fukui indexes at the same level of theory we get the highest value for susceptibility towards an electrophilic attack as follows: 0.20 for C(3) in indole and 0.25 for C(2) in pyrrole, consistent with the previous reaction schemes.
So, why is it listed as an indole? why would anyone search for it under that name? Nobody thinks about cyclohexane as 1,3,5-trihydrobenzene. According to my good friend and colleague Dr. Moisés Romero most names for heterocyles are kept even after such dramatic chemical changes due to historical and mnemonic reasons even when the reactivity is entirely different. This is only a nomenclature issue that we have inherited from the times of Hantzsch more than a century ago. We’ve become used to keeping the trivial (or should I say arbitrary) names and further use them as derivations but this could pose an epistemological problem if students cannot recognize which heterocycle presents which reactivity.
So, in a nutshell:
Chemistry makes the chemical and not the structure.
A thing we all know but sometimes is overlooked for the sake of simplicity.