Quick Post on preparing Gaussian input files from PDB files.
If you’re modeling biological systems chances are that, more often than not, you start by retrieving a PDB file. The Protein Data Bank is a repository for all things biochemistry – from oligo-peptides to full DNA sequences with over 140,000 available files encoding the corresponding structure obtained by various experimental means ranging from X-Ray diffraction, NMR and more recently, Cryo Electron Microscopy (CEM).
The PDB file encodes the Cartesian coordinates for each atom present in the structure as well as their in the same way molecular dynamics codes -like AMBER or GROMACS- code the parameters for a force field; this makes the PDB a natural input file for MD.
There are however some considerations to have in mind for when you need to use these coordinates in electronic structure calculations. Personally I give it a pass with OpenBabel to add (or possibly just re-add) all Hydrogen atoms with the following instruction:
$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -h
Alternatively, you can select a pH value, say 7.5 with:
$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -h -p7.5
You may also use the GUI if by any chance you’re working in Windows:
This sends all H atoms to the end of the atoms list. Usually for us the next step is to optimize their positions with a partial optimization at a low level of theory for which you need to use the ReadOptimize ReadOpt or RdOpt in the route section and then add the atom list at the end of the input file:
Finally, visual inspection of your input structure is always helpful to find any meaningful errors, remember that PDB files come from experimental measurements which are not free of problems.
As usual thanks for reading, commenting, and sharing.
We celebrate the successful thesis defense of Gustavo “Gus” Mondragón who has now completed his Masters degree and is now on to getting a PhD in our group. Gustavo has worked on the search for multiexcitonic states and their involvement in the excitonic transference between photosynthetic pigments, specifically between bacteriochlorophyll-d molecules (BChl-d) from the bchQRU chlorosome whose whole structure is shown in the gallery below. To this end, Gustavo has studied and implemented the Restricted Active Space method with double spin flip (RAS-2SF) with the use of QChem5.0, a method that has required the use and understanding of states with high multiplicities. Additionally, Gustavo has investigated the influence of the environment within the chlorosome by performing ONIOM calculations for the spectroscopic properties of a BChl-d dimer, finding albeit qualitatively a batochromic effect, probably an expected result but nonetheless an impressive feat for the level of theory selected.
There’s still a lot of work to do in this line of research and although we’re eager to publish our results in this excitonic transference mechanism we want to be completely sure that we’re taking every possibility into consideration so we don’t incur into any inconsistencies.
Gustavo cultivates many research interests from excited states of these pigments to biochemical processes that require the use of various tools; I’m sure his permanence in our lab will bring lots of interesting results. Congratulations, Gus! Thank you for your hard work.
Calculating the pKa value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.
Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the pKa value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (VS,max) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step in the deprotonation process.
We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate VS,max on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = water). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:
pKa = -0.2185(VS,max) + 16.1879
Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.
We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)
I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use.
We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to tetrel bonding. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.
Just as I was thinking about the state of Mexican scientific environment in the global scale, Prof. Dr. Gabriel Merino from CINVESTAV comes and gets this prize awarded by the International Center for Theoretical Physics (ICTP) and the Quantum ESPRESSO Foundation, showing us all that great science is possible even under pressing circumstances.
This prize is awarded biennially to a young scientist for outstanding contributions in the field of quantum-mechanical materials and molecular modeling, performed in a developing country or emerging economy,and in the case of Dr. Merino it is awarded not only for his contributions to theory and applications but also by his contributions to the prediction of novel systems that violate standard chemical paradigms, broadening the scope of concepts like aromaticity, coordination and chemical bond. The list of his contributions is very long despite his young age and there are barely any topic in chemistry or materials science that escapes his interest.
Gabriel is also one of the leading organizers of the Mexican Theoretical Physical Chemistry Meeting, an unstoppable mentor with many of his former students now leading research teams of their own. He is pretty much a force of nature.
Congratulations to Dr. Gabriel Merino, his team, CINVESTAV and thanks for being such an inspiration and a good friend at the same time.
The video below is a sad recount of the scientific conditions in Mexico that have driven an enormous amount of brain power to other countries. Doing science is always a hard endeavour but in developing countries is also filled with so many hurdles that it makes you wonder if it is all worth the constant frustration.
That is why I think it is even more important for the Latin American community to make our science visible, and special issues like this one from the International Journal of Quantum Chemistry goes a long way in doing so. This is not the first time IJQC devotes a special issue to the Comp.Chem. done south of the proverbial border, a full issue devoted to the Mexican Physical Chemistry Meetings (RMFQT) was also published six years ago.
I believe these special issues in mainstream journals are great ways of promoting our work in a collected way that stresses our particular lines of research instead of having them spread a number of journals. Also, and I may be ostracized for this, but I think coming up with a new journal for a specific geographical community represents a lot of effort that takes an enormous amount of time to take off and thus gain visibility.
For these reasons I’ve been cooking up some ideas for the next RMFQT website. I don’t pretend to say that my colleagues need any shoutouts from my part -I could only be so lucky to produce such fine pieces of research myself- but it wouldn’t hurt to have a more established online presence as a community.
¡Viva la ciencia Latinoamericana!
The RMFQT meeting is a long standing tradition within the Mexican Comp.Chem. community; a tradition that is now transcending our borders as more and more foreign students and researchers take part of this party, for it is a festive occasion indeed. This was the first time the RMFQT was held at a private institute, The Monterrey Institute of Technology.
As in previous years, our lab contributed with a four posters and one talk by yours truly. The posters presented by Raul Torres, Raúl Márquez, Gustavo Mondragón and Dr. Jacinto Sandoval whose pictures you can spot below in the gallery.
My talk was on the collaborative nature of Comp.Chem. and our particular interactions with the organic synthesis lab of Dr. Moisés Romero. The published papers discussed in the talk can be found in Tetrahedron (post), PCCP (post), and some unpublished results that can be read as a preprint in preprints.org.
I had the pleasure to meet and interact with old friends and make new ones like Dr. Julio Palma from Penn State, whose work on molecular rectifiers is very interesting. Also, I got to interact with many wonderful students who apparently are aware of the existence of this blog. (A big shoutout to M. Joaquina Beltrán and Plinio Cantero, from Chile whose work on DNA mismatch sensors is quite interesting, I look forward to further interacting with their team of research.)
A particular reason for this meeting to be special for me is the fact that I have been now announced as part of the local organizing committee for the next edition in 2019 in Toluca. I was also asked to develop a centralized website and coordinate the social media communication related to the this and other events, starting with the creation of the official Twitter account for our network and the meeting. I’m working on a few ideas, but if you have any suggestions please send them in the comments section.
See you next year in Toluca!
Two new papers on the development of chemosensors for different applications were recently published and we had the opportunity to participate in both with the calculation of electronic interactions.
A chemosensor requires to have a measurable response and calculating either that response from first principles based on the electronic structure, or calculating another physicochemical property related to the response are useful strategies in their molecular design. Additionally, electronic structure calculations helps us unveil the molecular mechanisms underlying their response and efficiency, as well as providing a starting point for their continuous improvement.
In the first paper, CdTe Quantum Dots (QD’s) are used to visualize in real time cell-membrane damages through a Gd Schiff base sensitizer (GdQDs). This probe interacts preferentially with a specific sequence motif of NHE-RF2 scaffold protein which is exposed during cell damage. This interactions yields intensely fluorescent droplets which can be visualized in real time with standard instrumentation. Calculations at the level of theory M06-2X/LANL2DZ plus an external double zeta quality basis set on Gd, were employed to characterize the electronic structure of the Gd³⁺ complex, the Quantum Dot and their mutual interactions. The first challenge was to come up with the right multiplicity for Gd³⁺ (an f⁷ ion) for which we had no experimental evidence of their magnetic properties. From searching the literature and talking to my good friend, inorganic chemist Dr. Vojtech Jancik it was more or less clear the multiplicity had to be an octuplet (all seven electrons unpaired).
As can be seen in figure 1a the Gd-N interactions are mostly electrostatic in nature, a fact that is also reflected in the Wiberg bond indexes calculated as 0.16, 0.17 and 0.21 (a single bond would yield a WBI value closer to 1.0).
PM6 optimizations were employed in optimizing the GdQD as a whole (figure 1f) and the MM-UFF to characterize their union to a peptide sequence (figure 2) from which we observed somewhat unsurprisingly that Gd³⁺interacts preferently with the electron rich residues.
This research was published in ACS Applied Materials and Interfaces. Thanks to Prof. Vojtech Adam from the Mendel University in Brno, Czech Republic for inviting me to collaborate with their interdisciplinary team.
The second sensor I want to write about today is a more closer to home collaboration with Dr. Alejandro Dorazco who developed a fluorescent porphyrin system that becomes chiefly quenched in the presence of Iodide but not with any other halide. This allows for a fast detection of iodide anions, related to some gland diseases, in aqueous samples such as urine. This probe was also granted a patent which technically lists yours-truly as an inventor, cool!
The calculated interaction energy was huge between I⁻ and the porphyrine, which supports the idea of a ionic interaction through which charge transfer interactions quenches the fluorescence of the probe. Figure 3 above shows how the HOMO largely resides on the iodide whereas the LUMO is located on the pi electron system of the porphyrine.
This research was published in Sensors and Actuators B – Chemical.
The compound shown below in figure 1 is listed by Aldrich as 4,5,6,7-tetrahydroindole, but is it really?
To a hardcore organic chemist it is clear that this is not an indole but a pyrrole because the lack of aromaticity in the fused ring gives this molecule the same reactivity as 2,3-diethyl pyrrole. If you search the ChemSpider database for ‘tetrahydroindole’ the search returns the following compound with the identical chemical formula C8H11N but with a different hydrogenation pattern: 2,3,3a,4-Tetrahydro-1H-indole
The real indole, upon an electrophilic attack, behaves as a free enamine yielding the product shown in figure 3 in which the substitution occurs in position 3. This compound cannot undergo an Aromatic Electrophilic Susbstitution since that would imply the formation of a sigma complex which would disrupt the aromaticity.
On the contrary, the corresponding pyrrole is substituted in position 2
These differences in reactivity towards electrophiles are easily rationalized when we plot their HOMO orbitals (calculated at the M062X/def2TZVP level of theory):
If we calculate the Fukui indexes at the same level of theory we get the highest value for susceptibility towards an electrophilic attack as follows: 0.20 for C(3) in indole and 0.25 for C(2) in pyrrole, consistent with the previous reaction schemes.
So, why is it listed as an indole? why would anyone search for it under that name? Nobody thinks about cyclohexane as 1,3,5-trihydrobenzene. According to my good friend and colleague Dr. Moisés Romero most names for heterocyles are kept even after such dramatic chemical changes due to historical and mnemonic reasons even when the reactivity is entirely different. This is only a nomenclature issue that we have inherited from the times of Hantzsch more than a century ago. We’ve become used to keeping the trivial (or should I say arbitrary) names and further use them as derivations but this could pose an epistemological problem if students cannot recognize which heterocycle presents which reactivity.
So, in a nutshell:
Chemistry makes the chemical and not the structure.
A thing we all know but sometimes is overlooked for the sake of simplicity.