Estimation of pKa Values through Local Electrostatic Potential Calculations

Calculating the pKa value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.

Fig 1. Thermodynamic Cycle for the pKa calculation of any given Bronsted acid, HA

Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the pKa value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (VS,max) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step  in the deprotonation process. 

We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate VS,max on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = water). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:

pKa = -0.2185(VS,max) + 16.1879

Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.

We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)

fig 2. calculated v experimental pKa values for a test set of 10 carboxylic acids from equation above

I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use. 

We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to tetrel bonding. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.


Calculation of Intermolecular Interactions for Sensors with Biological Applications

Two new papers on the development of chemosensors for different applications were recently published and we had the opportunity to participate in both with the calculation of electronic interactions.

A chemosensor requires to have a measurable response and calculating either that response from first principles based on the electronic structure, or calculating another physicochemical property related to the response are useful strategies in their molecular design. Additionally, electronic structure calculations helps us unveil the molecular mechanisms underlying their response and efficiency, as well as providing a starting point for their continuous improvement.

In the first paper, CdTe Quantum Dots (QD’s) are used to visualize in real time cell-membrane damages through a Gd Schiff base sensitizer (GdQDs). This probe interacts preferentially with a specific sequence motif of NHE-RF2 scaffold protein which is exposed during cell damage. This interactions yields intensely fluorescent droplets which can be visualized in real time with standard instrumentation. Calculations at the level of theory M06-2X/LANL2DZ plus an external double zeta quality basis set on Gd, were employed to characterize the electronic structure of the Gd³⁺ complex, the Quantum Dot and their mutual interactions. The first challenge was to come up with the right multiplicity for Gd³⁺ (an f⁷ ion) for which we had no experimental evidence of their magnetic properties. From searching the literature and talking to my good friend, inorganic chemist Dr. Vojtech Jancik it was more or less clear the multiplicity had to be an octuplet (all seven electrons unpaired).

As can be seen in figure 1a the Gd-N interactions are mostly electrostatic in nature, a fact that is also reflected in the Wiberg bond indexes calculated as 0.16, 0.17 and 0.21 (a single bond would yield a WBI value closer to 1.0).

PM6 optimizations were employed in optimizing the GdQD as a whole (figure 1f) and the MM-UFF to characterize their union to a peptide sequence (figure 2) from which we observed somewhat unsurprisingly that Gd³⁺interacts preferently with the electron rich residues.

This research was published in ACS Applied Materials and Interfaces. Thanks to Prof. Vojtech Adam from the Mendel University in Brno, Czech Republic for inviting me to collaborate with their interdisciplinary team.

The second sensor I want to write about today is a more closer to home collaboration with Dr. Alejandro Dorazco who developed a fluorescent porphyrin system that becomes chiefly quenched in the presence of Iodide but not with any other halide. This allows for a fast detection of iodide anions, related to some gland diseases, in aqueous samples such as urine. This probe was also granted a patent which technically lists yours-truly as an inventor, cool!

The calculated interaction energy was huge between I⁻ and the porphyrine, which supports the idea of a ionic interaction through which charge transfer interactions quenches the fluorescence of the probe. Figure 3 above shows how the HOMO largely resides on the iodide whereas the LUMO is located on the pi electron system of the porphyrine.

This research was published in Sensors and Actuators B – Chemical.

Mg²⁺ Needs a 5th Coordination in Chlorophylls – New paper in IJQC

Photosynthesis, the basis of life on Earth, is based on the capacity a living organism has of capturing solar energy and transform it into chemical energy through the synthesis of macromolecules like carbohydrates. Despite the fact that most of the molecular processes present in most photosynthetic organisms (plants, algae and even some bacteria) are well described, the mechanism of energy transference from the light harvesting molecules to the reaction centers are not entirely known. Therefore, in our lab we have set ourselves to study the possibility of some excitonic transference mechanisms between pigments (chlorophyll and its corresponding derivatives). It is widely known that the photophysical properties of chlorophylls and their derivatives stem from the electronic structure of the porphyrin and it is modulated by the presence of Mg but its not this ion the one that undergoes the main electronic transitions; also, we know that Mg almost never lies in the same plane as the porphyrin macrocycle because it bears a fifth coordination whether to another pigment or to a protein that keeps it in place (Figure 1).


Figure 1 The UV-Vis spectra of BCHl-a changes with the coordination state

During our calculations of the electronic structure of the pigments (Bacteriochlorophyll-a, BChl-a) present in the Fenna-Matthews-Olson complex of sulfur dependent bacteria we found that the Mg²⁺ ion at the center of one of these pigments could in fact create an intermolecular interaction with the C=C double bond in the phytol fragment which lied beneath the porphyrin ring.


Figure 2 Mg points ‘downwards’ upon optimization, hinting to the interaction under study


This would be the first time that a dihapto coordination is suggested to occur in any chlorophyll and that on itself is interesting enough but we took it further and calculated the photophysical implications of having this fifth intramolecular dihapto coordination as opposed to a protein or none for that matter. Figure 3 shows that the calculated UV-Vis spectra (calculated with Time Dependent DFT at the CAM-B3LYP functional and the cc-pVDZ, 6-31G(d,p) and 6-31+G(d,p) basis sets). A red shift is observed for the planar configuration, respect to the five coordinated species (regardless of whether it is to histidine or to the C=C double bond in the phytyl moiety).



Figure 3 CAMB3LYP UV-VIS spectra. Basis set left to right cc-PVDZ, 6-31G(d,p) and 6-31+G(d,p)

Before calculating the UV-Vis spectra, we had to unambiguously define the presence of this observed interaction. To that end we calculated to a first approximation the C-Mg Wiberg bond indexes at the CAM-B3LYP/cc-pVDZ level of theory. Both values were C(1)-Mg 0.022 and C(2)-Mg 0.032, which are indicative of weak interactions; but to take it even further we performed a non-covalent interactions analysis (NCI) under the Atoms in Molecules formalism, calculated at the M062X density which yielded the presence of the expected critical points for the η²Mg-(C=C) interaction. As a control calculation we performed the same calculation for Magnoscene just to unambiguously assign these kind of interactions (Fig 4, bottom).


Figure 4 (a), (b) NCI analysis for Mg-(C=C) interaction compared to Magnesocene (c)

This research is now available at the International Journal of Quantum Chemistry. A big shoutout and kudos to Gustavo “Gus” Mondragón for his work in this project during his masters; many more things come to him and our group in this and other research ventures.

I’m done with Computational Studies

I’ve lately reviewed a ton of papers whose titles begin with some version of “Computational studies of…“, “Theoretical studies of…” or even more subtly just subtitled “A theoretical/computational study” and even when I gotta confess this is probably something I’ve done once or twice myself, it got me thinking about the place and role of computational chemistry within chemistry itself.

As opposed to physicists, chemists are pressed to defend a utilitarian view of their work and possibly because of that view some computational chemists sometimes lose sight of their real contribution to a study, which is far from just performing a routine electronic structure calculation. I personally don’t like it when an experimental colleague comes asking for ‘some calculations’ without a clear question to be answered by them; Computational Chemistry is not an auxiliary science but a branch of physical chemistry in its own right, one that provides all the insight experiments -chemical or physical- sometimes cannot.

I’m no authority on authoring research papers but I encourage my students to think about the titles of their manuscripts in terms of what the manuscript most heavily relies on; whether it’s the phenomenon, the methodology or the object of the study, that should be further stressed on the title. Papers titled “Computational studies of…” usually are followed by ‘the object of study’ possibly overlooking the phenomenon observed throughout such studies. It is therefore a disservice to the science contained within the manuscript, just like experimental papers gain little from titles such as “Synthesis and Characterization of…“. It all comes down to finding a suitable narrative for our work, something that I constantly remind my students. It’s not about losing rigor or finding a way to oversell our results but instead to actually drive a point home. What did you do why and how. Anna Clemens, a professional scientific writer has a fantastic post on her blog about it and does it far better than I ever could. Also, when ranting on Twitter, the book Houston, we have a narrative was recommended to me, I will surely put it my to-read list.

While I’m on the topic of narratives in science, I’m sure Dr. Stuart Cantrill from Nature Chemistry wouldn’t mind if I share with you his deconstruction of an abstract. Let’s play a game and give this abstract a title in the comments section based on the information vested in it.DcJCrr_W0AQCNQZ

The Evolution of Photosynthesis

Recently, the journal ACS Central Science asked me to write a viewpoint for their First Reactions section about a research article by Prof. Alán Aspuru-Guzik from Harvard University on the evolution of the Fenna-Matthews-Olson (FMO) complex. It was a very rewarding experience to write this piece since we are very close to having our own work on FMO published as well (stay tuned!). The FMO complex remains a great research opportunity for understanding photosynthesis and thus the origin of life itself.

In said article, Aspuru-Guzik’s team climbed their way up a computationally generated phylogenetic tree for the FMO from different green sulfur bacteria by creating small successive mutations on the protein at a time while also calculating their photochemical properties. The idea is pretty simple and brilliant: perform a series of “educated guesses” on the structure of FMO’s ancestors (there are no fossil records of FMO so this ‘educated guesses’ are the next best thing) and find at what point the photochemistry goes awry. In the end the question is which led the way? did the photochemistry led the way of the evolution of FMO or did the evolution of FMO led to improved photochemistry?

Since both the article and viewpoint are both published as open access by the ACS, I wont take too much space here re-writing the whole thing and will instead exhort you to read them both.

Thanks for doing so!

Collaborations in Inorganic Chemistry

I began my path in computational chemistry while I still was an undergraduate student, working on my thesis under professor Cea at unam, synthesizing main group complexes with sulfur containing ligands. Quite a mouthful, I know. Therefore my first calculations dealt with obtaining Bond indexed for bidentate ligands bonded to tin, antimony and even arsenic; yes! I worked with arsenic once! Happily, I keep a tight bond (pun intended) with inorganic chemists and the recent two papers published with the group of Prof. Mónica Moya are proof of that.

In the first paper, cyclic metallaborates were formed with Ga and Al but when a cycle of a given size formed with one it didn’t with the other (fig 1), so I calculated the relative energies of both analogues while compensating for the change in the number of electrons with the following equation:

Fig 1


Under the same conditions 6-membered rings were formed  with Ga but not with Al and 8-membered rings were obtained for Al but not for Ga. Differences in their covalent radii alone couldn’t account for this fact.

ΔE = E(MnBxOy) – nEM + nEM’ – E(M’nBxOy)                     Eq 1

A seamless substitution would imply ΔE = 0 when changing from M to M’


Hipothetical compounds optimized at the B3LYP/6-31G(d,p) level of theory

The calculated ΔE were: ΔE(3/3′) = -81.38 kcal/mol; ΔE(4/4′) = 40.61 kcal/mol; ΔE(5/5′) = 70.98 kcal/mol

In all, the increased stability and higher covalent character of the Ga-O-Ga unit compared to that of the Al analogue favors the formation of different sized rings.

Additionally, a free energy change analysis was performed to assess the relative stability between compounds. Changes in free energy can be obtained easily from the thermochemistry section in the FREQ calculation from Gaussian.

This paper is published in Inorganic Chemistry under the following citation: Erandi Bernabé-Pablo, Vojtech Jancik, Diego Martínez-Otero, Joaquín Barroso-Flores, and Mónica Moya-Cabrera* “Molecular Group 13 Metallaborates Derived from M−O−M Cleavage Promoted by BH3” Inorg. Chem. 2017, 56, 7890−7899

The second paper deals with heavier atoms and the bonds the formed around Yttrium complexes with triazoles, for which we calculated a more detailed distribution of the electronic density and concluded that the coordination of Cp to Y involves a high component of ionic character.

This paper is published in Ana Cristina García-Álvarez, Erandi Bernabé-Pablo, Joaquín Barroso-Flores, Vojtech Jancik, Diego Martínez-Otero, T. Jesús Morales-Juárez, Mónica Moya-Cabrera* “Multinuclear rare-earth metal complexes supported by chalcogen-based 1,2,3-triazole” Polyhedron 135 (2017) 10-16

We keep working on other projects and I hope we keep on doing so for the foreseeable future because those main group metals have been in my blood all this century. Thanks and a big shoutout to Dr. Monica Moya for keeping me in her highly productive and competitive team of researchers; here is to many more years of joint work.

The Gossip Approach to Scientific Writing

Communication of scientific findings is an essential skill for any scientist, yet it’s one of those things some students are reluctant to do partially because of the infamous blank page scare. Once they are confronted to writing their thesis or papers they make some common mistakes like for instance not thinking who their audience is or not adhering to the main points. One of the the highest form of communication, believe it or not, is gossip, because gossip goes straight to the point, is juicy (i.e. interesting) and seldom needs contextualization i.e. you deliver it just to the right audience (that’s why gossiping about friends to your relatives is almost never fun) and you do it at the right time (that’s the difference between gossips and anecdotes). Therefore, I tell my students to write as if they were gossiping; treat your research in a good narrative way, because a poor narrative can make your results be overlooked.

I’ve read too many theses in which conclusions are about how well the methods work, and unless your thesis has to do with developing a new method, that is a terrible mistake. Methods work well, that is why they are established methods.

Take the following example for a piece of gossip: Say you are in a committed monogamous relationship and you have the feeling your significant other is cheating on you. This is your hypothesis. This hypothesis is supported by their strange behavior, that would be the evidence supporting your hypothesis; but be careful because there could also be anecdotal evidence which isn’t significant to your own as in the spouse of a friend had this behavior when cheating ergo mine is cheating too. The use of anecdotal evidence to support a hypothesis should be avoided like the plague. Then, you need an experimental setup to prove, or even better disprove, your hypothesis. To that end you could hack into your better half’s email, have them followed either by yourself or a third party, confronting their friends, snooping their phone, just basically about anything that might give you some information. This is the core of your research: your data. But data is meaningless without a conclusion, some people think data should speak for itself and let each reader come up with their own conclusions so they don’t get biased by your own vision and while there is some truth to that, your data makes sense in a context that you helped develop so providing your own conclusions is needed or we aren’t scientists but stamp collectors.

This is when most students make a terrible mistake because here is where gossip skills come in handy: When asked by friends (peers) what was it that you found out, most students will try to convince them that they knew the best algorithms for hacking a phone or that they were super conspicuous when following their partners or even how important was the new method for installing a third party app on their phones to have a text message sent every time their phone when outside a certain area, and yeah, by the way, I found them in bed together. Ultimately their question is left unanswered and the true conclusion lies buried in a lengthy boring description of the work performed; remember, you performed all that work to reach an ultimate goal not just for the sake of performing it.

Writers say that every sentence in a book should either move the story forward or show character; in the same way, every section of your scientific written piece should help make the point of your research, keep the why and the what distinct from the how, and don’t be afraid about treating your research as the best piece of gossip you’ve had in years because if you are a science student it is.


Stability of Unnatural DNA – @PCCP #CompChem

As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.


Figure 1

When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.

Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.

X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.

I’m putting a new blog out there

As if I didn’t have enough things to do I’m launching a new blog inspired by the #365papers hashtag on Twitter and the blog. In it I’ll hopefully list, write a femto-review of all the papers I read. This new effort is even more daunting than the actual reading of the huge digital pile of papers I have in my Mendeley To-Be-Read folder, the fattest of them all. The papers therein wont be a comprehensive review of Comp.Chem. must-read papers but rather papers relevant to our lab’s research or curiosity.

Maybe I’ll include some papers brought to my attention by the group and they could do the review. The whole endeavor might flop in a few weeks but I want to give it a shot; we’ll see how it mutates and if it survives or not. So far I haven’t managed to review all papers read but maybe this post will prompt to do so if only to save some face. The domain of the new blog is but I think it should have included the word MY at the beginning so as to convey the idea that it is only my own biased reading list. Anyway, if you’re interested share it and subscribe, those post will not be publicized.

Unnatural DNA and Synthetic Biology

Ever since I read the highly praised article by Floyd Romesberg in Nature back in 2013 I got really interested in synthetic biology. In said article, an unnatural base pair (UBP) was not only inserted into a DNA double strand in vivo  but the organism was even able to reproduce the UBPs present in subsequent generations.


Romesberg’s Nucleosides. No Hydrogen bonding is formed between them!

Inserting new unnatural base pairs in DNA works a lot like editing a computer’s code. Inserting a couple UBPs in vitro is like inserting a comment; it wont make a difference but its still there. If the DNA sequence containing the UBPs can be amplified by molecular biology techniques such as PCR it means that a polymerase enzyme is able to recognize it and place it in site, this is equivalent to inserting a ‘hello world’ section into a working code; it will compile but it’s pretty much useless. Inserting these UBPs in vivo means that the organism is able to thrive despite the large deformation in a short section of its genetic code, but having it replicated by the chemical machinery of the nucleus is an amazing feat that only a few molecules could allow.

The ultimate goal of synthetic biology would be to find a UBP which codes effectively and purposefully during translation of DNA.This last feat would be equivalent to inserting a working subroutine in a program with a specific purpose. But not only could the use of UBPs serve for the purposes of expanding the genetic code from a quaternary (base four) to a senary (base six) system: the field of DNA origami could also benefit from having an expansion in the chemical and structural possibilities of the famous double helix; marking and editing a sequence would also become easier by having distinctive sections with nucleotides other than A, T, C and G.

It is precisely in the concept of double helix that our research takes place since the available biochemical machinery for translation and replication can only work on a double helix, else, the repair mechanisms get activated or the DNA will just stop serving its purpose (i.e. the code wont compile).

My good friend, Dr. Rodrigo Galindo and I have worked on the simulation of Romesberg’s UBPs in order to understand the underlying structural, dynamical and electronic causes that made them so successful and to possibly design more efficient UBPs based on a set of general principles. A first paper has been accepted for publication in Phys.Chem.Chem.Phys. and we’re very excited for it; more on that in a future post.

