Category Archives: Computational Chemistry

Dealing with Spin Contamination

Most organic chemistry deals with closed shell calculations, but every once in a while you want to calculate carbenes, free radicals or radical transition states coming from a homolytic bond break, which means your structure is now open shell.

Closed shell systems are characterized by having doubly occupied molecular orbitals, that is to say the calculation is ‘restricted’: Two electrons with opposite spin occupy the same orbital. In open shell systems, unrestricted calculations have a complete set of orbitals for the electrons with alpha spin and another set for those with beta spin. Spin contamination arises from the fact that wavefunctions obtained from unrestricted calculations are no longer eigenfunctions of the total spin operator <S^2>. In other words, one obtains an artificial mixture of spin states; up until now we’re dealing only with single reference methods. With each step of the SCF procedure the value of <S^2> is calculated and compared to s(s+1) where s is half the number of unpaired electrons (0.75 for a radical and 2.0 for triplets, and so on); if a large deviation between these two numbers is found, the then calculation stops.

Gaussian includes an annihilation step during SCF to reduce the amount of spin contamination but it’s not 100% reliable. Spin contaminated wavefunctions aren’t reliable and lead to errors in geometries, energies and population analyses.

One solution to overcome spin contamination is using Restricted Open Shell calculations (ROHF, ROMP2, etc.) for which singly occupied orbitals is used for the unpaired electrons and doubly occupied ones for the rest. These calculations are far more expensive than the unrestricted ones and energies for the unpaired electrons (the interesting ones) are unreliable, specially spin polarization is lost since dynamical correlation is hardly accounted for. The IOP(5/14=2) in Gaussian uses the annihilated wavefunction for the population analysis if acceptable but since Mulliken’s method is not reliable either I don’t advice it anyway. 

The case of DFT is different since rho.alpha and rho.beta can be separated (similarly to the case of unrestricted ab initio calculations), but the fact that both densities are built of Kohn-Sham orbitals and not true canonical orbitals, compensates the contamination somehow. That is not to say that it never shows up in DFT calculations but it is usually less severe, of course for the case of hybrid functional the more HF exchange is included the more important spin contamination may become. 

So, in short, for spin contaminated wavefunctions you want to change from restricted to unrestricted and if that doesn’t work then move to Restricted Open Shell; if using DFT you can use the same scheme and also try changing from hybrid to pure orbitals at the cost of CPU time. There is a last option which is using spin projection methods but I’ll discuss that in a following post. 

Stability of Unnatural DNA – @PCCP #CompChem

As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.


Figure 1

When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.

Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.

X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.

I’m putting a new blog out there

As if I didn’t have enough things to do I’m launching a new blog inspired by the #365papers hashtag on Twitter and the blog. In it I’ll hopefully list, write a femto-review of all the papers I read. This new effort is even more daunting than the actual reading of the huge digital pile of papers I have in my Mendeley To-Be-Read folder, the fattest of them all. The papers therein wont be a comprehensive review of Comp.Chem. must-read papers but rather papers relevant to our lab’s research or curiosity.

Maybe I’ll include some papers brought to my attention by the group and they could do the review. The whole endeavor might flop in a few weeks but I want to give it a shot; we’ll see how it mutates and if it survives or not. So far I haven’t managed to review all papers read but maybe this post will prompt to do so if only to save some face. The domain of the new blog is but I think it should have included the word MY at the beginning so as to convey the idea that it is only my own biased reading list. Anyway, if you’re interested share it and subscribe, those post will not be publicized.

Unnatural DNA and Synthetic Biology

Ever since I read the highly praised article by Floyd Romesberg in Nature back in 2013 I got really interested in synthetic biology. In said article, an unnatural base pair (UBP) was not only inserted into a DNA double strand in vivo  but the organism was even able to reproduce the UBPs present in subsequent generations.


Romesberg’s Nucleosides. No Hydrogen bonding is formed between them!

Inserting new unnatural base pairs in DNA works a lot like editing a computer’s code. Inserting a couple UBPs in vitro is like inserting a comment; it wont make a difference but its still there. If the DNA sequence containing the UBPs can be amplified by molecular biology techniques such as PCR it means that a polymerase enzyme is able to recognize it and place it in site, this is equivalent to inserting a ‘hello world’ section into a working code; it will compile but it’s pretty much useless. Inserting these UBPs in vivo means that the organism is able to thrive despite the large deformation in a short section of its genetic code, but having it replicated by the chemical machinery of the nucleus is an amazing feat that only a few molecules could allow.

The ultimate goal of synthetic biology would be to find a UBP which codes effectively and purposefully during translation of DNA.This last feat would be equivalent to inserting a working subroutine in a program with a specific purpose. But not only could the use of UBPs serve for the purposes of expanding the genetic code from a quaternary (base four) to a senary (base six) system: the field of DNA origami could also benefit from having an expansion in the chemical and structural possibilities of the famous double helix; marking and editing a sequence would also become easier by having distinctive sections with nucleotides other than A, T, C and G.

It is precisely in the concept of double helix that our research takes place since the available biochemical machinery for translation and replication can only work on a double helix, else, the repair mechanisms get activated or the DNA will just stop serving its purpose (i.e. the code wont compile).

My good friend, Dr. Rodrigo Galindo and I have worked on the simulation of Romesberg’s UBPs in order to understand the underlying structural, dynamical and electronic causes that made them so successful and to possibly design more efficient UBPs based on a set of general principles. A first paper has been accepted for publication in Phys.Chem.Chem.Phys. and we’re very excited for it; more on that in a future post.

Grimme’s Dispersion DFT-D3 in Gaussian #CompChem

I was just asked if it is possible to perform DFT-D3 calculations in Gaussian and my first answer was to use the following  keyword:


which is available in G16 and G09 only in revision D, apparently.

There are also some overlays that can be used to invoke the use dispersion in various scenarios:

IOp(3/74=x) Exchange and Correlation Potentials





DSD-PBEP86 (double hybrid, DFT-D3).


B2PLYP-D3 (double hybrid, DFT-D3).

B97-D (DFT-D3).

IOp(3/76=x) Mixing of HF and DFT.

-33 PW6B95 and PW6B95-D3 coefficients.

IOp(3/124=x) Empirical dispersion term.




Force dispersion type 3 (Grimme DFT-D3).

Force dispersion type 4 (Grimme DFT-D3(BJ)).

Force dispersion type 5 (Grimme D3, PM7 version).


The D3 correction method of Grimme defines the van der Waals energy like:

$\displaystyle E_{\rm disp} = -\frac{1}{2} \sum_{i=1}^{N_{at}} \sum_{j=1}^{N_{at...
...{6ij}} {r_{ij,{L}}^6} +f_{d,8}(r_{ij,L})\,\frac{C_{8ij}} {r_{ij,L}^8} \right ),$

where coefficients $ C_{6ij}$ are adjusted depending on the geometry of atoms i and j. The damping D3 function for is:

$\displaystyle f_{d,n}(r_{ij}) = \frac{s_n}{1+6(r_{ij}/(s_{R,n}R_{0ij}))^{-\alpha_{n}}},$

where the values of s are adjustable parameters fit for the exchange-correlation functionals used in each calculation.

Mexican Phys.Chem. Meeting XVth edition 

For the fifth year in a row my research group has participated in this traditional meeting on theoretical and computational chemistry, now at the beautiful city of Merida in southeastern Mexico.

Several distinguished international guests included Profs. Jose Luis Mendoza (Florida State University), Adrián Roitberg (University of Florida), Vincent Ortiz (Auburn University) and Paul Ayers (McMaster U. Canada); Their contributions rounded up nicely those of household names like Drs. Alberto Vela, Gabriel Merino (CINVESTAV) (the latter was also the main organizer), Jesus Hernández-Trujillo (UNAM), Jose Luis Gazquez (UAM-I), Óscar Jimenez (Guanajuato), and so many others who were also present.

My students presented four posters summarized below:

1) Maru Sandoval and Gustavo Mondragón on Photosynthesis, particularly the search for exciton transference mechanisms in both natural and theoretical arrangements of photosynthetic pigments. Some very exciting results have been observed; their publication is really near.

2) Raúl Torres and Gustavo Mondragón presented their work on arsenic removing calixarenes, published earlier this year, and the extension of said work to As(III) acids. Graphene oxide is now considered in our simulations as per the experimental work of our colleagues, Prof. Reyes Sierra and Prof. Eddie Lopez-Honorato.

3) Marco Diaz, Guillermo Caballero, Gustavo Mondragón and Raúl Torres had this poster on the calculation of sigma holes as descriptors for predicting pka values in organic acids. Their +1600 calculations project has found the best levels of theory (and ruled out some like B3LYP, of course) with some nice correlations. Yet, much work is still to be done but we’re on the right track.

4) Durbis Castillo presented his work on molecular docking and dynamics of a large library of HIV-1 entry inhibitors for which he uses the suite MAESTRO as a continuation of another project of ours. His enormous library is now in the hundredths of thousands and although we’re facing some technical difficulties, Durbis is thriving in his search. This is our first serious attempt towards a more mature drug discovery project; a manuscript should be ready in the first part of next year.

This guys and the rest of the lab who weren’t present are the ones who make our research flourish and they’ve all earned a day or two at the beach!

Here’s to fifteen more years of RMFQT!

Tribology – New paper in JPC A

Tribology isn’t exactly an area with which us chemists are most familiar, yet chemistry has a great impact on this branch of physics of high industrial importance. Tribology is basically the science which studies the causes and consequences of friction between surfaces. 

The plastic bag industry requires the use of chemical additives to reduce the electrostatic adherence between sheets of plastic. My good old friend Dr. Armando Gama has studied through Dissipative Particle Dynamics (DPD) coarse-grained simulations the friction coefficients of having two slightly different molecules: erukamide and behenamide, which only differ in the presence of a double bond between carbon atoms 12 and 13 (Fig1).


Fig 1

In order to study the electronic aspects that give rise to different tribological effects in these very similar molecules, four chains of each kind were bounded to a frozen graphene surface (four bonds apart to prevent steric crowding) and were optimized at the B97D/6-31G(d,p) level of theory.



Double bonds in erukamide pile together through pi-pi stacking interactions (Fig2) which are absent in behenamide which is why these last ones are able to slide better between each other (Fig3). Interaction energies calculated for the inner chains at the same level of theory are 44.21 and 34.46 kcal/mol for erukamide and behenamide, respectively. As per the suggestion of a referee we extended the calculations to a 2D system by placing seven molecules on graphene, which once again was kept at the optimized geometry of its isolated state, at four bonds of separation in order to prevent steric crowding (Fig 4).


Fig 4

This calculations clearly represent a limit case with a high density covering of the surface, but they nevertheless reflect the observed trend that behenamide works better than erukamide in reducing the static friction coefficient between sheets.

The paper is now available at JPC-A. Thanks to Dr. Gama for this great opportunity to work with his team, I know it wont be the last.

New paper in PCCP: CCl3 reduced to CH3 through σ-holes #CompChem

I found it surprising that the trichloromethyl group could be chemically reduced into a methyl group quite rapidly in the presence of thiophenol, but once again a failed reaction in the lab gave us the opportunity to learn some nuances about the chemical reactivity of organic compounds. Even more surprising was the fact that this reduction occured through a mechanism in which chlorine atoms behave as electrophiles and not as nucleophiles.

We proposed the mechanism shown in figure 1 to be consistent with the 1H-NMR kinetic experiment (Figure 2) which shows the presence of the intermediary sulfides and leads to the observed phenyl-disulfide as the only isolable byproduct. The proposed mechanism invokes the presence of σ-holes on chlorine atoms to justify the attack of thiophenolate towards the chlorine atom leaving a carbanion behind during the first step. The NMR spectra were recorded at 195K which implies that the energy barriers had to be very low; the first step has a ~3kcal/mol energy barrier at this temperature.



Figure 1 – Calculated mechanism BMK/6-31G(d,p) sigma holes are observed on Cl atoms


1H NMR of the chemical reduction of the trichloromethyl group. Sulfide 4 is the only observed byproduct

To calculate these energy barriers we employed the BMK functional as implemented in Gaussian09. This functional came highly recommended to this purpose and I gotta say it delivered! The optimized geometries of all transition states and intermediaries were then taken to an MP2 single point upon which the maximum electrostatic potential on each atom (Vmax) was calculated with MultiWFN. In figure 3 we can observe the position and Vmax value of σ-holes on chlorine atoms as suggested by the mapping of electrostatic potential on the electron density of various compounds.

We later ran the same MP2 calculations on other CCl3 groups and found that the binding to an electron withdrawing group is necessary for a σ-hole to be present. (This fact was already present in the literature, of course, but reproducing it served us to validate our methodology.)


Figure 3 – Sigma holes found on other CCl3 containing compounds

We are pleased to have this work published in PhysChemChemPhys. Thanks to Dr. Moisés Romero for letting us into his laboratory and to Guillermo Caballero for his hard work both in the lab and behind the computer; Guillermo is now bound to Cambridge to get his PhD, we wish him every success possible in his new job and hope to see him again in a few years, I’m sure he will make a good job at his new laboratory.

Quick note on WFN(X) files and MP2 calculations #G09 #CompChem

A few weeks back we wrote about using WFN(X) files with MultiWFN in order to find σ-holes in halogen atoms by calculating the maximum potential on a given surface. We later found out that using a chk file to generate a wfn(x) file using the guess=(read,only) keyword didn’t retrieve the MP2 wavefunction but rather the HF wavefunction! Luckily we realized this problem very quickly and were able to fix it. We tried to generate the wfn(x) file with the following keywords at the route section

#p guess=(read,only) density=current

but we kept retrieving the HF values, which we noticed by running the corresponding HF calculation and noticing that every value extracted from the WFN file was exactly the same.

So, if you want a WFN(X) file for post processing an MP2 (or any other post-HartreFock calculation for that matter) ask for it from the beginning of your calculation in the same job. I still don’t know how to work around this or but will be happy to report it whenever I do.

PS. A sincere apology to all subscribers for getting a notification to this post when it wasn’t still finished.

A New Graduate Student

With pleasure I announce that last week our very own Gustavo “Gus” Mondragón became the fifth undergraduate student from my lab to defend his BSc thesis and it has to be said that he did it admirably so.

Gus has been working with us for about a year now and during this time he not only worked on his thesis calculating excited states for bacteriochlorophyl pigments but also helped us finishing some series of calculations on calix[n]arene complexes of Arsenic (V) acids, which granted him the possibility to apear as a co-author of the manuscript recently published in JIPH. Back in that study he calculated the interaction energies between a family of calix macrocycles and arsenic acid derivatives in order to develop a suitable extracting agent.

For his BSc thesis, Gus reproduced the UV-Vis absorption spectra of bacteriochlorophyll-a pigments found in the Fenna-Matthews-Olson complex of photosynthetic purple bacteria using Time Dependent Density Functional Theory (TD-DFT) with various levels of theory, with PBEPBE yielding the best results among the tried set. These calculations were performed at the crystallographic conformation and at the optimized structure, also, in vacuo results were compared to those in implicit solvent (SMD, MeOH). He will now move towards his masters where he will further continue our research on photosynthesis.

Thank you, Gustavo, for your hard work and your sense of humor. Congratulations on this step and may many more successes come your way.


%d bloggers like this: