# Category Archives: Theoretical Chemistry

## Post Calculation Addition of Empirical Dispersion – Fixing interaction energies

Calculation of interaction energies is one of those things people are more concerned with and is also something mostly done wrong. The so called ‘*gold standard*‘ according to Pavel Hobza for calculating supramolecular interaction energies is the CCSD(T)/CBS level of theory, which is highly impractical for most cases beyond 50 or so light atoms. Basis set extrapolation methods and inclusion of electronic correlation with MP2 methods yield excellent results but they are not nonetheless almost as time consuming as CC. DFT methods in general are terrible and still are the most widely used tools for electronic structure calculations due to their competitive computing times and the wide availability of schemes for including terms which help describe various kinds of interactions. The most important ingredients needed to get a decent to good interaction energies values calculated with DFT methods are correlation and dispersion. The first part can be recreated by a good correlation functional and the use of empirical dispersion takes care of the latter shortcoming, dramatically improving the results for interaction energies even for lousy functionals such as the infamous B3LYP. The results still wont be of benchmark quality but still the deviations from the *gold standard* will be shortened significantly, thus becoming more quantitatively reliable.

There is an online tool for calculating and adding the empirical dispersion from Grimme’s group to a calculation which originally lacked it. In the link below you can upload your calculation, select the basis set and functionals employed originally in it, the desired damping model and you get in return the corrected energy through a geometrical-Counterpoise correction and Grimme’s empirical dispersion function, D3, of which I have previously written here.

The gCP-D3 Webservice is located at: http://wwwtc.thch.uni-bonn.de/

The platform is entirely straightforward to use and it works with xyz, turbomole, orca and gaussian output files. The concept is very simple, a both gCP and D3 contributions are computed in the selected basis set and added to the uncorrected DFT (or HF) energy (eq. 1)

(**1**)

If you’re trying to calculate interaction energies, remember to perform these corrections for every component in your supramolecular assembly (eq. 2)

(**2**)

Here’s a screen capture of the outcome after uploading a G09 log file for the simplest of options B3LYP/6-31G(*d*), a decomposed energy is shown at the left while a 3D interactive Jmol rendering of your molecule is shown at the right. Also, various links to the literature explaining the details of these calculations are available in the top menu.

I’m currently writing a book chapter on methods for calculating ineraction energies so expect many more posts like this. A special mention to Dr. Jacinto Sandoval, who is working with us as a postdoc researcher, for bringing this platform to my attention, I was apparently living under a rock.

## DFT Textbook in Spanish by Dr. José Cerón-Carrasco

Today’s science is published mostly in English, which means that non-English speakers must first tackle the language barrier before sharing their scientific ideas and results with the community; this blog is a proof that non-native-English speakers such as myself cannot outreach a large audience in another language.

For young scientists learning English is a must nowadays but it shouldn’t shy students away from learning science in their own native tongues. To that end, the noble effort by Dr. José Cerón-Carrasco from Universidad Católica San Antonio de Murcia, in Spain, of writing a DFT textbook in Spanish constitutes a remarkable resource for Spanish-speaking computational chemistry students because it is not only a clear and concise introduction to ab initio and DFT methods but because it was also self published and written directly in Spanish. His book “*Introducción a los métodos DFT: Descifrando B3LYP sin morir en el intento*” is now available in Amazon. Dr. Cerón-Carrasco was very kind to invite me to write a prologue for his book, I’m very thankful to him for this opportunity.

Así que para los estudiantes hispanoparlantes hay ahora un muy valioso recurso para aprender DFT sin morir en el intento gracias al esfuerzo y la mente del Dr. José Pedro Cerón Carrasco a quien le agradezco haberme compartido la primicia de su libro

¡Salud y olé!

## Chemistry Makes the Chemical

The compound shown below in figure 1 is listed by Aldrich as 4,5,6,7-tetrahydroindole, but is it really?

To a hardcore organic chemist it is clear that this is not an indole but a pyrrole because the lack of aromaticity in the fused ring gives this molecule the same reactivity as 2,3-diethyl pyrrole. If you search the ChemSpider database for ‘tetrahydroindole’ the search returns the following compound with the identical chemical formula C8H11N but with a different hydrogenation pattern: 2,3,3a,4-Tetrahydro-1H-indole

The real indole, upon an electrophilic attack, behaves as a free enamine yielding the product shown in figure 3 in which the substitution occurs in position 3. This compound cannot undergo an Aromatic Electrophilic Susbstitution since that would imply the formation of a sigma complex which would disrupt the aromaticity.

On the contrary, the corresponding pyrrole is substituted in position 2

These differences in reactivity towards electrophiles are easily rationalized when we plot their HOMO orbitals (calculated at the M062X/def2TZVP level of theory):

If we calculate the Fukui indexes at the same level of theory we get the highest value for susceptibility towards an electrophilic attack as follows: 0.20 for C(3) in indole and 0.25 for C(2) in pyrrole, consistent with the previous reaction schemes.

So, why is it listed as an indole? why would anyone search for it under that name? Nobody thinks about cyclohexane as 1,3,5-trihydrobenzene. According to my good friend and colleague Dr. Moisés Romero most names for heterocyles are kept even after such dramatic chemical changes due to historical and mnemonic reasons even when the reactivity is entirely different. This is only a nomenclature issue that we have inherited from the times of Hantzsch more than a century ago. We’ve become used to keeping the trivial (or should I say arbitrary) names and further use them as derivations but this could pose an epistemological problem if students cannot recognize which heterocycle presents which reactivity.

So, in a nutshell:

Chemistry makes the chemical and not the structure.

A thing we all know but sometimes is overlooked for the sake of simplicity.

## Photosynthesis and Singlet Fission – #WATOC2017 PO1-296

If you work in the field of photovoltaics or polyacene photochemistry, then you are probably aware of the Singlet Fission (SF) phenomenon. SF can be broadly described as the process where an excited singlet state decays to a couple of degenerate coupled triplet states (via a multiexcitonic state) with roughly half the energy of the original singlet state, which in principle could be centered in two neighboring molecules; this generates two holes with a single photon, i.e. twice the current albeit at half the voltage (Fig 1).

It could also be viewed as the inverse process to triplet-triplet annihilation. An important requirement for SF is that the two triplets to which the singlet decays must be coupled in a ^{1}(*TT*) state, otherwise the process is spin-forbidden. Unfortunately (from a computational perspective) this also means that the ^{3}(*TT*) and ^{5}(*T*T) states are present and should be taken into account, and when it comes to chlorophyll derivatives the task quickly scales.

SF has been observed in polyacenes but so far the only photosynthetic pigments that have proven to exhibit SF are some carotene derivatives; so what about chlorophyll derivatives? For a -very- long time now, we have explored the possibility of finding a naturally-occurring, chlorophyll-based, photosynthetic system in which SF could be possible.

But first things first; The methodology: It was soon enough clear, from María Eugenia Sandoval’s MSc thesis, that TD-DFT wasn’t going to be enough to capture the whole description of the coupled states which give rise to SF. It was then that we started our collaboration with SF expert, Prof. David Casanova from the Basque Country University at Donostia, who suggested the use of Restricted Active Space – Spin Flip in order to account properly for the spin change during decay of the singlet excited state. A set of optimized bacteriochlorophyll-a molecules (BChl-a) were oriented ad-hoc so their *Qy* transition dipole moments were either parallel or perpendicular; the rate to which SF could be in principle present yielded that both molecules should be in a parallel *Qy* dipole moments configuration. When translated to a naturally-occurring system we sought in two systems: The Fenna-Matthews-Olson complex (FMO) containing 7 BChl-a molecules and a chlorosome from a mutant photosynthetic bacteria made up of 600 Bchl-d molecules (Fig 2). The FMO complex is a trimeric pigment-protein complex which lies between the antennae complex and the reaction center in green sulfur dependent photosynthetic bacteria such as *P. aestuarii* or *C. tepidium*, serving thus as a molecular wire in which is known that the excitonic transfer occurs with quantum coherence, i.e. virtually no energy loss which led us to believe SF could be an operating mechanism. So far it seems it is not present. However, for a crystallographic BChl-d dimer present in the chlorosome it could actually occur even when in competition with fluorescence.

I will keep on blogging more -numerical and computational- details about these results and hopefully about its publication but for now I will wrap this post by giving credit where credit is due: This whole project has been tackled by our former lab member María Eugenia “Maru” Sandoval and Gustavo Mondragón. Finally, after much struggle, we are presenting our results at **WATOC 2017** next week on **Monday 28th** at poster session 01 (**PO1-296**), so please stop by to say hi and comment on our work so we can improve it and bring it home!

## All you wanted to know about Hybrid Orbitals…

#### … but were afraid to ask

#### or

#### How I learned to stop worrying and not caring that much about hybridization.

The math behind orbital hybridization is fairly simple as I’ll try to show below, but first let me give my praise once again to the formidable Linus Pauling, whose creation of this model built a bridge between quantum mechanics and chemistry; I often say Pauling was the first Quantum Chemist (Gilbert N. Lewis’ fans, please settle down). Hybrid orbitals are therefore a way to create a basis that better suits the geometry formed by the bonds around a given atom and not the result of a process in which atomic orbitals transform themselves for better sterical fitting, or like I’ve said before, the C atom in CH_{4} is sp^{3} hybridized because CH_{4} is tetrahedral and not the other way around. Jack Simmons put it better in his book:

The atomic orbitals we all know and love are the set of solutions to the Schrödinger equation for the Hydrogen atom and more generally they are solutions to the hydrogen-like atoms for which the value of *Z* in the potential term of the Hamiltonian changes according to each element’s atomic number.

Since the Hamiltonian, and any other quantum mechanical operator for that matter, is a Hermitian operator, any given linear combination of wave functions that are solutions to it, will also be an acceptable solution. Therefore, since the *2s* and *2p* valence orbitals of Carbon do not point towards the edges of a tetrahedron they don’t offer a suitable basis for explaining the geometry of methane; even more so these atomic orbitals are not degenerate and there is no reason to assume all C-H bonds in methane aren’t equal. However we can come up with a linear combination of them that might and at the same time will be a solution to the Schrödinger equation of the hydrogen-like atom.

Ok, so we need four degenerate orbitals which we’ll name *ζ _{i}* and formulate them as linear combinations of the C atom valence orbitals:

*ζ _{1}*=

*a*+

_{1}2s*b*+

_{1}2p_{x}*c*+

_{1}2p_{y}*d*

_{1}2p_{z}*ζ _{2}*=

*a*+

_{2}2s*b*+

_{2}2p_{x}*c*+

_{2}2p_{y}*d*

_{2}2p_{z}*ζ _{3}*=

*a*+

_{3}2s*b*+

_{3}2p_{x}*c*+

_{3}2p_{y}*d*

_{3}2p_{z}*ζ _{4}*=

*a*+

_{4}2s*b*+

_{4}2p_{x}*c*+

_{4}2p_{y}*d*

_{4}2p_{z}to comply with equivalency lets set *a _{1}* =

*a*=

_{2}*a*=

_{3}*a*and normalize them:

_{4}*a _{1}*

*+*

^{2}*a*

_{2}*+*

^{2}*a*

_{3}*+*

^{2}*a*

_{4}*= 1 ∴*

^{2}*a*= 1/√4

_{i}Lets take *ζ _{1}* to be directed along the

*z*axis so

*b*=

_{1}*c*= 0

_{1}*ζ _{1 }*= 1/√4(

*2s*) +

*d*

_{1}2p_{z}since *ζ _{1}* must be normalized the sum of the squares of the coefficients is equal to 1:

^{1}/_{4} + *d _{1}^{2}* = 1;

*d _{1}* =

^{√3}/

_{2}

Therefore the first hybrid orbital looks like:

*ζ _{1}* =

^{1}/

_{√4}(

*2s*) +

^{√3}/

_{2}(

*2p*)

_{z}We now set the second hybrid orbital on the xz plane, therefore *c _{2}* = 0

*ζ _{2}* =

^{1}/

_{√4}(

*2s*) +

*b*+

_{2}2p_{x}*d*

_{2}2p_{z}since these hybrid orbitals must comply with all the conditions of atomic orbitals they should also be orthonormal:

〈*ζ _{1}*|

*ζ*〉 = δ

_{2}_{1,2}= 0

^{1}/_{4} + *d _{2}*

^{√3}/

_{2}= 0

*d _{2}* = –

^{1}/

_{2√3}

our second hybrid orbital is almost complete, we are only missing the value of *b _{2}*:

*ζ _{2}* =

^{1}/

_{√4}(

*2s*) +

*b*+-

_{2}2p_{x}^{1}/

_{2√3}(

*2p*)

_{z}again we make use of the normalization condition:

^{1}/_{4} + *b _{2}^{2}* +

^{1}/

_{12}= 1;

*b*=

_{2}^{√2}/

_{√3}

Finally, our second hybrid orbital takes the following form:

*ζ _{2}* =

^{1}/

_{√4}(

*2s*) +

^{√2}/

_{√3}(

*2p*) –

_{x}^{1}/

_{√12}(

*2p*)

_{z}The procedure to obtain the remaining two hybrid orbitals is the same but I’d like to stop here and analyze the relative direction *ζ _{1}* and

*ζ*take from each other. To that end, we take the angular part of the hydrogen-like atomic orbitals involved in the linear combinations we just found. Let us remember the canonical form of atomic orbitals and explicitly show the spherical harmonic functions to which the 2s, 2px, and 2pz atomic orbitals correspond:

_{2}ψ* _{2s}* = (1/4π)

^{½}

*R*(

*r*)

ψ* _{2px}* = (3/4π)

^{½}sinθcosφ

*R*(

*r*)

ψ* _{2pz}* = (3/4π)

^{½}cosθ

*R*(

*r*)

we substitute these in *ζ _{2}* and factorize R(r) and

^{1}/

_{√(4π)}

*ζ _{2}* = (

^{R(r)}/

_{√(4π)})[

^{1}/

_{√4}+ √2 sinθcosφ –

^{√3}/

_{√12}cosθ]

We differentiate *ζ _{2}* respect to θ, and set it to zero to find the maximum value of θ respect to the z axis we get the angle between the first to hybrid orbitals

*ζ*and

_{1}*ζ*(remember that

_{2}*ζ*is projected entirely over the

_{1}*z*axis)

d*ζ _{2}*/dθ = (

^{R(r)}/

_{√(4π)})[√2 cosθ –

^{√3}/

_{√12}sinθ] = 0

sinθ/cosθ = tanθ = -√8

θ = -70.53°,

but since θ is measured from the z axis towards the xy plane this result is equivalent to the complementary angle 180.0° – 70.53° = 109.47° which is exactly the angle between the C-H bonds in methane we all know! and we didn’t need to invoke the unpairing of electrons in full orbitals, their promotion of any electron into empty orbitals nor the ‘*reorganization*‘ of said orbitals into new ones. Orbital hybridization is nothing but a mathematical tool to find a set of orbitals which comply with the experimental observation and that is the important thing here!

To summarize, you can take any number of orbitals and build any linear combination you want, in order to comply with the observed geometry. Furthermore, no matter what hybridization scheme you follow, you still take the entire orbital, you cannot take half of it because they are basis functions. That is why you should never believe that any atom exhibits something like an *sp ^{2.5}* hybridization just because their bond angles lie between 109 and 120°. Take a vector

*v*= x

*i*+y

*j*+z

*k*, even if you specify it to be

*v*= 1/2

*i*that means x = 1/2, not that you took half of the unit vector i, and it doesn’t mean you took nothing of

*j*and

*k*but rather than y = z = 0.

This was a very lengthy post so please let me know if you read it all the way through by commenting, liking, or sharing. Thanks for reading.

## No, seriously, why can’t orbitals be observed?

The concept of *electronic orbital* has become such a useful and engraved tool in understanding chemical structure and reactivity that it has almost become one of those things whose original meaning has been lost and replaced for a utilitarian concept, one which is not bad in itself but that may lead to some wrong conclusions when certain fundamental facts are overlooked.

Last week a wrote -what I thought was- a humorous post on this topic because a couple of weeks ago a viewpoint in JPC-A was published by Pham and Gordon on the possibility of observing molecular orbitals through microscopy methods, which elicited a ‘*seriously? again?*‘ reaction from me, since I distinctly remember the Nature article by Zuo from the year 2000 when I just had entered graduate school. The article is titled “*direct observation of d-orbital holes.*” We discussed this paper in class and the discussion it prompted was very interesting at various levels: for starters, the allegedly observed d-orbital was strikingly similar to a *dz ^{2}*, which we had learned in class (thanks, prof. Carlos Amador!) that is actually a linear combination of

*d(z*and

^{2}-x^{2})*d(z*orbitals, a mathematical -lets say- trick to conform to spectroscopic observations.

^{2}-y^{2})Pham and Gordon are pretty clear in their first paragraph: “*The wave function amplitude Ψ*Ψ is interpreted as the probability density. All observable atomic or molecular properties are determined by the probability and a corresponding quantum mechanical operator, not by the wave function itself. Wave functions, even exact wave functions, are not observables.*” There is even another problem, about which I wrote a post long time ago: orbitals are non-unique, this means that I could get a set of orbitals by solving the Schrödinger equation for any given molecule and then perform a unit transformation on them (such as renormalizing them, re-orthonormalizing them to get a localized version, or even hybridizing them) and the electronic density derived from them would be the same! In quantum mechanical terms this means that the probability density associated with the wave function internal product, *Ψ*Ψ, *is not changed upon unit transformations; why then would a specific version be “observed” under a microscope? As Pham and Gordon state more eloquently it has to do with the Density of States (DOS) rather than with the orbitals. Furthermore, an orbital, or more precisely a spinorbital, is conveniently (in math terms) separated into a radial, an angular and a spin component *R*(*r*)*Y ^{l}_{m}*(

*θ*,

*φ*)

*σ*(

*α*,

*β*) with the angular part given by the spherical harmonic functions

*Y*(

^{l}_{m}*θ*,

*φ*), which in turn -when plotted in spherical coordinates- create the famous lobes we all chemists know and love. Zuo’s observation claim was based on the resemblance of the observed density to the angular part of an atomic orbital. Another thing, orbitals have phases, no experimental observation claims to have resolved those.

Now, I may be entering a dangerous comparison but, can you observe a 2? If you say you just did, well, that “2” is just a symbol used to represent a quantity: two, the cardinality of a set containing two elements. You might as well depict such quantity as “II” or** “⋅⋅”** but still cannot observe “a two”. (If any mathematician is reading this, please, be gentle.) I know a number and a function are different, sorry if I’m just rambling here and overextending a metaphor.

Pretending to having observed an orbital through direct experimental methods is to neglect the Born interpretation of the wave function, Heisenberg’s uncertainty principle and even Schrödinger’s cat! (I know, I know, Schrödinger came up with this *gedankenexperiment* in order to refute the Copenhagen interpretation of quantum mechanics, but it seems like after all the cat is still not out of the box!)

So, the take home message from the viewpoint in JPC is that molecular properties are defined by the expected values of a given wave function for a specific quantum mechanical operator of the property under investigation and not from the wave function itself. Wave functions are not observables and although some imaging techniques seem to accomplish a formidable task the physical impossibility hints to a misinterpretation of facts.

I think I’ll write more about this in a future post but for now, my take home message is to keep in mind that **orbitals are wave functions** and therefore are not more observable (as in imaging) than a partition function is in statistical mechanics.

## Dealing with Spin Contamination

Most organic chemistry deals with closed shell calculations, but every once in a while you want to calculate carbenes, free radicals or radical transition states coming from a homolytic bond break, which means your structure is now open shell.

Closed shell systems are characterized by having doubly occupied molecular orbitals, that is to say the calculation is ‘restricted’: Two electrons with opposite spin occupy the same orbital. In open shell systems, unrestricted calculations have a complete set of orbitals for the electrons with alpha spin and another set for those with beta spin. Spin contamination arises from the fact that wavefunctions obtained from unrestricted calculations are no longer eigenfunctions of the total spin operator <*S*^2>. In other words, one obtains an artificial mixture of spin states; up until now we’re dealing only with single reference methods. With each step of the SCF procedure the value of <*S*^2> is calculated and compared to *s*(*s*+1) where *s* is half the number of unpaired electrons (0.75 for a radical and 2.0 for triplets, and so on); if a large deviation between these two numbers is found, the then calculation stops.

Gaussian includes an annihilation step during SCF to reduce the amount of spin contamination but it’s not 100% reliable. Spin contaminated wavefunctions aren’t reliable and lead to errors in geometries, energies and population analyses.

One solution to overcome spin contamination is using Restricted Open Shell calculations (ROHF, ROMP2, etc.) for which singly occupied orbitals is used for the unpaired electrons and doubly occupied ones for the rest. These calculations are far more expensive than the unrestricted ones and energies for the unpaired electrons (the interesting ones) are unreliable, specially spin polarization is lost since dynamical correlation is hardly accounted for. The IOP(5/14=2) in Gaussian uses the annihilated wavefunction for the population analysis if acceptable but since Mulliken’s method is not reliable either I don’t advice it anyway.

The case of DFT is different since rho.alpha and rho.beta can be separated (similarly to the case of unrestricted ab initio calculations), but the fact that both densities are built of Kohn-Sham orbitals and not true canonical orbitals, compensates the contamination somehow. That is not to say that it never shows up in DFT calculations but it is usually less severe, of course for the case of hybrid functional the more HF exchange is included the more important spin contamination may become.

So, in short, for spin contaminated wavefunctions you want to change from restricted to unrestricted and if that doesn’t work then move to Restricted Open Shell; if using DFT you can use the same scheme and also try changing from hybrid to pure orbitals at the cost of CPU time. There is a last option which is using spin projection methods but I’ll discuss that in a following post.

## Stability of Unnatural DNA – @PCCP #CompChem

As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.

When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.

Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.

X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.

## I’m putting a new blog out there

As if I didn’t have enough things to do I’m launching a new blog inspired by the #365papers hashtag on Twitter and the naturalproductman.wordpress.com blog. In it I’ll hopefully list, write a femto-review of *all* the papers I read. This new effort is even more daunting than the actual reading of the huge digital pile of papers I have in my Mendeley *To-Be-Read* folder, the fattest of them all. The papers therein wont be a comprehensive review of Comp.Chem. must-read papers but rather papers relevant to our lab’s research or curiosity.

Maybe I’ll include some papers brought to my attention by the group and they could do the review. The whole endeavor might flop in a few weeks but I want to give it a shot; we’ll see how it mutates and if it survives or not. So far I haven’t managed to review all papers read but maybe this post will prompt to do so if only to save some face. The domain of the new blog is compchemdigest.wordpress.com but I think it should have included the word MY at the beginning so as to convey the idea that it is only my own biased reading list. Anyway, if you’re interested share it and subscribe, those post will **not** be publicized.