Blog Archives

DFT beyond academia


Density Functional Theory is by far the most successful way of gaining access to molecular properties starting from their composition. Calculating the electronic structure of molecules or solid phases has become a widespread activity in computational as well as in experimental labs not only for shedding light on the properties of a system under study but also as a tool to design those systems with taylor-made properties. This level of understanding of matter brought by DFT is based in a rigorous physical and mathematical development, still–and maybe because of it–DFT (and electronic structure calculations in general for that matter) might be thought of as something of little use outside academia.

Prof. Juan Carlos Sancho-García from the University of Alicante in Spain, encouraged me to talk to his students last month about the reaches of DFT in the industrial world. Having once worked in the IP myself I remembered the simulations performed there were mostly DPD (Dissipative Particle Dynamics), a coarse grained kind of molecular dynamics, for investigating the interactions between polymers and surfaces, but no DFT calculations were ever on sight. It is widely known that Docking, QSAR, and Molecular Dynamics are widely used in the pharma industry for the development of new drugs but I wasn’t sure where DFT could fit in all this. I thought patent search would be a good descriptor for the commercial applicability of DFT. So I took a shallow dive and searched for patents explicitly mentioning the use of DFT as part of the invention development process and protection. The first thing I noticed is that although they appear to be only a few, these are growing in numbers throughout the years (Figure 1). Again, this was not an exhaustive search so I’m obviously overlooking many.

Figure 1 – A non-exhaustive search in a patents database

The second thing that caught my attention was that the first hit came from 1998, nicely coinciding with the rise of B3LYP (Figure 2). This patent was awarded to Australian inventors from the University of Wollongong, South New Wales to determine trace gas concentrations by chromatography by means of calculating the FT-IR spectra of sample molecules (Figure 3), so DFT is used as part of the invention but I ignore if this is a widespread method in analytical labs.

Figure 2 – B3LYP cited in scientific publications

While I’m mentioning the infamous B3LYP functional, a search about it in patents yields the following graph (Figure 4), most of which relate to the protection of photoluminescent or thermoluminescent molecules for light emitting devices; it appears that DFT calculations are used to provide the key features of their protection, such as HOMO-LUMO gap etc.

Figure 4 – Patents bearing B3LYP as part of their invention

So what about software? Most of the more recent patents in Figure 1 (2018 – 2022) lie in the realm of electronics, particularly the development of semiconductors, ceramical or otherwise, so it was safe to assume VASP could be a popular choice to that end, right? turns out that’s not necessarily the case since a patent search for VASP only accounts for about the 10% of all awarded patents (Figure 5).

Figure 5 – VASP in patents

I guess it’s safe to say by now that DFT has a significant impact in the industrial development, one could only expect it to keep on rising, however the advent of machine learning techniques and other artificial intelligence related methods promise an accelerated development. I went again to the patents database and this time searched for ‘machine learning development materials‘ (the term ‘development’ was deleted by the search engine, guess found it too obvious) and its rise is quite notorious, surpassing the frequency of DFT in patents (Figure 6), particularly in the past 5 years (2018 – 2022).

Figure 6 – The rise of the machines in materials development

I’m guessing in some instances DFT and ML will tend to go hand in hand in the industrial development process, but the timescales reachable by ML will only tend to grow, so I’m left with the question of what are we waiting for to make ML and AI part of the chemistry curricula? As computational chemistry teachers we should start talking about this points with our students and convince the head of departments to help us create proper courses or we risk our graduates to become niche scientists in a time when new skills are sought after in the IP.

__________________________________________________________________________________

Thanks again to Prof. Juan Carlos Sancho García at the University of Alicante, Spain, who asked me talk about the subject in front of his class, and to Prof. José Pedro Cerón-Carrasco from Cartagena for allowing me to talk about this and other topics at Centro Universitario de la Defensa. Thank you, guys! I look forward to meeting you again soon.

Exciton Energy Transfer-Talk at the Virtual Winter School of Comp.Chem. 2022


I’m very honored to have been invited to this edition of this long standing event, the Virtual Winter School of Computational Chemistry. In this talk I walk through the basics of what are excitons and how do they move or transfer across matter; and of course, a primer on how to calculate the energy transfer with Gaussian.

This is a very basic introduction but I hope someone finds it useful. Thanks to Henrique Castro for inviting me to take part of this experience and to all the professors and students involved in the organization. Don’t forget to go and check all the other fantastic talks, including one by Nobel Laureate and chemistry legend Prof. Roald Hoffmann, at the Virtual Winter School’s website: https://winterschool.cc/

Water splitting by proton to hydride umpolung—New paper in Chem.Sci.


The word ‘umpolung‘ is not used often enough in my opinion, and that’s a shame since this phenomenon refers to one of the most classic tropes or deus ex machina used in sci-fi movies—prominently in the Dr. Who lore*—and that is ‘reversing the polarity‘. Now, reversing the polarity only means that for any given dipole the positively charged part now acquires a negative charge, while the originally negatively charged part becomes positively charged, and thus the direction of the dipole moment is, well, reversed.

In chemistry, reversing the polarity of a bond is an even cooler matter because it means that atoms that typically behave as positively charged become negatively charged and react with other molecules accordingly. Such is the case of this new research conducted experimentally by Prof. Rong Shang at Hiroshima University and theoretically elucidated by Leonardo “Leo” Lugo, who currently works jointly with me and my good friend the always amazing José Oscar Carlos Jimenez-Halla at the University of Guanajuato, Mexico.

Production of molecular hydrogen from water splitting at room temperature is a remarkable feat that forms the basis of fuel cells in the search for cleaner sources of energy; this process commonly requires a metallic catalyst, and it has been achieved via Frustrated Lewis Pairs from Si(II), but so far the use of an intramolecular electron relay process has not been reported.

BPB – Figure 1

Prof. Rong Shang and her team synthesized an ortho-phenylene linked bisborane functionalized phosphine (Figure 1), and proved their stoichiometric reaction with water yielding H2 and phosphine oxide quantitatively at room temperature. During the reaction mechanism the umpolung occurs when a proton from the captured water molecule forms a hydride centered on the borane moiety of BPB. The reaction mechanism is shown in Figure 2.

According to the calculated mechanism, a water molecule coordinates to one of the borane groups via the oxygen atom, and the phosphorus atom later forms a hydrogen bond via their lone pair separating the water molecule into OH and H+, this latter migrates to the second borane and it is during this migration (marked TSH2 in Figure 2) where the umpolung process takes place; the natural charge of the hydrogen atom changes from positive to negative and stays so in the intermediate H3. This newly formed hydride reacts with the hydrogen atom on the OH group to form the reduction product H2, the final phosphine oxide shows a PO…B intramolecular forming a five membered ring which further stabilizes it.

This results are now available in Chemical Science, 2021, 12, 15603 DOI:10.1039/d1sc05135k. As always, I deeply thank Prof. Óscar Jiménez-Halla for inviting me to participate on this venture.


* Below there’s a cool compilation of the Reverse the Polarity trope found in Dr. Who:

XIX RMFQT – National Meeting on TheoPhysChem


The Mexican Meeting on Theoretical Physical Chemistry is a national staple of our local scientific discipline. The nineteenth edition had to be a virtual conference due to sanitary restrictions still enforced in Mexico. Nevertheless, this was a successful meeting in which we tried new things, such as a live broadcast via our new official YouTube channel and a Twitter poster session covered under the hashtag #RMFQTXIX.

Please browse the previous links (talks in Spanish, most Tweets are also in Spanish but some are available in English.) Twitter conferences are here to stay and the creativity from the participants will be key in moving them forward; unfortunately, most of us are still grounded in the traditional idea of a physical poster and that notion taken literally translates poorly to a Tweet. I wanted to embed some of the presented posters but I don’t want to leave people out and they were fortunately too many for them to fit in a blog post. So head on to Twitter and check the hashtags #RMFQTXIX and #CompChemMX and follow the official Twitter account for the RMFQT.

A big shout-out to the staff, PhD students Jessica Arcudia and Gustavo Mondragón for keeping up the live sessions and online broadcast. The future of Mexican CompChem is in safe hands!

Fixing the error: Bad data into FinFrg


I found this error in the calculation of two interacting fragments, both with unpaired electrons. So, two radicals interact at a certain distance and the full system is deemed as a singlet, therefore the unpaired electron on each fragment have opposite spins. The problem came when trying to calculate the Basis Set Superposition Error (BSSE) because in the Counterpoise method you need to assign a charge and multiplicity to each fragment, however it’s not obvious how to assign opposite spins.

The core of the problem is related to the guess construction; normally a Counterpoise calculation would look like the following example:

#p B3LYP/6-31G(d,p) counterpoise=2

-2,1 -1,2 -1,2
C(Fragment=1)        0.00   0.00   0.00
O(Fragment=2)        1.00   1.00   1.00
...

In which the first pair of charge-multiplicity numbers correspond to the whole molecule and the following to those of each fragment in increasing order of N (in this case, N = 2). So for this hypothetical example we have two anions (but could easily be two cations) each with an unpaired electron, yielding a complex of charge = -2 and a singlet multiplicity which implies those two unpaired electrons have opposite spin. But if the guess (the initial trial wavefunction from which the SCF will begin) has a problem understanding this then the title error shows up:

Bad data into FinFrg 
Error termination via Lnk1e ...

The solution to this problem is as simple as it may be obscure: Create a convenient guess wavefunction by placing a negative sign to the multiplicity of one of the fragments in the following example. You may then use the guess as the starting point of other calculations since it will be stored in the checkpoint file. By using this negative sign we’re not requesting a negative multiplicity, but a given multiplicity of opposite spin to the other fragment.

#p B3LYP/6-31G(d,p) guess=(only,fragment=2)

-2,1 -1,2 -1,-2 
C(Fragment=1)        0.00   0.00   0.00 
O(Fragment=2)        1.00   1.00   1.00 
...

This way, the second fragment will have the opposite spin (but the same multiplicity) as the first fragment. The only keyword tells gaussian to only calculate the guess wave function and then exit the program. You may then use that guess as the starting point for other calculations such as my failed Counterpoise one.

Submerged Reaction Energy Barriers


The energy of your calculated transition state (TS) is lower than that of the reagents. That’s gotta be an error right? Well, maybe not.

Typically, in classical transition state theory, we associate the reaction barrier to the energy difference between the reaction complex and the TS, in other words, we associate the reaction barrier to the relative energy of the TS. However, this isn’t always the case, since the TS isn’t always located at the barrier, which simply may not exist or may be a submerged one, i.e. the TS relative energy is negative with respect to the reaction complex. This leads to negative activation energies, but one must bear in mind that the activation energy is not equal to the relative energy of the TS but rather to the slope of the Arrhenius plot, which in turn comes from the Arrhenius equation given below.

k = Aexp(Ea/RT) 
or in logarithmic form
Lnk = LnA + (Ea/RT)

The Arrhenius plot is then the plot of Lnk vs T-1, with slope Ea

Caution is advised since the apparent presence of such a barrier may be due to a computational artifact rather than to the real kinetics taking place, that’s why an IRC calculation must follow a TS optimization in order to verify the truthfulness of the TS; keep in mind that in classical transition state theory, we’re ‘slicing‘ a multidimensional map along a carefully chosen reaction coordinate but this choice might not entirely be the right one, or even an existing one for that matter. I also recommend to change the level of theory, reconsider the reaction complex structure (because a hidden intermediate or complex may be lurking between reactants and TS, see figure 1) and fully verifying the thermochemistry of all components involved before asserting that any given reaction under study has one of these atypical barriers.

Basis Set Superposition Error (BSSE). A short intro


Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, Eint, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

Eint = EAB – EA – EB (1)

However the calculation of Eint by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

EintCP = EABAB – EAAB– EBAB (2)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘ghost‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk
#P opt wB97XD/6-31G(d,p) Counterpoise=2

HF dimer

0,1 0,1 0,1
H(Fragment=1) 0.00 0.00 0.00
F(Fragment=1) 0.00 0.00 0.70
H(Fragment=2) 0.00 0.00 1.00
F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

Orbital Contributions to Excited States


This is a guest post by our very own Gustavo “Gus” Mondragón whose work centers around the study of excited states chemistry of photosynthetic pigments.

When you’re calculating excited states (no matter the method you’re using, TD-DFT, CI-S(D), EOM-CCS(D)) the analysis of the orbital contributions to electronic transitions poses a challenge. In this post, I’m gonna guide you through the CI-singles excited states calculation and the analysis of the electronic transitions.

I’ll use adenine molecule for this post. After doing the corresponding geometry optimization by the method of your choice, you can do the excited states calculation. For this, I’ll use two methods: CI-Singles and TD-DFT.

The route section for the CI-Singles calculation looks as follows:

%chk=adenine.chk
%nprocshared=8
%mem=1Gb

#p CIS(NStates=10,singlets)/6-31G(d,p) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1
--blank line--

I use the same geometry from the optimization step, and I request only for 10 singlet excited states. The CPCP implicit solvation model (solvent=water) is requested. If you want to do TD-DFT, the route section should look as follows:

%chk=adenine.chk
%nprocshared=8
%mem=1Gb

#p FUNCTIONAL/6-31G(d,p) TD(NStates=10,singlets) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1
--blank line--

Where FUNCTIONAL is the DFT exchange-correlation functional of your choice. Here I strictly not recommend using B3LYP, but CAM-B3LYP is a noble choice to start.

Both calculations give to us the excited states information: excitation energy, oscillator strength (as f value), excitation wavelength and multiplicity:

Excitation energies and oscillator strengths:

 Excited State   1:      Singlet-A      6.3258 eV  196.00 nm  f=0.4830  <S**2>=0.000
      11 -> 39        -0.00130
      11 -> 42        -0.00129
      11 -> 43         0.00104
      11 -> 44        -0.00256
      11 -> 48         0.00129
      11 -> 49         0.00307
      11 -> 52        -0.00181
      11 -> 53         0.00100
      11 -> 57        -0.00167
      11 -> 59         0.00152
      11 -> 65         0.00177

The data below corresponds to all the electron transitions involved in this excited state. I have to cut all the electron transitions because there are a lot of them for all excited states. If you have done excited states calculations before, you realize that the HOMO-LUMO transition is always an important one, but not the only one to be considered. Here is when we calculate the Natural Transition Orbitals (NTO), by these orbitals we can analyze the electron transitions.

For the example, I’ll show you first the HOMO-LUMO transition in the first excited state of adenine. It appears in the long list as follows:

35 -> 36         0.65024

The 0.65024 value corresponds to the transition amplitude, but it doesn’t mean anything for excited state analysis. We must calculate the NTOs of an excited state from a new Gaussian input file, requesting from the checkpoint file we used to calculate excited states. The file looks as follows:

%Oldchk=adenine.chk
%chk=adNTO1.chk
%nproc=8
%mem=1Gb

#p SP geom=allcheck guess=(read,only) density=(Check,Transition=1) pop=(minimal,NTO,SaveNTO)

I want to say some important things right here for this last file. See that no level of theory is needed, all the calculation data is requested from the checkpoint file “adenine.chk”, and saved into the new checkpoint file “adNTO1.chk”, we must use the previous calculated density and specify the transition of interest, it means the excited state we want to analyze. As we don’t need to specify charge, multiplicity or even the comment line, this file finishes really fast.

After doing this last calculation, we use the new checkpoint file “adNTO1.chk” and we format it:

formchk -3 adNTO1.chk adNTO1.fchk

If we open this formatted checkpoint file with GaussView, chemcraft or the visualizer you want, we will see something interesting by watching he MOs diagram, as follows:

We can realize that frontier orbitals shows the same value of 0.88135, which means the real transition contribution to the first excited state. As these orbitals are contributing the most, we can plot them by using the cubegen routine:

cubegen 0 mo=homo adNTO1.fchk adHOMO.cub 0 h

This last command line is for plotting the equivalent as the HOMO orbital. If we want to plot he LUMO, just change the “homo” keyword for “lumo”, it doesn’t matter if it is written with capital letters or not.

You must realize that the Natural Transition Orbitals are quite different from Molecular Orbitals. For visual comparisson, I’ve printed also the molecular orbitals, given from the optimization and from excited states calculations, without calculating NTOs:

These are the molecular frontier orbitals, plotted with Chimera with 0.02 as the isovalue for both phase spaces:

The frontier NTOs look qualitatively the same, but that’s not necessarily always the case:

If we analyze these NTOs on a hole-electron model, the HOMO refers to the hole space and the LUMO refers to the electron space.

Maybe both orbitals look the same, but both frontier orbitals are quite different between them, and these last orbitals are the ones implied on first excited state of adenine. The electron transition will be reported as follows:

If I can do a graphic summary for this topic, it will be the next one:

NTOs analysis is useful no matter if you calculate excited states by using CIS(D), EOM-CCS(D), TD-DFT, CASSCF, or any of the excited states method of your election. These NTOs are useful for population analysis in excited states, but these calculations require another software, MultiWFN is an open-source code that allows you to do this analysis, and another one is called TheoDORE, which we’ll cover in a later post.

XVIII RMFQT


It was my distinct pleasure for me to participate in the organization of the latest edition of the Mexican Meeting on Theoretical Physical Chemistry, RMFQT which took place last week here in Toluca. With the help of the School of Chemistry from the Universidad Autónoma del Estado de México.

This year the national committee created a Lifetime Achievement Award for Dr. Annik Vivier, Dr. Carlos Bunge, and Dr. José Luis Gázquez. This recognition from our community is awarded to these fine scientists for their contributions to theoretical chemistry but also for their pioneering work in the field in Mexico. The three of them were invited to talk about any topic of their choosing, particularly, Dr. Vivier stirred the imagination of younger students by showing her pictures of the times when she used to hangout with Slater, Roothan, Löwdin, etc., it is always nice to put faces onto equations.

Continuing with a recent tradition we also had the pleasure to host three invited plenary lectures by great scientists and good friends of our community: Prof. William Tiznado (Chile), Prof. Samuel B. Trickey (USA), and Prof. Julia Contreras (France) who shared their progress on their recent work.

As I’ve abundantly pointed out in the past, the RMFQT is a joyous occasion for the Mexican theoretical community to get together with old friends and discuss very exciting research being done in our country and by our colleagues abroad. I’d like to add a big shoutout to Dr. Jacinto Sandoval-Lira for his valuable help with the organization of our event.

Useful Thermochemistry from Gaussian Calculations


Statistical Mechanics is the bridge between microscopic calculations and thermodynamics of a particle ensemble. By means of calculating a partition function divided in electronic, rotational, translational and vibrational functions, one can calculate all thermodynamic functions required to fully characterize a chemical reaction. From these functions, the vibrational contribution, together with the electronic contribution, is the key element to getting thermodynamic functions.

Calculating the Free Energy change of any given reaction is a useful approach to asses their thermodynamic feasibility. A large negative change in Free Energy when going from reagents to products makes up for a quantitative spontaneous (and exothermic) reaction, nevertheless the rate of the reaction is a different story, one that can be calculated as well.

Using the freq option in your route section for a Gaussian calculation is mandatory to ascertain the current wave function corresponds to a minimum on a potential energy hypersurface, but also yields the thermochemistry and thermodynamic values for the current structure. However, thermochemistry calculations are not restricted to minima but it can also be applied to transition states, therefore yielding a full thermodynamic characterization of a reaction mechanism.

A regular freq calculation yields the following output (all values in atomic units):

Zero-point correction=                           0.176113 (Hartree/Particle)
 Thermal correction to Energy=                    0.193290
 Thermal correction to Enthalpy=                  0.194235
 Thermal correction to Gibbs Free Energy=         0.125894
 Sum of electronic and zero-point Energies=           -750.901777
 Sum of electronic and thermal Energies=              -750.884600
 Sum of electronic and thermal Enthalpies=            -750.883656
 Sum of electronic and thermal Free Energies=         -750.951996

For any given reaction say A+B -> C one could take the values from the last row (lets call it G) for all three components of the reaction and perform the arithmetic: DG = GC – [GA + GB], so products minus reagents.

By default, Gaussian calculates these values (from the previously mentioned partition function) using normal conditions, T = 298.15 K and P = 1 atm. For an assessment of the thermochemistry at other conditions you can include in your route section the corresponding keywords Temperature=x.x and Pressure=x.x, in Kelvin and atmospheres, respectively.

(Huge) Disclaimer: Although calculating the thermochemistry of any reaction by means of DFT calculations is a good (and potentially very useful) guide to chemical reactivity, getting quantitative results require of high accuracy methods like G3 or G4 methods, collectively known as Gn mehtods, which are composed of pre-defined stepwise calculations. The sequence of these calculations is carried out automatically; no basis set should be specified. Other high accuracy methods like CBS-QB3 or W1U can also be considered whenever Gn methods are too costly.

%d bloggers like this: