# Category Archives: DFT

## Population Analysis in the Excited State with Gaussian

To calculate what the bonding properties of a molecule are in a particular excited state we can run any population analysis following the root of interest. This straightforward procedure takes two consecutive calculations since you don’t necessarily know before hand which excited state is the one of interest.

The regular Time Dependent Density Functional Theory (TD-DFT) calculation input with Gaussian 16 looks as follows (G09 works pretty much the same), let us assume we’ve already optimized the geometry of a given molecule:

%OldChk=filename.chk %nprocshared=16 %chk=filename_ES.chk #p TD(NStates=10,singlets) wb97xd/cc-pvtz geom=check guess=read Title Card Required 0 1 --blank line--

This input file retrieves the geometry and wavefunction from a previous calculation from filename.chk and doesn’t write anything new into it (that is what `%OldChk=filename.chk` means) and creates a new checkpoint where the excited states are calculated (`%chk=filename_ES.chk`)

In the output you search for the transition which peeks your interest; most often than not you’ll be interested in the one with the highest oscillator strength, *f*. The oscillator strength is a dimensionless number that represents the ratio of the observed, integrated, absorption coefficient to that calculated for a single electron in a three-dimensional harmonic potential [Harris & Bertolucci, Symmetry and Spectroscopy]; in other words, it is related to the probability of that transition to occur, and therefore it takes values from 0.0 to 1.0 (for single photon absorption processes.)

The output of this calculation looks as follows, the value of f for every excitation is reported together with its energy and the orbital transitions which comprise it.

Excitation energies and oscillator strengths: Excited State 1: Singlet-A 3.1085 eV 398.86 nm f=0.0043 <S**2>=0.000 56 -> 59 -0.11230 58 -> 59 0.69339 This state for optimization and/or second-order correction. Total Energy, E(TD-HF/TD-DFT) = -1187.56377917 Copying the excited state density for this state as the 1-particle RhoCI density. Excited State 2: Singlet-A 4.0827 eV 303.68 nm f=0.0016 <S**2>=0.000 52 -> 59 0.46689 52 -> 64 -0.20488 53 -> 59 0.19693 54 -> 59 0.40414 54 -> 64 -0.16261 ... ... Excited State 8: Singlet-A 5.2345 eV 236.86 nm f=0.8063 <S**2>=0.000 52 -> 60 0.17162 53 -> 59 0.47226 53 -> 60 -0.11771 54 -> 59 -0.27658 54 -> 60 -0.22006 55 -> 59 0.20496 56 -> 59 0.15029

Now we’ve selected excited state #8 because it has the largest value of f from the lot, we use the following input to read in the geometry from the old checkpoint file and we generate a new one in case we need it for something else. The input file for doing all this looks as follows (I’ve selected as usual the Natural Bond Orbital population analysis):

%oldchk=a_ES.chk %nprocshared=16 %chk=a_nbo.chk #p TD(Read,Root=8) wb97xd/cc-pvtz geom=check density=current guess=read pop=NBORead Title Card Required 0 1 $NBO BOAO BNDIDX E2PERT $END --blank line--

The flags at the bottom request the calculation of Wiberg Bond Indexes (BNDIDX) as well as Bond Order in the Atomic Orbital basis (BOAO) and a second order perturbation theory for the electronic delocalization (E2PERT). Now we can compare the population analysis between ground and the 8th excited state; check figure 1 and notice the differences in Wiberg’s bond order for this complex made of two molecules and one Na+ cation.

In this example we can observe that in the ground state we have a neutral and a negative molecule together with a Na+ cation, but when we analyze the population in the 8th excited state both molecules acquire a similar charge, ca. **0.46 e**, which means that some of the electron density has been transferred from the negative one to the neutral molecule, forming an

*Electron Donor-Acceptor*complex (EDA) in the excited state.

This procedure can be extended to any other kind of population analysis and their derived combination, e.g. one could calculate their condensed fukui functions in the Nth excited state; but beware! These calculations yield vertical excitations, should the excited state of interest have a minimum we can first optimize the ES geometry and then perform the population analysis on said geometry; just add the opt keyword to perform both jobs in one go, but bear in mind that the NBO population analysis is performed before and after the optimization process so look for the tables and values closer to the end of the output file.

In the case of open shell systems the procedure is the same but one should be extremely careful in searching for the total population analysis since the output file contains this table for the alpha and beta populations separately as well as the added values for the total number of electrons.

## DFT beyond academia

Density Functional Theory is by far the most successful way of gaining access to molecular properties starting from their composition. Calculating the electronic structure of molecules or solid phases has become a widespread activity in computational as well as in experimental labs not only for shedding light on the properties of a system under study but also as a tool to design those systems with taylor-made properties. This level of understanding of matter brought by DFT is based in a rigorous physical and mathematical development, still–and maybe because of it–DFT (and electronic structure calculations in general for that matter) might be thought of as something of little use outside academia.

Prof. Juan Carlos Sancho-García from the University of Alicante in Spain, encouraged me to talk to his students last month about the reaches of DFT in the industrial world. Having once worked in the IP myself I remembered the simulations performed there were mostly DPD (Dissipative Particle Dynamics), a coarse grained kind of molecular dynamics, for investigating the interactions between polymers and surfaces, but no DFT calculations were ever on sight. It is widely known that Docking, QSAR, and Molecular Dynamics are widely used in the pharma industry for the development of new drugs but I wasn’t sure where DFT could fit in all this. I thought patent search would be a good descriptor for the commercial applicability of DFT. So I took a shallow dive and searched for patents explicitly mentioning the use of DFT as part of the invention development process and protection. The first thing I noticed is that although they appear to be only a few, these are growing in numbers throughout the years (Figure 1). Again, this was not an exhaustive search so I’m obviously overlooking many.

The second thing that caught my attention was that the first hit came from 1998, nicely coinciding with the rise of B3LYP (Figure 2). This patent was awarded to Australian inventors from the University of Wollongong, South New Wales to determine trace gas concentrations by chromatography by means of calculating the FT-IR spectra of sample molecules (Figure 3), so DFT is used as part of the invention but I ignore if this is a widespread method in analytical labs.

While I’m mentioning the infamous B3LYP functional, a search about it in patents yields the following graph (Figure 4), most of which relate to the protection of photoluminescent or thermoluminescent molecules for light emitting devices; it appears that DFT calculations are used to provide the key features of their protection, such as HOMO-LUMO gap etc.

**Figure 4**– Patents bearing B3LYP as part of their invention

So what about software? Most of the more recent patents in Figure 1 (2018 – 2022) lie in the realm of electronics, particularly the development of semiconductors, ceramical or otherwise, so it was safe to assume VASP could be a popular choice to that end, right? turns out that’s not necessarily the case since a patent search for VASP only accounts for about the 10% of all awarded patents (Figure 5).

I guess it’s safe to say by now that DFT has a significant impact in the industrial development, one could only expect it to keep on rising, however the advent of machine learning techniques and other artificial intelligence related methods promise an accelerated development. I went again to the patents database and this time searched for ‘*machine learning *development *materials*‘ (the term ‘development’ was deleted by the search engine, guess found it too obvious) and its rise is quite notorious, surpassing the frequency of DFT in patents (Figure 6), particularly in the past 5 years (2018 – 2022).

**Figure 6**– The rise of the machines in materials development

I’m guessing in some instances DFT and ML will tend to go hand in hand in the industrial development process, but the timescales reachable by ML will only tend to grow, so I’m left with the question of what are we waiting for to make ML and AI part of the chemistry curricula? As computational chemistry teachers we should start talking about this points with our students and convince the head of departments to help us create proper courses or we risk our graduates to become niche scientists in a time when new skills are sought after in the IP.

__________________________________________________________________________________

Thanks again to Prof. Juan Carlos Sancho García at the University of Alicante, Spain, who asked me talk about the subject in front of his class, and to Prof. José Pedro Cerón-Carrasco from Cartagena for allowing me to talk about this and other topics at Centro Universitario de la Defensa. Thank you, guys! I look forward to meeting you again soon.

## Geometry Optimizations for Excited States

Electronic excitations are calculated vertically according to the Frank—Condon principle, this means that the geometry does not change upon the excitation and we merely calculate the energy required to reach the next electronic state. But for some instances, say calculating not only the absorption spectra but also the emission, it is important to know what the geometry minimum of this final state looks like, or if it even exists at all (Figure 1). Optimizing the geometry of a given excited state requires the prior calculation of the vertical excitations whether via a multireference method, quantum Monte Carlo, or the Time Dependent Density Functional Theory, TD-DFT, which due to its lower computational cost is the most widespread method.

Most single-reference treatments, ab initio or density based, yield good agreement with experiments for lower states, but not so much for the higher excitations or process that involve the excitation of two electrons. Of course, an appropriate selection of the method ensures the accuracy of the obtained results, and the more states are considered, the better their description although it becomes more computationally demanding in turn.

In Gaussian 09 and 16, the argument to the ROOT keyword selects a given excited state to be optimized. In the following example, five excited states are calculated and the optimization is requested upon the second excited state. If no ROOT is specified, then the optimization would be carried out by default on the first excited state (Where L.O.T. stands for Level of Theory).

#p opt TD=(nstates=5,root=2)L.O.T.

Gaussian16 includes now the calculation of analytic second derivatives which allows for the calculation of vibrational frequencies for IR and Raman spectra, as well as transition state optimization and IRC calculations in excited states opening thus an entire avenue for the computation of photochemistry.

If you already computed the excited states and just want to optimize one of them from a previous calculation, you can read the previous results with the following input :

#p opt TD=(Read,Root=N)L.O.T.Density=Current Guess=Read Geom=AllCheck

Common problems. The following error message is commonly observed in excited state calculations whether in TD-DFT, CIS or other methods:

No map to state XX, you need to solve for more vectors in order to follow this state.

This message usually means you need to increase the number of excited states to be calculated for a proper description of the one you’re interested in. Increase the number N for nstates=N in the route section at higher computational cost. A rule of thumb is to request at least 2 more states than the state of interest. This message can also reflect the fact that during the optimization the energy ordering changes between states, and can also mean that the ground state wave function is unstable, i.e., the energy of the excited state becomes lower than that of the ground state, in this case a single determinant approach is unviable and CAS should be used if the size of the molecule allows it. Excited state optimizations are tricky this way, in some cases the optimization may cross from one PES to another making it hard to know if the resulting geometry corresponds to the state of interest or another. Gaussian recommends changing the step size of the optimization from the default 0.3 Bohr radius to 0.1, but obviously this will make the calculation take longer.

Opt=(MaxStep=10)

If the minimum on the excited state potential energy surface (PES) doesn’t exist, then the excited state is not bound; take for example the first excited state of the H_{2} molecule which doesn’t show a minimum, and therefore the optimized geometry would correspond to both H atoms moving away from each other indefinitely (Figure 2). Nevertheless, a failed optimization doesn’t necessarily means the minimum does not exist and further analysis is required, for instance, checking the gradient is converging to zero while the forces do not.

## Density Keyword in Excited State Calculations with Gaussian

I have written about extracting information from excited state calculations but an important consideration when analyzing the results is the proper use of the keyword *density*.

This keyword let’s Gaussian know which density is to be used in calculating some results. An important property to be calculated when dealing with excited states is the change in dipole moment between the ground state and any given state. The Transition Dipole Moment is an important quantity that allows us to predict whether any given electronic transition will be allowed or not. A change in the dipole moment (i.e. non-zero) of a molecule during an electronic transition helps us characterize said transition.

Say you perform a TD-DFT calculation without the *density* keyword, the default will provide results on the lowest excited state from all the requested states, which may or may not be the state of interest to the transition of interest; you may be interested in the dipole moment of all your excited states.

Three separate calculations would be required to calculate the change of dipole moment upon an electronic transition:

1) A regular DFT for the ground state as a reference

2) TD-DFT, to calculate the electronic transitions; request as many states as you need/want, analyze it and from there you can see which transition is the most important.

3) Request the density of the Nth state of interest to be recovered from the checkpoint file with the following route section:

# TD(Read,Root=N)LOTDensity=Current Guess=Read Geom=AllCheck

replace *N* for the *N*th state which caught your eye in step number 2) and *LOT* for the *Level of Theory* you’ve been using in the previous steps. That should give you the dipole moment for the structure of the *N*th excited state and you can compare it with the one in the ground state calculated in 1). Again, if density=current is not used, only properties of *N*=1 will be printed.

## Orbital Contributions to Excited States

This is a guest post by our very own Gustavo “*Gus*” Mondragón whose work centers around the study of excited states chemistry of photosynthetic pigments.

When you’re calculating excited states (no matter the method you’re using, TD-DFT, CI-S(D), EOM-CCS(D)) the analysis of the orbital contributions to electronic transitions poses a challenge. In this post, I’m gonna guide you through the CI-singles excited states calculation and the analysis of the electronic transitions.

I’ll use adenine molecule for this post. After doing the corresponding geometry optimization by the method of your choice, you can do the excited states calculation. For this, I’ll use two methods: CI-Singles and TD-DFT.

The route section for the CI-Singles calculation looks as follows:

%chk=adenine.chk

%nprocshared=8

%mem=1Gb

#p CIS(NStates=10,singlets)/6-31G(d,p) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1

--blank line--

I use the same geometry from the optimization step, and I request only for 10 singlet excited states. The CPCP implicit solvation model (solvent=water) is requested. If you want to do TD-DFT, the route section should look as follows:

%chk=adenine.chk

%nprocshared=8

%mem=1Gb

#p FUNCTIONAL/6-31G(d,p) TD(NStates=10,singlets) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1

--blank line--

Where FUNCTIONAL is the DFT exchange-correlation functional of your choice. Here I strictly not recommend using B3LYP, but CAM-B3LYP is a noble choice to start.

Both calculations give to us the excited states information: excitation energy, oscillator strength (as *f* value), excitation wavelength and multiplicity:

Excitation energies and oscillator strengths:

Excited State 1: Singlet-A 6.3258 eV 196.00 nm f=0.4830 <S**2>=0.000

11 -> 39 -0.00130

11 -> 42 -0.00129

11 -> 43 0.00104

11 -> 44 -0.00256

11 -> 48 0.00129

11 -> 49 0.00307

11 -> 52 -0.00181

11 -> 53 0.00100

11 -> 57 -0.00167

11 -> 59 0.00152

11 -> 65 0.00177

The data below corresponds to all the electron transitions involved in this excited state. I have to cut all the electron transitions because there are a lot of them for all excited states. If you have done excited states calculations before, you realize that the HOMO-LUMO transition is always an important one, but not the only one to be considered. Here is when we calculate the *Natural Transition Orbitals *(NTO), by these orbitals we can analyze the electron transitions.

For the example, I’ll show you first the HOMO-LUMO transition in the first excited state of adenine. It appears in the long list as follows:

35 -> 36 0.65024

The 0.65024 value corresponds to the transition amplitude, but it doesn’t mean anything for excited state analysis. We must calculate the NTOs of an excited state from a new Gaussian input file, requesting from the checkpoint file we used to calculate excited states. The file looks as follows:

%Oldchk=adenine.chk

%chk=adNTO1.chk

%nproc=8

%mem=1Gb

#p SP geom=allcheck guess=(read,only) density=(Check,Transition=1) pop=(minimal,NTO,SaveNTO)

I want to say some important things right here for this last file. See that no level of theory is needed, all the calculation data is requested from the checkpoint file “adenine.chk”, and saved into the new checkpoint file “adNTO1.chk”, we must use the previous calculated density and specify the transition of interest, it means the excited state we want to analyze. As we don’t need to specify charge, multiplicity or even the comment line, this file finishes really fast.

After doing this last calculation, we use the new checkpoint file “adNTO1.chk” and we format it:

formchk -3 adNTO1.chk adNTO1.fchk

If we open this formatted checkpoint file with GaussView, chemcraft or the visualizer you want, we will see something interesting by watching he MOs diagram, as follows:

We can realize that frontier orbitals shows the same value of 0.88135, which means the real transition contribution to the first excited state. As these orbitals are contributing the most, we can plot them by using the cubegen routine:

cubegen 0 mo=homo adNTO1.fchk adHOMO.cub 0 h

This last command line is for plotting the equivalent as the HOMO orbital. If we want to plot he LUMO, just change the “homo” keyword for “lumo”, it doesn’t matter if it is written with capital letters or not.

You must realize that the Natural Transition Orbitals are quite different from Molecular Orbitals. For visual comparisson, I’ve printed also the molecular orbitals, given from the optimization and from excited states calculations, without calculating NTOs:

These are the molecular frontier orbitals, plotted with Chimera with 0.02 as the isovalue for both phase spaces:

The frontier NTOs look qualitatively the same, but that’s not necessarily always the case:

If we analyze these NTOs on a hole-electron model, the HOMO refers to the *hole* space and the LUMO refers to the *electron* space.

Maybe both orbitals look the same, but both frontier orbitals are quite different between them, and these last orbitals are the ones implied on first excited state of adenine. The electron transition will be reported as follows:

If I can do a graphic summary for this topic, it will be the next one:

NTOs analysis is useful no matter if you calculate excited states by using CIS(D), EOM-CCS(D), TD-DFT, CASSCF, or any of the excited states method of your election. These NTOs are useful for population analysis in excited states, but these calculations require another software, MultiWFN is an open-source code that allows you to do this analysis, and another one is called TheoDORE, which we’ll cover in a later post.

## Natural Transition Orbitals (NTOs) Gaussian

The canonical molecular orbital depiction of an electronic transition is often a messy business in terms of a ‘*chemical*‘ interpretation of ‘*which electrons*‘ go from ‘*which occupied orbitals*‘ to ‘*which virtual orbitals*‘.

**Natural Transition Orbitals** provide a more intuitive picture of the orbitals, whether mixed or not, involved in any *hole-particle *excitation. This transformation is particularly useful when working with the excited states of molecules with extensively delocalized chromophores or multiple chromophoric sites. The elegance of the NTO method relies on its simplicity: separate unitary transformations are performed on the *occupied *and on the *virtual *set of orbitals in order to get a localized picture of the *transition density matrix*.

[1] R. L. Martin, *J. Chem. Phys*., 2003, DOI:10.1063/1.1558471.

In Gaussian09:

After running a TD-DFT calculation with the keyword TD(Nstates=n) (where n = number of states to be requested) we need to take that result and launch a new calculation for the NTOs but lets take it one step at a time. As an example here’s phenylalanine which was already optimized to a minimum at the B3LYP/6-31G(*d*,*p*) level of theory. If we take that geometry and launch a new calculation with the TD(Nstates=40) in the route section we obtain the UV-Vis spectra and the output looks like this (only the first three states are shown):

Excitation energies and oscillator strengths: Excited State 1: Singlet-A 5.3875 eV 230.13 nm f=0.0015 <S**2>=0.000 42 -> 46 0.17123 42 -> 47 0.12277 43 -> 46 -0.40383 44 -> 45 0.50838 44 -> 47 0.11008 This state for optimization and/or second-order correction. Total Energy, E(TD-HF/TD-KS) = -554.614073682 Copying the excited state density for this state as the 1-particle RhoCI density. Excited State 2: Singlet-A 5.5137 eV 224.86 nm f=0.0138 <S**2>=0.000 41 -> 45 -0.20800 41 -> 47 0.24015 42 -> 45 0.32656 42 -> 46 0.10906 42 -> 47 -0.24401 43 -> 45 0.20598 43 -> 47 -0.14839 44 -> 45 -0.15344 44 -> 47 0.34182 Excited State 3: Singlet-A 5.9254 eV 209.24 nm f=0.0042 <S**2>=0.000 41 -> 45 0.11844 41 -> 47 -0.12539 42 -> 45 -0.10401 42 -> 47 0.16068 43 -> 45 -0.27532 43 -> 46 -0.11640 43 -> 47 0.16780 44 -> 45 -0.18555 44 -> 46 -0.29184 44 -> 47 0.43124

The oscillator strength is listed on each Excited State as “f” and it is a measure of the probability of that excitation to occur. If we look at the third one for this phenylalanine we see f=0.0042, a very low probability, but aside from that the following list shows what orbital transitions compose that excitation and with what energy, so the first line indicates a transition from orbital 41 (HOMO-3) to orbital 45 (LUMO); there are 10 such transitions composing that excitation, visualizing them all with canonical orbitals is not an intuitive picture, so lets try the NTO approach, we’re going to take excitation #10 for phenylalanine as an example just because it has a higher oscillation strength:

%chk=Excited State 10: Singlet-A 7.1048 eV 174.51 nm f=0.3651 <S**2>=0.000 41 -> 45 0.35347 41 -> 47 0.34685 42 -> 45 0.10215 42 -> 46 0.17248 42 -> 47 0.13523 43 -> 45 -0.26596 43 -> 47 -0.22995 44 -> 46 0.23277

Each set of NTOs for each transition must be calculated separately. First, copy you *filename.chk* file from the TD-DFT result to a new one and name it after the Nth state of interest as shown below (state 10 in this case). **NOTE**: In the route section, replace N with the number of the excitation of interest according to the results in * filename.log*. Run separately for each transition your interested in:

#chk=state10.chk #p B3LYP/6-31G(d,p) Geom=AllCheck Guess=(Read,Only) Density=(Check,Transition=N) Pop=(Minimal,NTO,SaveNTO) 0 1 --blank line--

By requesting SaveNTO, the canonical orbitals in the state10.chk file are replaced with the NTOs for the 10th excitation, this makes it easier to plot since most visualizers just plot whatever set of orbitals they read in the chk file but if they find the canonical MOs then one would need to do some re-processing of them. This is much more straightforward.

Now we format our chk files into fchk with the formchk utility:

formchk -3 filename.chk filename.fchk

formchk -3 state10.chk state10.fchk

If we open filename.fchk (the file where the original TD-DFT calculation is located) with GaussView we can plot all orbitals involved in excited state number ten, those would be seven orbitals from 41 (HOMO-3) to 47 (LUMO+2) as shown in figure 1.

If we now open state10.fchk we see that the numbers at the side of the orbitals are not their energy but their occupation number particular to this state of interest, so we only need to plot those with highest occupations, in our example those are orbitals 44 and 45 (HOMO and LUMO) which have occupations = 0.81186; you may include 43 and 46 (HOMO-1 and LUMO+1, respectively) for a much more complete description (occupations = 0.18223) but we’re still dealing with 4 orbitals instead of 7.

*particle*) and Orbital 45 (

*hole*) exhibit the largest occupations for Excited State No. 10

The NTO transition 44 -> 45 is far easier to conceptualize than all the 10 combinations given in the canonical basis from the direct TD-DFT calculation. TD-DFT provides us with the correct transitions, NTOs just paint us a picture more readily available to the chemist mindset.

**NOTE**: for G09 revC and above, the %OldChk option is available, I haven’t personally tried it but using it to specify where the excitations are located and then write the NTOs of interest into a new chk file in the following way, thus eliminating the need of copying the original chk file for each state:

%OldChk=filename.chk

%chk=stateN.chk

NTOs are based on the Natural Hybrid orbitals vision by Löwdin and others, and it is said to be so straightforward that it has been re-discovered from time to time. Be that as it may, the NTO visualization provides a much clearer vision of the excitations occurring during a TD calculation.

Thanks for reading, stay home and stay safe during these harsh days everyone. Please share, rate and comment this and other posts.

## Estimation of pKa Values through Local Electrostatic Potential Calculations

Calculating the p*K*a value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.

Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the p*K*a value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (*V*_{S,max}) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step in the deprotonation process.

We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate *V*_{S,max} on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = *water*). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:

p*K*a = -0.2185(*V*_{S,max}) + 16.1879

Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.

We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)

I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use.

We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to *tetrel bonding*. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.

## Computational Chemistry from Latin America

The video below is a sad recount of the scientific conditions in Mexico that have driven an enormous amount of brain power to other countries. Doing science is always a hard endeavour but in developing countries is also filled with so many hurdles that it makes you wonder if it is all worth the constant frustration.

That is why I think it is even more important for the Latin American community to make our science visible, and special issues like this one from the International Journal of Quantum Chemistry goes a long way in doing so. This is not the first time IJQC devotes a special issue to the Comp.Chem. done south of the proverbial border, a full issue devoted to the Mexican Physical Chemistry Meetings (RMFQT) was also published six years ago.

I believe these special issues in mainstream journals are great ways of promoting our work in a collected way that stresses our particular lines of research instead of having them spread a number of journals. Also, and I may be ostracized for this, but I think coming up with a new journal for a specific geographical community represents a lot of effort that takes an enormous amount of time to take off and thus gain visibility.

For these reasons I’ve been cooking up some ideas for the next RMFQT website. I don’t pretend to say that my colleagues need any shoutouts from my part -I could only be so lucky to produce such fine pieces of research myself- but it wouldn’t hurt to have a more established online presence as a community.

¡Viva la ciencia Latinoamericana!

## The HOMO-LUMO Gap in Open Shell Calculations. Meaningful or meaningless?

The HOMO – LUMO orbitals are central to the Frontier Molecular Orbital (FMO) Theory devised by Kenichi Fukui back in the fifties. The central tenet of the FMO theory resides on the idea that most of chemical reactivity is dominated by the interaction between these orbitals in an electron donor-acceptor pair, in which the most readily available electrons of the former arise from the HOMO and will land at the LUMO in the latter. The energy difference between the HOMO and LUMO of any chemical species, known as the HOMO-LUMO gap, is a very useful quantity for describing and understanding the photochemistry and photophysics of organic molecules since most of the electronic transitions in the UV-Vis region are dominated by the electron transfer between these two frontier orbitals.

But when we talk about Frontier Orbitals we’re usually referring to their doubly occupied version; in the case of open shell calculations the electron density with *α* spin is separate from the one with *β* spin, therefore giving rise to two separate sets of singly occupied orbitals and those in turn have a *α-*HOMO/LUMO and *β-*HOMO/LUMO, although SOMO (Singly Occupied Molecular Orbital) is the preferred nomenclature. Most people will then dismiss the HOMO/LUMO question for open shell systems as meaningless because ultimately we are dealing with two different sets of molecular orbitals. Usually the approach is to work backwards when investigating the optical transitions of a, say, organic radical, e.g. by calculating the transitions with such methods like TD-DFT (Time Dependent DFT) and look to the main orbital components of each within the set of *α* and *β* densities.

To the people who have asked me this question I strongly suggest to first try Restricted Open calculations, RODFT, which pair all electrons and treat them with identical orbitals and treat the unpaired ones independently. As a consequence, RO calculations and Unrestricted calculations vary due to variational freedom. RO calculations could yield wavefunctions with small to large values of spin contamination, so beware. Or just go straight to TDDFT calculations with hybrid orbitals which include a somewhat large percentage of HF exchange and polarized basis sets, but to always compare results to experimental values, if available, since DFT based calculations are Kohn-Sham orbitals which are defined for non-interacting electrons so the energy can be biased. Performing CI or CASSCF calculations is almost always prohibitive for systems of chemical interest but of course they would be the way to go.

## Post Calculation Addition of Empirical Dispersion – Fixing interaction energies

Calculation of interaction energies is one of those things people are more concerned with and is also something mostly done wrong. The so called ‘*gold standard*‘ according to Pavel Hobza for calculating supramolecular interaction energies is the CCSD(T)/CBS level of theory, which is highly impractical for most cases beyond 50 or so light atoms. Basis set extrapolation methods and inclusion of electronic correlation with MP2 methods yield excellent results but they are not nonetheless almost as time consuming as CC. DFT methods in general are terrible and still are the most widely used tools for electronic structure calculations due to their competitive computing times and the wide availability of schemes for including terms which help describe various kinds of interactions. The most important ingredients needed to get a decent to good interaction energies values calculated with DFT methods are correlation and dispersion. The first part can be recreated by a good correlation functional and the use of empirical dispersion takes care of the latter shortcoming, dramatically improving the results for interaction energies even for lousy functionals such as the infamous B3LYP. The results still wont be of benchmark quality but still the deviations from the *gold standard* will be shortened significantly, thus becoming more quantitatively reliable.

There is an online tool for calculating and adding the empirical dispersion from Grimme’s group to a calculation which originally lacked it. In the link below you can upload your calculation, select the basis set and functionals employed originally in it, the desired damping model and you get in return the corrected energy through a geometrical-Counterpoise correction and Grimme’s empirical dispersion function, D3, of which I have previously written here.

The gCP-D3 Webservice is located at: http://wwwtc.thch.uni-bonn.de/

The platform is entirely straightforward to use and it works with xyz, turbomole, orca and gaussian output files. The concept is very simple, a both gCP and D3 contributions are computed in the selected basis set and added to the uncorrected DFT (or HF) energy (eq. 1)

(**1**)

If you’re trying to calculate interaction energies, remember to perform these corrections for every component in your supramolecular assembly (eq. 2)

(**2**)

Here’s a screen capture of the outcome after uploading a G09 log file for the simplest of options B3LYP/6-31G(*d*), a decomposed energy is shown at the left while a 3D interactive Jmol rendering of your molecule is shown at the right. Also, various links to the literature explaining the details of these calculations are available in the top menu.

I’m currently writing a book chapter on methods for calculating ineraction energies so expect many more posts like this. A special mention to Dr. Jacinto Sandoval, who is working with us as a postdoc researcher, for bringing this platform to my attention, I was apparently living under a rock.