Category Archives: White papers
This is a guest post by our very own Gustavo “Gus” Mondragón whose work centers around the study of excited states chemistry of photosynthetic pigments.
When you’re calculating excited states (no matter the method you’re using, TD-DFT, CI-S(D), EOM-CCS(D)) the analysis of the orbital contributions to electronic transitions poses a challenge. In this post, I’m gonna guide you through the CI-singles excited states calculation and the analysis of the electronic transitions.
I’ll use adenine molecule for this post. After doing the corresponding geometry optimization by the method of your choice, you can do the excited states calculation. For this, I’ll use two methods: CI-Singles and TD-DFT.
The route section for the CI-Singles calculation looks as follows:
#p CIS(NStates=10,singlets)/6-31G(d,p) geom=check guess=read scrf=(cpcm,solvent=water)
adenine excited states with CI-Singles method
I use the same geometry from the optimization step, and I request only for 10 singlet excited states. The CPCP implicit solvation model (solvent=water) is requested. If you want to do TD-DFT, the route section should look as follows:
#p FUNCTIONAL/6-31G(d,p) TD(NStates=10,singlets) geom=check guess=read scrf=(cpcm,solvent=water)
adenine excited states with CI-Singles method
Where FUNCTIONAL is the DFT exchange-correlation functional of your choice. Here I strictly not recommend using B3LYP, but CAM-B3LYP is a noble choice to start.
Both calculations give to us the excited states information: excitation energy, oscillator strength (as f value), excitation wavelength and multiplicity:
Excitation energies and oscillator strengths:
Excited State 1: Singlet-A 6.3258 eV 196.00 nm f=0.4830 <S**2>=0.000
11 -> 39 -0.00130
11 -> 42 -0.00129
11 -> 43 0.00104
11 -> 44 -0.00256
11 -> 48 0.00129
11 -> 49 0.00307
11 -> 52 -0.00181
11 -> 53 0.00100
11 -> 57 -0.00167
11 -> 59 0.00152
11 -> 65 0.00177
The data below corresponds to all the electron transitions involved in this excited state. I have to cut all the electron transitions because there are a lot of them for all excited states. If you have done excited states calculations before, you realize that the HOMO-LUMO transition is always an important one, but not the only one to be considered. Here is when we calculate the Natural Transition Orbitals (NTO), by these orbitals we can analyze the electron transitions.
For the example, I’ll show you first the HOMO-LUMO transition in the first excited state of adenine. It appears in the long list as follows:
35 -> 36 0.65024
The 0.65024 value corresponds to the transition amplitude, but it doesn’t mean anything for excited state analysis. We must calculate the NTOs of an excited state from a new Gaussian input file, requesting from the checkpoint file we used to calculate excited states. The file looks as follows:
#p SP geom=allcheck guess=(read,only) density=(Check,Transition=1) pop=(minimal,NTO,SaveNTO)
I want to say some important things right here for this last file. See that no level of theory is needed, all the calculation data is requested from the checkpoint file “adenine.chk”, and saved into the new checkpoint file “adNTO1.chk”, we must use the previous calculated density and specify the transition of interest, it means the excited state we want to analyze. As we don’t need to specify charge, multiplicity or even the comment line, this file finishes really fast.
After doing this last calculation, we use the new checkpoint file “adNTO1.chk” and we format it:
formchk -3 adNTO1.chk adNTO1.fchk
If we open this formatted checkpoint file with GaussView, chemcraft or the visualizer you want, we will see something interesting by watching he MOs diagram, as follows:
We can realize that frontier orbitals shows the same value of 0.88135, which means the real transition contribution to the first excited state. As these orbitals are contributing the most, we can plot them by using the cubegen routine:
cubegen 0 mo=homo adNTO1.fchk adHOMO.cub 0 h
This last command line is for plotting the equivalent as the HOMO orbital. If we want to plot he LUMO, just change the “homo” keyword for “lumo”, it doesn’t matter if it is written with capital letters or not.
You must realize that the Natural Transition Orbitals are quite different from Molecular Orbitals. For visual comparisson, I’ve printed also the molecular orbitals, given from the optimization and from excited states calculations, without calculating NTOs:
These are the molecular frontier orbitals, plotted with Chimera with 0.02 as the isovalue for both phase spaces:
The frontier NTOs look qualitatively the same, but that’s not necessarily always the case:
If we analyze these NTOs on a hole-electron model, the HOMO refers to the hole space and the LUMO refers to the electron space.
Maybe both orbitals look the same, but both frontier orbitals are quite different between them, and these last orbitals are the ones implied on first excited state of adenine. The electron transition will be reported as follows:
If I can do a graphic summary for this topic, it will be the next one:
NTOs analysis is useful no matter if you calculate excited states by using CIS(D), EOM-CCS(D), TD-DFT, CASSCF, or any of the excited states method of your election. These NTOs are useful for population analysis in excited states, but these calculations require another software, MultiWFN is an open-source code that allows you to do this analysis, and another one is called TheoDORE, which we’ll cover in a later post.
The canonical molecular orbital depiction of an electronic transition is often a messy business in terms of a ‘chemical‘ interpretation of ‘which electrons‘ go from ‘which occupied orbitals‘ to ‘which virtual orbitals‘.
Natural Transition Orbitals provide a more intuitive picture of the orbitals, whether mixed or not, involved in any hole-particle excitation. This transformation is particularly useful when working with the excited states of molecules with extensively delocalized chromophores or multiple chromophoric sites. The elegance of the NTO method relies on its simplicity: separate unitary transformations are performed on the occupied and on the virtual set of orbitals in order to get a localized picture of the transition density matrix.
 R. L. Martin, J. Chem. Phys., 2003, DOI:10.1063/1.1558471.
After running a TD-DFT calculation with the keyword TD(Nstates=n) (where n = number of states to be requested) we need to take that result and launch a new calculation for the NTOs but lets take it one step at a time. As an example here’s phenylalanine which was already optimized to a minimum at the B3LYP/6-31G(d,p) level of theory. If we take that geometry and launch a new calculation with the TD(Nstates=40) in the route section we obtain the UV-Vis spectra and the output looks like this (only the first three states are shown):
Excitation energies and oscillator strengths: Excited State 1: Singlet-A 5.3875 eV 230.13 nm f=0.0015 <S**2>=0.000 42 -> 46 0.17123 42 -> 47 0.12277 43 -> 46 -0.40383 44 -> 45 0.50838 44 -> 47 0.11008 This state for optimization and/or second-order correction. Total Energy, E(TD-HF/TD-KS) = -554.614073682 Copying the excited state density for this state as the 1-particle RhoCI density. Excited State 2: Singlet-A 5.5137 eV 224.86 nm f=0.0138 <S**2>=0.000 41 -> 45 -0.20800 41 -> 47 0.24015 42 -> 45 0.32656 42 -> 46 0.10906 42 -> 47 -0.24401 43 -> 45 0.20598 43 -> 47 -0.14839 44 -> 45 -0.15344 44 -> 47 0.34182 Excited State 3: Singlet-A 5.9254 eV 209.24 nm f=0.0042 <S**2>=0.000 41 -> 45 0.11844 41 -> 47 -0.12539 42 -> 45 -0.10401 42 -> 47 0.16068 43 -> 45 -0.27532 43 -> 46 -0.11640 43 -> 47 0.16780 44 -> 45 -0.18555 44 -> 46 -0.29184 44 -> 47 0.43124
The oscillator strength is listed on each Excited State as “f” and it is a measure of the probability of that excitation to occur. If we look at the third one for this phenylalanine we see f=0.0042, a very low probability, but aside from that the following list shows what orbital transitions compose that excitation and with what energy, so the first line indicates a transition from orbital 41 (HOMO-3) to orbital 45 (LUMO); there are 10 such transitions composing that excitation, visualizing them all with canonical orbitals is not an intuitive picture, so lets try the NTO approach, we’re going to take excitation #10 for phenylalanine as an example just because it has a higher oscillation strength:
%chk=Excited State 10: Singlet-A 7.1048 eV 174.51 nm f=0.3651 <S**2>=0.000 41 -> 45 0.35347 41 -> 47 0.34685 42 -> 45 0.10215 42 -> 46 0.17248 42 -> 47 0.13523 43 -> 45 -0.26596 43 -> 47 -0.22995 44 -> 46 0.23277
Each set of NTOs for each transition must be calculated separately. First, copy you filename.chk file from the TD-DFT result to a new one and name it after the Nth state of interest as shown below (state 10 in this case). NOTE: In the route section, replace N with the number of the excitation of interest according to the results in filename.log. Run separately for each transition your interested in:
#chk=state10.chk #p B3LYP/6-31G(d,p) Geom=AllCheck Guess=(Read,Only) Density=(Check,Transition=N) Pop=(Minimal,NTO,SaveNTO) 0 1 --blank line--
By requesting SaveNTO, the canonical orbitals in the state10.chk file are replaced with the NTOs for the 10th excitation, this makes it easier to plot since most visualizers just plot whatever set of orbitals they read in the chk file but if they find the canonical MOs then one would need to do some re-processing of them. This is much more straightforward.
Now we format our chk files into fchk with the formchk utility:
formchk -3 filename.chk filename.fchk
formchk -3 state10.chk state10.fchk
If we open filename.fchk (the file where the original TD-DFT calculation is located) with GaussView we can plot all orbitals involved in excited state number ten, those would be seven orbitals from 41 (HOMO-3) to 47 (LUMO+2) as shown in figure 1.
If we now open state10.fchk we see that the numbers at the side of the orbitals are not their energy but their occupation number particular to this state of interest, so we only need to plot those with highest occupations, in our example those are orbitals 44 and 45 (HOMO and LUMO) which have occupations = 0.81186; you may include 43 and 46 (HOMO-1 and LUMO+1, respectively) for a much more complete description (occupations = 0.18223) but we’re still dealing with 4 orbitals instead of 7.
The NTO transition 44 -> 45 is far easier to conceptualize than all the 10 combinations given in the canonical basis from the direct TD-DFT calculation. TD-DFT provides us with the correct transitions, NTOs just paint us a picture more readily available to the chemist mindset.
NOTE: for G09 revC and above, the %OldChk option is available, I haven’t personally tried it but using it to specify where the excitations are located and then write the NTOs of interest into a new chk file in the following way, thus eliminating the need of copying the original chk file for each state:
NTOs are based on the Natural Hybrid orbitals vision by Löwdin and others, and it is said to be so straightforward that it has been re-discovered from time to time. Be that as it may, the NTO visualization provides a much clearer vision of the excitations occurring during a TD calculation.
Thanks for reading, stay home and stay safe during these harsh days everyone. Please share, rate and comment this and other posts.
Statistical Mechanics is the bridge between microscopic calculations and thermodynamics of a particle ensemble. By means of calculating a partition function divided in electronic, rotational, translational and vibrational functions, one can calculate all thermodynamic functions required to fully characterize a chemical reaction. From these functions, the vibrational contribution, together with the electronic contribution, is the key element to getting thermodynamic functions.
Calculating the Free Energy change of any given reaction is a useful approach to asses their thermodynamic feasibility. A large negative change in Free Energy when going from reagents to products makes up for a quantitative spontaneous (and exothermic) reaction, nevertheless the rate of the reaction is a different story, one that can be calculated as well.
Using the freq option in your route section for a Gaussian calculation is mandatory to ascertain the current wave function corresponds to a minimum on a potential energy hypersurface, but also yields the thermochemistry and thermodynamic values for the current structure. However, thermochemistry calculations are not restricted to minima but it can also be applied to transition states, therefore yielding a full thermodynamic characterization of a reaction mechanism.
A regular freq calculation yields the following output (all values in atomic units):
Zero-point correction= 0.176113 (Hartree/Particle) Thermal correction to Energy= 0.193290 Thermal correction to Enthalpy= 0.194235 Thermal correction to Gibbs Free Energy= 0.125894 Sum of electronic and zero-point Energies= -750.901777 Sum of electronic and thermal Energies= -750.884600 Sum of electronic and thermal Enthalpies= -750.883656 Sum of electronic and thermal Free Energies= -750.951996
For any given reaction say A+B -> C one could take the values from the last row (lets call it G) for all three components of the reaction and perform the arithmetic: DG = GC – [GA + GB], so products minus reagents.
By default, Gaussian calculates these values (from the previously mentioned partition function) using normal conditions, T = 298.15 K and P = 1 atm. For an assessment of the thermochemistry at other conditions you can include in your route section the corresponding keywords Temperature=x.x and Pressure=x.x, in Kelvin and atmospheres, respectively.
(Huge) Disclaimer: Although calculating the thermochemistry of any reaction by means of DFT calculations is a good (and potentially very useful) guide to chemical reactivity, getting quantitative results require of high accuracy methods like G3 or G4 methods, collectively known as Gn mehtods, which are composed of pre-defined stepwise calculations. The sequence of these calculations is carried out automatically; no basis set should be specified. Other high accuracy methods like CBS-QB3 or W1U can also be considered whenever Gn methods are too costly.
“Well, where else were they supposed to appear?”
I was sent this error along with the previous question for a failed optimization. Apparently there is no answer in the internet (I quickly checked) so here it is:
Gaussian is confused about finding atomic coordinates because there is also a geom=check instruction placed in the route section, i.e., it was told to retrieve the atomic coordinates from a checkpoint and then it was given those atomic coordinates within the input so it doesn’t know what you mean and exits.
Nuclear Magnetic Resonance is a most powerful tool for elucidating the structure of diamagnetic compounds, which makes it practically universal for the study of organic chemistry, therefore the calculation of 1H and 13C chemical shifts, as well as coupling constants, is extremely helpful in the assignment of measured signals on a spectrum to an actual functional group.
Several packages offer an additive (group contribution) empirical approach to the calculation of chemical shifts (ChemDraw, Isis, ChemSketch, etc.) but they are usually only partially accurate for the simplest molecules and no insight is provided for the more interesting effects of long distance interactions (vide infra) so quantum mechanical calculations are really the way to go.
With Gaussian the calculation is fairly simple just use the NMR keyword in the route section in order to calculate the NMR shielding tensors for relevant nuclei. Bear in mind that an optimized structure with a large basis set is required in order to get the best results, also the use of an implicit solvation model goes a long way. The output displays the value of the total isotropic magnetic shielding for each nucleus in ppm (image taken from the Gaussian website):
Magnetic shielding (ppm): 1 C Isotropic = 57.7345 Anisotropy = 194.4092 XX= 48.4143 YX= .0000 ZX= .0000 XY= .0000 YY= -62.5514 ZY= .0000 XZ= .0000 YZ= .0000 ZZ= 187.3406 2 H Isotropic = 23.9397 Anisotropy = 5.2745 XX= 27.3287 YX= .0000 ZX= .0000 XY= .0000 YY= 24.0670 ZY= .0000 XZ= .0000 YZ= .0000 ZZ= 20.4233
Now, here is why this is the long way; in order for these values to be meaningful they need to be contrasted with a reference, which experimentally for 1H and 13C is tetramethylsilane, TMS. This means you have to perform the same calculation for TMS at -preferably- the same level of theory used for the sample and substract the corresponding values for either H or C accordingly. Only then the chemical shifts will read as something we can all remember from basic analytical chemistry class.
GaussView 6.0 provides a shortcut; open the Results menu, select NMR and in the new window there is a dropdown menu for selecting the nucleus and a second menu for selecting a reference. In the case of hydrogen the available references are TMS calculated with the HF and B3LYP methods. The SCF – GIAO plot will show the assignments to each atom, the integration simulation and a reference curve if desired.
The chemical shifts obtained this far will be a good approximation and will allow you to assign any peaks in any given spectrum but still not be completely accurate though. The reasons behind the numerical deviations from calculated and experimental values are many, from the chosen method to solvent interactions or basis set limitations, scaling factors are needed; that’s when you can ask the Cheshire Cat which way to go
If you don’t know where you are going any road will get you there.
Lewis Carroll – Alice in Wonderland
Well, not really. The Chemical Shift Repository for computed NMR scaling factors, with Coupling Constants Added Too (aka CHESHIRE CCAT) provides with straight directions on how to correct your computed NMR chemical shifts according to the level of theory without the need to calculate the NMR shielding tensor for the reference compound (usually TMS as pointed out earlier). In a nutshell, the group of Prof. Dean Tantillo (UC Davis) has collected a large number of isotropic magnetic shielding values and plotted them against experimental chemical shifts. Just go to their scaling factors page and check all their linear regressions and use the values that more closely approach to your needs, there are also all kinds of scripts and spreadsheets to make your job even easier. Of course, if you make use of their website don’t forget to give the proper credit by including these references in your paper.
We’ve recently published an interesting study in which the 1H – 19F coupling constants were calculated via the long way (I was just recently made aware of CHESHIRE CCAT by Dr. Jacinto Sandoval who knows all kinds of web resources for computational chemistry calculations) as well as their conformational dependence for some substituted 2-aza-carbazoles (fig. 1).
The paper is published in the Journal of Molecular Structure. In this study we used the GIAO NMR computations to assign the peaks on an otherwise cluttered spectrum in which the signals were overlapping due to conformational variations arising from the rotation of the C-C bond which re-orients the F atoms in the fluorophenyl grou from the H atom in the carbazole. After the calculations and the scans were made assigning the peaks became a straightforward task even without the use of scaling factors. We are now expanding these calculations to more complex systems and will contrast both methods in this space. Stay tuned.
Calculation of interaction energies is one of those things people are more concerned with and is also something mostly done wrong. The so called ‘gold standard‘ according to Pavel Hobza for calculating supramolecular interaction energies is the CCSD(T)/CBS level of theory, which is highly impractical for most cases beyond 50 or so light atoms. Basis set extrapolation methods and inclusion of electronic correlation with MP2 methods yield excellent results but they are not nonetheless almost as time consuming as CC. DFT methods in general are terrible and still are the most widely used tools for electronic structure calculations due to their competitive computing times and the wide availability of schemes for including terms which help describe various kinds of interactions. The most important ingredients needed to get a decent to good interaction energies values calculated with DFT methods are correlation and dispersion. The first part can be recreated by a good correlation functional and the use of empirical dispersion takes care of the latter shortcoming, dramatically improving the results for interaction energies even for lousy functionals such as the infamous B3LYP. The results still wont be of benchmark quality but still the deviations from the gold standard will be shortened significantly, thus becoming more quantitatively reliable.
There is an online tool for calculating and adding the empirical dispersion from Grimme’s group to a calculation which originally lacked it. In the link below you can upload your calculation, select the basis set and functionals employed originally in it, the desired damping model and you get in return the corrected energy through a geometrical-Counterpoise correction and Grimme’s empirical dispersion function, D3, of which I have previously written here.
The gCP-D3 Webservice is located at: http://wwwtc.thch.uni-bonn.de/
The platform is entirely straightforward to use and it works with xyz, turbomole, orca and gaussian output files. The concept is very simple, a both gCP and D3 contributions are computed in the selected basis set and added to the uncorrected DFT (or HF) energy (eq. 1)
If you’re trying to calculate interaction energies, remember to perform these corrections for every component in your supramolecular assembly (eq. 2)
Here’s a screen capture of the outcome after uploading a G09 log file for the simplest of options B3LYP/6-31G(d), a decomposed energy is shown at the left while a 3D interactive Jmol rendering of your molecule is shown at the right. Also, various links to the literature explaining the details of these calculations are available in the top menu.
I’m currently writing a book chapter on methods for calculating ineraction energies so expect many more posts like this. A special mention to Dr. Jacinto Sandoval, who is working with us as a postdoc researcher, for bringing this platform to my attention, I was apparently living under a rock.
There’s an error message when opening some Gaussian16 output files in GaussView5 for which the message displayed is the following:
ConnectionGLOG::Parse_Gauss_Coord(). Failure reading oriented atomic coordinates. Line Number
We have shared some solutions to the GaussView handling of *chk and *.fchk files in teh past but never for *.log files, and this time Dr. Davor Šakić from the University of Zagreb in Croatia has brought to my attention a fix for this error. If “Dipole orientation” with subsequent orientation is removed, the file becomes again readable by GaussView5.
Here you can download a script to fix the file without any hassle. The usage from the command line is simply:
˜$ chmod 777 Fg16TOgv5 ˜$ ./Fg16TOgv5 name.log
The first line is to change and grant all permissions to the script (use at your discretion/own risk), which in turn will take the output file name.log and yield two more files: gv5_name.log and and name.arch; the latter archive allows for easy generation of SI files while the former is formatted for GaussView5.x.
Thanks to Dr. Šakić for his script and insight, we hope you find it useful and if indeed you do please credit him whenever its due, also, if you find this or other posts in the blog useful, please let us know by sharing, staring and commenting in all of them, your feedback is incredibly helpful in justifying to my bosses the time I spent curating this blog.
Thanks for reading.
One of the most popular posts in this blog has to do with calculating Fukui indexes, however, when dealing with a large number of molecules, our described methodology can become cumbersome since it requires to manually extract the population analysis from two or three different output files and then performing the arithmetic on them separately with a spreadsheet or something.
Our new team member Ricardo Loaiza has written a python script that takes the three aforementioned files and yields a .csv file with the calculated Fukui indexes, and it even points out which of the atoms exhibit the largest values so if you have a large molecule you don’t have to manually check for them. We have also a batch version which takes all the files in any given directory and performs the Fukui calculations for each, provided it can find file triads with the naming requirements described below.
Output files must be named filename.log (the N electrons reference state), filename_plus.log (the state with N+1 electrons) and filename_minus.log (the N-1 electrons state). Another restriction is that so far these scripts only work with NBO population analysis as provided by the NBO3.1 program available in the various versions of Gaussian. I imagine the listing is similar in NBO5.x and NBO6.x and so it should work if you do the population analysis with them.
The syntax for the single molecule version is:
python fukui.py filename.log filename_minus.log filename_plus.log
For the batch version is:
(Por Lote means In Batch in Spanish.)
These scripts are available via GitHub. We hope you find them useful, and you do please let us know whether here at the comments section or at our GitHub site.
We’ve covered some common errors when dealing with formatted checkpoint files (*.fchk) generated from Gaussian, specially when analyzed with the associated GaussView program. (see here and here for previous posts on the matter.)
Prof. Neal Zondlo from the University of Delaware kindly shared this solution with us when the following message shows up:
CConnectionGFCHK::Parse_GFCHK() Missing or bad data: Rbond Line Number 1234
The Rbond label has to do with the connectivity displayed by the visualizer and can be overridden by close examination of the input file. In the example provided by Prof. Zondlo he found the following line in the connectivity matrix of the input file:
2 9 0.0
which indicates a zero bond order between atoms 2 and 9, possibly due to their proximity. He changed the line to simply
So editing the connectivity of your atoms in the input can help preventing the Rbond message.
I hope this helps someone else.