Blog Archives

Population Analysis in the Excited State with Gaussian


To calculate what the bonding properties of a molecule are in a particular excited state we can run any population analysis following the root of interest. This straightforward procedure takes two consecutive calculations since you don’t necessarily know before hand which excited state is the one of interest.

The regular Time Dependent Density Functional Theory (TD-DFT) calculation input with Gaussian 16 looks as follows (G09 works pretty much the same), let us assume we’ve already optimized the geometry of a given molecule:

%OldChk=filename.chk
%nprocshared=16
%chk=filename_ES.chk

#p TD(NStates=10,singlets) wb97xd/cc-pvtz geom=check guess=read

Title Card Required

0 1
--blank line--

This input file retrieves the geometry and wavefunction from a previous calculation from filename.chk and doesn’t write anything new into it (that is what %OldChk=filename.chk means) and creates a new checkpoint where the excited states are calculated (%chk=filename_ES.chk)

In the output you search for the transition which peeks your interest; most often than not you’ll be interested in the one with the highest oscillator strength, f. The oscillator strength is a dimensionless number that represents the ratio of the observed, integrated, absorption coefficient to that calculated for a single electron in a three-dimensional harmonic potential [Harris & Bertolucci, Symmetry and Spectroscopy]; in other words, it is related to the probability of that transition to occur, and therefore it takes values from 0.0 to 1.0 (for single photon absorption processes.)

The output of this calculation looks as follows, the value of f for every excitation is reported together with its energy and the orbital transitions which comprise it.

 Excitation energies and oscillator strengths:

 Excited State   1:      Singlet-A      3.1085 eV  398.86 nm  f=0.0043  <S**2>=0.000
      56 -> 59        -0.11230
      58 -> 59         0.69339
 This state for optimization and/or second-order correction.
 Total Energy, E(TD-HF/TD-DFT) =  -1187.56377917
 Copying the excited state density for this state as the 1-particle RhoCI density.

 Excited State   2:      Singlet-A      4.0827 eV  303.68 nm  f=0.0016  <S**2>=0.000
      52 -> 59         0.46689
      52 -> 64        -0.20488
      53 -> 59         0.19693
      54 -> 59         0.40414
      54 -> 64        -0.16261
...
... 
Excited State   8:      Singlet-A      5.2345 eV  236.86 nm  f=0.8063  <S**2>=0.000
      52 -> 60         0.17162
      53 -> 59         0.47226
      53 -> 60        -0.11771
      54 -> 59        -0.27658
      54 -> 60        -0.22006
      55 -> 59         0.20496
      56 -> 59         0.15029

Now we’ve selected excited state #8 because it has the largest value of f from the lot, we use the following input to read in the geometry from the old checkpoint file and we generate a new one in case we need it for something else. The input file for doing all this looks as follows (I’ve selected as usual the Natural Bond Orbital population analysis):

%oldchk=a_ES.chk
%nprocshared=16
%chk=a_nbo.chk

#p TD(Read,Root=8) wb97xd/cc-pvtz geom=check density=current guess=read pop=NBORead

Title Card Required

0 1

$NBO BOAO BNDIDX E2PERT $END

--blank line--

The flags at the bottom request the calculation of Wiberg Bond Indexes (BNDIDX) as well as Bond Order in the Atomic Orbital basis (BOAO) and a second order perturbation theory for the electronic delocalization (E2PERT). Now we can compare the population analysis between ground and the 8th excited state; check figure 1 and notice the differences in Wiberg’s bond order for this complex made of two molecules and one Na+ cation.

Figure 1. Natural Population Analysis comparison for a supramolecular arrangement. Numbers next in brackets correspond to the sum of charges for each molecule. Notice the significant change in charges for each molecule when going from the ground to the 8th excited state.

In this example we can observe that in the ground state we have a neutral and a negative molecule together with a Na+ cation, but when we analyze the population in the 8th excited state both molecules acquire a similar charge, ca. 0.46e, which means that some of the electron density has been transferred from the negative one to the neutral molecule, forming an Electron Donor-Acceptor complex (EDA) in the excited state.

This procedure can be extended to any other kind of population analysis and their derived combination, e.g. one could calculate their condensed fukui functions in the Nth excited state; but beware! These calculations yield vertical excitations, should the excited state of interest have a minimum we can first optimize the ES geometry and then perform the population analysis on said geometry; just add the opt keyword to perform both jobs in one go, but bear in mind that the NBO population analysis is performed before and after the optimization process so look for the tables and values closer to the end of the output file.

In the case of open shell systems the procedure is the same but one should be extremely careful in searching for the total population analysis since the output file contains this table for the alpha and beta populations separately as well as the added values for the total number of electrons.

Photosynthesis in the near-IR. A New paper in JCTC


Photosynthetic organisms are so widespread around the globe they have adapted to various solar lighting conditions to thrive. The bacteria Blastochloris viridis absorbs light in the near infrared region of the electromagnetic spectrum, in fact, it holds the record for the longest wavelength (~1015 nm) absorbing organism whose Light Harvesting complex 1 (LHC1) has been elucidated. Despite their adaptation to a wide number of light conditions, photosynthetic organism can only make use of so many pigments or chromophores; the LHC1 (Figure 1) in B. viridis in fact is made up of Bacteriochlorophyll-b (BChl-b) molecules, one of the most abundant photosynthetic pigments on Earth, whose main absorption in solution (MeOH) is observed at 795 nm.

Figure 1. B. viridis LHC1 (PDB 6ET5)

So, how can B. viridis use BChl-b molecules to absorb near IR radiation and how does it achieve this remarkable red-shifting effect? The LHC1 structure was solved in 2018 by Qian et al. through Cryo-EM at a 2.9 Å resolution; it is comprised of 17 protein subunits surrounding the so called photosynthetic pigments special pair. Each of these subunits is made up of three α-helix structures surrounding two BChl-b and one dihydroneurosporene (DHN) molecule for a total of 34 of these photosynthetic pigments inside the LHC and 17 DHN molecules interacting between the protein structures and the
main BChl-b pigments.

It was Dr. Jacinto Sandoval and Gustavo “Gus” Mondragón who brought this facts to our attention during their survey of potential candidates for calculating exotic exciton transfer mechanisms in photosynthetic organisms, part of Gustavo’s PhD thesis. To them, it was clear from the start that some sort of cooperative effect between pigments was operating and possibly leading to the red-shifted absorption, therefore a careful dissection of all possible pigments combinations was carried out and their UV-Vis spectra were calculated at the CAMB3LYP/cc-pVDZ on PBE0/6-31G(d) optimized geometries, leading to the systems shown below in figure 2.

Figure 2. Two and Three containing pigments systems under study

System B7 reproduced the red-shifted absorption at 1026 nm, but since the original structure was fitted from the Cryo-EM with a 2.9 Å resolution, “Gus” suggested reaching out to the group of Prof. Andrés Gerardo Cisneros and Dr. Jorge Nochebuena at UT Dallas for carrying out QM/MM calculations; this optimization included the proteins surrounding the pigments in the MM layer and the interacting residues (Hys coordinated to Mg2+ ions in BChl-b) along the chromophores were incorporated into the QM layer, however the thus obtained minima for the B7 system lost the main absorption in the near-IR region, therefore, Dr. Nochebuena suggested running an MD simulation (45 ns) and took a random sampling of ten structures (Figure 3).

Figure 3. Structure sampling of the MD structures based on k-means
clustering analysis.

All structures in the sampling reproduced the red-shifted absorption (~1000 nm) successfully proving that cooperative and dynamic effects allow B. viridis to perform photosynthesis with low energy radiation (Figure 4). Therefore, close intermolecular interactions along with thermal/dynamical fluctuations allow for a regular pigment such as BChl-b to form near-IR absorbing photosystems for organisms to thrive in low conditions of solar light.

Figure 4. Calculated spectra of representative structure 7 vs the experimental spectrum. (a) Calculated spectrum (on green) including the DHN
molecule fragment. (b) Calculated spectrum (blue) excluding the DHN molecule fragment.

If you want to read further details, this work is now published in the Journal of Chemical Theory and Computation of the American Chemical Society. I’ll talk about this and other ventures in photosynthesis next week at the WATOC conference in Vancouver, swing by to talk CompChem!

Percentage of Molecular Orbital Composition – G09,G16


Canonical Molecular Orbitals are–by construction–delocalized over the various atoms making up a molecule. In some contexts it is important to know how much of any given orbital is made up by a particular atom or group of atoms, and while you could calculate it by hand given the coefficients of each MO in terms of every AO (or basis set function) centered on each atom there is a straightforward way to do it in Gaussian.

If we’re talking about ‘dividing’ a molecular orbital into atomic components, we’re most definitely talking about population analysis calculations, so we’ll resort to the pop keyword and the orbitals option in the standard syntax:

#p M052x/cc-pVDZ pop=orbitals

This will produce the following output right after the Mulliken population analysis section:

Atomic contributions to Alpha molecular orbitals:
 Alpha occ 140 OE=-0.314 is Pt1-d=0.23 C38-p=0.16 C31-p=0.16 C36-p=0.16 C33-p=0.15
 Alpha occ 141 OE=-0.313 is Pt1-d=0.41
 Alpha occ 142 OE=-0.308 is Cl2-p=0.25
 Alpha occ 143 OE=-0.302 is Cl2-p=0.72 Pt1-d=0.18
 Alpha occ 144 OE=-0.299 is Cl2-p=0.11
 Alpha occ 145 OE=-0.298 is C65-p=0.11 C58-p=0.11 C35-p=0.11 C30-p=0.11
 Alpha occ 146 OE=-0.293 is C58-p=0.10
 Alpha occ 147 OE=-0.291 is C22-p=0.09
 Alpha occ 148 OE=-0.273 is Pt1-d=0.18 C11-p=0.12 C7-p=0.11
 Alpha occ 149 OE=-0.273 is Pt1-d=0.18
 Alpha vir 150 OE=-0.042 is C9-p=0.18 C13-p=0.18
 Alpha vir 151 OE=-0.028 is C7-p=0.25 C16-p=0.11 C44-p=0.11
 Alpha vir 152 OE=0.017 is Pt1-p=0.10
 Alpha vir 153 OE=0.021 is C36-p=0.15 C31-p=0.14 C63-p=0.12 C59-p=0.12 C38-p=0.11 C33-p=0.11
 Alpha vir 154 OE=0.023 is C36-p=0.13 C31-p=0.13 C63-p=0.11 C59-p=0.11
 Alpha vir 155 OE=0.027 is C65-p=0.11 C58-p=0.10
 Alpha vir 156 OE=0.029 is C35-p=0.14 C30-p=0.14 C65-p=0.12 C58-p=0.11
 Alpha vir 157 OE=0.032 is C52-p=0.09
 Alpha vir 158 OE=0.040 is C50-p=0.14 C22-p=0.13 C45-p=0.12 C17-p=0.11
 Alpha vir 159 OE=0.044 is C20-p=0.15 C48-p=0.14 C26-p=0.12 C54-p=0.11

Alpha and Beta densities are listed separately only in unrestricted calculations, otherwise only the first is printed. Each orbital is listed sequentially (occ = occupied; vir = virtual) with their energy value (OE = orbital energy) in atomic units following and then the fraction with which each atom contributes to each MO.

By default only the ten highest occupied orbitals and ten lowest virtual orbitals will be assessed, but the number of MOs to be analyzed can be modified with orbitals=N, if you want to have all orbitals analyzed then use the option AllOrbitals instead of just orbitals. Also, the threshold used for printing the composition is set to 10% but it can be modified with the option ThreshOrbitals=N, for the same compound as before here’s the output lines for HOMO and LUMO (MOs 149, 150) with ThreshOrbitals set to N=1, i.e. 1% as occupation threshold (ThreshOrbitals=1):

Alpha occ 149 OE=-0.273 is Pt1-d=0.18 N4-p=0.08 N6-p=0.08 C20-p=0.06 C13-p=0.06 C48-p=0.06 C9-p=0.06 C24-p=0.05 C52-p=0.05 C16-p=0.04 C44-p=0.04 C8-p=0.03 C15-p=0.03 C17-p=0.03 C45-p=0.02 C46-p=0.02 C18-p=0.02 C26-p=0.02 C54-p=0.02 N5-p=0.01 N3-p=0.01
Alpha vir 150 OE=-0.042 is C9-p=0.18 C13-p=0.18 C44-p=0.08 C16-p=0.08 C15-p=0.06 C8-p=0.06 N6-p=0.04 N4-p=0.04 C52-p=0.04 C24-p=0.04 N5-p=0.03 N3-p=0.03 C46-p=0.03 C18-p=0.03 C48-p=0.02 C20-p=0.02

The fragment=n label in the coordinates can be used as in BSSE Counterpoise calculations and the output will show the orbital composition by fragments with the label "Fr", grouping all contributions to the MO by the AOs centered on the atoms in that fragment.

As always, thanks for reading, sharing, and rating. I hope someone finds this useful.

Au(I) Chemistry No.3 – New paper in Dalton Transactions


Stabilizing Gold in low oxidation states is a longstanding challenge of organometallic chemistry. To do so, a fine tuning of the electron density provided to an Au atom by a ligand via the formation of a σ bond. The group of Professor Rong Shang at the University of Nagasaki has accomplished the stabilization of an aurate complex through the use of a boron, nitrogen-containing heterocyclic carbene; DFT calculations at the wB97XD/(LANL2TZ(f),6-311G(d)) level of theory revealed that this ligand exhibits a high π-withdrawing character of the neutral 4π B,N-heterocyclic carbene (BNC) moiety and a 6π weakly aromatic character with π-donating properties, implying that this is the first cyclic carbene ligand that is able to be tuned between π-withdrawing (Fischer-type)- and π-donating (Schrock-type) kinds.

A π-withdrawing character on part of the ligand is important to allow the electron-rich gold center back donate some of its excess electron density, this way preventing its oxidation. A modification of Bertrand’s cyclic (alkyl)(amino)carbene (CAAC) has allowed Shang and co-workers to perform the two electrons Au(I) reduction to form the aurate shown in figure 1 (CCDC 2109027). This work also reports on the modular synthesis of the BNC-1 ligand and the mechanism was calculated once again by Leonardo “Leo” Lugo.

Figure 1. Compound 4a (H atoms omitted for clarity)

The ability of the BNC-1 ligand to accept gold’s back donation is reflected on the HOMO/LUMO gap as shown in Figure 2; while BNC-1 has a gap of 7.14 eV, the classic NHC carbene has a gap of 11.28 eV, furthermore, in the case of NHC the accepting orbital is not LUMO but LUMO+1. Additionally, the NBO delocalization energies show that the back donation from Au 5d orbital to the C-N antibonding π* orbital is about half that expected for a Fischer type carbene, suggesting an intermediate character between π accepting and π donating carbene. On the other hand, the largest interaction corresponds to the carbanion density donated to Au vacant p orbital (ca. 45 kcal/mol). All these observations reveal the successful tuning of the electron density on BNC-1.

Figure 2. Frontier Molecular Orbitals for the ligand BNC-1 and a comparison to similar carbenes used elsewhere

This study is available in Dalton Transactions. As usual, I’m honored to be a part of this international collaboration, and I’m deeply thankful to the amazing Prof. José Oscar Carlos Jiménez-Halla for inviting me to be a part of it.

Yoshitaka Kimura, Leonardo I. Lugo-Fuentes, Souta Saito, J. Oscar C. Jimenez-HallaJoaquín Barroso-FloresYohsuke YamamotoMasaaki Nakamoto and Rong Shang* “A boron, nitrogen-containing heterocyclic carbene (BNC) as a redox active ligand: synthesis and characterization of a lithium BNC-aurate complex”, Dalton Trans., 2022,51, 7899-7906 https://doi.org/10.1039/D2DT01083F

Exciton Energy Transfer-Talk at the Virtual Winter School of Comp.Chem. 2022


I’m very honored to have been invited to this edition of this long standing event, the Virtual Winter School of Computational Chemistry. In this talk I walk through the basics of what are excitons and how do they move or transfer across matter; and of course, a primer on how to calculate the energy transfer with Gaussian.

This is a very basic introduction but I hope someone finds it useful. Thanks to Henrique Castro for inviting me to take part of this experience and to all the professors and students involved in the organization. Don’t forget to go and check all the other fantastic talks, including one by Nobel Laureate and chemistry legend Prof. Roald Hoffmann, at the Virtual Winter School’s website: https://winterschool.cc/

Water splitting by proton to hydride umpolung—New paper in Chem.Sci.


The word ‘umpolung‘ is not used often enough in my opinion, and that’s a shame since this phenomenon refers to one of the most classic tropes or deus ex machina used in sci-fi movies—prominently in the Dr. Who lore*—and that is ‘reversing the polarity‘. Now, reversing the polarity only means that for any given dipole the positively charged part now acquires a negative charge, while the originally negatively charged part becomes positively charged, and thus the direction of the dipole moment is, well, reversed.

In chemistry, reversing the polarity of a bond is an even cooler matter because it means that atoms that typically behave as positively charged become negatively charged and react with other molecules accordingly. Such is the case of this new research conducted experimentally by Prof. Rong Shang at Hiroshima University and theoretically elucidated by Leonardo “Leo” Lugo, who currently works jointly with me and my good friend the always amazing José Oscar Carlos Jimenez-Halla at the University of Guanajuato, Mexico.

Production of molecular hydrogen from water splitting at room temperature is a remarkable feat that forms the basis of fuel cells in the search for cleaner sources of energy; this process commonly requires a metallic catalyst, and it has been achieved via Frustrated Lewis Pairs from Si(II), but so far the use of an intramolecular electron relay process has not been reported.

BPB – Figure 1

Prof. Rong Shang and her team synthesized an ortho-phenylene linked bisborane functionalized phosphine (Figure 1), and proved their stoichiometric reaction with water yielding H2 and phosphine oxide quantitatively at room temperature. During the reaction mechanism the umpolung occurs when a proton from the captured water molecule forms a hydride centered on the borane moiety of BPB. The reaction mechanism is shown in Figure 2.

According to the calculated mechanism, a water molecule coordinates to one of the borane groups via the oxygen atom, and the phosphorus atom later forms a hydrogen bond via their lone pair separating the water molecule into OH and H+, this latter migrates to the second borane and it is during this migration (marked TSH2 in Figure 2) where the umpolung process takes place; the natural charge of the hydrogen atom changes from positive to negative and stays so in the intermediate H3. This newly formed hydride reacts with the hydrogen atom on the OH group to form the reduction product H2, the final phosphine oxide shows a PO…B intramolecular forming a five membered ring which further stabilizes it.

This results are now available in Chemical Science, 2021, 12, 15603 DOI:10.1039/d1sc05135k. As always, I deeply thank Prof. Óscar Jiménez-Halla for inviting me to participate on this venture.


* Below there’s a cool compilation of the Reverse the Polarity trope found in Dr. Who:

XIX RMFQT – National Meeting on TheoPhysChem


The Mexican Meeting on Theoretical Physical Chemistry is a national staple of our local scientific discipline. The nineteenth edition had to be a virtual conference due to sanitary restrictions still enforced in Mexico. Nevertheless, this was a successful meeting in which we tried new things, such as a live broadcast via our new official YouTube channel and a Twitter poster session covered under the hashtag #RMFQTXIX.

Please browse the previous links (talks in Spanish, most Tweets are also in Spanish but some are available in English.) Twitter conferences are here to stay and the creativity from the participants will be key in moving them forward; unfortunately, most of us are still grounded in the traditional idea of a physical poster and that notion taken literally translates poorly to a Tweet. I wanted to embed some of the presented posters but I don’t want to leave people out and they were fortunately too many for them to fit in a blog post. So head on to Twitter and check the hashtags #RMFQTXIX and #CompChemMX and follow the official Twitter account for the RMFQT.

A big shout-out to the staff, PhD students Jessica Arcudia and Gustavo Mondragón for keeping up the live sessions and online broadcast. The future of Mexican CompChem is in safe hands!

Worldwide CompChem in the Fight against COVID-19


The war against COVID-19 has been waged in many fronts. The computational chemistry community has done their share during this pandemic to put forward a cure, a vaccine, or a better understanding of the molecular mechanisms behind the human infection by the SARS-CoV-2 virus. As few vaccines show currently their heads and start making their way around the globe to stop the spreading, amidst a climate of disinformation, distrust and political upheaval, all of which pose several challenges yet to be faced aside from the technical and scientific ones.

This is by no means a comprehensive review of the literature, in fact, most of the cited literature herein was observed in Twitter under the #CompChem and #COVID combined hashtags; Summarizing the research by the CompChem community on COVID-19 related topics in a single blog-post would be near to impossible—I trust a book is being written on it as I type these lines.

The structural elucidation of the proteins associated to the SARS-CoV-2 virus is probably the first step required in designing chemical compounds capable of modifying their functions and altering their life-cycle without altering the biochemistry of the hosts. The Coronavirus Structural Taskforce has elucidated the structure of 28 proteins of SARS-CoV-2 aside from the 300+ proteins from the previous SARS-CoV virus using the tools from the FoldIt at home game based on the Rosetta program to heuristically predict the structure of these proteins. Structure based drug design rely on the knowledge of the structure of the active site (hence the name), but in the case of newly discovered proteins for which homology modeling is not entirely feasible, a ligand-based approach named D3Similarity was developed early in the pandemic for identifying the possible active sites by the group of Prof. Zhijian Xu. Mapping of the of the viral genome and proteome was also achieved early on during the first dates of lockdown in the American continent. The information was readily made available and usable for further studies which prompts another challenge: the rapid dissemination, review and evaluation of information to make scientifically sound claims and make data-based decisions. In this regard, the role of preprints cannot be stressed enough. Without a rapid communication, scientific results cannot generate a much needed critical mass to turn all these data into knowledge. As evidenced by the vast majority of the links present in this post, ChemRXiv from the ACS served the much needed function to gather, link and put the data for scientific evaluation out there in order to accelerate the discovery of solutions to the various steps of the virus’ reproductive cycle through various strategies.

The role of supercomputing has been paramount worldwide to the various efforts made in CompChem (read the C&EN piece) in various fronts from structural elucidation, such as the AI driven structural modelling of spike proteins and their infection mechanism led by Prof. Rommie Amaro (UCSD) and Dr. Arvind Ramanathan which was celebrated by the Bell Prize, to development of vaccines. Many Molecular Dynamics simulations have been performed on potential inhibitors of proteins such as the spike protein, in some cases these simulations coupled with cryo-EM microscopy allowed for the elucidation of the hinging mechanism of these spike proteins, their thermodynamic properties, and all atoms-simulations assessed the rigidity of the receptor as the cause of its infectivity. Still, owning these computing resources isn’t always cost effective; that’s why there have been outsourced to companies such as Amazon web services as Pearlman did for the QM/DFT calculations of the binding energy of several drug candidates for the inhibition of the virus’ main protease (MPro). Many other CADD studies are available (here, here, and here). Researchers from all around the world can chip in and join the effort by reaching out to the COVID-19 High Performance Computing Consortium (HPC) which brings together some of the most advanced computing systems to the hands of private and academic researchers with relevant projects aimed to the study of the virus. On the other side of the Atlantic, the Partnership for Advanced Computing in Europe (PRACE) also provides access to advanced computing services for research. As an effort to keep all the developing information curated and concentrated, the COVID-19 Molecular Structure and Therapeutics Hub was created to provide a community-driven data repository and curation service for molecular structures, models, therapeutics, and simulations related to computational research related to therapeutic opportunities.

As described above, molecular dynamics simulations are capital in the assessment of how drugs interact with proteins. But molecular dynamics can only do so much as they’re computing intensive so, the use of Polarizable Force Fields (PFF) algorithms to obtain results in the microseconds regime with high-resolution sampling methods which have been applied also to the modeling of the MPro protein; the phase space is sampled by different MD trajectories which are then tested and selected. Aside from classical simulations, artificial intelligence predictions and docking calculations, also quantum mechanical calculations have been employed in the search for the most intimate interactions governing the mechanisms of inhibition of proteins. In this front, a Fragment Molecular Orbital based analysis was carried out to find which residues in MPro interacted the most with a given inhibitor.

Virtual screening is at the heart of the computationally aided drug discovery process, specially high-throughput virtual screening such as the one performed by the group of Andre Fischer at Basel, in which 11 potential drugs were narrowed from a pool of over 600 million compounds that were analyzed as potential protease inhibitors. Repurposing of antiviral drugs, and other entry-inhibiting compounds, is also a major avenue explored in the search for treatments; in the linked study by Shailly Tomar et al. antiviral drugs which are also anti inflammatory are believed to take care of lung inflammation and injury associated to the infection at the same time they tend to disrupt the virus’ infection mechanism. The comeback of Virtual Reality can make virtual screening more cooperative even during lockdown conditions and more ‘tangible’ as the company Nanome has proven with their COVID-19 Town Hall meetings which aim to the modeling of proteins in 3D space. Aside from the de novo and repurposing efforts, the search for peptides against infection by SARS-CoV-2 was an important topic (here and here). More recently, Skariyachan and Gopal turn to natural products from herbal origins for their virtual screening (molecular docking and dynamics). In their perspective the chemical complexity achieved through biosynthesis can overcome the bottleneck of chemical discovery while at the same time turning to the ancient practices of herbal remedies described in Ayurveda. Other researchers like Manish Manish have also turned to libraries of 500,000+ natural compounds to find potential drugs for MPro.

The year is coming to an end but not the pandemic in any way. Now, with the advent of new strains, and the widespread vaccination effort put in place, it is more important than ever to keep the fight strong in our labs but also in our personal habits and responsibilities—the same advices that were given at the beginning of the year are still in effect today and will continue to be for the months to come. I want to wish everyone who reads this a happy holiday season, but above all I want to pay a small tribute to the scientists working relentlessly in one of the largest coordinated scientific efforts in modern history, one that can only be compared to the Moon landing or the Manhattan Project; to those scientists and all the healthcare personnel, may you find rest soon, may your efforts never go unnoticed: Thank you for your service.

Basis Set Superposition Error (BSSE). A short intro


Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, Eint, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

Eint = EAB – EA – EB (1)

However the calculation of Eint by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

EintCP = EABAB – EAAB– EBAB (2)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘ghost‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk
#P opt wB97XD/6-31G(d,p) Counterpoise=2

HF dimer

0,1 0,1 0,1
H(Fragment=1) 0.00 0.00 0.00
F(Fragment=1) 0.00 0.00 0.70
H(Fragment=2) 0.00 0.00 1.00
F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

Orbital Contributions to Excited States


This is a guest post by our very own Gustavo “Gus” Mondragón whose work centers around the study of excited states chemistry of photosynthetic pigments.

When you’re calculating excited states (no matter the method you’re using, TD-DFT, CI-S(D), EOM-CCS(D)) the analysis of the orbital contributions to electronic transitions poses a challenge. In this post, I’m gonna guide you through the CI-singles excited states calculation and the analysis of the electronic transitions.

I’ll use adenine molecule for this post. After doing the corresponding geometry optimization by the method of your choice, you can do the excited states calculation. For this, I’ll use two methods: CI-Singles and TD-DFT.

The route section for the CI-Singles calculation looks as follows:

%chk=adenine.chk
%nprocshared=8
%mem=1Gb

#p CIS(NStates=10,singlets)/6-31G(d,p) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1
--blank line--

I use the same geometry from the optimization step, and I request only for 10 singlet excited states. The CPCP implicit solvation model (solvent=water) is requested. If you want to do TD-DFT, the route section should look as follows:

%chk=adenine.chk
%nprocshared=8
%mem=1Gb

#p FUNCTIONAL/6-31G(d,p) TD(NStates=10,singlets) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1
--blank line--

Where FUNCTIONAL is the DFT exchange-correlation functional of your choice. Here I strictly not recommend using B3LYP, but CAM-B3LYP is a noble choice to start.

Both calculations give to us the excited states information: excitation energy, oscillator strength (as f value), excitation wavelength and multiplicity:

Excitation energies and oscillator strengths:

 Excited State   1:      Singlet-A      6.3258 eV  196.00 nm  f=0.4830  <S**2>=0.000
      11 -> 39        -0.00130
      11 -> 42        -0.00129
      11 -> 43         0.00104
      11 -> 44        -0.00256
      11 -> 48         0.00129
      11 -> 49         0.00307
      11 -> 52        -0.00181
      11 -> 53         0.00100
      11 -> 57        -0.00167
      11 -> 59         0.00152
      11 -> 65         0.00177

The data below corresponds to all the electron transitions involved in this excited state. I have to cut all the electron transitions because there are a lot of them for all excited states. If you have done excited states calculations before, you realize that the HOMO-LUMO transition is always an important one, but not the only one to be considered. Here is when we calculate the Natural Transition Orbitals (NTO), by these orbitals we can analyze the electron transitions.

For the example, I’ll show you first the HOMO-LUMO transition in the first excited state of adenine. It appears in the long list as follows:

35 -> 36         0.65024

The 0.65024 value corresponds to the transition amplitude, but it doesn’t mean anything for excited state analysis. We must calculate the NTOs of an excited state from a new Gaussian input file, requesting from the checkpoint file we used to calculate excited states. The file looks as follows:

%Oldchk=adenine.chk
%chk=adNTO1.chk
%nproc=8
%mem=1Gb

#p SP geom=allcheck guess=(read,only) density=(Check,Transition=1) pop=(minimal,NTO,SaveNTO)

I want to say some important things right here for this last file. See that no level of theory is needed, all the calculation data is requested from the checkpoint file “adenine.chk”, and saved into the new checkpoint file “adNTO1.chk”, we must use the previous calculated density and specify the transition of interest, it means the excited state we want to analyze. As we don’t need to specify charge, multiplicity or even the comment line, this file finishes really fast.

After doing this last calculation, we use the new checkpoint file “adNTO1.chk” and we format it:

formchk -3 adNTO1.chk adNTO1.fchk

If we open this formatted checkpoint file with GaussView, chemcraft or the visualizer you want, we will see something interesting by watching he MOs diagram, as follows:

We can realize that frontier orbitals shows the same value of 0.88135, which means the real transition contribution to the first excited state. As these orbitals are contributing the most, we can plot them by using the cubegen routine:

cubegen 0 mo=homo adNTO1.fchk adHOMO.cub 0 h

This last command line is for plotting the equivalent as the HOMO orbital. If we want to plot he LUMO, just change the “homo” keyword for “lumo”, it doesn’t matter if it is written with capital letters or not.

You must realize that the Natural Transition Orbitals are quite different from Molecular Orbitals. For visual comparisson, I’ve printed also the molecular orbitals, given from the optimization and from excited states calculations, without calculating NTOs:

These are the molecular frontier orbitals, plotted with Chimera with 0.02 as the isovalue for both phase spaces:

The frontier NTOs look qualitatively the same, but that’s not necessarily always the case:

If we analyze these NTOs on a hole-electron model, the HOMO refers to the hole space and the LUMO refers to the electron space.

Maybe both orbitals look the same, but both frontier orbitals are quite different between them, and these last orbitals are the ones implied on first excited state of adenine. The electron transition will be reported as follows:

If I can do a graphic summary for this topic, it will be the next one:

NTOs analysis is useful no matter if you calculate excited states by using CIS(D), EOM-CCS(D), TD-DFT, CASSCF, or any of the excited states method of your election. These NTOs are useful for population analysis in excited states, but these calculations require another software, MultiWFN is an open-source code that allows you to do this analysis, and another one is called TheoDORE, which we’ll cover in a later post.

%d bloggers like this: