The word ‘umpolung‘ is not used often enough in my opinion, and that’s a shame since this phenomenon refers to one of the most classic tropes or deus ex machina used in sci-fi movies—prominently in the Dr. Who lore*—and that is ‘reversing the polarity‘. Now, reversing the polarity only means that for any given dipole the positively charged part now acquires a negative charge, while the originally negatively charged part becomes positively charged, and thus the direction of the dipole moment is, well, reversed.
In chemistry, reversing the polarity of a bond is an even cooler matter because it means that atoms that typically behave as positively charged become negatively charged and react with other molecules accordingly. Such is the case of this new research conducted experimentally by Prof. Rong Shang at Hiroshima University and theoretically elucidated by Leonardo “Leo” Lugo, who currently works jointly with me and my good friend the always amazing José Oscar Carlos Jimenez-Halla at the University of Guanajuato, Mexico.
Production of molecular hydrogen from water splitting at room temperature is a remarkable feat that forms the basis of fuel cells in the search for cleaner sources of energy; this process commonly requires a metallic catalyst, and it has been achieved via Frustrated Lewis Pairs from Si(II), but so far the use of an intramolecular electron relay process has not been reported.
Prof. Rong Shang and her team synthesized an ortho-phenylene linked bisborane functionalized phosphine (Figure 1), and proved their stoichiometric reaction with water yielding H2 and phosphine oxide quantitatively at room temperature. During the reaction mechanism the umpolung occurs when a proton from the captured water molecule forms a hydride centered on the borane moiety of BPB. The reaction mechanism is shown in Figure 2.
According to the calculated mechanism, a water molecule coordinates to one of the borane groups via the oxygen atom, and the phosphorus atom later forms a hydrogen bond via their lone pair separating the water molecule into OH– and H+, this latter migrates to the second borane and it is during this migration (marked TSH2 in Figure 2) where the umpolung process takes place; the natural charge of the hydrogen atom changes from positive to negative and stays so in the intermediate H3. This newly formed hydride reacts with the hydrogen atom on the OH group to form the reduction product H2, the final phosphine oxide shows a PO…B intramolecular forming a five membered ring which further stabilizes it.
This results are now available in Chemical Science, 2021, 12, 15603 DOI:10.1039/d1sc05135k. As always, I deeply thank Prof. Óscar Jiménez-Halla for inviting me to participate on this venture.
* Below there’s a cool compilation of the Reverse the Polarity trope found in Dr. Who:
The Mexican Meeting on Theoretical Physical Chemistry is a national staple of our local scientific discipline. The nineteenth edition had to be a virtual conference due to sanitary restrictions still enforced in Mexico. Nevertheless, this was a successful meeting in which we tried new things, such as a live broadcast via our new official YouTube channel and a Twitter poster session covered under the hashtag #RMFQTXIX.
Please browse the previous links (talks in Spanish, most Tweets are also in Spanish but some are available in English.) Twitter conferences are here to stay and the creativity from the participants will be key in moving them forward; unfortunately, most of us are still grounded in the traditional idea of a physical poster and that notion taken literally translates poorly to a Tweet. I wanted to embed some of the presented posters but I don’t want to leave people out and they were fortunately too many for them to fit in a blog post. So head on to Twitter and check the hashtags #RMFQTXIX and #CompChemMX and follow the official Twitter account for the RMFQT.
I found this error in the calculation of two interacting fragments, both with unpaired electrons. So, two radicals interact at a certain distance and the full system is deemed as a singlet, therefore the unpaired electron on each fragment have opposite spins. The problem came when trying to calculate the Basis Set Superposition Error (BSSE) because in the Counterpoise method you need to assign a charge and multiplicity to each fragment, however it’s not obvious how to assign opposite spins.
The core of the problem is related to the guess construction; normally a Counterpoise calculation would look like the following example:
#p B3LYP/6-31G(d,p) counterpoise=2 -2,1 -1,2 -1,2 C(Fragment=1) 0.00 0.00 0.00 O(Fragment=2) 1.00 1.00 1.00 ...
In which the first pair of charge-multiplicity numbers correspond to the whole molecule and the following to those of each fragment in increasing order of N (in this case, N = 2). So for this hypothetical example we have two anions (but could easily be two cations) each with an unpaired electron, yielding a complex of charge = -2 and a singlet multiplicity which implies those two unpaired electrons have opposite spin. But if the guess (the initial trial wavefunction from which the SCF will begin) has a problem understanding this then the title error shows up:
Bad data into FinFrg Error termination via Lnk1e ...
The solution to this problem is as simple as it may be obscure: Create a convenient guess wavefunction by placing a negative sign to the multiplicity of one of the fragments in the following example. You may then use the guess as the starting point of other calculations since it will be stored in the checkpoint file. By using this negative sign we’re not requesting a negative multiplicity, but a given multiplicity of opposite spin to the other fragment.
#p B3LYP/6-31G(d,p) guess=(only,fragment=2) -2,1 -1,2 -1,-2 C(Fragment=1) 0.00 0.00 0.00 O(Fragment=2) 1.00 1.00 1.00 ...
This way, the second fragment will have the opposite spin (but the same multiplicity) as the first fragment. The only keyword tells gaussian to only calculate the guess wave function and then exit the program. You may then use that guess as the starting point for other calculations such as my failed Counterpoise one.
Once again, as the year before, I woke to my phone resembling a slot machine from any Vegas casino. #LatinXChem became a Trending Topic on Twitter early from it’s start on september 20th 2021, this year the Twitter poster session was divided into eleven categories (Ana, Bio, Comp, Edu, Eng, Env, Inorg, Mat, Nano, Org, Phys) with Nano and Comp being the ones with the most participants.
Last year, due to the 2020 COVID-19 pandemic, most if not all scientific conferences were cancelled and in some cases replaced by an endless stream of webinars, some with virtual spaces for hanging out and networking with colleagues around the world. Poster sessions on massive platforms like Twitter aren’t new, the yearly #RSCPoster session was already a popular event even before the many lockdowns, but only a few have amassed the number of participants #LatinXChem has. From its conception, #LatinXChem was thought not to be exclusive for the Latin community, but rather as an event created from Latin America for the world; it’s now time to move it forward into the next level, making it a brand, not just a yearly event. Through #LatinXChem many activities could be sponsored, students could be promoted, and courses could be taught, all covered under the same flag of a global identity with one core goal in mind: showcasing and promoting the chemistry work done by underrepresented groups anywhere in the planet. As this pandemic has clearly shown, science illiteracy is dangerous at various levels, from the health choices we make in our homes to the policies imposed by governments.
To check all the fantastic works presented in this edition just go to Twitter and add the name of the category as described above to the hashtag #LXChem (e.g. #LXChemComp will hash all the posters registered in the computational chemistry category), make sure you like and share your favorite pieces of research but above all make sure you engage with the people behind the research, you won’t regret making new acquaintances who share your scientific interests.
A cool new feature of this year’s edition is the video interviews 8 Minutes with, featuring very personal and inspiring interviews with Latin scientists like Adrian Roitberg, Alan Aspuru-Guzik, and my good friend Javier Vela, but also with awardees from the previous edition. These interviews are also available on YouTube where new videos are uploaded continuously. Although we know people is kinda overloaded with webinars after so many months of distancing, we wanted to keep a tradition of having lectures from Nobel Laureates within the framework of our event. Last year we had the honor of having Prof. Frances Arnold gave the last lecture and this year we’ll have Profs. Roald Hoffmann and Rudolf Marcus.
At #LatinXChem we know chemistry, and we prove it day in and day out sharing our work, our experiences, and our joy for doing exciting scientific research.
The energy of your calculated transition state (TS) is lower than that of the reagents. That’s gotta be an error right? Well, maybe not.
Typically, in classical transition state theory, we associate the reaction barrier to the energy difference between the reaction complex and the TS, in other words, we associate the reaction barrier to the relative energy of the TS. However, this isn’t always the case, since the TS isn’t always located at the barrier, which simply may not exist or may be a submerged one, i.e. the TS relative energy is negative with respect to the reaction complex. This leads to negative activation energies, but one must bear in mind that the activation energy is not equal to the relative energy of the TS but rather to the slope of the Arrhenius plot, which in turn comes from the Arrhenius equation given below.
k = Aexp(Ea/RT) or in logarithmic form Lnk = LnA + (Ea/RT) The Arrhenius plot is then the plot of Lnk vs T-1, with slope Ea
Caution is advised since the apparent presence of such a barrier may be due to a computational artifact rather than to the real kinetics taking place, that’s why an IRC calculation must follow a TS optimization in order to verify the truthfulness of the TS; keep in mind that in classical transition state theory, we’re ‘slicing‘ a multidimensional map along a carefully chosen reaction coordinate but this choice might not entirely be the right one, or even an existing one for that matter. I also recommend to change the level of theory, reconsider the reaction complex structure (because a hidden intermediate or complex may be lurking between reactants and TS, see figure 1) and fully verifying the thermochemistry of all components involved before asserting that any given reaction under study has one of these atypical barriers.
Electronic excitations are calculated vertically according to the Frank—Condon principle, this means that the geometry does not change upon the excitation and we merely calculate the energy required to reach the next electronic state. But for some instances, say calculating not only the absorption spectra but also the emission, it is important to know what the geometry minimum of this final state looks like, or if it even exists at all (Figure 1). Optimizing the geometry of a given excited state requires the prior calculation of the vertical excitations whether via a multireference method, quantum Monte Carlo, or the Time Dependent Density Functional Theory, TD-DFT, which due to its lower computational cost is the most widespread method.
Most single-reference treatments, ab initio or density based, yield good agreement with experiments for lower states, but not so much for the higher excitations or process that involve the excitation of two electrons. Of course, an appropriate selection of the method ensures the accuracy of the obtained results, and the more states are considered, the better their description although it becomes more computationally demanding in turn.
In Gaussian 09 and 16, the argument to the ROOT keyword selects a given excited state to be optimized. In the following example, five excited states are calculated and the optimization is requested upon the second excited state. If no ROOT is specified, then the optimization would be carried out by default on the first excited state (Where L.O.T. stands for Level of Theory).
#p opt TD=(nstates=5,root=2) L.O.T.
Gaussian16 includes now the calculation of analytic second derivatives which allows for the calculation of vibrational frequencies for IR and Raman spectra, as well as transition state optimization and IRC calculations in excited states opening thus an entire avenue for the computation of photochemistry.
If you already computed the excited states and just want to optimize one of them from a previous calculation, you can read the previous results with the following input :
#p opt TD=(Read,Root=N) L.O.T. Density=Current Guess=Read Geom=AllCheck
Common problems. The following error message is commonly observed in excited state calculations whether in TD-DFT, CIS or other methods:
No map to state XX, you need to solve for more vectors in order to follow this state.
This message usually means you need to increase the number of excited states to be calculated for a proper description of the one you’re interested in. Increase the number N for nstates=N in the route section at higher computational cost. A rule of thumb is to request at least 2 more states than the state of interest. This message can also reflect the fact that during the optimization the energy ordering changes between states, and can also mean that the ground state wave function is unstable, i.e., the energy of the excited state becomes lower than that of the ground state, in this case a single determinant approach is unviable and CAS should be used if the size of the molecule allows it. Excited state optimizations are tricky this way, in some cases the optimization may cross from one PES to another making it hard to know if the resulting geometry corresponds to the state of interest or another. Gaussian recommends changing the step size of the optimization from the default 0.3 Bohr radius to 0.1, but obviously this will make the calculation take longer.
If the minimum on the excited state potential energy surface (PES) doesn’t exist, then the excited state is not bound; take for example the first excited state of the H2 molecule which doesn’t show a minimum, and therefore the optimized geometry would correspond to both H atoms moving away from each other indefinitely (Figure 2). Nevertheless, a failed optimization doesn’t necessarily means the minimum does not exist and further analysis is required, for instance, checking the gradient is converging to zero while the forces do not.
The war against COVID-19 has been waged in many fronts. The computational chemistry community has done their share during this pandemic to put forward a cure, a vaccine, or a better understanding of the molecular mechanisms behind the human infection by the SARS-CoV-2 virus. As few vaccines show currently their heads and start making their way around the globe to stop the spreading, amidst a climate of disinformation, distrust and political upheaval, all of which pose several challenges yet to be faced aside from the technical and scientific ones.
This is by no means a comprehensive review of the literature, in fact, most of the cited literature herein was observed in Twitter under the #CompChem and #COVID combined hashtags; Summarizing the research by the CompChem community on COVID-19 related topics in a single blog-post would be near to impossible—I trust a book is being written on it as I type these lines.
The structural elucidation of the proteins associated to the SARS-CoV-2 virus is probably the first step required in designing chemical compounds capable of modifying their functions and altering their life-cycle without altering the biochemistry of the hosts. The Coronavirus Structural Taskforce has elucidated the structure of 28 proteins of SARS-CoV-2 aside from the 300+ proteins from the previous SARS-CoV virus using the tools from the FoldIt at home game based on the Rosetta program to heuristically predict the structure of these proteins. Structure based drug design rely on the knowledge of the structure of the active site (hence the name), but in the case of newly discovered proteins for which homology modeling is not entirely feasible, a ligand-based approach named D3Similarity was developed early in the pandemic for identifying the possible active sites by the group of Prof. Zhijian Xu. Mapping of the of the viral genome and proteome was also achieved early on during the first dates of lockdown in the American continent. The information was readily made available and usable for further studies which prompts another challenge: the rapid dissemination, review and evaluation of information to make scientifically sound claims and make data-based decisions. In this regard, the role of preprints cannot be stressed enough. Without a rapid communication, scientific results cannot generate a much needed critical mass to turn all these data into knowledge. As evidenced by the vast majority of the links present in this post, ChemRXiv from the ACS served the much needed function to gather, link and put the data for scientific evaluation out there in order to accelerate the discovery of solutions to the various steps of the virus’ reproductive cycle through various strategies.
The role of supercomputing has been paramount worldwide to the various efforts made in CompChem (read the C&EN piece) in various fronts from structural elucidation, such as the AI driven structural modelling of spike proteins and their infection mechanism led by Prof. Rommie Amaro (UCSD) and Dr. Arvind Ramanathan which was celebrated by the Bell Prize, to development of vaccines. Many Molecular Dynamics simulations have been performed on potential inhibitors of proteins such as the spike protein, in some cases these simulations coupled with cryo-EM microscopy allowed for the elucidation of the hinging mechanism of these spike proteins, their thermodynamic properties, and all atoms-simulations assessed the rigidity of the receptor as the cause of its infectivity. Still, owning these computing resources isn’t always cost effective; that’s why there have been outsourced to companies such as Amazon web services as Pearlman did for the QM/DFT calculations of the binding energy of several drug candidates for the inhibition of the virus’ main protease (MPro). Many other CADD studies are available (here, here, and here). Researchers from all around the world can chip in and join the effort by reaching out to the COVID-19 High Performance Computing Consortium (HPC) which brings together some of the most advanced computing systems to the hands of private and academic researchers with relevant projects aimed to the study of the virus. On the other side of the Atlantic, the Partnership for Advanced Computing in Europe (PRACE) also provides access to advanced computing services for research. As an effort to keep all the developing information curated and concentrated, the COVID-19 Molecular Structure and Therapeutics Hub was created to provide a community-driven data repository and curation service for molecular structures, models, therapeutics, and simulations related to computational research related to therapeutic opportunities.
As described above, molecular dynamics simulations are capital in the assessment of how drugs interact with proteins. But molecular dynamics can only do so much as they’re computing intensive so, the use of Polarizable Force Fields (PFF) algorithms to obtain results in the microseconds regime with high-resolution sampling methods which have been applied also to the modeling of the MPro protein; the phase space is sampled by different MD trajectories which are then tested and selected. Aside from classical simulations, artificial intelligence predictions and docking calculations, also quantum mechanical calculations have been employed in the search for the most intimate interactions governing the mechanisms of inhibition of proteins. In this front, a Fragment Molecular Orbital based analysis was carried out to find which residues in MPro interacted the most with a given inhibitor.
Virtual screening is at the heart of the computationally aided drug discovery process, specially high-throughput virtual screening such as the one performed by the group of Andre Fischer at Basel, in which 11 potential drugs were narrowed from a pool of over 600 million compounds that were analyzed as potential protease inhibitors. Repurposing of antiviral drugs, and other entry-inhibiting compounds, is also a major avenue explored in the search for treatments; in the linked study by Shailly Tomar et al. antiviral drugs which are also anti inflammatory are believed to take care of lung inflammation and injury associated to the infection at the same time they tend to disrupt the virus’ infection mechanism. The comeback of Virtual Reality can make virtual screening more cooperative even during lockdown conditions and more ‘tangible’ as the company Nanome has proven with their COVID-19 Town Hall meetings which aim to the modeling of proteins in 3D space. Aside from the de novo and repurposing efforts, the search for peptides against infection by SARS-CoV-2 was an important topic (here and here). More recently, Skariyachan and Gopal turn to natural products from herbal origins for their virtual screening (molecular docking and dynamics). In their perspective the chemical complexity achieved through biosynthesis can overcome the bottleneck of chemical discovery while at the same time turning to the ancient practices of herbal remedies described in Ayurveda. Other researchers like Manish Manish have also turned to libraries of 500,000+ natural compounds to find potential drugs for MPro.
The year is coming to an end but not the pandemic in any way. Now, with the advent of new strains, and the widespread vaccination effort put in place, it is more important than ever to keep the fight strong in our labs but also in our personal habits and responsibilities—the same advices that were given at the beginning of the year are still in effect today and will continue to be for the months to come. I want to wish everyone who reads this a happy holiday season, but above all I want to pay a small tribute to the scientists working relentlessly in one of the largest coordinated scientific efforts in modern history, one that can only be compared to the Moon landing or the Manhattan Project; to those scientists and all the healthcare personnel, may you find rest soon, may your efforts never go unnoticed: Thank you for your service.
Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.
Fundamentally, the interaction energy of any A—B dimer, Eint, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).
Eint = EAB – EA – EB (1)
However the calculation of Eint by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)
EintCP = EABAB – EAAB– EBAB (2)
where the superscript AB means the whole basis set is used. This is accomplished by using ‘ghost‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.
In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.
%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70
For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!
The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.
BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.
I have written about extracting information from excited state calculations but an important consideration when analyzing the results is the proper use of the keyword density.
This keyword let’s Gaussian know which density is to be used in calculating some results. An important property to be calculated when dealing with excited states is the change in dipole moment between the ground state and any given state. The Transition Dipole Moment is an important quantity that allows us to predict whether any given electronic transition will be allowed or not. A change in the dipole moment (i.e. non-zero) of a molecule during an electronic transition helps us characterize said transition.
Say you perform a TD-DFT calculation without the density keyword, the default will provide results on the lowest excited state from all the requested states, which may or may not be the state of interest to the transition of interest; you may be interested in the dipole moment of all your excited states.
Three separate calculations would be required to calculate the change of dipole moment upon an electronic transition:
1) A regular DFT for the ground state as a reference
2) TD-DFT, to calculate the electronic transitions; request as many states as you need/want, analyze it and from there you can see which transition is the most important.
3) Request the density of the Nth state of interest to be recovered from the checkpoint file with the following route section:
# TD(Read,Root=N) LOT Density=Current Guess=Read Geom=AllCheck
replace N for the Nth state which caught your eye in step number 2) and LOT for the Level of Theory you’ve been using in the previous steps. That should give you the dipole moment for the structure of the Nth excited state and you can compare it with the one in the ground state calculated in 1). Again, if density=current is not used, only properties of N=1 will be printed.