Category Archives: Chemistry

Population Analysis in the Excited State with Gaussian


To calculate what the bonding properties of a molecule are in a particular excited state we can run any population analysis following the root of interest. This straightforward procedure takes two consecutive calculations since you don’t necessarily know before hand which excited state is the one of interest.

The regular Time Dependent Density Functional Theory (TD-DFT) calculation input with Gaussian 16 looks as follows (G09 works pretty much the same), let us assume we’ve already optimized the geometry of a given molecule:

%OldChk=filename.chk
%nprocshared=16
%chk=filename_ES.chk

#p TD(NStates=10,singlets) wb97xd/cc-pvtz geom=check guess=read

Title Card Required

0 1
--blank line--

This input file retrieves the geometry and wavefunction from a previous calculation from filename.chk and doesn’t write anything new into it (that is what %OldChk=filename.chk means) and creates a new checkpoint where the excited states are calculated (%chk=filename_ES.chk)

In the output you search for the transition which peeks your interest; most often than not you’ll be interested in the one with the highest oscillator strength, f. The oscillator strength is a dimensionless number that represents the ratio of the observed, integrated, absorption coefficient to that calculated for a single electron in a three-dimensional harmonic potential [Harris & Bertolucci, Symmetry and Spectroscopy]; in other words, it is related to the probability of that transition to occur, and therefore it takes values from 0.0 to 1.0 (for single photon absorption processes.)

The output of this calculation looks as follows, the value of f for every excitation is reported together with its energy and the orbital transitions which comprise it.

 Excitation energies and oscillator strengths:

 Excited State   1:      Singlet-A      3.1085 eV  398.86 nm  f=0.0043  <S**2>=0.000
      56 -> 59        -0.11230
      58 -> 59         0.69339
 This state for optimization and/or second-order correction.
 Total Energy, E(TD-HF/TD-DFT) =  -1187.56377917
 Copying the excited state density for this state as the 1-particle RhoCI density.

 Excited State   2:      Singlet-A      4.0827 eV  303.68 nm  f=0.0016  <S**2>=0.000
      52 -> 59         0.46689
      52 -> 64        -0.20488
      53 -> 59         0.19693
      54 -> 59         0.40414
      54 -> 64        -0.16261
...
... 
Excited State   8:      Singlet-A      5.2345 eV  236.86 nm  f=0.8063  <S**2>=0.000
      52 -> 60         0.17162
      53 -> 59         0.47226
      53 -> 60        -0.11771
      54 -> 59        -0.27658
      54 -> 60        -0.22006
      55 -> 59         0.20496
      56 -> 59         0.15029

Now we’ve selected excited state #8 because it has the largest value of f from the lot, we use the following input to read in the geometry from the old checkpoint file and we generate a new one in case we need it for something else. The input file for doing all this looks as follows (I’ve selected as usual the Natural Bond Orbital population analysis):

%oldchk=a_ES.chk
%nprocshared=16
%chk=a_nbo.chk

#p TD(Read,Root=8) wb97xd/cc-pvtz geom=check density=current guess=read pop=NBORead

Title Card Required

0 1

$NBO BOAO BNDIDX E2PERT $END

--blank line--

The flags at the bottom request the calculation of Wiberg Bond Indexes (BNDIDX) as well as Bond Order in the Atomic Orbital basis (BOAO) and a second order perturbation theory for the electronic delocalization (E2PERT). Now we can compare the population analysis between ground and the 8th excited state; check figure 1 and notice the differences in Wiberg’s bond order for this complex made of two molecules and one Na+ cation.

Figure 1. Natural Population Analysis comparison for a supramolecular arrangement. Numbers next in brackets correspond to the sum of charges for each molecule. Notice the significant change in charges for each molecule when going from the ground to the 8th excited state.

In this example we can observe that in the ground state we have a neutral and a negative molecule together with a Na+ cation, but when we analyze the population in the 8th excited state both molecules acquire a similar charge, ca. 0.46e, which means that some of the electron density has been transferred from the negative one to the neutral molecule, forming an Electron Donor-Acceptor complex (EDA) in the excited state.

This procedure can be extended to any other kind of population analysis and their derived combination, e.g. one could calculate their condensed fukui functions in the Nth excited state; but beware! These calculations yield vertical excitations, should the excited state of interest have a minimum we can first optimize the ES geometry and then perform the population analysis on said geometry; just add the opt keyword to perform both jobs in one go, but bear in mind that the NBO population analysis is performed before and after the optimization process so look for the tables and values closer to the end of the output file.

In the case of open shell systems the procedure is the same but one should be extremely careful in searching for the total population analysis since the output file contains this table for the alpha and beta populations separately as well as the added values for the total number of electrons.

Advertisement

Au(I) Chemistry No.3 – New paper in Dalton Transactions


Stabilizing Gold in low oxidation states is a longstanding challenge of organometallic chemistry. To do so, a fine tuning of the electron density provided to an Au atom by a ligand via the formation of a σ bond. The group of Professor Rong Shang at the University of Nagasaki has accomplished the stabilization of an aurate complex through the use of a boron, nitrogen-containing heterocyclic carbene; DFT calculations at the wB97XD/(LANL2TZ(f),6-311G(d)) level of theory revealed that this ligand exhibits a high π-withdrawing character of the neutral 4π B,N-heterocyclic carbene (BNC) moiety and a 6π weakly aromatic character with π-donating properties, implying that this is the first cyclic carbene ligand that is able to be tuned between π-withdrawing (Fischer-type)- and π-donating (Schrock-type) kinds.

A π-withdrawing character on part of the ligand is important to allow the electron-rich gold center back donate some of its excess electron density, this way preventing its oxidation. A modification of Bertrand’s cyclic (alkyl)(amino)carbene (CAAC) has allowed Shang and co-workers to perform the two electrons Au(I) reduction to form the aurate shown in figure 1 (CCDC 2109027). This work also reports on the modular synthesis of the BNC-1 ligand and the mechanism was calculated once again by Leonardo “Leo” Lugo.

Figure 1. Compound 4a (H atoms omitted for clarity)

The ability of the BNC-1 ligand to accept gold’s back donation is reflected on the HOMO/LUMO gap as shown in Figure 2; while BNC-1 has a gap of 7.14 eV, the classic NHC carbene has a gap of 11.28 eV, furthermore, in the case of NHC the accepting orbital is not LUMO but LUMO+1. Additionally, the NBO delocalization energies show that the back donation from Au 5d orbital to the C-N antibonding π* orbital is about half that expected for a Fischer type carbene, suggesting an intermediate character between π accepting and π donating carbene. On the other hand, the largest interaction corresponds to the carbanion density donated to Au vacant p orbital (ca. 45 kcal/mol). All these observations reveal the successful tuning of the electron density on BNC-1.

Figure 2. Frontier Molecular Orbitals for the ligand BNC-1 and a comparison to similar carbenes used elsewhere

This study is available in Dalton Transactions. As usual, I’m honored to be a part of this international collaboration, and I’m deeply thankful to the amazing Prof. José Oscar Carlos Jiménez-Halla for inviting me to be a part of it.

Yoshitaka Kimura, Leonardo I. Lugo-Fuentes, Souta Saito, J. Oscar C. Jimenez-HallaJoaquín Barroso-FloresYohsuke YamamotoMasaaki Nakamoto and Rong Shang* “A boron, nitrogen-containing heterocyclic carbene (BNC) as a redox active ligand: synthesis and characterization of a lithium BNC-aurate complex”, Dalton Trans., 2022,51, 7899-7906 https://doi.org/10.1039/D2DT01083F

What do we talk about when we talk about molecules?


Molecules. Atoms glued by bonds; nuclei incarcerated by electrons; electrons forming an inhomogeneous gas contained not by outer walls but by an electrostatic potential in its interior ironically named ‘external potential’. Molecules. The study object of chemists. The fundamental construct on which the chemical understanding of the universe relies.

Ten electrons, ten protons, and ten neutrons, giving rise to various electronic densities, various chemical properties: CH4, NH3, H2O, HF; which is it?

Atoms are letters, molecules are words; Chemistry, their unabashed poetry.

DFT beyond academia


Density Functional Theory is by far the most successful way of gaining access to molecular properties starting from their composition. Calculating the electronic structure of molecules or solid phases has become a widespread activity in computational as well as in experimental labs not only for shedding light on the properties of a system under study but also as a tool to design those systems with taylor-made properties. This level of understanding of matter brought by DFT is based in a rigorous physical and mathematical development, still–and maybe because of it–DFT (and electronic structure calculations in general for that matter) might be thought of as something of little use outside academia.

Prof. Juan Carlos Sancho-García from the University of Alicante in Spain, encouraged me to talk to his students last month about the reaches of DFT in the industrial world. Having once worked in the IP myself I remembered the simulations performed there were mostly DPD (Dissipative Particle Dynamics), a coarse grained kind of molecular dynamics, for investigating the interactions between polymers and surfaces, but no DFT calculations were ever on sight. It is widely known that Docking, QSAR, and Molecular Dynamics are widely used in the pharma industry for the development of new drugs but I wasn’t sure where DFT could fit in all this. I thought patent search would be a good descriptor for the commercial applicability of DFT. So I took a shallow dive and searched for patents explicitly mentioning the use of DFT as part of the invention development process and protection. The first thing I noticed is that although they appear to be only a few, these are growing in numbers throughout the years (Figure 1). Again, this was not an exhaustive search so I’m obviously overlooking many.

Figure 1 – A non-exhaustive search in a patents database

The second thing that caught my attention was that the first hit came from 1998, nicely coinciding with the rise of B3LYP (Figure 2). This patent was awarded to Australian inventors from the University of Wollongong, South New Wales to determine trace gas concentrations by chromatography by means of calculating the FT-IR spectra of sample molecules (Figure 3), so DFT is used as part of the invention but I ignore if this is a widespread method in analytical labs.

Figure 2 – B3LYP cited in scientific publications

While I’m mentioning the infamous B3LYP functional, a search about it in patents yields the following graph (Figure 4), most of which relate to the protection of photoluminescent or thermoluminescent molecules for light emitting devices; it appears that DFT calculations are used to provide the key features of their protection, such as HOMO-LUMO gap etc.

Figure 4 – Patents bearing B3LYP as part of their invention

So what about software? Most of the more recent patents in Figure 1 (2018 – 2022) lie in the realm of electronics, particularly the development of semiconductors, ceramical or otherwise, so it was safe to assume VASP could be a popular choice to that end, right? turns out that’s not necessarily the case since a patent search for VASP only accounts for about the 10% of all awarded patents (Figure 5).

Figure 5 – VASP in patents

I guess it’s safe to say by now that DFT has a significant impact in the industrial development, one could only expect it to keep on rising, however the advent of machine learning techniques and other artificial intelligence related methods promise an accelerated development. I went again to the patents database and this time searched for ‘machine learning development materials‘ (the term ‘development’ was deleted by the search engine, guess found it too obvious) and its rise is quite notorious, surpassing the frequency of DFT in patents (Figure 6), particularly in the past 5 years (2018 – 2022).

Figure 6 – The rise of the machines in materials development

I’m guessing in some instances DFT and ML will tend to go hand in hand in the industrial development process, but the timescales reachable by ML will only tend to grow, so I’m left with the question of what are we waiting for to make ML and AI part of the chemistry curricula? As computational chemistry teachers we should start talking about this points with our students and convince the head of departments to help us create proper courses or we risk our graduates to become niche scientists in a time when new skills are sought after in the IP.

__________________________________________________________________________________

Thanks again to Prof. Juan Carlos Sancho García at the University of Alicante, Spain, who asked me talk about the subject in front of his class, and to Prof. José Pedro Cerón-Carrasco from Cartagena for allowing me to talk about this and other topics at Centro Universitario de la Defensa. Thank you, guys! I look forward to meeting you again soon.

Worldwide CompChem in the Fight against COVID-19


The war against COVID-19 has been waged in many fronts. The computational chemistry community has done their share during this pandemic to put forward a cure, a vaccine, or a better understanding of the molecular mechanisms behind the human infection by the SARS-CoV-2 virus. As few vaccines show currently their heads and start making their way around the globe to stop the spreading, amidst a climate of disinformation, distrust and political upheaval, all of which pose several challenges yet to be faced aside from the technical and scientific ones.

This is by no means a comprehensive review of the literature, in fact, most of the cited literature herein was observed in Twitter under the #CompChem and #COVID combined hashtags; Summarizing the research by the CompChem community on COVID-19 related topics in a single blog-post would be near to impossible—I trust a book is being written on it as I type these lines.

The structural elucidation of the proteins associated to the SARS-CoV-2 virus is probably the first step required in designing chemical compounds capable of modifying their functions and altering their life-cycle without altering the biochemistry of the hosts. The Coronavirus Structural Taskforce has elucidated the structure of 28 proteins of SARS-CoV-2 aside from the 300+ proteins from the previous SARS-CoV virus using the tools from the FoldIt at home game based on the Rosetta program to heuristically predict the structure of these proteins. Structure based drug design rely on the knowledge of the structure of the active site (hence the name), but in the case of newly discovered proteins for which homology modeling is not entirely feasible, a ligand-based approach named D3Similarity was developed early in the pandemic for identifying the possible active sites by the group of Prof. Zhijian Xu. Mapping of the of the viral genome and proteome was also achieved early on during the first dates of lockdown in the American continent. The information was readily made available and usable for further studies which prompts another challenge: the rapid dissemination, review and evaluation of information to make scientifically sound claims and make data-based decisions. In this regard, the role of preprints cannot be stressed enough. Without a rapid communication, scientific results cannot generate a much needed critical mass to turn all these data into knowledge. As evidenced by the vast majority of the links present in this post, ChemRXiv from the ACS served the much needed function to gather, link and put the data for scientific evaluation out there in order to accelerate the discovery of solutions to the various steps of the virus’ reproductive cycle through various strategies.

The role of supercomputing has been paramount worldwide to the various efforts made in CompChem (read the C&EN piece) in various fronts from structural elucidation, such as the AI driven structural modelling of spike proteins and their infection mechanism led by Prof. Rommie Amaro (UCSD) and Dr. Arvind Ramanathan which was celebrated by the Bell Prize, to development of vaccines. Many Molecular Dynamics simulations have been performed on potential inhibitors of proteins such as the spike protein, in some cases these simulations coupled with cryo-EM microscopy allowed for the elucidation of the hinging mechanism of these spike proteins, their thermodynamic properties, and all atoms-simulations assessed the rigidity of the receptor as the cause of its infectivity. Still, owning these computing resources isn’t always cost effective; that’s why there have been outsourced to companies such as Amazon web services as Pearlman did for the QM/DFT calculations of the binding energy of several drug candidates for the inhibition of the virus’ main protease (MPro). Many other CADD studies are available (here, here, and here). Researchers from all around the world can chip in and join the effort by reaching out to the COVID-19 High Performance Computing Consortium (HPC) which brings together some of the most advanced computing systems to the hands of private and academic researchers with relevant projects aimed to the study of the virus. On the other side of the Atlantic, the Partnership for Advanced Computing in Europe (PRACE) also provides access to advanced computing services for research. As an effort to keep all the developing information curated and concentrated, the COVID-19 Molecular Structure and Therapeutics Hub was created to provide a community-driven data repository and curation service for molecular structures, models, therapeutics, and simulations related to computational research related to therapeutic opportunities.

As described above, molecular dynamics simulations are capital in the assessment of how drugs interact with proteins. But molecular dynamics can only do so much as they’re computing intensive so, the use of Polarizable Force Fields (PFF) algorithms to obtain results in the microseconds regime with high-resolution sampling methods which have been applied also to the modeling of the MPro protein; the phase space is sampled by different MD trajectories which are then tested and selected. Aside from classical simulations, artificial intelligence predictions and docking calculations, also quantum mechanical calculations have been employed in the search for the most intimate interactions governing the mechanisms of inhibition of proteins. In this front, a Fragment Molecular Orbital based analysis was carried out to find which residues in MPro interacted the most with a given inhibitor.

Virtual screening is at the heart of the computationally aided drug discovery process, specially high-throughput virtual screening such as the one performed by the group of Andre Fischer at Basel, in which 11 potential drugs were narrowed from a pool of over 600 million compounds that were analyzed as potential protease inhibitors. Repurposing of antiviral drugs, and other entry-inhibiting compounds, is also a major avenue explored in the search for treatments; in the linked study by Shailly Tomar et al. antiviral drugs which are also anti inflammatory are believed to take care of lung inflammation and injury associated to the infection at the same time they tend to disrupt the virus’ infection mechanism. The comeback of Virtual Reality can make virtual screening more cooperative even during lockdown conditions and more ‘tangible’ as the company Nanome has proven with their COVID-19 Town Hall meetings which aim to the modeling of proteins in 3D space. Aside from the de novo and repurposing efforts, the search for peptides against infection by SARS-CoV-2 was an important topic (here and here). More recently, Skariyachan and Gopal turn to natural products from herbal origins for their virtual screening (molecular docking and dynamics). In their perspective the chemical complexity achieved through biosynthesis can overcome the bottleneck of chemical discovery while at the same time turning to the ancient practices of herbal remedies described in Ayurveda. Other researchers like Manish Manish have also turned to libraries of 500,000+ natural compounds to find potential drugs for MPro.

The year is coming to an end but not the pandemic in any way. Now, with the advent of new strains, and the widespread vaccination effort put in place, it is more important than ever to keep the fight strong in our labs but also in our personal habits and responsibilities—the same advices that were given at the beginning of the year are still in effect today and will continue to be for the months to come. I want to wish everyone who reads this a happy holiday season, but above all I want to pay a small tribute to the scientists working relentlessly in one of the largest coordinated scientific efforts in modern history, one that can only be compared to the Moon landing or the Manhattan Project; to those scientists and all the healthcare personnel, may you find rest soon, may your efforts never go unnoticed: Thank you for your service.

Aurides Chemistry – New Paper in Organometallics


Compound 2 represents the first structural example of a 12 e− auride complex, with a pseudohalide/hydride nature in bonding. According to our NBO calculations, this electron deficient gold center is stabilized by weak intramolecular interactions between Au p orbitals and σC−C and σC−H bonds of adjacent aromatic rings together with a Ga−Au−Ga 3 centers−2 electrons bond (I like the term ‘banana bond‘, don’t you?).

Fig. 1 Crystal structure for Compound 2. Au in the center is effectively an auride.

I was invited to participate in this wonderful venture by my good friend and colleague Dr. José Oscar Carlos Jiménez-Halla, from the University of Guanajuato, Mexico, with whom we’re now working with Prof. Rong Shang at the Hiroshima University. Prof. Shang has synthesized this portentous Auride complex and over the last year, Leonardo “Leo” Lugo has worked with Oscar and I in calculating their electronic structure and bonding properties.

Gold catalysis is an active area of research but low valent Au compounds are electron deficient and therefore highly reactive and elusive; that’s why researchers prefer to synthesize these compounds in situ, to harness their catalytic properties before they’re lost. Power’s digalladeltacyclane was used as a ligand framework to bind to a Au(I) center, which became reduced after the addition and breaking of the Ga−Ga bond while the opposite face of the metallic center became blocked by the bulky aromatic groups on the main ligand. NBO calculations at the M05-2X/[LANL2TZ(f),6-311G(d,p)] and QTAIM BCP analysis show the main features of Au bonding in 2, noteworthy features are the 3c−2e bond (banana) and the σC−C and σC−H donations (See figure 2).

Fig.2 Natural Hybrid Composition for the Ga−Au−Ga ‘banana‘ bond (left). Bond Critical Points (BCPs) for Au in 2 (right).

One of the most interesting features of this compound is the fact that Au(PPh3)Cl reacts differently to the digallane ligand than it does to analogous B−B, Si−Si, or Sn−Sn bonds. The Au−Cl bond does not undergo metathesis as with B−B, nor does it undergo an oxidative addition, so to further understand the chemistry of−and leading to−compound 2, the reaction mechanism energy profile was calculated in a rather painstakingly effort (Kudos, Leo, and a big shoutout to my friend Dr. Jacinto Sandoval for his one on one assistance). Figure 3 shows the energy profile for the reaction mechanism for the formation of 2 from Power’s digallane reagent and Au(PPh3)Cl.

Fig. 3 Free Energy profile for the formation of 2. All values, kcal/mol

You can read more details about this research in Organometallics DOI:10.1021/acs.organomet.0c00557. Thanks again to Profs. Rong Shang and Óscar Jiménez-Halla for bringing me on board of this project and to Leo for his relentless work getting those NBO calculations done; this is certainly the beginning of a golden opportunity for us to collaborate on a remarkable field of chemistry, it has certainly made me go bananas over Aurides chemistry. OK I’ll see myself out.

Mario Molina, Nobel Laureate. Rest In Peace


Prof. Mario Molina was awarded the Nobel Prize in Chemistry in 1995, the same year I started my chemistry education at the chemistry school from the National Autonomous University of Mexico, UNAM, the same school from where he got his undergraduate diploma. To be a chemistry student in the late nineties in Mexico had Prof. Molina as a sort of mythical reference, something to aspire to, a role model, the sort of representation the Latinx and other underrepresented communities still require and seldom get.

I saw him several times at UNAM, where he’d pack any auditorium almost once a year to talk about various research topics, but I remember distinctly the first time I sort of interacted with him. It was 1997 and I attended my first congress, the 5th North America Chemistry Congress. Minutes before the official inauguration which he was supposed to preside, I caught a glimpse of him in the hallways near the main conference room. Being only 19 years old, I thought it’d be a good idea to chase him, ask for his autograph and a picture. He was kind enough not to brush me off and took just a minute to shake my hand, sign my book of abstracts, and get his picture taken with me. But cameras back then relied on the user to place a roll of film correctly. I did not; so the picture, although it happened, it doesn’t exist. Because of this and other anecdotes, that congress cemented my love for chemistry. I never asked for a second picture in the few subsequent occasions I had the pleasure to hear him talk.

Prof. Molina was an advocate of green and sustainable sources of energies. His work predicted the existence of a hole in the ozone layer and his struggle brought change into the banning of CFCs and other substances which interfere with the replenishment of ozone in the sub-stratosphere. Today, his legacy remains but also do his pending battles in the quest for new policies that favor the use of green alternative forms of energy. May he rest in peace and may we continue his example.

The #LatinxChem Twitter Poster Contest


For the past few weeks, some chemists of the worldwide Latinx community have been cooking an online project devoted to showcase the important contributions to chemistry made by workers, students, and researchers from Latinamerican origin.

The result is the #LatinXChem Twitter Poster Contest which will take place 7th September during a 24 hour span and the corresponding Twitter account @latinxchem (go follow it now! I’ll wait right here.)

All chemists from Latinx origin are called to participate by registering their posters in our website latinxchem.org before August 25th. Upon registration, each poster should be classified into one of the eleven categories available and use the corresponding hashtag during the event (e.g. #LatinxchemTheo for the readers of this blog), in which prominent Latinx chemist will serve as reviewers and cast their votes for the best one in each category. Some prizes will be available, thanks to our kind sponsors (RSC, Chemical Science, ACS, Carbomex, The Brazilian Chemical Society, and more to come), but just for those registered works; if anyone wishes to present a poster without being registered at the website they can do so but eligibility for prizes remain for those who complete the register. Official languages for the poster are Spanish, Portuguese, and English.

Each category is organized by young prominent Latinx chemists; for the particular case of Computational Chemistry –the recurring theme of this blog– Prof. Fernanda Duarte (@fjduarteg) from Chile now working at Oxford University in the UK and yours truly (@joaquinbarroso) will be in charge of the #LatinXChemTheo section. Please check the website to learn about the other sections and the wonderful people working hard in the organizing committee (see below for the full list of the organizers and their Twitter handles).

The main goal of the event is to celebrate and showcase the espectacular research, education, and innovation brought to chemistry by a large and vibrant community dispersed throughout the globe of Latinx identification. We want to celebrate diversity by showcasing our contributions in the context of a global science interconnected with people from other groups.

So please visit our website, help us spread the word and get those posters ready, we’re eager to read, comment, Tweet and Retweet your work and show the world the drive and passion of Latinxs for chemistry, knowledge, and the betterment of the world through science.

Go follow us all and of course @LatinXchem too!

¡Gracias! Obrigado! Thank you!

Gabriel Merino Cinvestav Mérida, México @theochemmerida
Miguel A. Méndez-Rojas UDLAP, México @nanoprofe
Joaquín Barroso UNAM, México @joaquinbarroso
Javier Vela Iowa State University, USA @vela_group
Diego Solís-Ibarra UNAM, México @piketin
Braulio Rodríguez-Molina UNAM, México @MolinaGroup
Paula X. García-Reynaldos Science Communicator, México @paux_gr
Liliana Quintanar Cinvestav Zacatenco, México @lilquintanar
María Gallardo-Williams North Carolina State University, USA @Teachforaliving
Fernanda Duarte University of Oxford, UK @fjduarteg
Yadira Vega Tec de Monterrey, México @yivega
Gabriel Gomes University of Toronto, Canadá @gpassosgomes
Luciana Oliveira UNICAMP, Brasil @LuBruGonzaga
Cesar A. Urbina-Blanco Ghent University, Belgium @cesapo
Ariane Nunes HITS, Germany @anunesalves
Walter Waldman Brazil, @waldmanlab

DFT Estimation of pKb Values – New Paper in JCIM


As a continuation of our previous work on estimating pKa values from DFT calculations for carboxylic acids, we now present the complementary pKb values for amino groups by the same method, and the coupling of both methodologies for predicting the isoelectric point -pI- values of amino acids as a proof of concept.

Analogously to our work on pKa, we now used the Minimum Surface Electrostatic Potentia, VS,min, as a descriptor of the availability of Nitrogen’s lone pair and correlated it with the experimental basicity of a large number of amines, separated into three groups: primary, secondary and tertiary amines.

Interestingly, the correlation coefficient between experimental and calculated pKb values decreases in the following order: primary (R2 = 0.9519) > secondary (R2 = 0.9112) > tertiary (R2 = 0.8172). This could be due to steric effects, the change in s-character of the lone pair or just plain old selection bias. Nevertheless, there is a good correlation between both values and the resulting equations can predict the pKb value of an amino group within less of a unit, which is very good for a statistical method that does not require the calculation of a full thermodynamic cycle.

We then took thirteen amino acids (those without titratable side chains) and calculated simultaneously VS,min and VS,max for the amino and the carboxyl group (this latter with the use of equation 2 from our previous work published in Molecules MDPI) and the arithmetical average of both gave us their corresponding pI values with an agreement of less than one unit.

This work is now available at the Journal of Chemical Information and Modeling (DOI: 10.1021/acs.jcim.9b01173); as always a shoutout is due to the people working on it: Leonardo “Leo” Lugo, Gustavo “Gus” Mondragón and leading the charge Dr. Jacinto Sandoval-Lira.

Estimation of pKa Values through Local Electrostatic Potential Calculations


Calculating the pKa value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.

Thermodynamic_Cycle
Fig 1. Thermodynamic Cycle for the pKa calculation of any given Bronsted acid, HA

Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the pKa value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (VS,max) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step  in the deprotonation process. 

We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate VS,max on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = water). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:

pKa = -0.2185(VS,max) + 16.1879

Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.

We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)

47051619_1824157374360101_2244437569725005824_n
fig 2. calculated v experimental pKa values for a test set of 10 carboxylic acids from equation above

I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use. 

We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to tetrel bonding. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.

%d bloggers like this: