As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.
When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.
Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.
X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.
As if I didn’t have enough things to do I’m launching a new blog inspired by the #365papers hashtag on Twitter and the naturalproductman.wordpress.com blog. In it I’ll hopefully list, write a femto-review of all the papers I read. This new effort is even more daunting than the actual reading of the huge digital pile of papers I have in my Mendeley To-Be-Read folder, the fattest of them all. The papers therein wont be a comprehensive review of Comp.Chem. must-read papers but rather papers relevant to our lab’s research or curiosity.
Maybe I’ll include some papers brought to my attention by the group and they could do the review. The whole endeavor might flop in a few weeks but I want to give it a shot; we’ll see how it mutates and if it survives or not. So far I haven’t managed to review all papers read but maybe this post will prompt to do so if only to save some face. The domain of the new blog is compchemdigest.wordpress.com but I think it should have included the word MY at the beginning so as to convey the idea that it is only my own biased reading list. Anyway, if you’re interested share it and subscribe, those post will not be publicized.
I found it surprising that the trichloromethyl group could be chemically reduced into a methyl group quite rapidly in the presence of thiophenol, but once again a failed reaction in the lab gave us the opportunity to learn some nuances about the chemical reactivity of organic compounds. Even more surprising was the fact that this reduction occured through a mechanism in which chlorine atoms behave as electrophiles and not as nucleophiles.
We proposed the mechanism shown in figure 1 to be consistent with the 1H-NMR kinetic experiment (Figure 2) which shows the presence of the intermediary sulfides and leads to the observed phenyl-disulfide as the only isolable byproduct. The proposed mechanism invokes the presence of σ-holes on chlorine atoms to justify the attack of thiophenolate towards the chlorine atom leaving a carbanion behind during the first step. The NMR spectra were recorded at 195K which implies that the energy barriers had to be very low; the first step has a ~3kcal/mol energy barrier at this temperature.
To calculate these energy barriers we employed the BMK functional as implemented in Gaussian09. This functional came highly recommended to this purpose and I gotta say it delivered! The optimized geometries of all transition states and intermediaries were then taken to an MP2 single point upon which the maximum electrostatic potential on each atom (Vmax) was calculated with MultiWFN. In figure 3 we can observe the position and Vmax value of σ-holes on chlorine atoms as suggested by the mapping of electrostatic potential on the electron density of various compounds.
We later ran the same MP2 calculations on other CCl3 groups and found that the binding to an electron withdrawing group is necessary for a σ-hole to be present. (This fact was already present in the literature, of course, but reproducing it served us to validate our methodology.)
We are pleased to have this work published in PhysChemChemPhys. Thanks to Dr. Moisés Romero for letting us into his laboratory and to Guillermo Caballero for his hard work both in the lab and behind the computer; Guillermo is now bound to Cambridge to get his PhD, we wish him every success possible in his new job and hope to see him again in a few years, I’m sure he will make a good job at his new laboratory.
Literature in synthetic chemistry is full of reactions that do occur but very little or no attention is payed to those that do not proceed. The question here is what can we learn from reactions that are not taking place even when our chemical intuition tells us they’re feasible? Is there valuable knowledge that can be acquired by studying the ‘anti-driving force’ that inhibits a reaction? This is the focus of a new manuscript recently published by our research group in Tetrahedron (DOI: 10.1016/j.tet.2016.05.058) which was the basis of Guillermo Caballero’s BSc thesis.
It is well known in organic chemistry that if a molecular structure has the possibility to be aromatic it can somehow undergo an aromatization process to achieve this more stable state. During some experimental efforts Guillermo Caballero found two compounds that could be easily regarded as non-aromatic tautomers of a substituted pyridine but which were not transformed into the aromatic compound by any means explored; whether by treatment with strong bases, or through thermal or photochemical reaction conditions.
These results led us to investigate the causes that inhibits these aromatization reactions to occur and here is where computational chemistry took over. As a first approach we proposed two plausible reaction mechanisms for the aromatization process and evaluated them with DFT transition state calculations at the M05-2x/6-31+G(d,p)//B3LYP/6-31+G(d,p) levels of theory. The results showed that despite the aromatic tautomers are indeed more stable than their corresponding non-aromatic ones, a high activation free energy is needed to reach the transition states. Thus, the barrier heights are the first reason why aromatization is being inhibited; there just isn’t enough thermal energy in the environment for the transformation to occur.
But this is only the proximal cause, we went then to search for the distal causes (i.e. the reasons behind the high energy of the barriers). The second part of the work was then the calculation of the delocalization energies and frontier molecular orbitals for the non-aromatic tautomers at the HF/cc-pVQZ level of theory to get insights for the large barrier heights. The energies showed a strong electron delocalization of the nitrogen’s lone pair to the oxygen atom in the carbonyl group. Such delocalization promoted the formation of an electron corridor formed with frontier and close-to-frontier molecular orbitals, resembling an extended push-pull effect. The hydrogen atoms that could promote the aromatization process are shown to be chemically inaccessible.
Further calculations for a series of analogous compounds showed that the dimethyl amino moiety plays a crucial role avoiding the aromatization process to occur. When this group was changed for a nitro group, theoretical calculations yielded a decrease in the barrier high, enough for the reaction to proceed. Electronically, the bonding electron corridor is interrupted due to a pull-pull effect that was assessed through the delocalization energies.
The identity of the compounds under study was assessed through 1H, 13C-NMR and 2D NMR experiments HMBC, HMQC so we had to dive head long into experimental techniques to back our calculations.
Last week we posted some insights on finding Transitions States in Gaussian 09 in order to evaluate a given reaction mechanism. A stepwise methodology is tried to achieve and this time we’ll wrap the post with two flow charts trying to synthesize the information given. It must be stressed that knowledge about the chemistry of the reaction is of paramount importance since G09 cannot guess the structure connecting two minima on its own but rather needs our help from our chemical intuition. So, without further ado here is the remainder of Guillermo’s post.
METHOD 3. QST3. For this method, you provide the coordinates of your reagents, products and TS (in that order) and G09 uses the QST3 method to find the first order saddle point. As for QST2 the numbering scheme must match for all the atoms in your three sets of coordinates, again, use the connection editor to verify it. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=(qst3,calcfc) geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of reagents --blank line— Charge Multiplicity Coordinates of products --blank line-- Charge Multiplicity Coordinates of TS --blank line---
As I previously mentioned, it happens that you find a first order saddle point but does not correspond to the TS you want, you find an imaginary vibration that is not the one for the bond you are forming or breaking. For these cases, I suggest you to take that TS structure and manually modify the region that is causing you trouble, then use method 2.
METHOD 4. When the previous methods fail to yield your desired TS, the brute force way is to acquire the potential energy surface (PES) and visually locate your possible TS. The task is to perform a rigid PES scan, for this, the molecular structure must be defined using z-matrix. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) scan test geom=connectivity --blank line-- Charge Multiplicity Z-matrix of reagents (or products) --blank line--
In the Z-matrix section you must specify which variables (B, A or D) you want to modify. First, locate the variables you want to modify (distance B, angle A, or dihedral angle D). Then modify those lines within the Z-matrix, here is an example.
B1 1.41 3 0.05 A1 104.5 2 1.0
What you are specifying with this is that the variable B1 (a distance) is going to be stepped 3 times by 0.05. Then variable A1 (an angle) is going to be stepped 2 times by 1.0. Thus, a total of 12 energy evaluations will be performed. At the end of the calculation open the .log file in gaussview and in Results choose the Scan… option. This will open a 3D surface where you should locate the saddle point, this is an educated guess, so take the structure you think corresponds to your TS and use it for method 2.
I have not fully explored this method so I encourage you to go to Gaussian.com and thoroughly review it.
Once you have found your TS structure and via the imaginary vibration confirmed that is the one you are looking for the next step is to verify that your TS connects both your reagents and products in the potential energy surface. For this, an Intrinsic Reaction Coordinate (IRC) calculation must be performed. Here is an example of the input file for the IRC.
link 0 --blank line-- #p b3lyp/6-31G(d,p) irc=calcfc geom=connectivity --blank line-- Charge Multiplicity Coordinates of TS --blank line--
With this input, you ask for an IRC calculation, the default numbers of steps are 20 for each side of your TS in the PES; you must specify the coordinates of your TS or take them from the .chk file of your optimization. In addition, an initial force constant calculation must be made. It often occurs that the calculation fails in the correction step, thus, for complicated cases I hardly suggest to use irc=calcall, this will consume very long time (even days) but there is a 95% guaranty. If the number of points is insufficient you can put more within the route section, here is such an example for a complicated case.
link 0 --blank line-- #p b3lyp/6-31G(d,p) irc=(calcall,maxpoints=80) geom=connectivity --blank line-- Charge Multiplicity Coordinates of TS --blank line--
With this route section, you are asking to perform an IRC calculation with 80 points on each side of the PES, calculating the force constants at every point. For an even complicated case try adding the scf=qc keyword in the route section, quadratic convergence often works better for IRC calculations.
Guillermo Caballero, a graduate student from this lab, has written this two-part post on the nuances to be considered when searching for transition states in the theoretical assessment of reaction mechanisms. He’s been quite successful in getting beautiful energy profiles for organic reaction mechanisms, some of which have even explained why some reactions do not occur! A paper in Tetrahedron has just been accepted but we’ll talk about it in another post. I wanted Guillermo to share his insight into this hard practice of computational chemistry so he wrote the following post. Enjoy!
Yes, finding a transition state (TS) can be one of the most challenging tasks in computational chemistry, it requires both a good choice of keywords in your route section and all of your chemical intuition as well. Herein I give you some good tricks when you have to find a transition state using Gaussian 09 Rev. D1
METHOD 1. The first option you should try is to use the opt=qst2 keyword. With this method you provide the structures of your reagents and your products, then the program uses the quadratic synchronous transit algorithm to find a possible transition state structure and then optimize it to a first order saddle point. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=qst2 geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of reagents --blank line-- Charge Multiplicity Coordinates of products --blank line---
It is mandatory that the numbering must be the same in the reagents and the products otherwise the calculation will crash. To verify that the label for a given atom is the same in reagents and products you can go to Edit, then Connection. This opens a new window were you can manually modify the numbering scheme. I suggest you to work in a split window in gaussview so you can see at the same time your reagents and products.
The keyword freq=noraman is used to calculate the frequencies for your optimized structure, it is important because for a TS you must only observe one imaginary frequency, if not, then that is not a TS and you have to use another method. It also occurs that despite you find a first order saddle point, the imaginary frequency does not correspond to the bond forming or bond breaking in your TS, thus, you should use another method. I will give you advice later in the text for when this happens. When you use the noraman in this keyword you are not calculating the Raman frequencies, which for the purpose of a TS is unnecessary and saves computing time. Frequency analysis MUST be performed AT THE VERY SAME LEVEL OF THEORY at which the optimization is performed.
The main advantage for using the qst2 option is that if your calculation is going to crash, it generally crashes at the beginning, in the moment of guessing your transition state structure. Once the program have a guess, it starts the optimization. I suggest you to ask the algorithm to calculate the force constants once, this generally improves on the convergence, it will take slightly more time depending on the size of your structure but it pays off. The keyword in the route section is opt=(qst2,calcfc). Indeed, I hardly encourage you to use the calcfc keyword in any optimization you want to run.
METHOD 2. If method 1 does not work, my next advice is to use the opt=ts keyword. For this method, the coordinates in your input file are those for the TS structure. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=ts geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of TS --blank line--
The question that arises here is how should I get the coordinates for my TS? Well, honestly this is not a trivial task, here is where you use all the chemistry you know. For example, you can start with the coordinates of your reagents and manually get them closer. If you are forming a bond whose length is to be 1.5Å, then I suggest you to have that length in 1.6Å in your TS. Sometimes this becomes trial and error but the most accurate your TS structure is, based on your chemical knowledge, the easiest to find your TS will be. As another example, if you want to find a TS for a [1,5]-sigmatropic reaction a good TS structure will be putting the hydrogen atom that migrates in the middle point through the way. I have to insist, this method hardly depends on your imagination to elucidate a TS and on your chemistry background.
Most of the time when you use the opt=ts keyword the calculations crashes because of an error in the number of eigenvalues, you can avoid it adding noeigen to the route section; here is an example of the input file, I encourage you to use this method.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=(ts,noeigen,calcfc) geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of TS --blank line--
If you have problems in the optimization steps I suggest you to ask the algorithm to calculate the force constants in every step of the optimization opt=(ts,noeigen,calcall) this is quite a harsh method because will consume long computing time but works well for small molecules and for complicated TSs to find.
Another ‘tricky’ way to get your coordinates for your TS is to run the qst2 calculation, then if it fails, take the second- or the third-step coordinates and used them as a ‘pre-optimized’ set of coordinates for this method.
By the way, here is another useful trick. If you are evaluating a group of TSs, let’s say, if you are varying a functional group among the group, focus on finding the TS for the simplest case, then use this optimized TS as a template where you add the moieties and use this this method. This works pretty well.
For this post we’ll leave it up to here and post the rest of Guillermo’s tricks and advice on finding TS structures next week when we’ll also discuss the use of IRC calculations and some considerations on energy corrections when plotting the full energy profile. In the mean time please take the time to rate, like and share this and other posts.
Thanks for reading!
It is with great pride that I’d like to announce that for the first time we have a Masters Student graduated from this Comp.Chem. lab: María Eugenia “Maru” Sandoval-Salinas has finished her graduate studies and just last Friday defended her thesis admirably earning not only the degree of Masters of Science in Chemistry but doing so with the highest honors given by the National Autonomous University of Mexico.
Maru’s thesis is for many reasons a landmark in this lab not only because it is the first graduate thesis published from our lab but also the first document on our work about the study of Photosynthesis, a long sought after endeavor now closer to publication. It must also be said that Maru came to this lab when she was an undergraduate student five years ago when I just recently joined UNAM as a researcher fresh out of a postdoc stay. After getting her B.Sc. degree and publishing an article in JCTC (DOI: 10.1021/ct4004178) she now is about to publish more papers that I’m sure will be as highly ranked as the previous one. Thus, Maru was a pioneer in our lab giving it a vote of confidence when we had little to nothing to show for; thanks to her hard work and confidence, along with that of the students who have followed her, we managed to succeed as a consolidated research group in the field of computational chemistry.
More specifically, her thesis centered around finding a mechanism for the excitonic transference between pigments (bacteriochlorophyl-a, BChl-a) in the Fenna-Matthews-Olson (FMO) complex, a protein trimer with seven BChl-a molecules in each monomer, located between the antenna complex and the reaction center in green sulfur bacteria. Among the possible mechanisms explored were Förster’s theory, a modification to Marcus’ theory and finally we explored the possibility of Singlet Fission occurring between adjacent molecules with the help of Dr. David Casanova from the Basque Country University where Maru took a short research stay last autumn. Since nature doesn’t conform to any specific mechanism -specially in a complex arrangement such as the FMO- then it could be possible that a combination of the above might also occur but lets just wait for the papers to be published to discuss it. Calculations were performed through the TD-DFT and the C-DFT formalisms using G09 and Q-Chem; comparing experimental data in CH3OH (SMD implicit calculations with the SVWN5 functional) were undertaken previously for selection of the level of theory.
Now, after two original theses written and successfully defended, an article published in JCTC and more in process, at least five posters, a couple of oral presentations and countless hours at her desk, Maru will go pursuit a PhD abroad where I’m sure she will exceed anyone’s expectations with her work, drive, dedication and scientific curiosity. Thank you, Maru, for all your hard work and trust when this lab needed it the most, we wish you the best for you earn it. You will surely be missed.
Some atomic properties such as an atomic charge are isotropic, but every now and then some derivations of them become anisotropic, for example the plotting of the Molecular Electrostatic Potential (MEP) on the electron density surface can exhibit some anisotropic behavior; quantifying it can be a bit challenging.
It is well known that halogen atoms such as Chlorine can form so-called halogen-bonds of the type R-Cl-R in crystals with a near perfect 180° angle. This finding has lead to the discovery of σ-holes in halogens. σ-Holes are electrophilic portions of the anisotropic electrostatic potential in an otherwise nucleophilic atom. Recently, Guillermo “Memo” Caballero and I calculated the MEP for a series of trichloromethyl-containing compounds at the MP2/cc-pVQZ level of theory and the mapping shows evidence of such σ-holes as seen in Figure1. Those small blue portions on an otherwise red atom indicate that some electron density is missing in that position, which by the way is located at 180° away from the carbon atom.
But having the picture is not enough. We want to quantify just how strong are those σ-holes to effectively attract a nucleophile and perhaps perform some chemistry on the C-Cl bond. That’s when we resorted to MultiWFN, a Multifunctional Wavefunction Analyzer developed by Tian Lu (卢天) at the Beijing Kein Research Center for Natural Sciences. You can check the project leader list of publications here. Among many other capabilities, MultiWFN is able to print details about properties along a surface.
In order to work with MultiWFN you need to generate a *.wfn file, if you have a previous Gaussian calculation for which you want to analyze their surface you can run a guess=only calculation in order to extract the wavefunction from the checkpoint file. Here is a dummy of the input for such calculation
%chk=oldfile.chk # output=wfn geom=check guess=(only,read) density=current --blank-- Title Card --blank-- 0 1 --blank-- filename.wfn --blank--
In our case, having a post-Hartree-Fock calculation, the use of density=current is mandatory to get the MP2 density matrix and not just the HF one. Running this calculation will generate the file filename.wfn which is now used with MultiWFN. When starting MultiWFN you get to see a terminal window like the one below in which you are asked to input the path of your wfn file:
After loading it you will get the following window with the various options available. Type 12// (these two slashes are mandatory) to get the quantitative analysis of molecular surface option.
Then you will be asked to define some elements of that surface (we used the default options 0)
The following screenshot shows the results section in which several maxima and minima of electrostatic potential were found (7 and 11 in our case); a star is placed on the side of the global maximum. The value of the MEP at those points is given in Hartrees, eV and kcal/mol which I personally hate because there isn’t such a thing as a mole of ‘potentials’ (same argument as giving an orbital’s energy in kcal/mol, moles of what? orbitals? Personally, I don’t like it even if its valid).
Their visualizer is activated through the option 0 and although it is far from pretty it is quite good enough to find the numbers corresponding to maxima and minima of the MEP on the isodensity surface. If we look for the maxima then we find for our example (CHCl3) that a maximum is located in front of each Cl atom in a straight line from the C atom. Now we get to put a number on the mapped isosurface provided by Gaussian or even import the file into Chimera.
We are still working our way around MultiWFN, I hope we can find the batch option, it would be most useful. In the mean time, Guillermo and I will continue to search for σ-holes in chlorinated reagents. Thanks to Guillermo for his ongoing work in this and other topics within the realm of organic reactivity.
Have you any suggestions or ideas to work with MultiWFN? Please share them!