Category Archives: Tricks
Calculation of interaction energies is one of those things people are more concerned with and is also something mostly done wrong. The so called ‘gold standard‘ according to Pavel Hobza for calculating supramolecular interaction energies is the CCSD(T)/CBS level of theory, which is highly impractical for most cases beyond 50 or so light atoms. Basis set extrapolation methods and inclusion of electronic correlation with MP2 methods yield excellent results but they are not nonetheless almost as time consuming as CC. DFT methods in general are terrible and still are the most widely used tools for electronic structure calculations due to their competitive computing times and the wide availability of schemes for including terms which help describe various kinds of interactions. The most important ingredients needed to get a decent to good interaction energies values calculated with DFT methods are correlation and dispersion. The first part can be recreated by a good correlation functional and the use of empirical dispersion takes care of the latter shortcoming, dramatically improving the results for interaction energies even for lousy functionals such as the infamous B3LYP. The results still wont be of benchmark quality but still the deviations from the gold standard will be shortened significantly, thus becoming more quantitatively reliable.
There is an online tool for calculating and adding the empirical dispersion from Grimme’s group to a calculation which originally lacked it. In the link below you can upload your calculation, select the basis set and functionals employed originally in it, the desired damping model and you get in return the corrected energy through a geometrical-Counterpoise correction and Grimme’s empirical dispersion function, D3, of which I have previously written here.
The gCP-D3 Webservice is located at: http://wwwtc.thch.uni-bonn.de/
The platform is entirely straightforward to use and it works with xyz, turbomole, orca and gaussian output files. The concept is very simple, a both gCP and D3 contributions are computed in the selected basis set and added to the uncorrected DFT (or HF) energy (eq. 1)
If you’re trying to calculate interaction energies, remember to perform these corrections for every component in your supramolecular assembly (eq. 2)
Here’s a screen capture of the outcome after uploading a G09 log file for the simplest of options B3LYP/6-31G(d), a decomposed energy is shown at the left while a 3D interactive Jmol rendering of your molecule is shown at the right. Also, various links to the literature explaining the details of these calculations are available in the top menu.
I’m currently writing a book chapter on methods for calculating ineraction energies so expect many more posts like this. A special mention to Dr. Jacinto Sandoval, who is working with us as a postdoc researcher, for bringing this platform to my attention, I was apparently living under a rock.
We’ve covered some common errors when dealing with formatted checkpoint files (*.fchk) generated from Gaussian, specially when analyzed with the associated GaussView program. (see here and here for previous posts on the matter.)
Prof. Neal Zondlo from the University of Delaware kindly shared this solution with us when the following message shows up:
CConnectionGFCHK::Parse_GFCHK() Missing or bad data: Rbond Line Number 1234
The Rbond label has to do with the connectivity displayed by the visualizer and can be overridden by close examination of the input file. In the example provided by Prof. Zondlo he found the following line in the connectivity matrix of the input file:
2 9 0.0
which indicates a zero bond order between atoms 2 and 9, possibly due to their proximity. He changed the line to simply
So editing the connectivity of your atoms in the input can help preventing the Rbond message.
I hope this helps someone else.
A few weeks back we wrote about using WFN(X) files with MultiWFN in order to find σ-holes in halogen atoms by calculating the maximum potential on a given surface. We later found out that using a chk file to generate a wfn(x) file using the guess=(read,only) keyword didn’t retrieve the MP2 wavefunction but rather the HF wavefunction! Luckily we realized this problem very quickly and were able to fix it. We tried to generate the wfn(x) file with the following keywords at the route section
#p guess=(read,only) density=current
but we kept retrieving the HF values, which we noticed by running the corresponding HF calculation and noticing that every value extracted from the WFN file was exactly the same.
So, if you want a WFN(X) file for post processing an MP2 (or any other post-HartreFock calculation for that matter) ask for it from the beginning of your calculation in the same job. I still don’t know how to work around this or but will be happy to report it whenever I do.
PS. A sincere apology to all subscribers for getting a notification to this post when it wasn’t still finished.
Last week we posted some insights on finding Transitions States in Gaussian 09 in order to evaluate a given reaction mechanism. A stepwise methodology is tried to achieve and this time we’ll wrap the post with two flow charts trying to synthesize the information given. It must be stressed that knowledge about the chemistry of the reaction is of paramount importance since G09 cannot guess the structure connecting two minima on its own but rather needs our help from our chemical intuition. So, without further ado here is the remainder of Guillermo’s post.
METHOD 3. QST3. For this method, you provide the coordinates of your reagents, products and TS (in that order) and G09 uses the QST3 method to find the first order saddle point. As for QST2 the numbering scheme must match for all the atoms in your three sets of coordinates, again, use the connection editor to verify it. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=(qst3,calcfc) geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of reagents --blank line— Charge Multiplicity Coordinates of products --blank line-- Charge Multiplicity Coordinates of TS --blank line---
As I previously mentioned, it happens that you find a first order saddle point but does not correspond to the TS you want, you find an imaginary vibration that is not the one for the bond you are forming or breaking. For these cases, I suggest you to take that TS structure and manually modify the region that is causing you trouble, then use method 2.
METHOD 4. When the previous methods fail to yield your desired TS, the brute force way is to acquire the potential energy surface (PES) and visually locate your possible TS. The task is to perform a rigid PES scan, for this, the molecular structure must be defined using z-matrix. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) scan test geom=connectivity --blank line-- Charge Multiplicity Z-matrix of reagents (or products) --blank line--
In the Z-matrix section you must specify which variables (B, A or D) you want to modify. First, locate the variables you want to modify (distance B, angle A, or dihedral angle D). Then modify those lines within the Z-matrix, here is an example.
B1 1.41 3 0.05 A1 104.5 2 1.0
What you are specifying with this is that the variable B1 (a distance) is going to be stepped 3 times by 0.05. Then variable A1 (an angle) is going to be stepped 2 times by 1.0. Thus, a total of 12 energy evaluations will be performed. At the end of the calculation open the .log file in gaussview and in Results choose the Scan… option. This will open a 3D surface where you should locate the saddle point, this is an educated guess, so take the structure you think corresponds to your TS and use it for method 2.
I have not fully explored this method so I encourage you to go to Gaussian.com and thoroughly review it.
Once you have found your TS structure and via the imaginary vibration confirmed that is the one you are looking for the next step is to verify that your TS connects both your reagents and products in the potential energy surface. For this, an Intrinsic Reaction Coordinate (IRC) calculation must be performed. Here is an example of the input file for the IRC.
link 0 --blank line-- #p b3lyp/6-31G(d,p) irc=calcfc geom=connectivity --blank line-- Charge Multiplicity Coordinates of TS --blank line--
With this input, you ask for an IRC calculation, the default numbers of steps are 20 for each side of your TS in the PES; you must specify the coordinates of your TS or take them from the .chk file of your optimization. In addition, an initial force constant calculation must be made. It often occurs that the calculation fails in the correction step, thus, for complicated cases I hardly suggest to use irc=calcall, this will consume very long time (even days) but there is a 95% guaranty. If the number of points is insufficient you can put more within the route section, here is such an example for a complicated case.
link 0 --blank line-- #p b3lyp/6-31G(d,p) irc=(calcall,maxpoints=80) geom=connectivity --blank line-- Charge Multiplicity Coordinates of TS --blank line--
With this route section, you are asking to perform an IRC calculation with 80 points on each side of the PES, calculating the force constants at every point. For an even complicated case try adding the scf=qc keyword in the route section, quadratic convergence often works better for IRC calculations.
Guillermo Caballero, a graduate student from this lab, has written this two-part post on the nuances to be considered when searching for transition states in the theoretical assessment of reaction mechanisms. He’s been quite successful in getting beautiful energy profiles for organic reaction mechanisms, some of which have even explained why some reactions do not occur! A paper in Tetrahedron has just been accepted but we’ll talk about it in another post. I wanted Guillermo to share his insight into this hard practice of computational chemistry so he wrote the following post. Enjoy!
Yes, finding a transition state (TS) can be one of the most challenging tasks in computational chemistry, it requires both a good choice of keywords in your route section and all of your chemical intuition as well. Herein I give you some good tricks when you have to find a transition state using Gaussian 09 Rev. D1
METHOD 1. The first option you should try is to use the opt=qst2 keyword. With this method you provide the structures of your reagents and your products, then the program uses the quadratic synchronous transit algorithm to find a possible transition state structure and then optimize it to a first order saddle point. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=qst2 geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of reagents --blank line-- Charge Multiplicity Coordinates of products --blank line---
It is mandatory that the numbering must be the same in the reagents and the products otherwise the calculation will crash. To verify that the label for a given atom is the same in reagents and products you can go to Edit, then Connection. This opens a new window were you can manually modify the numbering scheme. I suggest you to work in a split window in gaussview so you can see at the same time your reagents and products.
The keyword freq=noraman is used to calculate the frequencies for your optimized structure, it is important because for a TS you must only observe one imaginary frequency, if not, then that is not a TS and you have to use another method. It also occurs that despite you find a first order saddle point, the imaginary frequency does not correspond to the bond forming or bond breaking in your TS, thus, you should use another method. I will give you advice later in the text for when this happens. When you use the noraman in this keyword you are not calculating the Raman frequencies, which for the purpose of a TS is unnecessary and saves computing time. Frequency analysis MUST be performed AT THE VERY SAME LEVEL OF THEORY at which the optimization is performed.
The main advantage for using the qst2 option is that if your calculation is going to crash, it generally crashes at the beginning, in the moment of guessing your transition state structure. Once the program have a guess, it starts the optimization. I suggest you to ask the algorithm to calculate the force constants once, this generally improves on the convergence, it will take slightly more time depending on the size of your structure but it pays off. The keyword in the route section is opt=(qst2,calcfc). Indeed, I hardly encourage you to use the calcfc keyword in any optimization you want to run.
METHOD 2. If method 1 does not work, my next advice is to use the opt=ts keyword. For this method, the coordinates in your input file are those for the TS structure. Here is an example of the input file.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=ts geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of TS --blank line--
The question that arises here is how should I get the coordinates for my TS? Well, honestly this is not a trivial task, here is where you use all the chemistry you know. For example, you can start with the coordinates of your reagents and manually get them closer. If you are forming a bond whose length is to be 1.5Å, then I suggest you to have that length in 1.6Å in your TS. Sometimes this becomes trial and error but the most accurate your TS structure is, based on your chemical knowledge, the easiest to find your TS will be. As another example, if you want to find a TS for a [1,5]-sigmatropic reaction a good TS structure will be putting the hydrogen atom that migrates in the middle point through the way. I have to insist, this method hardly depends on your imagination to elucidate a TS and on your chemistry background.
Most of the time when you use the opt=ts keyword the calculations crashes because of an error in the number of eigenvalues, you can avoid it adding noeigen to the route section; here is an example of the input file, I encourage you to use this method.
link 0 --blank line-- #p b3lyp/6-31G(d,p) opt=(ts,noeigen,calcfc) geom=connectivity freq=noraman --blank line-- Charge Multiplicity Coordinates of TS --blank line--
If you have problems in the optimization steps I suggest you to ask the algorithm to calculate the force constants in every step of the optimization opt=(ts,noeigen,calcall) this is quite a harsh method because will consume long computing time but works well for small molecules and for complicated TSs to find.
Another ‘tricky’ way to get your coordinates for your TS is to run the qst2 calculation, then if it fails, take the second- or the third-step coordinates and used them as a ‘pre-optimized’ set of coordinates for this method.
By the way, here is another useful trick. If you are evaluating a group of TSs, let’s say, if you are varying a functional group among the group, focus on finding the TS for the simplest case, then use this optimized TS as a template where you add the moieties and use this this method. This works pretty well.
For this post we’ll leave it up to here and post the rest of Guillermo’s tricks and advice on finding TS structures next week when we’ll also discuss the use of IRC calculations and some considerations on energy corrections when plotting the full energy profile. In the mean time please take the time to rate, like and share this and other posts.
Thanks for reading!
Some atomic properties such as an atomic charge are isotropic, but every now and then some derivations of them become anisotropic, for example the plotting of the Molecular Electrostatic Potential (MEP) on the electron density surface can exhibit some anisotropic behavior; quantifying it can be a bit challenging.
It is well known that halogen atoms such as Chlorine can form so-called halogen-bonds of the type R-Cl-R in crystals with a near perfect 180° angle. This finding has lead to the discovery of σ-holes in halogens. σ-Holes are electrophilic portions of the anisotropic electrostatic potential in an otherwise nucleophilic atom. Recently, Guillermo “Memo” Caballero and I calculated the MEP for a series of trichloromethyl-containing compounds at the MP2/cc-pVQZ level of theory and the mapping shows evidence of such σ-holes as seen in Figure1. Those small blue portions on an otherwise red atom indicate that some electron density is missing in that position, which by the way is located at 180° away from the carbon atom.
But having the picture is not enough. We want to quantify just how strong are those σ-holes to effectively attract a nucleophile and perhaps perform some chemistry on the C-Cl bond. That’s when we resorted to MultiWFN, a Multifunctional Wavefunction Analyzer developed by Tian Lu (卢天) at the Beijing Kein Research Center for Natural Sciences. You can check the project leader list of publications here. Among many other capabilities, MultiWFN is able to print details about properties along a surface.
In order to work with MultiWFN you need to generate a *.wfn file, if you have a previous Gaussian calculation for which you want to analyze their surface you can run a guess=only calculation in order to extract the wavefunction from the checkpoint file. Here is a dummy of the input for such calculation
%chk=oldfile.chk # output=wfn geom=check guess=(only,read) density=current --blank-- Title Card --blank-- 0 1 --blank-- filename.wfn --blank--
In our case, having a post-Hartree-Fock calculation, the use of density=current is mandatory to get the MP2 density matrix and not just the HF one. Running this calculation will generate the file filename.wfn which is now used with MultiWFN. When starting MultiWFN you get to see a terminal window like the one below in which you are asked to input the path of your wfn file:
After loading it you will get the following window with the various options available. Type 12// (these two slashes are mandatory) to get the quantitative analysis of molecular surface option.
Then you will be asked to define some elements of that surface (we used the default options 0)
The following screenshot shows the results section in which several maxima and minima of electrostatic potential were found (7 and 11 in our case); a star is placed on the side of the global maximum. The value of the MEP at those points is given in Hartrees, eV and kcal/mol which I personally hate because there isn’t such a thing as a mole of ‘potentials’ (same argument as giving an orbital’s energy in kcal/mol, moles of what? orbitals? Personally, I don’t like it even if its valid).
Their visualizer is activated through the option 0 and although it is far from pretty it is quite good enough to find the numbers corresponding to maxima and minima of the MEP on the isodensity surface. If we look for the maxima then we find for our example (CHCl3) that a maximum is located in front of each Cl atom in a straight line from the C atom. Now we get to put a number on the mapped isosurface provided by Gaussian or even import the file into Chimera.
We are still working our way around MultiWFN, I hope we can find the batch option, it would be most useful. In the mean time, Guillermo and I will continue to search for σ-holes in chlorinated reagents. Thanks to Guillermo for his ongoing work in this and other topics within the realm of organic reactivity.
Have you any suggestions or ideas to work with MultiWFN? Please share them!
Sometimes you just need to optimize some fragment or moiety of your molecule for a number of reasons -whether because of its size, your current interest, or to skew the progress of a previous optimization- or maybe you want just some kind of atoms to have their positions optimized. I usually optimize hydrogen atoms when working with crystallographic files but that for some reason I want to preserve the rest of the molecule as refined, in order to keep it under a crystalline field of sorts.
Asking Gaussian to optimize some of the atoms in your molecule requires you to make a list albeit the logic behind it is not quite straightforward to me. This list is invoked by the ReadOptimize keyword in the route section and it includes all atoms by default, you can then further tell G09 which atoms are to be included or excluded from the optimization.
So, for example you want to optimize all atoms EXCEPT hydrogens, then your input should bear the ReadOptimize keyword in the route section and then, at the end of the molecule specification, the following line:
If you wish to selectively add some atoms to the list while excluding others, here’s an example:
atoms=C H S notatoms=5-8
This list adds, and therefore optimizes, all carbon, hydrogen and sulfur atoms, except atoms 5, 6, 7 and 8, should they be any of the previous elements in the C H S list.
The way I selectively optimize hydrogen atoms is by erasing all atoms from the list -using the noatoms instruction- and then selecting which are to be included in the list -with atoms=H-, but I haven’t tried it with only selecting hydrogen atoms from the start, as in atoms=H
I probably get very confused because I learned to do this with the now obsolete ReadFreeze keyword; now it sometimes may seem to me like I’m using double negatives or something – please do not optimize all atoms except if they are hydrogen atoms. You can include numbers, ranks or symbols in this list as a final line of your input file.
Common errors (by common I mean I’ve got them):
Lets look at the end of an input I just was working with:
> AtmSel: Line=”P 0″
> Maximum list size exceeded in AddBin.
> Error termination via Lnk1e in…
AtmSel is the routine which reads the atoms list and I was using a pseudopotential on phosphorous atoms, I placed the atoms list at the end of the file but it should be placed right after the coordinates and the connectivity matrix, should there be one, and thus before any external basis set or pseudopotential or any other specification to be read by Gaussian.
As a sort of test you can use the instruction:
%kjob l103 %chk=myfile.chk ...
at the Link0 section (where your checkpoint is defined). This will kill the job after the link 103 is finished, thus you will only get a list of what parameters were frozen and which were active. Then, if things look ok, you can run the job without the %kjob l103 instruction and get it done.
As usual I hope this helps. Thanks for reading except to those who didn’t read it except for the parts they did read.
Having a long calculation terminated just because it ran out of time in the queue is very frustrating; even more so if restarting it from the last accesible point is hard to do.
I have recently performed some particularly demanding calculation: Basis Set Superposition Error (BSSE) with the Counterpoise method and second order Moller-Plesset perturbation theory calculation (MP2). The calculation ran out of time but I was able to restart it because I had the rwf file! My input looked a bit like this:
#p mp2/GEN counterpoise=2 maxdisk=200GB
So here is how it works.
The very first line of your calculation gives you the process ID number which is not necessarily the same as the PID given by the queue system (in fact, is not the same because the latter corresponds to the submitted script, not the instructions in it i.e. your calculation)
Entering Gaussian System, Link 0=g09 Initial command: /opt/SC/aplicaciones/g09-C.01/l1.exe /tmpu/joaqbf_g/joaqbf/Gau-38954.inp -scrdir=/tmpu/joaqbf_g/joaqbf/ Entering Link 1 = /opt/SC/aplicaciones/g09-C.01/l1.exe PID= 38955.
(emphasis in red is mine)
This is the number you want to write down. You will need to find the corresponding rwf file (usually in your SCRATCH directory) as Gau-PID.rwf (in the aforementioned case, Gau-38955.rwf). If you are a bit paranoid like myself you want to copy and keep this file safe but be aware that these are very long files, in my case it was 175 GB long. Now you need to launch your calculation again with the following input:
%rwf=myfile.rwf %nosave %chk=myfile.chk Title Card # restart rest of input
You can add all other controls to the Link0 section such as %nprocshared or %mem according to your needs.
I’m pretty sure it should work for other kinds of calculations in which taking from the checkpoint file is not as easy, so if you run into this kind of problems, its worth the try.
As far as population analysis methods goes, the Quantum Theory of Atoms in Molecules (QTAIM) a.k.a Atoms in Molecules (AIM) has become a popular option for defining atomic properties in molecular systems, however, its calculation is a bit tricky and maybe not as straightforward as Mulliken’s or NBO.
Personally I find AIM a philosophical question since, after the introduction of the molecule concept by Stanislao Cannizzaro in 1860 (although previously developed by Amadeo Avogadro who was dead at the time of the Karlsruhe congress), the questions of whether or not an atom retains its identity when bound to others? where does an atom end and the next begins? What are the connections between atoms in a molecule? are truly interesting and far deeper than we usually consider because it takes a big mental leap to think about how matter is organized to give rise to substances. Particularly I’m very interested with the concept of a Molecular Graph which in turn is concerned with the way we “draw lines” to form conceptual molecules. Perhaps in a different post we can go into the detail of the method, which is based in the Laplacian operator of the electron density, but today, I just want to collect the basic steps in getting the most basic AIM answers for any given molecule. Recently, my good friend Pezhman Zarabadi-Poor and I have used rather extensively the following procedure. We hope to have a couple of manuscripts published later on. Therefore, I’ve asked Pezhman to write a sort of guest post on how to run AIMALL, which is our selected program for the integration algorithm.
The first thing we need is a WFN or WFX file, which contains the wavefunction in a Fortran unformatted file on which the Laplacian integration is to be performed. This is achieved in Gaussian09 by incluiding the keyword output=wfn or output=wfx in the route section and adding a name for this file at the bottom line of the input file, e.g.
(NOTE: WFX is an eXtended version of WFN; particularly necessary when using pseudopotentials or ECP’s)
Analyzing this file requires the use of a third party software such as AIMALL suite of programs, of which the standard version is free of charge upon registration to their website.
OpenAIMStudio (the accompanying graphical interface) and select the AIMQB program from the run menu as shown in figure 1.
Select your WFN/WFX file on which the calculation is to be run. (Figure 2)
You can control several options for the integration of the Laplacian of the electron density as well as other features. If your molecules are simple enough, you may go through with a successful and meaningful calculation using the default settings. After the calculation is finished, several result files are obtained. We’ll work in this tutorial only with *.mpgviz (which contains information about the molecular graph, MG) and *.sum (which contains all of needed numerical data).
Visualization of the MG yields different kinds of critical points, such as: 1) Nuclear Attractor Critical Points (NACP); 2) Bond Critical Points (BCP); 3) Ring CP’s (RCP); and 4) Cage CP’s (CCP).
Of the above, BCP are the ones that indicate the presence of a chemical bond between two atoms, although this conclusion is not without controversy as pointed out by Foroutan-Njead in his paper: C. Foroutan-Nejad, S. Shahbazian and R. Marek, Chemistry – A European Journal, 2014, 20, 10140-10152. However, at a first approximation, BCP’s can help us to explore chemical interactions.
Now, let’s go back to visualizing those MGs (in our examples we’ve used methane and ethylene and acetylene). We open the corresponding *.mpgviz file in AIMStudio and export the image from the file menu and using the save as picture option (figure 3).
The labeled atoms are NACP’s while the green dots correspond to BCP’s. Multiplicity of a bond cannot be discerned within the MG; in order to find out whether a bond is a single, double or triple bond we have to look into the *.sum file, in which we’ll take a look at the bond orders between pairs of atoms in the section labeled “Diatomic Electron Pair Contributions and Delocalization Data” (Figure 4).
Delocalization indexes, DI’s, show the approximate number of electrons shared between two atoms. From the above examples we get the following DI(C,C) values: 1.93 for C2H4 and 2.87 for C2H2; on the other hand, DI(C,H) values are 0.98 for CH4, 0.97 in C2H4 and 0.96 in C2H2. These are our usual bond orders.
This is the first part of a crash tutorial on AIM, in my opinion this is the very basics anyone needs to get started with this interesting and widespread method. Thanks to all who asked about QTAIM, now you have your long answer.
Thanks a lot to my good friend Dr Pezhman Zarabadi-Poor for providing this contribution to the blog, we hope you all find it helpful. Please share and comment.