# Blog Archives

## Basis Set Superposition Error (BSSE). A short intro

Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, *E _{int}*, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

*E _{int} = E_{AB} – E_{A} – E_{B}* (

**1**)

However the calculation of *E _{int} *by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

*E _{int}^{CP} = E_{AB}^{AB} – E_{A}^{AB}– E_{B}^{AB}* (

**2**)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘*ghost*‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

## Some .fchk files wont open in GaussView5.0 (Update)

A couple of weeks ago I posted a solution for a common error regarding .fchk files that will display the error below when opened with GaussView5.0. As I expected, this error has to do with the use of diffuse functions in the basis set and is related to a change of format between Gaussian versions.

CConnectionGFCHK::Parse_GFCHK() Missing or bad data: Alpha Orbital Energies Line Number 1234

Although the method described in the previous post works just fine, the following update is a better approach. Due to a change of spelling between G03 and G09 (which has been corrected for G09 but not available for GV versions prior to 5.0.9) one must change “*independent*” for “*independant*”

To make the change directly from the terminal the following command is needed:

sed -i 's/independent/independant/g' file.fchk

Alternatively you can redirect the output to a new file

sed -e 's/independent/independant/g' file.fchk > newfile.fchk

if you want to keep the old version and work with a new one.

Of course this edition can be performed manually with any text editor available (for example if you work in Windows) but solutions from the terminal always seem easier and a lot more fun to me.

Thanks to Dr. Fernando Cortés for sharing his insight into this issue.

## If a .fchk file wont open in GaussView5.0

I’ve found the following error regarding the opening of .fchk files in GaussView5.0.

CConnectionGFCHK::Parse_GFCHK() Missing or bad data: Alpha Orbital Energies Line Number 1234

The error is prevented to a first approximation (i.e. it at least will allow GV to open and visualize the file but other issues may arise) by opening the file and modifying the number of basis functions to equal the number of independent functions (which is lower)

FILE HEADER FOpt RM062X 6-311++G(d,p) Number of atoms I 75 Info1-9 I N= 9 163 163 0 0 0 110 2 18 -502 Charge I 0 Multiplicity I 1 Number of electrons I 314 Number of alpha electrons I 157 Number of beta electrons I 157 Number of basis functions I1199Number of independent functions I1199Number of point charges in /Mol/ I 0 Number of translation vectors I 0 Atomic numbers I N= 75 ... ... ... ...

Once both numbers match you can open the file normally and work with it. My guess is this will continue to happen with highly polarized basis sets but I need to run some tests.

## The Local Bond Order, LBO (Barroso et al. 2004)

I don’t know why I haven’t written about the Local Bond Order (LBO) before! And a few days ago when I thought about it my immediate reaction was to shy away from it since it would constitute a blatant self-promotion attempt; but hell! this is my blog! A place I’ve created for my blatant self-promotion! So without further ado, I hereby present to you one of my own original contributions to Theoretical Chemistry.

During the course of my graduate years I grew interested in weakly bonded inorganic systems, namely those with secondary interactions in bidentate ligands such as xanthates, dithiocarboxylates, dithiocarbamates and so on. Description of the resulting geometries around the central metallic atom involved the invocation of secondary interactions defined purely by geometrical parameters (Alcock, 1972) in which these were defined as present if the interatomic distance was longer than the sum of their covalent radii and yet smaller than the sum of their van der Waals radii. This definition is subject to a lot of constrictions such as the accuracy of the measurement, which in turn is related to the quality of the monocrystal used in the X-ray difraction experiment; the used definition of covalent radii (Pauling, Bondi, etc.); and most importantly, it doesn’t shed light on the roles of crystal packing, intermolecular contacts, and the energetics of the interaction.

This is why in 2004 we developed a simple yet useful definition of bond order which could account for a single molecule in vacuo the strength and relevance of the secondary interaction, relative to the well defined covalent bonds.

Barroso-Flores, J. et al.Journal of Organometallic Chemistry689 (2004) 2096–2102 http://dx.doi.org/10.1016/j.jorganchem.2004.03.035,

Let a Molecular Orbital be defined as a wavefunction *ψ*i which in turn may be constructed by a linear combination of Atomic Orbitals (or atom centered basis set functions) φj

We define *ζ*LBO in the following way, where we explicitly take into account a doubly occupied orbital (hence the multiplication by 2) and therefore we are assuming a closed shell configuration in the Restricted formalism.

The summation is carried over all the orbitals which belong to atom A1 and those of atom A2.

Simplifying we yield,

where *S*jk is the overlap integral for the *φ*j and *φ*k functions.

By summing over all *i* MOs we have accomplished with this definition to project all the MO’s onto the space of those functions centered on atoms A1 and A2. This definition is purely quantum mechanical in nature and is independent from any geometric requirement of such interacting atoms (i.e. interatomic distance) thus can be used as a complement to the internuclear distance argument to assess the interaction between them. This definition also results very simple and easy to calculate for all you need are the coefficients to the LCAO expansion and the respective overlap integrals.

Unfortunately, the Local Bond Order hasn’t found much echo, partly due to the fact that it is hidden in a missapropriate journal. I hope someone finds it interesting and useful; if so, don’t forget to cite it appropriately 😉

## The Gen keyword in Gaussian. Adding an external basis set.

I am frequently asked how to include an extra set of basis functions in a calculation or how to use an entirely external basis set. Sometimes this question also implies the explicit declaration of an external pseudopotential or Effective Core Potential (ECP).

New basis sets and ECPs are published continuously in specialized journals all the time. The same happens with functionals for DFT calculations. The format in which they are published is free and usually only a list of coefficients and exponents are shown and one has to figure out how to introduce it in ones calculation. The EMSL Basis Set Exchange site helps you get it right! It has a clickable periodic table and a list of many (not all) different basis sets at the left side. Below the periodic table there is a menu from which one can select which program we want our basis set for; finally we click on “get basis set” and a pop-up window shows the result in the selected format along with the corresponding references for citation. A multiple query can be performed by selecting more than one element on the table, which generates a list that almost sure can be used as input without further manipulations. Dr. David Feller is to be thanked for leading the creation of this repository. More on the history and mission of the EMSL can be found on their About page. Because of my experience, the rest of the post addresses the inclusion of external basis sets in Gaussian, other programs such as NwChem will be addressed in a different post soon.

The correct format for inclusion of an external basis set is exemplified below with the inclusion of the 3-21G basis set for Carbon as obtained from the EMSL Basis Set Exchange site (blank lines are marked explicitly just to emphasize their location:

spin multiplicity Molecular coordinates - blank line- C 0 S 3 1.00 172.2560000 0.0617669 25.9109000 0.3587940 5.5333500 0.7007130 SP 2 1.00 3.6649800 -0.3958970 0.2364600 0.7705450 1.2158400 0.8606190 SP 1 1.00 0.1958570 1.0000000 1.0000000 ****- blank line -

The use of four stars ‘****’ is mandatory to indicate the end of the basis set specification for any given atom. If a basis set is to be declared for a second atom, it should be included after the **** line without any blank line in between.

WARNING! Sometimes we can find more than one basis set in a single file this is due to different representations, spherical or cartesian basis sets. Gaussian by default uses cartesian (5D,7F) functions. Pure gaussian use 6 functions for *d*-type orbitals and 10 for* f*-type orbitals (6D, 10F). Calculations must be consistent throughout, hence all basis functions should be either cartesian or pure.

Inclusion of a pseudopotential allows for more computational resources to be used for calculation of the electronic structure of the valence shell by replacing the inner electrons for a set of functions which simulate the presence of these and their effect (such as shielding) on the valence electrons. There are full core pseudopotentialas, which replace the entire core (kernel). There are also medium core pseudopotentials which only replace the previous kernel to the full one, allowing for the outermost core electrons to be explicitly calculated. The correct inclusion of a pseudopotential is shown below exemplified by the LANL2DZ ECP by Hay and Wadt for the Chlorine atom.

spin multiplicity Molecular coordinates - blank line-basis set for atom1****basis set for atom2(if there is any) **** -blank line- CL 0 CL-ECP 2 10 d potential 5 1 94.8130000 -10.0000000 2 165.6440000 66.2729170 2 30.8317000 -28.9685950 2 10.5841000 -12.8663370 2 3.7704000 -1.7102170 s-d potential 5 0 128.8391000 3.0000000 1 120.3786000 12.8528510 2 63.5622000 275.6723980 2 18.0695000 115.6777120 2 3.8142000 35.0606090 p-d potential 6 0 216.5263000 5.0000000 1 46.5723000 7.4794860 2 147.4685000 613.0320000 2 48.9869000 280.8006850 2 13.2096000 107.8788240 2 3.1831000 15.3439560

If a second ECP is to be introduced, it should be placed right after the first one without any blank line! If a blank line is detected then the program will assume it’s done reading all ECPs and Basis Sets.

Finally, here is an example of a combination of both keywords. If a second ECP was needed then we’d place it at the end of the first one without a blank line. The molecule is any given chlorinated hydrocarbon (H, C and Cl atoms exclusively)

#P B3LYP/gen pseudo=read ADDITIONAL-KEYWORDS -blank line- 0 1Molecular Coordinates-blank line- H 0 S 3 1.00 19.2384000 0.0328280 2.8987000 0.2312040 0.6535000 0.8172260 S 1 1.00 0.1776000 1.0000000 **** C 0 S 7 1.00 4233.0000000 0.0012200 634.9000000 0.0093420 146.1000000 0.0454520 42.5000000 0.1546570 14.1900000 0.3588660 5.1480000 0.4386320 1.9670000 0.1459180 S 2 1.00 5.1480000 -0.1683670 0.4962000 1.0600910 S 1 1.00 0.1533000 1.0000000 P 4 1.00 18.1600000 0.0185390 3.9860000 0.1154360 1.1430000 0.3861880 0.3594000 0.6401140 P 1 1.00 0.1146000 1.0000000 **** Cl 0 S 2 1.00 2.2310000 -0.4900589 0.4720000 1.2542684 S 1 1.00 0.1631000 1.0000000 P 2 1.00 6.2960000 -0.0635641 0.6333000 1.0141355 P 1 1.00 0.1819000 1.0000000 **** -blank line- CL 0 CL-ECP 2 10 d potential 5 1 94.8130000 -10.0000000 2 165.6440000 66.2729170 2 30.8317000 -28.9685950 2 10.5841000 -12.8663370 2 3.7704000 -1.7102170 s-d potential 5 0 128.8391000 3.0000000 1 120.3786000 12.8528510 2 63.5622000 275.6723980 2 18.0695000 115.6777120 2 3.8142000 35.0606090 p-d potential 6 0 216.5263000 5.0000000 1 46.5723000 7.4794860 2 147.4685000 613.0320000 2 48.9869000 280.8006850 2 13.2096000 107.8788240 2 3.1831000 15.3439560 -blank line-

If you like this post or found it useful please leave a comment, share it or just give it a like. It is as much fun to find out people is reading as it is finding the answer to ones questions in someone else’s blog 🙂

Peace out!