# Blog Archives

## Basis Set Superposition Error (BSSE). A short intro

Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, *E _{int}*, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

*E _{int} = E_{AB} – E_{A} – E_{B}* (

**1**)

However the calculation of *E _{int} *by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

*E _{int}^{CP} = E_{AB}^{AB} – E_{A}^{AB}– E_{B}^{AB}* (

**2**)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘*ghost*‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

## Post Calculation Addition of Empirical Dispersion – Fixing interaction energies

Calculation of interaction energies is one of those things people are more concerned with and is also something mostly done wrong. The so called ‘*gold standard*‘ according to Pavel Hobza for calculating supramolecular interaction energies is the CCSD(T)/CBS level of theory, which is highly impractical for most cases beyond 50 or so light atoms. Basis set extrapolation methods and inclusion of electronic correlation with MP2 methods yield excellent results but they are not nonetheless almost as time consuming as CC. DFT methods in general are terrible and still are the most widely used tools for electronic structure calculations due to their competitive computing times and the wide availability of schemes for including terms which help describe various kinds of interactions. The most important ingredients needed to get a decent to good interaction energies values calculated with DFT methods are correlation and dispersion. The first part can be recreated by a good correlation functional and the use of empirical dispersion takes care of the latter shortcoming, dramatically improving the results for interaction energies even for lousy functionals such as the infamous B3LYP. The results still wont be of benchmark quality but still the deviations from the *gold standard* will be shortened significantly, thus becoming more quantitatively reliable.

There is an online tool for calculating and adding the empirical dispersion from Grimme’s group to a calculation which originally lacked it. In the link below you can upload your calculation, select the basis set and functionals employed originally in it, the desired damping model and you get in return the corrected energy through a geometrical-Counterpoise correction and Grimme’s empirical dispersion function, D3, of which I have previously written here.

The gCP-D3 Webservice is located at: http://wwwtc.thch.uni-bonn.de/

The platform is entirely straightforward to use and it works with xyz, turbomole, orca and gaussian output files. The concept is very simple, a both gCP and D3 contributions are computed in the selected basis set and added to the uncorrected DFT (or HF) energy (eq. 1)

(**1**)

If you’re trying to calculate interaction energies, remember to perform these corrections for every component in your supramolecular assembly (eq. 2)

(**2**)

Here’s a screen capture of the outcome after uploading a G09 log file for the simplest of options B3LYP/6-31G(*d*), a decomposed energy is shown at the left while a 3D interactive Jmol rendering of your molecule is shown at the right. Also, various links to the literature explaining the details of these calculations are available in the top menu.

I’m currently writing a book chapter on methods for calculating ineraction energies so expect many more posts like this. A special mention to Dr. Jacinto Sandoval, who is working with us as a postdoc researcher, for bringing this platform to my attention, I was apparently living under a rock.