# Blog Archives

## Basis Set Superposition Error (BSSE). A short intro

Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, *E _{int}*, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

*E _{int} = E_{AB} – E_{A} – E_{B}* (

**1**)

However the calculation of *E _{int} *by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

*E _{int}^{CP} = E_{AB}^{AB} – E_{A}^{AB}– E_{B}^{AB}* (

**2**)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘*ghost*‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

## A personal artistic impression of the CompChem Landscape

In a nutshell, computational chemistry models are about depicting, reproducing and predicting the electronic-based molecular reality. I had this conversation with my students last week and at some point I drew a parallel between them and art in terms of how such reality is approached.

**Semi empirical methods
**Prehistoric wall paintings depict a coarse aspect of reality without any detail but nevertheless we can draw some conclusions from the images. In the most sophisticated of these images, the cave paintings in Altamira, we can discern a bison, or could it be a bull? but definitely not a giraffe nor a whale, most in the same way Hückel´s method provides an

*ad hoc*picture of π electron density without any regard of the σ portion of the electron density or the conformational possibilities (

*s-cis*and

*s-trans*1,3-butadiene have the same Hückel description).

More sophisticated semi-empirical Hamiltonians like PM3 or PM6 have better parametrizations and hence yield better results. We are still replacing a lot of information for experimental or adjusted parameters but we still cannot truly adopt it as truthful. Take this pre-medieval painting of one of the first Kings of England, Aelred the Unready. It is, by today standards, a good children´s drawing and not a royal portrait, we now see more detail and can discern many more features yielding a better description of a human figure than those found in Altamira or Egypt.

**Hartree-Fock
**HF is the simplest of

*ab initio*methods, meaning that no experimental results or adjustable parameters are introduced. Even more so, from the HF equations for a multi-electron system that complies with Pauli’s exclusion principle the exchange operator arises as a new quantum feature of matter with no classical analogue. Still, there are some shortcomings. Correlation energy is disregarded and most results vary according to the basis set employed. Take the impressionist movement, specially in France: In Monet´s Lady with Umbrella we have a more complicated composition, we observe many more features and although we have a better description of color composition some details, like her face, remain obscure. The impressionists are characterized by their broad strokes, the thicker the strokes the harder it is to observe details similar to what happens in HF when we change from a small to a large basis set, respectively.

**CI (Configurations Interaction)**

Extension of HF to a multi-reference method yields better results. In CI we take the original guess wavefunction -as expressed through a Slater Determinant- and extend it with one or many more wavefunctions; thus a linear combination of Slater Determinants gives rise to a broader description of the ground state because other electronic configurations are involved to include more details like the ionic and covalent pictures (configurations). The more terms we include the more real the results feel. If we take classical figurative paintings we have a similar result; most of these paintings are constituted of many elements and the more realistically each element is captured the more real the whole composition looks even if some are just merely indicated.

**CCSD(T) full-CI, CASPT2**

In Edwards Much’s the scream, we might think we have lost some information again and went back to impressionism but we know this is actually an expressionist painting; we can now not only observe details of the figurative portion of the image but Munch has captured his subject´s fear in the form of distorsions on the subjective reality. In this way, CCSD(T), full-CI and CASPT2 methods provide a description of the ground as well as the excited states which -in experimental reality- are only accessed through a perturbation of the elecron density by electromagnetic radiation. Something resembling radiation has perturbated the subject in The Scream rendering him frightened and wondering how to return to his ground state or if such thing will be even possible.

**Density Functional Methods**

At least due to its widespread use, DFT has risen as the preferred method. One of the reasons behind its success is the reduced computing time when compared to previous *ab initio* methods. So DFT is pretty much like photography, in which reality is captured in full but only apparently after selecting a given lens, an exposition, a filter, shutter speed and the occasional Photoshop for correcting issues such as aliasing. In photography, as in DFT, all details concerning the procedure or method for capturing an uncanny reproduction of reality must be stated in every case for reproduction purposes.

Now, in the end it all comes down to Magritte’s Pipe. *Ceci n’est pas une pipe* -or, ‘this is not a pipe’- reminds us that painting as with modeling we don’t get reality but rather a depiction of it. In this famous painting we look at an image that in our heads resembles that of a pipe but we cannot grab it, fill it with tobacco and smoke it.

The image above is a digital file, which translated becomes a scaled reproduction of an image painted by Magritte in which we see the 2D projection of the image of an object that reminds us of a pipe. In fact, the real name of this work is *The Treachery of Images*, definitely quite an epistemology problem on perception and knowledge but before I get too metaphysical I should finish this post.

Can you find where cubism or surrealism should be placed? with MPn methods, perhaps?