# Blog Archives

## Percentage of Molecular Orbital Composition – G09,G16

Canonical Molecular Orbitals are–by construction–delocalized over the various atoms making up a molecule. In some contexts it is important to know how much of any given orbital is made up by a particular atom or group of atoms, and while you could calculate it by hand given the coefficients of each MO in terms of every AO (or basis set function) centered on each atom there is a straightforward way to do it in Gaussian.

If we’re talking about ‘dividing’ a molecular orbital into atomic components, we’re most definitely talking about population analysis calculations, so we’ll resort to the pop keyword and the orbitals option in the standard syntax:

`#p M052x/cc-pVDZ pop=orbitals`

This will produce the following output right after the Mulliken population analysis section:

```Atomic contributions to Alpha molecular orbitals:
Alpha occ 140 OE=-0.314 is Pt1-d=0.23 C38-p=0.16 C31-p=0.16 C36-p=0.16 C33-p=0.15
Alpha occ 141 OE=-0.313 is Pt1-d=0.41
Alpha occ 142 OE=-0.308 is Cl2-p=0.25
Alpha occ 143 OE=-0.302 is Cl2-p=0.72 Pt1-d=0.18
Alpha occ 144 OE=-0.299 is Cl2-p=0.11
Alpha occ 145 OE=-0.298 is C65-p=0.11 C58-p=0.11 C35-p=0.11 C30-p=0.11
Alpha occ 146 OE=-0.293 is C58-p=0.10
Alpha occ 147 OE=-0.291 is C22-p=0.09
Alpha occ 148 OE=-0.273 is Pt1-d=0.18 C11-p=0.12 C7-p=0.11
Alpha occ 149 OE=-0.273 is Pt1-d=0.18
Alpha vir 150 OE=-0.042 is C9-p=0.18 C13-p=0.18
Alpha vir 151 OE=-0.028 is C7-p=0.25 C16-p=0.11 C44-p=0.11
Alpha vir 152 OE=0.017 is Pt1-p=0.10
Alpha vir 153 OE=0.021 is C36-p=0.15 C31-p=0.14 C63-p=0.12 C59-p=0.12 C38-p=0.11 C33-p=0.11
Alpha vir 154 OE=0.023 is C36-p=0.13 C31-p=0.13 C63-p=0.11 C59-p=0.11
Alpha vir 155 OE=0.027 is C65-p=0.11 C58-p=0.10
Alpha vir 156 OE=0.029 is C35-p=0.14 C30-p=0.14 C65-p=0.12 C58-p=0.11
Alpha vir 157 OE=0.032 is C52-p=0.09
Alpha vir 158 OE=0.040 is C50-p=0.14 C22-p=0.13 C45-p=0.12 C17-p=0.11
Alpha vir 159 OE=0.044 is C20-p=0.15 C48-p=0.14 C26-p=0.12 C54-p=0.11
```

Alpha and Beta densities are listed separately only in unrestricted calculations, otherwise only the first is printed. Each orbital is listed sequentially (occ = occupied; vir = virtual) with their energy value (OE = orbital energy) in atomic units following and then the fraction with which each atom contributes to each MO.

By default only the ten highest occupied orbitals and ten lowest virtual orbitals will be assessed, but the number of MOs to be analyzed can be modified with orbitals=N, if you want to have all orbitals analyzed then use the option AllOrbitals instead of just orbitals. Also, the threshold used for printing the composition is set to 10% but it can be modified with the option ThreshOrbitals=N, for the same compound as before here’s the output lines for HOMO and LUMO (MOs 149, 150) with ThreshOrbitals set to N=1, i.e. 1% as occupation threshold (ThreshOrbitals=1):

```Alpha occ 149 OE=-0.273 is Pt1-d=0.18 N4-p=0.08 N6-p=0.08 C20-p=0.06 C13-p=0.06 C48-p=0.06 C9-p=0.06 C24-p=0.05 C52-p=0.05 C16-p=0.04 C44-p=0.04 C8-p=0.03 C15-p=0.03 C17-p=0.03 C45-p=0.02 C46-p=0.02 C18-p=0.02 C26-p=0.02 C54-p=0.02 N5-p=0.01 N3-p=0.01
Alpha vir 150 OE=-0.042 is C9-p=0.18 C13-p=0.18 C44-p=0.08 C16-p=0.08 C15-p=0.06 C8-p=0.06 N6-p=0.04 N4-p=0.04 C52-p=0.04 C24-p=0.04 N5-p=0.03 N3-p=0.03 C46-p=0.03 C18-p=0.03 C48-p=0.02 C20-p=0.02```

The fragment=n label in the coordinates can be used as in BSSE Counterpoise calculations and the output will show the orbital composition by fragments with the label "Fr", grouping all contributions to the MO by the AOs centered on the atoms in that fragment.

As always, thanks for reading, sharing, and rating. I hope someone finds this useful.

## Basis Set Superposition Error (BSSE). A short intro

Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, Eint, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

Eint = EAB – EA – EB (1)

However the calculation of Eint by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

EintCP = EABAB – EAAB– EBAB (2)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘ghost‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

```%chk=HF2.chk
#P opt wB97XD/6-31G(d,p) Counterpoise=2

HF dimer

0,1 0,1 0,1
H(Fragment=1) 0.00 0.00 0.00
F(Fragment=1) 0.00 0.00 0.70
H(Fragment=2) 0.00 0.00 1.00
F(Fragment=2) 0.00 0.00 1.70```

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

## Some .fchk files wont open in GaussView5.0 (Update)

A couple of weeks ago I posted a solution for a common error regarding .fchk files that will display the error below when opened with GaussView5.0. As I expected, this error has to do with the use of diffuse functions in the basis set and is related to a change of format between Gaussian versions.

```CConnectionGFCHK::Parse_GFCHK()
Missing or bad data: Alpha Orbital Energies
Line Number 1234```

Although the method described in the previous post works just fine, the following update is a better approach. Due to a change of spelling between G03 and G09 (which has been corrected for G09 but not available for GV versions prior to 5.0.9) one must change “independent” for “independant

To make the change directly from the terminal the following command is needed:

`sed -i 's/independent/independant/g' file.fchk`

Alternatively you can redirect the output to a new file

`sed -e 's/independent/independant/g' file.fchk > newfile.fchk`

if you want to keep the old version and work with a new one.

Of course this edition can be performed manually with any text editor available (for example if you work in Windows) but solutions from the terminal always seem easier and a lot more fun to me.

Thanks to Dr. Fernando Cortés for sharing his insight into this issue.

## If a .fchk file wont open in GaussView5.0

I’ve found the following error regarding the opening of .fchk files in GaussView5.0.

```CConnectionGFCHK::Parse_GFCHK()
Missing or bad data: Alpha Orbital Energies
Line Number 1234```

The error is prevented to a first approximation (i.e. it at least will allow GV to open and visualize the file but other issues may arise) by opening the file and modifying the number of basis functions to equal the number of independent functions (which is lower)

```FILE HEADER
FOpt RM062X 6-311++G(d,p)
Number of atoms I 75
Info1-9 I N= 9
163 163 0 0 0 110
2 18 -502
Charge I 0
Multiplicity I 1
Number of electrons I 314
Number of alpha electrons I 157
Number of beta electrons I 157
Number of basis functions I 1199
Number of independent functions I 1199
Number of point charges in /Mol/ I 0
Number of translation vectors I 0
Atomic numbers I N= 75
... ...
... ...```

Once both numbers match you can open the file normally and work with it. My guess is this will continue to happen with highly polarized basis sets but I need to run some tests.

## The Local Bond Order, LBO (Barroso et al. 2004)

I don’t know why I haven’t written about the Local Bond Order (LBO) before! And a few days ago when I thought about it my immediate reaction was to shy away from it since it would constitute a blatant self-promotion attempt; but hell! this is my blog! A place I’ve created for my blatant self-promotion! So without further ado, I hereby present to you one of my own original contributions to Theoretical Chemistry.

During the course of my graduate years I grew interested in weakly bonded inorganic systems, namely those with secondary interactions in bidentate ligands such as xanthates, dithiocarboxylates, dithiocarbamates and so on. Description of the resulting geometries around the central metallic atom involved the invocation of secondary interactions defined purely by geometrical parameters (Alcock, 1972) in which these were defined as present if the interatomic distance was longer than the sum of their covalent radii and yet smaller than the sum of their van der Waals radii. This definition is subject to a lot of constrictions such as the accuracy of the measurement, which in turn is related to the quality of the monocrystal used in the X-ray difraction experiment; the used definition of covalent radii (Pauling, Bondi, etc.); and most importantly, it doesn’t shed light on the roles of crystal packing, intermolecular contacts, and the energetics of the interaction.

This is why in 2004 we developed a simple yet useful definition of bond order which could account for a single molecule in vacuo the strength and relevance of the secondary interaction, relative to the well defined covalent bonds.

```Barroso-Flores, J. et al. Journal of Organometallic Chemistry 689 (2004) 2096–2102
http://dx.doi.org/10.1016/j.jorganchem.2004.03.035,```

Let a Molecular Orbital be defined as a wavefunction ψi which in turn may be constructed by a linear combination of Atomic Orbitals (or atom centered basis set functions) φj We define ζLBO in the following way, where we explicitly take into account a doubly occupied orbital (hence the multiplication by 2) and therefore we are assuming a closed shell configuration in the Restricted formalism. The summation is carried over all the orbitals which belong to atom A1 and those of atom A2.
Simplifying we yield, where Sjk is the overlap integral for the φj and φk functions.

By summing over all i MOs we have accomplished with this definition to project all the MO’s onto the space of those functions centered on atoms A1 and A2. This definition is purely quantum mechanical in nature and is independent from any geometric requirement of such interacting atoms (i.e. interatomic distance) thus can be used as a complement to the internuclear distance argument to assess the interaction between them. This definition also results very simple and easy to calculate for all you need are the coefficients to the LCAO expansion and the respective overlap integrals.

Unfortunately, the Local Bond Order hasn’t found much echo, partly due to the fact that it is hidden in a missapropriate journal. I hope someone finds it interesting and useful; if so, don’t forget to cite it appropriately 😉

## The Gen keyword in Gaussian. Adding an external basis set.

I am frequently asked how to include an extra set of basis functions in a calculation or how to use an entirely external basis set. Sometimes this question also implies the explicit declaration of an external pseudopotential or Effective Core Potential (ECP).

New basis sets and ECPs are published continuously in specialized journals all the time. The same happens with functionals for DFT calculations. The format in which they are published is free and usually only a list of coefficients and exponents are shown and one has to figure out how to introduce it in ones calculation. The EMSL Basis Set Exchange site helps you get it right! It has a clickable periodic table and a list of many (not all) different basis sets at the left side. Below the periodic table there is a menu from which one can select which program we want our basis set for; finally we click on “get basis set” and a pop-up window shows the result in the selected format along with the corresponding references for citation. A multiple query can be performed by selecting more than one element on the table, which generates a list that almost sure can be used as input without further manipulations. Dr. David Feller is to be thanked for leading the creation of this repository. More on the history and mission of the EMSL can be found on their About page. Because of my experience, the rest of the post addresses the inclusion of external basis sets in Gaussian, other programs such as NwChem will be addressed in a different post soon.

The correct format for inclusion of an external basis set is exemplified below with the inclusion of the 3-21G basis set for Carbon as obtained from the EMSL Basis Set Exchange site (blank lines are marked explicitly just to emphasize their location:

```spin multiplicity
Molecular coordinates
- blank line -
C     0
S   3   1.00
172.2560000              0.0617669
25.9109000              0.3587940
5.5333500              0.7007130
SP   2   1.00
3.6649800             -0.3958970              0.2364600
0.7705450              1.2158400              0.8606190
SP   1   1.00
0.1958570              1.0000000              1.0000000
****
- blank line -```

The use of four stars ‘****’ is mandatory to indicate the end of the basis set specification for any given atom. If a basis set is to be declared for a second atom, it should be included after the **** line without any blank line in between.

WARNING! Sometimes we can find more than one basis set in a single file this is due to different representations, spherical or cartesian basis sets. Gaussian by default uses cartesian (5D,7F) functions. Pure gaussian use 6 functions for d-type orbitals and 10 for f-type orbitals (6D, 10F). Calculations must be consistent throughout, hence all basis functions should be either cartesian or pure.

Inclusion of a pseudopotential allows for more computational resources to be used for calculation of the electronic structure of the valence shell by replacing the inner electrons for a set of functions which simulate the presence of these and their effect (such as shielding) on the valence electrons. There are full core pseudopotentialas, which replace the entire core (kernel). There are also medium core pseudopotentials which only replace the previous kernel to the full one, allowing for the outermost core electrons to be explicitly calculated. The correct inclusion of a pseudopotential is shown below exemplified by the LANL2DZ ECP by Hay and Wadt for the Chlorine atom.

```spin multiplicity
Molecular coordinates
- blank line -
basis set for atom1
****
basis set for atom2 (if there is any)
****
- blank line -
CL     0
CL-ECP     2     10
d   potential
5
1     94.8130000            -10.0000000
2    165.6440000             66.2729170
2     30.8317000            -28.9685950
2     10.5841000            -12.8663370
2      3.7704000             -1.7102170
s-d potential
5
0    128.8391000              3.0000000
1    120.3786000             12.8528510
2     63.5622000            275.6723980
2     18.0695000            115.6777120
2      3.8142000             35.0606090
p-d potential
6
0    216.5263000              5.0000000
1     46.5723000              7.4794860
2    147.4685000            613.0320000
2     48.9869000            280.8006850
2     13.2096000            107.8788240
2      3.1831000             15.3439560```

If a second ECP is to be introduced, it should be placed right after the first one without any blank line! If a blank line is detected then the program will assume it’s done reading all ECPs and Basis Sets.

Finally, here is an example of a combination of both keywords. If a second ECP was needed then we’d place it at the end of the first one without a blank line. The molecule is any given chlorinated hydrocarbon (H, C and Cl atoms exclusively)

```#P B3LYP/gen pseudo=read ADDITIONAL-KEYWORDS
- blank line -
0 1
Molecular Coordinates
- blank line -
H     0
S   3   1.00
19.2384000              0.0328280
2.8987000              0.2312040
0.6535000              0.8172260
S   1   1.00
0.1776000              1.0000000
****
C     0
S   7   1.00
4233.0000000              0.0012200
634.9000000              0.0093420
146.1000000              0.0454520
42.5000000              0.1546570
14.1900000              0.3588660
5.1480000              0.4386320
1.9670000              0.1459180
S   2   1.00
5.1480000             -0.1683670
0.4962000              1.0600910
S   1   1.00
0.1533000              1.0000000
P   4   1.00
18.1600000              0.0185390
3.9860000              0.1154360
1.1430000              0.3861880
0.3594000              0.6401140
P   1   1.00
0.1146000              1.0000000
****
Cl     0
S   2   1.00
2.2310000             -0.4900589
0.4720000              1.2542684
S   1   1.00
0.1631000              1.0000000
P   2   1.00
6.2960000             -0.0635641
0.6333000              1.0141355
P   1   1.00
0.1819000              1.0000000
****
- blank line -
CL     0
CL-ECP     2     10
d   potential
5
1     94.8130000            -10.0000000
2    165.6440000             66.2729170
2     30.8317000            -28.9685950
2     10.5841000            -12.8663370
2      3.7704000             -1.7102170
s-d potential
5
0    128.8391000              3.0000000
1    120.3786000             12.8528510
2     63.5622000            275.6723980
2     18.0695000            115.6777120
2      3.8142000             35.0606090
p-d potential
6
0    216.5263000              5.0000000
1     46.5723000              7.4794860
2    147.4685000            613.0320000
2     48.9869000            280.8006850
2     13.2096000            107.8788240
2      3.1831000             15.3439560
- blank line -```

If you like this post or found it useful please leave a comment, share it or just give it a like. It is as much fun to find out people is reading as it is finding the answer to ones questions in someone else’s blog 🙂

Peace out!