Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.
Fundamentally, the interaction energy of any A—B dimer, Eint, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).
Eint = EAB – EA – EB (1)
However the calculation of Eint by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)
EintCP = EABAB – EAAB– EBAB (2)
where the superscript AB means the whole basis set is used. This is accomplished by using ‘ghost‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.
In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.
%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70
For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!
The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.
BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.
As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.
When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.
Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.
X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.
Well, I only contributed with the theoretical section by doing electronic structure calculations, so it isn’t really a paper we can ascribe to this particular lab, however it is really nice to see my name in JACS along such a prominent researcher as Prof. Chad Mirkin from Northwestern University, in a work closely related to my area of research interest as macrocyclic recognition agents.
In this manuscript, a calixarene is allosterically opened and closed reversibly by coordinating different kinds of ligands to a platinum center linked to the macrocycle. (This approach has been referred to as the weak link approach.) I recently visited Northwestern and had a great time with José Mendez-Arroyo, the first author, who showed me around and opened the possibility for further work between our research groups.
Closed, semi-open and fully open conformations; selectivity is modulated through cavity size. (Ligands: Green = Chloride; Blue = Cyanide)
Here at UNAM we calculated the interaction energies for the two guests that were successfully inserted into the cavity: N-methyl-pyridinium (Eint = 57.4 kcal/mol) and Pyridine-N-oxide (Eint = +200.0 kcal/mol). Below you can see the electrostatic potential mapped onto the electron density isosurface for one of the adducts. Relative orientation of the hosts within the cavity follows the expected (anti-) alignment of mutual dipole moments. At this level of theory, we could easily be inclined to assert that the most stable interaction is indeed the one from the semi-open compound and that this in turn is due to the fact that host and guest are packed closer together but there is also an orbital issue: Pyridine Oxide is a better electron acceptor than N-Me-pyridinium and when we take a closer look to the (Natural Bonding) orbitals interacting it becomes evident that a closer location does not necessarily yields a stronger interaction when the electron accepting power of the ligand is weaker (which is, in my opinion, both logic and at the same time a bit counterintuitive, yet fascinating, nonetheless).
All calculations were performed at the B97D/LANL2DZ level of theory with the use of Gaussian09 and NBO3.1 as provided within the former. Computing time at UNAM’s supercomputer known as ‘Miztli‘ is fully acknowledged.
The full citation follows:
A Multi-State, Allosterically-Regulated Molecular Receptor With Switchable Selectivity
Jose Mendez-Arroyo †, Joaquín Barroso-Flores §,Alejo M. Lifschitz †, Amy A. Sarjeant †, Charlotte L. Stern †, and Chad A. Mirkin *†
Thanks to José Mendez-Arroyo for contacting me and giving me the opportunity to collaborate with his research; I’m sure this is the first of many joint projects that will mutually benefit our groups.