Elucidating the pairing of non-hydrogen bonded unnatural base pairs (UBPs) is still a controversial subject due to the lack of specificity in their mutual interactions. Experimentally, NMR is the method of choice but the DNA strand must be affixed on template of sorts such as a polymerase protein. Those discrepancies are well documented in a recent review which cites our previous computational work, both DFT and MD, on UBPs.
Since that last paper of ours on synthetic DNA, my good friend Dr. Rodrigo Galindo from Utah U. and I have had serious doubts on the real pairing fashion exhibited by Romesberg’s famous hydrophobic nucleotides d5SICS – dNaM. While the authors claim a stacked pairing (within the context of the strand in the KlenTaq polymerase enzime), our simulations showed a Watson-Crick-like pairing was favored in the native form. To further shed light on the matter we performed converged micro-seconds long simulations, varying the force field (two recent AMBER fields were explored: Bsc1 and OL15), the water model (TIP3P and OPC), and the ionic compensation scheme (Na+/Cl– or Mg2+/Cl–).
In the image below it can be observed how the pairing is consistently WC (dC1′-C1′ ~10.4 A) in the most populated clusters regardless of the force field.
Also, a flipping experiment was performed where both nucleotides were placed 180.0° outwards and the system was left to converge inwards to explore a ‘de novo’ pairing guided solely by their mutual interactions and the template formed by the rest of the strand. Distance population for C1′ – C1′ were 10.4 A for Bsc1 (regardless of ionic compensation) and 9.8 A for OL15 (10.4 A where Mg2+ was used as charge compensation).
Despite the successful rate of replication by a living organism -which is a fantastic feat!- of these two nucleotides, there is little chance they can be used for real coding applications (biological or otherwise) due to the lack of structural control of the double helix. The work of Romesberg is impressive, make no mistake about it, but my money isn’t on hydrophobic unnatural nucleotides for information applications 🙂
All credit and glory is due to the amazing Dr. Rodrigo Galindo-Murillo from the University of Utah were he works as a developer for the AMBER code among many other things. Go check his impressive record!
As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.
When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.
Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.
X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.
Ever since I read the highly praised article by Floyd Romesberg in Nature back in 2013 I got really interested in synthetic biology. In said article, an unnatural base pair (UBP) was not only inserted into a DNA double strand in vivo but the organism was even able to reproduce the UBPs present in subsequent generations.
Inserting new unnatural base pairs in DNA works a lot like editing a computer’s code. Inserting a couple UBPs in vitro is like inserting a comment; it wont make a difference but its still there. If the DNA sequence containing the UBPs can be amplified by molecular biology techniques such as PCR it means that a polymerase enzyme is able to recognize it and place it in site, this is equivalent to inserting a ‘hello world’ section into a working code; it will compile but it’s pretty much useless. Inserting these UBPs in vivo means that the organism is able to thrive despite the large deformation in a short section of its genetic code, but having it replicated by the chemical machinery of the nucleus is an amazing feat that only a few molecules could allow.
The ultimate goal of synthetic biology would be to find a UBP which codes effectively and purposefully during translation of DNA.This last feat would be equivalent to inserting a working subroutine in a program with a specific purpose. But not only could the use of UBPs serve for the purposes of expanding the genetic code from a quaternary (base four) to a senary (base six) system: the field of DNA origami could also benefit from having an expansion in the chemical and structural possibilities of the famous double helix; marking and editing a sequence would also become easier by having distinctive sections with nucleotides other than A, T, C and G.
It is precisely in the concept of double helix that our research takes place since the available biochemical machinery for translation and replication can only work on a double helix, else, the repair mechanisms get activated or the DNA will just stop serving its purpose (i.e. the code wont compile).
My good friend, Dr. Rodrigo Galindo and I have worked on the simulation of Romesberg’s UBPs in order to understand the underlying structural, dynamical and electronic causes that made them so successful and to possibly design more efficient UBPs based on a set of general principles. A first paper has been accepted for publication in Phys.Chem.Chem.Phys. and we’re very excited for it; more on that in a future post.