Out of some +1000 twitter accounts I follow about a quarter are related computational chemistry. The following public list isn’t comprehensive and prone to errors and contains researchers, programmers, students, journals, products and companies who gravitate around the use of in silico methods for the understanding and design of chemical and biochemical compounds.
As is the case of proteins, the functioning of DNA is highly dependent on its 3D structure and not just only on its sequence but the difference is that protein tertiary structure has an enormous variety whereas DNA is (almost) always a double helix with little variations. The canonical base pairs AT, CG stabilize the famous double helix but the same cannot be guaranteed when non-canonical -unnatural- base pairs (UBPs) are introduced.
When I first took a look at Romesberg’s UBPS, d5SICS and dNaM (throughout the study referred to as X and Y see Fig.1) it was evident that they could not form hydrogen bonds, in the end they’re substituted naphtalenes with no discernible ways of creating a synton like their natural counterparts. That’s when I called Dr. Rodrigo Galindo at Utah University who is one of the developers of the AMBER code and who is very knowledgeable on matters of DNA structure and dynamics; he immediately got on board and soon enough we were launching molecular dynamics simulations and quantum mechanical calculations. That was more than two years ago.
Our latest paper in Phys.Chem.Chem.Phys. deals with the dynamical and structural stability of a DNA strand in which Romesberg’s UBPs are introduced sequentially one pair at a time into Dickerson’s dodecamer (a palindromic sequence) from the Protein Data Bank. Therein d5SICS-dNaM pair were inserted right in the middle forming a trisdecamer; as expected, +10 microseconds molecular dynamics simulations exhibited the same stability as the control dodecamer (Fig.2 left). We didn’t need to go far enough into the substitutions to get the double helix to go awry within a couple of microseconds: Three non-consecutive inclusions of UBPs were enough to get a less regular structure (Fig. 2 right); with five, a globular structure was obtained for which is not possible to get a proper average of the most populated structures.
X and Y don’t form hydrogen bonds so the pairing is pretty much forced by the scaffold of the rest of the DNA’s double helix. There are some controversies as to how X and Y fit together, whether they overlap or just wedge between each other and according to our results, the pairing suggests that a C1-C1′ distance of 11 Å is most stable consistent with the wedging conformation. Still much work is needed to understand the pairing between X and Y and even more so to get a pair of useful UBPs. More papers on this topic in the near future.
As if I didn’t have enough things to do I’m launching a new blog inspired by the #365papers hashtag on Twitter and the naturalproductman.wordpress.com blog. In it I’ll hopefully list, write a femto-review of all the papers I read. This new effort is even more daunting than the actual reading of the huge digital pile of papers I have in my Mendeley To-Be-Read folder, the fattest of them all. The papers therein wont be a comprehensive review of Comp.Chem. must-read papers but rather papers relevant to our lab’s research or curiosity.
Maybe I’ll include some papers brought to my attention by the group and they could do the review. The whole endeavor might flop in a few weeks but I want to give it a shot; we’ll see how it mutates and if it survives or not. So far I haven’t managed to review all papers read but maybe this post will prompt to do so if only to save some face. The domain of the new blog is compchemdigest.wordpress.com but I think it should have included the word MY at the beginning so as to convey the idea that it is only my own biased reading list. Anyway, if you’re interested share it and subscribe, those post will not be publicized.
Sometimes you just need to optimize some fragment or moiety of your molecule for a number of reasons -whether because of its size, your current interest, or to skew the progress of a previous optimization- or maybe you want just some kind of atoms to have their positions optimized. I usually optimize hydrogen atoms when working with crystallographic files but that for some reason I want to preserve the rest of the molecule as refined, in order to keep it under a crystalline field of sorts.
Asking Gaussian to optimize some of the atoms in your molecule requires you to make a list albeit the logic behind it is not quite straightforward to me. This list is invoked by the ReadOptimize keyword in the route section and it includes all atoms by default, you can then further tell G09 which atoms are to be included or excluded from the optimization.
So, for example you want to optimize all atoms EXCEPT hydrogens, then your input should bear the ReadOptimize keyword in the route section and then, at the end of the molecule specification, the following line:
If you wish to selectively add some atoms to the list while excluding others, here’s an example:
atoms=C H S notatoms=5-8
This list adds, and therefore optimizes, all carbon, hydrogen and sulfur atoms, except atoms 5, 6, 7 and 8, should they be any of the previous elements in the C H S list.
The way I selectively optimize hydrogen atoms is by erasing all atoms from the list -using the noatoms instruction- and then selecting which are to be included in the list -with atoms=H-, but I haven’t tried it with only selecting hydrogen atoms from the start, as in atoms=H
I probably get very confused because I learned to do this with the now obsolete ReadFreeze keyword; now it sometimes may seem to me like I’m using double negatives or something – please do not optimize all atoms except if they are hydrogen atoms. You can include numbers, ranks or symbols in this list as a final line of your input file.
Common errors (by common I mean I’ve got them):
Lets look at the end of an input I just was working with:
> AtmSel: Line=”P 0″
> Maximum list size exceeded in AddBin.
> Error termination via Lnk1e in…
AtmSel is the routine which reads the atoms list and I was using a pseudopotential on phosphorous atoms, I placed the atoms list at the end of the file but it should be placed right after the coordinates and the connectivity matrix, should there be one, and thus before any external basis set or pseudopotential or any other specification to be read by Gaussian.
As a sort of test you can use the instruction:
%kjob l103 %chk=myfile.chk ...
at the Link0 section (where your checkpoint is defined). This will kill the job after the link 103 is finished, thus you will only get a list of what parameters were frozen and which were active. Then, if things look ok, you can run the job without the %kjob l103 instruction and get it done.
As usual I hope this helps. Thanks for reading except to those who didn’t read it except for the parts they did read.