Papers for the 21st Century


The format of a research paper hasn’t changed much throughout history, despite the enormous changes in platforms available for their consumption and the near extinction of the library issue. Convenient electronic files such as PDFs still resemble printed-and-bound-in-issues papers in their layout instead of exploiting the seemingly endless capabilities of the electronic format.

For instance, why do we still need to have page numbers? a DOI is a full, traceable and unique identification for each work and there are so many nowadays that publishers have to pour them out as e-first, ASAPs, and just accepted before having them assigned page numbers, a process which is still a concern for some researchers (and even for some of the organizations funding them or evaluating their performance). Numbers for Issues, Volumes and Pages are library indexes needed to sort and retrieve information from physical journals but in the e-realm where one can browse all issues online, perform a search and download the results these indexes are hardly of any use, only the year is helpful in establishing a chronological order to the development of ideas. This brings me to the next issue (no pun intended): If bound-issues are no longer a thing then neither should be covers. Being selected for a cover is a huge honor, it means the editorial staff think your work stands out from the published works in the same period; but nowadays is an honor that comes to a price, sometimes a high price. With the existence of covers, back-covers, inner-covers and inner-back-covers and whatnot at USD$1,500 a piece, the honor gets a bit diluted. Advertisers know this and now they place their ads as banners, pop-ups and other online digital formats instead of -to some extent- paying for placing ads in the pages of the journals.

I recently posted a quick informal poll on Twitter about the scientific reading habits of chemists and I confirmed what I expected: only one in five still prefers to mostly read papers on actual paper*, the rest rely on an electronic version such as HTML full text or the most popular PDF on a suitable reader.

https://platform.twitter.com/widgets.js

What came as a surprise for me was that in the follow up poll, Reference Manager programs such as Mendeley, Zotero, EndNote or ReadCube are only preferred by 15% while 80% prefer the PDF reader (I’m guessing Acrobat Reader might be the most popular.) A minority seems to prefer the HTML full text version, which I think is the richest but hardly customizable for note taking, sharing, or, uhm hoarding.

https://platform.twitter.com/widgets.js

I’m a Mendeley user because I like the integration between users, its portability between platforms and the synchronization features but if I were to move to another reference manager software it would be ReadCube. I like taking notes, highlighting text, and adding summaries and ideas onto the file but above all I like the fact that I can conduct searches in the myriad of PDF files I’ve acumulated over the years. During my PhD studies I had piles of (physical) paper and folders with PDF files that sometimes were easier to print than to sort and organize (I even had a spreadsheet with the literature read-a nightmarish project in itself!)

So, here is my wish list for what I want e-papers in the 21st century to do. Some features are somewhat available in some journals and some can be achieved within the PDF itself others would require a new format or a new platform to be carried out. Please comment what other features would you like to have in papers.

  • Say goodbye to the two columns format. I’m zooming to a single column anyway.
  • Pop-up charts/plots/schemes/figures. Let me take a look at any graphical object by hovering (or 3D touching in iOS, whatever) on the “see Figure X” legend instead of having to move back and forth to check it, specially when the legend is “see figure SX” and I have to go to the Supporting Information file/section.
  • Pop-up References. Currently some PDFs let you jump to the References section when you click on one but you can’t jump back but scroll and find the point where you left.
  • Interactive objects. Structures, whether from X-ray diffraction experiments or calculations could be deposited as raw coordinates files for people to play with and most importantly to download** and work with. This would increase the hosting journals need to devote to each work so I’m not holding my breath.
  • Audio output. This one should be trickier, but far most helpful. I commute long hours so having papers being read out loud would be a huge time-saver, but it has to be smart. Currently I make Siri read papers by opening them in the Mendeley app, then “select all“, “voice“, but when it hits a formula, or a set of equations the flow is lost (instead of reading water as ‘H-Two-O‘, it reads ‘H-subscript Two-O‘; try having the formula of a perovskite be read)
  • A compiler that outputs the ‘traditional version‘ for printing. Sure, why not.

I realize this post may come out as shallow in view of the Plan-S or FAIR initiatives, sorry for that but comfort is not incompatible with accessibility.

What other features do you think research papers should have by now?



* It is true that our attention -and more importantly- our retention of information is not the same when we read on paper than on a screen. Recently there was an interview on this matter on Science Friday.
** I absolutely hate having a Supporting Information section with long PDF lists of coordinates to copy-paste and fix into a new input file. OpenBabel, people!

Advertisements

Failure Reading NMR data in GaussView


There was this following message on a GIAO calculation when trying to open the file in GaussView5.0 (it opens successfully in ChemCraft)

CConnectionGLOG::Parse_GLOG()
Failure reading NMR data 
Line Number 2414

When you go to said line (line 2414) you find the following string:

Eigenvalues:-12345.6789 -12345.6789 -12345.6789

Which belong to the eigenvalues of the SCF NMR GIAO shielding tensor. The problem lies with the space missing between the colon sign ‘:’ and the ‘-‘ sign of the first eigenvalue. You can fix it either by hand with an editor but GV only warns you about the first instance so there may be others and you need to repeat the procedure. It is probably best to fix them all in one go with the following command from the terminal:

sed -i ‘s/Eigenvalues:-/Eigenvalues: -/g’

It is good to be back in Romania at the UBB writing these posts where this blog began. Thanks to my good friend Dr. Alexandru Lupan for pointing out this error.

Atom specifications unexpectedly found in input stream.


“Well, where else were they supposed to appear?”

I was sent this error along with the previous question for a failed optimization. Apparently there is no answer in the internet (I quickly checked) so here it is:

Gaussian is confused about finding atomic coordinates because there is also a geom=check instruction placed in the route section, i.e., it was told to retrieve the atomic coordinates from a checkpoint and then it was given those atomic coordinates within the input so it doesn’t know what you mean and exits.

10 Years of Blogging!


This week marks the 10th anniversary of this little blog! It’s crazy to think a pet project that I took on during my last year as a postdoc is still going on after a decade of recording the work of our group in computational chemistry and it is also a happy coincidence that this year is the centennial anniversary of IUPAC and the sesquicentennial anniversary of the Periodic Table, for which 2019 has been designated as the International Year of the Periodic Table. I will release various posts celebrating this first decade of blogging and some regarding the IYPT2019 as soon as possible, also some major changes in layout and look are coming. It has been suggested to me that setting a patreon.com account could help me raise some funding for assisting underprivileged students but I’m not so sure yet.

By 2009 the chemistry blogosphere was already in full swing, so I got to it a bit late. (Is commenting ‘First!‘ still a thing?) At the time my job future seemed a bit uncertain, I had already spent two years as a postdoc in Romania and prior to that I worked for a private company in their research center here in Mexico so I started to ramble here so in upcoming job interviews I could point to a resource which gathered my thoughts and some achievements in a more informal fashion than a CV or a resume. (Plus, I like writing about things other than chemistry just for myself, maybe someday I’ll start a blog with some fiction writings I have here and there.) Quite frankly I didn’t think I could go back into academia so I was mainly looking for jobs in the R&D departments of various chemical companies, particularly in the field of coatings which was the one I already had some experience.

I never imagined this little blog would gain any attention, I think I had something like 2,000 views on the first year, now it’s up to 1,500 views a week! One of the first posts that gained popularity quite quickly dealt with the calculation of SCRF calculations and some parameters we were struggling to get right. Once I found the best parameters for running them, my boss, the late Prof. Dr. Ioan Silaghi-Dumitrescu at Babes-Bolyai University, asked me to email them to the group and post them physically in the lab so we wouldn’t loose them; I thought it would be a better idea to have them on the blog so we all could access them easily and at the same time share our findings with whomever had the same issues. Turned out that many people struggled with these parameters for SCRF calculations in Gaussian and from that moment on that became one of the underlying principles of the blog: “any problem we face in the lab is definitely faced by someone else, so lets share our solution”; the other principle of course was my blatant self promotion.

There are 10 kinds of people in the world: Those who understand binary and those who don’t.

One of the most rewarding aspects of having kept this blog going on for so long is knowing that it is a modest resource that some people has found helpful. Attending conferences and having people telling me they like my posts and have found help in them is extremely gratifying. Also, academically it has allowed me to meet wonderful people with whom I’ve established very interesting collaborations in various countries like Iran, US, Slovakia, Czech Republic, Chile, Bulgaria and many more.

Very early I started getting direct questions to specific problems and to the best of my abilities I’ve tried to answer them although I not always have the time to do so and for that I apologize to all the readers who didn’t get an answer; up until now keeping this blog has been a spare-time endeavor which not always gets the priority it deserves withing my academic tasks.

Thank you for reading, commenting and sharing these posts during this past decade! I truly appreciate it and it has been very important to me; I sometimes feel the posts go into the void but every now and then I’m approached by readers who have found the blog helpful and that is very rewarding. Here’s to ten more years!

Using PDB files for Electronic Structure Calculations


Quick Post on preparing Gaussian input files from PDB files.

If you’re modeling biological systems chances are that, more often than not, you start by retrieving a PDB file. The Protein Data Bank is a repository for all things biochemistry – from oligo-peptides to full DNA sequences with over 140,000 available files encoding the corresponding structure obtained by various experimental means ranging from X-Ray diffraction, NMR and more recently, Cryo Electron Microscopy (CEM).

The PDB file encodes the Cartesian coordinates for each atom present in the structure as well as their in the same way molecular dynamics codes -like AMBER or GROMACS- code the parameters for a force field; this makes the PDB a natural input file for MD.

There are however some considerations to have in mind for when you need to use these coordinates in electronic structure calculations. Personally I give it a pass with OpenBabel to add (or possibly just re-add) all Hydrogen atoms with the following instruction:

$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -h

Alternatively, you can select a pH value, say 7.5 with:

$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -h -p7.5

You may also use the GUI if by any chance you’re working in Windows:

This sends all H atoms to the end of the atoms list. Usually for us the next step is to optimize their positions with a partial optimization at a low level of theory for which you need to use the ReadOptimize ReadOpt or RdOpt in the route section and then add the atom list at the end of the input file:

Atomic coordinates
--blank line--
noatoms atoms=H
--blank line--

Finally, visual inspection of your input structure is always helpful to find any meaningful errors, remember that PDB files come from experimental measurements which are not free of problems.

As usual thanks for reading, commenting, and sharing.

Gustavo “Gus” Mondragón M.Sc. – Thesis Defense


We celebrate the successful thesis defense of Gustavo “Gus” Mondragón who has now completed his Masters degree and is now on to getting a PhD in our group. Gustavo has worked on the search for multiexcitonic states and their involvement in the excitonic transference between photosynthetic pigments, specifically between bacteriochlorophyll-d molecules (BChl-d) from the bchQRU chlorosome whose whole structure is shown in the gallery below. To this end, Gustavo has studied and implemented the Restricted Active Space method with double spin flip (RAS-2SF) with the use of QChem5.0, a method that has required the use and understanding of states with high multiplicities. Additionally, Gustavo has investigated the influence of the environment within the chlorosome by performing ONIOM calculations for the spectroscopic properties of a BChl-d dimer, finding albeit qualitatively a batochromic effect, probably an expected result but nonetheless an impressive feat for the level of theory selected.

There’s still a lot of work to do in this line of research and although we’re eager to publish our results in this excitonic transference mechanism we want to be completely sure that we’re taking every possibility into consideration so we don’t incur into any inconsistencies.

Gustavo cultivates many research interests from excited states of these pigments to biochemical processes that require the use of various tools; I’m sure his permanence in our lab will bring lots of interesting results. Congratulations, Gus! Thank you for your hard work.

Estimation of pKa Values through Local Electrostatic Potential Calculations


Calculating the pKa value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.

Thermodynamic_Cycle
Fig 1. Thermodynamic Cycle for the pKa calculation of any given Bronsted acid, HA

Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the pKa value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (VS,max) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step  in the deprotonation process. 

We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate VS,max on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = water). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:

pKa = -0.2185(VS,max) + 16.1879

Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.

We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)

47051619_1824157374360101_2244437569725005824_n
fig 2. calculated v experimental pKa values for a test set of 10 carboxylic acids from equation above

I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use. 

We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to tetrel bonding. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.

Dr. Gabriel Merino wins The Walter Kohn Prize 2018


Just as I was thinking about the state of Mexican scientific environment in the global scale, Prof. Dr. Gabriel Merino from CINVESTAV comes and gets this prize awarded by the International Center for Theoretical Physics (ICTP) and the Quantum ESPRESSO Foundation, showing us all that great science is possible even under pressing circumstances. 

Prof. Dr. Gabriel Merino at CINVESTAV Mérida, Yucatán, MEXICO

This prize is awarded biennially to a young scientist for outstanding contributions in the field of quantum-mechanical materials and molecular modeling, performed in a developing country or emerging economy,and in the case of Dr. Merino it is awarded not only for his contributions to theory and applications but also by his contributions to the prediction of novel systems that violate standard chemical paradigms, broadening the scope of concepts like aromaticity, coordination and chemical bond. The list of his contributions is very long despite his young age and there are barely any topic in chemistry or materials science that escapes his interest.

Gabriel is also one of the leading organizers of the Mexican Theoretical Physical Chemistry Meeting, an unstoppable mentor with many of his former students now leading research teams of their own. He is pretty much a force of nature. 

Congratulations to Dr. Gabriel Merino, his team, CINVESTAV and thanks for being such an inspiration and a good friend at the same time.

¡Felicidades, Gabriel!

Computational Chemistry from Latin America


The video below is a sad recount of the scientific conditions in Mexico that have driven an enormous amount of brain power to other countries. Doing science is always a hard endeavour but in developing countries is also filled with so many hurdles that it makes you wonder if it is all worth the constant frustration. 

That is why I think it is even more important for the Latin American community to make our science visible, and special issues like this one from the International Journal of Quantum Chemistry goes a long way in doing so. This is not the first time IJQC devotes a special issue to the Comp.Chem. done south of the proverbial border, a full issue devoted to the Mexican Physical Chemistry Meetings (RMFQT) was also published six years ago.

I believe these special issues in mainstream journals are great ways of promoting our work in a collected way that stresses our particular lines of research instead of having them spread a number of journals. Also, and I may be ostracized for this, but I think coming up with a new journal for a specific geographical community represents a lot of effort that takes an enormous amount of time to take off and thus gain visibility. 

For these reasons I’ve been cooking up some ideas for the next RMFQT website. I don’t pretend to say that my colleagues need any shoutouts from my part -I could only be so lucky to produce such fine pieces of research myself- but it wouldn’t hurt to have a more established online presence as a community. 

¡Viva la ciencia Latinoamericana!

XVII Mexican Meeting on Theoretical Physical Chemistry


The RMFQT meeting is a long standing tradition within the Mexican Comp.Chem. community; a tradition that is now transcending our borders as more and more foreign students and researchers take part of this party, for it is a festive occasion indeed. This was the first time the RMFQT was held at a private institute, The Monterrey Institute of Technology.

As in previous years, our lab contributed with a four posters and one talk by yours truly. The posters presented by Raul Torres, Raúl Márquez, Gustavo Mondragón and Dr. Jacinto Sandoval whose pictures you can spot below in the gallery. 

My talk was on the collaborative nature of Comp.Chem. and our particular interactions with the organic synthesis lab of Dr. Moisés Romero. The published papers discussed in the talk can be found in Tetrahedron (post), PCCP (post), and some unpublished results that can be read as a preprint in preprints.org.

I had the pleasure to meet and interact with old friends and make new ones like Dr. Julio Palma from Penn State, whose work on molecular rectifiers is very interesting. Also, I got to interact with many wonderful students who apparently are aware of the existence of this blog. (A big shoutout to M. Joaquina Beltrán and Plinio Cantero, from Chile whose work on DNA mismatch sensors is quite interesting, I look forward to further interacting with their team of research.)

A particular reason for this meeting to be special for me is the fact that I have been now announced as part of the local organizing committee for the next edition in 2019 in Toluca. I was also asked to develop a centralized website and coordinate the social media communication related to the this and other events, starting with the creation of the official Twitter account for our network and the meeting. I’m working on a few ideas, but if you have any suggestions please send them in the comments section. 

See you next year in Toluca!

%d bloggers like this: