The Gen keyword in Gaussian. Adding an external basis set.

I am frequently asked how to include an extra set of basis functions in a calculation or how to use an entirely external basis set. Sometimes this question also implies the explicit declaration of an external pseudopotential or Effective Core Potential (ECP).

New basis sets and ECPs are published continuously in specialized journals all the time. The same happens with functionals for DFT calculations. The format in which they are published is free and usually only a list of coefficients and exponents are shown and one has to figure out how to introduce it in ones calculation. The EMSL Basis Set Exchange site helps you get it right! It has a clickable periodic table and a list of many (not all) different basis sets at the left side. Below the periodic table there is a menu from which one can select which program we want our basis set for; finally we click on “get basis set” and a pop-up window shows the result in the selected format along with the corresponding references for citation. A multiple query can be performed by selecting more than one element on the table, which generates a list that almost sure can be used as input without further manipulations. Dr. David Feller is to be thanked for leading the creation of this repository. More on the history and mission of the EMSL can be found on their About page. Because of my experience, the rest of the post addresses the inclusion of external basis sets in Gaussian, other programs such as NwChem will be addressed in a different post soon.

The correct format for inclusion of an external basis set is exemplified below with the inclusion of the 3-21G basis set for Carbon as obtained from the EMSL Basis Set Exchange site (blank lines are marked explicitly just to emphasize their location:

spin multiplicity
Molecular coordinates
- blank line -
C     0
S   3   1.00
    172.2560000              0.0617669
     25.9109000              0.3587940
      5.5333500              0.7007130
SP   2   1.00
      3.6649800             -0.3958970              0.2364600
      0.7705450              1.2158400              0.8606190
SP   1   1.00
      0.1958570              1.0000000              1.0000000
****
- blank line -

The use of four stars ‘****’ is mandatory to indicate the end of the basis set specification for any given atom. If a basis set is to be declared for a second atom, it should be included after the **** line without any blank line in between.

WARNING! Sometimes we can find more than one basis set in a single file this is due to different representations, spherical or cartesian basis sets. Gaussian by default uses cartesian (5D,7F) functions. Pure gaussian use 6 functions for d-type orbitals and 10 for f-type orbitals (6D, 10F). Calculations must be consistent throughout, hence all basis functions should be either cartesian or pure.

Inclusion of a pseudopotential allows for more computational resources to be used for calculation of the electronic structure of the valence shell by replacing the inner electrons for a set of functions which simulate the presence of these and their effect (such as shielding) on the valence electrons. There are full core pseudopotentialas, which replace the entire core (kernel). There are also medium core pseudopotentials which only replace the previous kernel to the full one, allowing for the outermost core electrons to be explicitly calculated. The correct inclusion of a pseudopotential is shown below exemplified by the LANL2DZ ECP by Hay and Wadt for the Chlorine atom.

spin multiplicity
Molecular coordinates
- blank line -
basis set for atom1
****
basis set for atom2 (if there is any)
****
- blank line -
CL     0
CL-ECP     2     10
d   potential
  5
1     94.8130000            -10.0000000
2    165.6440000             66.2729170
2     30.8317000            -28.9685950
2     10.5841000            -12.8663370
2      3.7704000             -1.7102170
s-d potential
  5
0    128.8391000              3.0000000
1    120.3786000             12.8528510
2     63.5622000            275.6723980
2     18.0695000            115.6777120
2      3.8142000             35.0606090
p-d potential
  6
0    216.5263000              5.0000000
1     46.5723000              7.4794860
2    147.4685000            613.0320000
2     48.9869000            280.8006850
2     13.2096000            107.8788240
2      3.1831000             15.3439560

If a second ECP is to be introduced, it should be placed right after the first one without any blank line! If a blank line is detected then the program will assume it’s done reading all ECPs and Basis Sets.

Finally, here is an example of a combination of both keywords. If a second ECP was needed then we’d place it at the end of the first one without a blank line. The molecule is any given chlorinated hydrocarbon (H, C and Cl atoms exclusively)

#P B3LYP/gen pseudo=read ADDITIONAL-KEYWORDS
- blank line -
0 1
Molecular Coordinates
- blank line -
H     0
S   3   1.00
     19.2384000              0.0328280
      2.8987000              0.2312040
      0.6535000              0.8172260
S   1   1.00
      0.1776000              1.0000000
****
C     0
S   7   1.00
   4233.0000000              0.0012200
    634.9000000              0.0093420
    146.1000000              0.0454520
     42.5000000              0.1546570
     14.1900000              0.3588660
      5.1480000              0.4386320
      1.9670000              0.1459180
S   2   1.00
      5.1480000             -0.1683670
      0.4962000              1.0600910
S   1   1.00
      0.1533000              1.0000000
P   4   1.00
     18.1600000              0.0185390
      3.9860000              0.1154360
      1.1430000              0.3861880
      0.3594000              0.6401140
P   1   1.00
      0.1146000              1.0000000
****
Cl     0
S   2   1.00
      2.2310000             -0.4900589
      0.4720000              1.2542684
S   1   1.00
      0.1631000              1.0000000
P   2   1.00
      6.2960000             -0.0635641
      0.6333000              1.0141355
P   1   1.00
      0.1819000              1.0000000
****
- blank line -
CL     0
CL-ECP     2     10
d   potential
  5
1     94.8130000            -10.0000000
2    165.6440000             66.2729170
2     30.8317000            -28.9685950
2     10.5841000            -12.8663370
2      3.7704000             -1.7102170
s-d potential
  5
0    128.8391000              3.0000000
1    120.3786000             12.8528510
2     63.5622000            275.6723980
2     18.0695000            115.6777120
2      3.8142000             35.0606090
p-d potential
  6
0    216.5263000              5.0000000
1     46.5723000              7.4794860
2    147.4685000            613.0320000
2     48.9869000            280.8006850
2     13.2096000            107.8788240
2      3.1831000             15.3439560
- blank line -

If you like this post or found it useful please leave a comment, share it or just give it a like. It is as much fun to find out people is reading as it is finding the answer to ones questions in someone else’s blog :)

Peace out!

About these ads

About joaquinbarroso

Theoretical chemist in his early thirties, in love with life and deeply in love with his woman. I love science, baseball, literature, movies (perhaps even in that order). I'm passionate about food and lately wines have become a major hobby. In a nutshell I'm filled with regrets but also with hope, and that is called "living".

Posted on November 2, 2011, in Computational Chemistry, Gaussian, Models, Software, Theoretical Chemistry, White papers and tagged , , , , , , . Bookmark the permalink. 16 Comments.

  1. Hello dear Dr. Barroso,

    Thanks you for your blog, It has helped me several times.

    May I have a question about G09.

    I would like to do a relaxed PES scan (on dihedral angle) at HF level and a single point at B3LYP level on every step.

    I have tried this route:

    # B3LYP/6-31G(d)//HF/6-31G(d) geom=modredundant

    But the single point is done only on the last step.

    I have also tried to do a PES scan and a second calculation with geom=check, again, the single point was done only on the last step.

    Could you help me?

    Thanks

    • Thank you very much for your thoughts about the blog!

      About your question. I don’t think its possible to do what you are asking to do; not automatically anyway. You would have to do it manually by performing the scan at the HF level and then extracting (with gaussview for instance) the intermediate geometries and do the single point at those geometries.
      I must say the methodology sounds very odd. May I ask what is it you are trying to achieve? Is there any reason why you wont do both scans? If you are thinking about “correcting” the energy at each point with the use of DFT let me tell you this is wrong. The energy and geometries achieved belong to different potential energy hyper-surfaces, i.e., results from one, are not necessarily valid on the other.
      If you could give me more information I could help you better.

      In the mean time I hope this helps.

      Best wishes!

  2. Dear Sir,

    Your blog is simply superb. Love the way and you write the things.
    Your post on Gen ECP using EMSl made my task so easy. Waiting for more posts.
    Thanking you,
    Nijam,
    From India.

  3. also you can add the basis set, using a file at the end of the geom specification adding the line with @

    ie:
    %chk=blablabla…


    [coordinates, geom specification, etc]

    @./custombasis.gbs (<— file "custombasis.gbs" in the same directory of the gaussian input.).

    And so, the extra file content, is printed enterly at the begin of the output log. Sometimes its more easy copy a file and add a line, that recycle a input, modifying molecules hehehe

    Mario.

  4. Hi Joaquin,
    I have been a silent reader of your blog for quite few years and thanks for the excellent job that you do here…!!

    Now, I have a question about ONIOM in G09. A TS search I was doing is crashing with an error that I am not familiar with.

    Here is the input.

    %nprocshared=8
    Will use up to 8 processors via shared memory.
    %mem=8000MW
    %name.chk

    # opt=(calcfc,qst3,quadmac,noeigentest,maxcycle=8000) freq=noraman oniom(b3lyp/genecp:uff) nosymm guess=save scf=(qc,maxcyc=7400) geom=connectivity

    ================================================================
    I have a Cd atom in the system, so I am supplying external basis set (LANL2DZ-ECP).
    and rest of the atoms are treated with B3LYP/6-31+G(d):uff

    now here is the end of the output.
    =================================================================

    Convergence on expansion vectors, NOT on wavefunctions!
    H-V products: 292
    Lowest eigenvalue= -0.00688
    Eigenvector required to have negative eigenvalue:
    QM components of vector:
    A5 R6 D18 D17 D104
    1 0.43598 -0.37685 -0.36031 -0.31978 -0.29053
    D16 R7 D10 A55 D106
    1 -0.21855 -0.21495 -0.16994 -0.14925 -0.13923
    Largest MM components of vector:
    Atom 353 cartesian Z -0.03045
    Atom 352 cartesian Z -0.01502
    Atom 401 cartesian Z -0.01458
    Atom 403 cartesian Z 0.01411
    Atom 400 cartesian Z -0.01197
    Atom 416 cartesian Z -0.01166
    Atom 406 cartesian Z -0.01100
    Atom 401 cartesian Y -0.01065
    Atom 353 cartesian Y 0.01007
    Atom 405 cartesian Y -0.00997
    H-V products w/o non-bonded: 30
    SchOr2 failed for MMProj.
    Error termination via Lnk1e in /opt/scyld/g09/l103.exe at Sun Mar 17 03:41:05 2013.
    Job cpu time: 1 days 12 hours 56 minutes 56.7 seconds.
    File lengths (MBytes): RWF= 1412 Int= 0 D2E= 0 Chk= 35 Scr= 1
    ================================================================

    All the atoms specified in the output are MM layer hydrogens. I looked at the geometry and couldn’t find anything peculiar with their geometry/connectivity.
    Would you please help me to figure out what might be the issue here?

    Thanks a lot.

    Roby

  5. Thanks a lot.

  6. Thank you very much for a very good blog. The manual is good for understanding the principles for each keyword, but I often find it hard to understand wher to put it inthe inputfile (as in this case), Thank you very much. I have one extra question regardgin this situation though. If I want to use the basis set cc-pvdz for most atoms, but add a diffuse function on only a few atoms (i.e. some H atoms with and some without aug), is the following correct?

    title

    0 1
    atom coordinates

    H C N F 0
    cc-pvdz
    ****
    8 9 10 2 0
    aug-cc-pvdz
    ****

    with thisI intend that all H C N F are described by cc-pvdz except atoms no 8,9,10,2 in the list of atoms

    Thank you very much for a useful blog!

    • Thank you very much for your kind words, Francesca! I’m glad to know this little blog has been helpful to you.

      About your question, I must say I’ve never tried something like that but it sounds correct. If it doesn’t work then try doing an explicit listing like:
      H 1-10,12-24
      cc-pvdz
      ****
      H 11
      aug-cc-pvdz
      ****

      This sets the first basis set to all H atoms but number 11 which is assigned the augmented basis set in the following line. As I said before I’ve never tried anything of this sort so I can’t say it will work but try both schemes and please let me know if any of them worked.

      Have a nice day!

  7. What’s up, after reading this remarkable piece
    of writing i am as well happy to share my familiarity here with colleagues.

  8. Zabiollah Mahdavifar

    thank you for your help

  9. Thank you for posting some helpful examples, they really helped me get my custom basis sets calculations up and running quickly.

    A quick nit-picky comment about the following however…
    “Gaussian by default uses cartesian (5D,7F) functions. Pure gaussian use 6 functions for d-type orbitals and 10 for f-type orbitals (6D, 10F). Calculations must be consistent throughout, hence all basis functions should be either cartesian or pure.”

    Pure (spherical harmonic) basis functions use 5 functions at the d-level and 7 functions at the f-level, whereas Cartesian representations require 6 and 10 functions respectively. More generally, for a given angular momentum, L, there are (L+1)(L+2)/2 Cartesian functions and 2*L+1 spherical harmonics, where s -> L=0, p -> L=1, d -> L=2, etc.

    All of the actually useful information in your statement is correct however, ie the Gaussian 09 default values and the need for consistency in the representation.

    Again thank you for the helpful post.

    cheers,
    -frank

  10. Hi Dr Barraso,

    Thank you for your useful blog. I want to ask you a question. I’m trying to do b3lyp/def2-tzvp calculations for first row transition metal complexes in Gaussian 03. But I had problems while using that GEN keyword. For example my input file is like that:

    – – – – – –
    chk=name.chk
    %NProcShared=8
    %mem=10GB
    #P opt B3LYP/GEN

    Multiplicity

    Cordinates

    and the basis set from emsl website
    – – – – – –

    after 129 cycles, there was an error “annihilation of the first spin contaminant, convergence failure…”

    so I did same calculation with the output coordinates of that calculation. This time, it said “SCF is confused”, then I added scf=qc keyword to my new calculation, again there was an error “The polynomial fit failed, no lower point found”.

    Do you suggest me anything about that? And if i need to use pseudopotentials how can I find those pseudopotentials that fit to my complexes.

    Thanks a lot.

  11. Dr. Barroso, Muchisimas gracias por este trabajo. No se imagina la ayuda tan grande que ha sido para nosotros su Blog.

  12. Hello I am so glad I found your weblog, I really found you by mistake, while I was searching on Aol
    for something else, Nonetheless I am here now and would just like to say kudos for
    a incredible post and a all round exciting blog (I also love the
    theme/design), I don’t have time to read it all at the moment but I
    have book-marked it and also added your RSS feeds, so when I have time I will be back to read a great
    deal more, Please do keep up the awesome work.

  13. excellent issues altogether, you just won a brand new reader.
    What may you recommend in regards to your submit that you made some days
    in the past? Any positive?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,221 other followers

%d bloggers like this: