Molpro

From CCMSTWiki

Jump to: navigation, search

Contents

Adding Content to the Page

If you wish to add any content to this page, make sure you have read the Suggestions for Adding Content. Hopefully, this will allow anything added to this page to be as helpful as possible.


User Manual

Molpro User's Manual (On vergil)


Bugs

Molpro 2002 cannot be trusted for vibrational frequencies. It computes frequencies numerically, but no matter how tightly you converge the energies, a bug creates numerical noise which means that you do not get the 6 (or 5, for a linear molecule) frequencies which are supposed to be zero to actually be zero --- often they are around 50/cm or less. Because MOLPRO cannot distinguish the actual small frequencies from these "fake" small frequencies, it suppresses small frequencies in its computation of ZPVEs or finite-temperature thermal corrections. This is not a huge problem for ZPVE's, since a small frequency won't contribute much, but it IS a huge problem for thermal corrections, because small frequencies contribute a lot to the vibrational partition function. Hence, do not use MOLRPO 2002 for temperature corrections, and treat all frequencies with extreme caution. UPDATE: Molpro 2006.1 seems to have fixed the problem of reporting small numerical frequencies. It either has high energy and gradient thresholds which make these numerical frequencies very close to zero and/or knows how to project our rotations and translations and/or suppresses the printing of these annoying little frequencies.

Direct vs Semi-Direct vs non-Direct

Molpro has the ability to run any computaions (except full CI and CCSD(T)) in a fully "direct" mode, meaning that no (or little) disk is used, and the integrals are re-calculated over and over as needed. This takes more CPU time (because of the need to recalculate integrals), but removes the need to read and write very large disk files. For a sufficiently large number of integrals, the direct mode becomes preferable because the disk files for the integrals (and other intermediates) eventually become too large to store. Even if they can be stored, if they are large enough and the disks aren't fast enough, it could possibly take more time to read and write these files than to recalculate the integrals. For MP2 (and possibly other methods?) there is also a semi-direct method that is in between fully direct and the conventional (disk based method).

To tell Molpro to do a computation in direct mode, use the keyword gdirect. For MP2, it will then try to decide if direct or semi-direct is faster (although the SCF part will be performed direct). If you want to force fully direct mode for MP2, use gdirect;dmp2=0. Remember if you run conventionally or semi-direct, you need to reserve a Scratch resource. If you run fully direct, you do not. The speed of the semi-direct and direct computations will depend on the amount of memory you give the program. The more memory, the fewer passes through the integrals will be required. The cost of the computation is (approximately) directly proportional to the number of integral passes.

How can you tell if you need to run direct or semi-direct? We need to compile some data on this. Certainly if your job crashes because the disk is filled, then you need to run direct or semi-direct instead of the conventional (non-direct) method.

Table I: Timings (real time, seconds, egate) for various MP2 computations
Case Conventional Semi-direct Direct Passes
Sandwich benzene dimer, aug-cc-pVTZ disk fill 28,299 36,762 1

Parallel Computations

To run Molpro in parallel on Egate, a typical submission script would look like

#!/bin/sh

#BSUB -J 2procTemplate
#BSUB -R "rusage[mem=1500] span[ptile=2]"
#BSUB -W 200:00
#BSUB -n 2

setenv NUM_COMPUTE_THREADS 2 

runmolprop test.com

It is essential to include "span[ptile=2]" in the resource requirement and set the environmenal variable "NUM_COMPUTE_THREADS" to 2 because parallel Molpro jobs which are supposed to use one of two processors on a node have been known to create multiples threads actually using both processors on the same node. That obviously creates a lot of problems.

You should give the MOLPRO input file the amount of memory to use per thread

Below are some benchmarks of Molpro parallel jobs on Egate. Tables II shows the scaling for large single energy point calculations while Table III compares the timing for transition state searches and vibrational frequency calculations using numerical gradients and Hessians. Table IV demonstrates the disadvantage of using parallel computations for smaller calculations where the overhead of parallelization becomes comparable to or exceeds the actual computation time.

Reported are real times in seconds decomposed into their main components, namely integral computation (ints), SCF iterations (hf) and correlated wavefuction calculations [ks, mp2, ccsd(t)]. For transition state and vibrational frequency calculations where numerical gradients and Hessians are computed numerically from a series of single point energy calculations, the total time for these processes is lumped into TSsearch/opt and freq. Parallel jobs ran with 'span[ptile=2]'.

Table II: Scaling for Single Energy Point Calculations (CH4 X/cc-pVQZ= 196 SOs)
Method t(n=1) t(n=2) t(n=4) t(n=8) t1/t2 t1/t4 t1/t8
B3LYP-int 46.13 34.82 16.67 14.72 1.32 2.77 3.13
B3LYP-hf 163.20 65.68 38.51 26.77 2.48 4.24 6.10
B3LYP-ks 257.86 116.78 61.68 36.47 2.21 4.18 7.07
CCSD(T)-int 39.48 30.82 21.01 13.84 1.28 1.88 2.85
CCSD(T)-hf 167.15 62.30 41.90 25.37 2.68 3.99 6.59
CCSD(T) 424.46 198.19 129.83 87.41 2.14 3.27 4.86
MP2-int 39.27 30.27 19.42 15.01 1.30 2.02 2.62
MP2-hf 156.40 61.35 40.54 25.45 2.55 3.86 6.15
MP2 162.64 64.65 42.36 26.55 2.52 3.84 6.13


The scaling for single energy points is fairly good up to 8 processors. The integral computation part doesn't seem to parallelize very well.


Table III: Scaling for Transition State Search for H2-CCH (H2-CCH X/aug-cc-pVQZ= 330 SOs)
Method t(n=1) t(n=2) t(n=4) t1/t2 t1/t4
B3LYP-int 178.18 119.60 91.59 1.49 1.95
B3LYP-hf 259.75 161.14 113.13 1.61 2.30
B3LYP-ks 353.47 208.89 138.31 1.69 2.56
B3LYP-TSsearch 51961.88 32422.17 19184.04 1.60 2.71
CCSD(t)-int 173.68 112.62 80.07 1.54 2.17
CCSD(T)-hf 255.48 154.43 101.55 1.65 2.52
CCSD(T) 3106.21 1746.15 1137.25 1.78 2.73
CCSD(T)-TSsearch 226589.35 123904.63 102099.57 1.83 2.22
MP2-int 176.30 113.76 118.92 1.55 1.48
MP2-hf 257.84 155.21 148.08 1.66 1.74
MP2 271.20 162.88 153.93 1.67 1.76
MP2-TSsearch 15381.71 9459.53 6454.25 1.63 2.38

The scaling for this transition state search is also pretty good up to 4 processors. Going from 1 to 2 processors gives a much better scaling than going from 1 to 4 or 2 to 4 processors.

Table IV: Scaling for Geometry Optimization and Vibrational Frequency Calculations of CH4 (cc-pVDZ= 55 SOs)
Method t(n=1) t(n=2) t(n=4) t(n=8) t1/t2 t1/t4 t1/t8
B3LYP-int 0.54 0.26 0.90 0.33 2.08 0.60 1.64
B3LYP-hf 0.66 0.33 0.95 0.37 2.00 0.69 1.78
B3LYP-ks 2.09 1.09 2.10 0.83 1.92 1.00 2.52
B3LYP-opt 72.77 112.43 - - 0.65 - -
B3LYP-freq 812.84 564.85 - - 1.44 - -
CCSD(T)-int 0.23 0.27 0.34 0.26 0.85 0.68 0.88
CCSD(T)-hf 0.35 0.34 0.39 0.30 1.03 0.90 1.17
CCSD(T) 1.00 0.73 1.66 0.58 1.37 0.60 1.72
CCSD(T)-opt 27.05 19.08 17.47 14.92 1.42 1.55 1.81
CCSD(T)-freq 277.15 176.85 136.56 116.67 1.57 2.03 2.38
MP2-int 0.23 0.26 1.04 0.35 0.88 0.22 0.66
MP2-hf 0.34 0.33 1.10 0.40 1.03 0.31 0.85
MP2 0.51 0.39 1.32 0.65 1.31 0.39 0.78
MP2-opt 15.71 31.65 - - 0.50 - -
MP2-freq 53.93 104.37 - - 0.52 - -

The scaling here is not good and often sporadic. MP2 and B3LYP optimization and frequency calculations couldn't be done for more than 2 processors. This is probably an ideal case showing that using multiple processors on small jobs is not a good idea because the overhead of managing different processes and threads becomes comparable or more time-consuming than the actual computation. So, unless a Molpro job is sufficiently large, one should not run it in parallel.

Table V: Sandwich benzene dimer ghost computations, MP2/aQZ (1512 basis functions; real times in seconds on egate)
R n=1, 1 pass n=2, 1 pass n=2, 2passes
- (850MW) (400MWx2) (200MWx2)
3.7 292035 288758 526072
3.9 304145 -

Note: We haven't really tested the effect of the global array memory size (specified by the -G option) on the efficiency of MOLPRO parallel computations.

Quick Notes

Disclaimer: The comments below are pertinent to Molpro 2002.6, but not necessarily to newer versions.

  • Molpro has no analytic gradients or hessians for any method except for closed shell SCF.BT
  • Molpro has disabled UMP2 geometry optimization and transition state search. BT
  • Molpro assumes all GTOs are spherical, so make sure you use the 'cartesian' keyword when specifying GTOs in basis sets like 3-21G, 6-31G ...etc if they contain d, f, and higher order angular momentum functions. BT
  • Molpro's default convergence criteria are generally questionable. One should probably set thresholds using the GTHRESH keyword. Very conservative settings would be:
GTHRESH,ZERO=1.d-14,ONEINT=1.d-14,TWOINT=1.d-14,ENERGY=1.d-12,GRAD=1.d-6,step=1.d-6

For single-points of van der Waals complexes, one should use an energy convergence of at least 1.d-8. Somewhat less conservative settings, which should be ok if no numerical differentiation is being done, might be:

GTHRESH,ENERGY=1.e-8,ONEINT=1.e-14,TWOINT=1.e-12,GRAD=1.d-6
  • Molpro's default print levels are not great either - it doesn't even print orbital energies. One change the print level using the GPRINT keywork like
GPRINT,basis,orbital 
  • Molpro freezes core for correlated wavefunction methods be default. One must include the 'core' keyword to unfreeze core orbitals. Especially when using ECPs, it might become problematic because it could attempt to freeze core electrons of the atom for which an ECP is used. BT
  • When trying to compute include relativistic effects using Douglas-Kroll integrals, whether you include 'dkroll=1' keyword before before or after your basis set makes a difference, especially when using correlation consistent basis.
dkroll=1
basis=cc-pvdz

computes Douglas-Kroll integrals for the cc-pVDZ basis. On the other hand,

basis=cc-pvdz 
dkroll=1

sets the basis as cc-pVDZ-DK and computes Douglas-Kroll integrals on that. For the case of H2 the energies computed using the above two specifications are -1.16684858 a.u. and -1.16684111 a.u., respectively. For larger systems, the difference would probably be much more significant.

  • Whenever using Molpro for DFT calculations, one must set a very tight integration grid by using the GRID and THR keywords.
GRID,1800.2,UNKNOWN
THR,1e-10,0,0

The calculations would take longer, considering the default is THR,1e-5,0,0. For more on this, see the Molpro man pages

SCS-MP2 Singlet and Triplet Pair Energies

MOLPRO SCS calculations print out a section that looks like:

SCS-MP2 correlation energy: -2.247339312134 (PS= 1.200000 PT= 0.333333)

SCS-MP2 total energy: -555.538672906519


Norm of first-order wave function: 0.589320334537

Coefficient of reference function: 0.793221140007


MP2 singlet pair energy -1.417576015355

MP2 triplet pair energy -0.877898722031


Reference energy -553.291333594385

Correlation energy -2.295474737386


!MP2 STATE 1.1 ENERGY -555.586808331771


The SCS-MP2 correlation energy is:

 ESCS = (2/3)*PT*Et + PS*Eopp

or:

 ESCS = (2/3)*PT*Et + (1/3)*PS*Et+ PS*Es

where,

PT = scsfact, the same spin scaling factor

Et = MP2 triplet pair energy

PS = scsfacs, the opposite spin scaling factor

Eopp = ECorrelation energy - (2/3)Et

Es = MP2 singlet pair energy

How To Guides

Personal tools