FNOCC: Frozen natural orbitals for CCSD(T), QCISD(T), CEPA, and MP4

Code author: A. Eugene DePrince

Section author: A. Eugene DePrince

Module: Keywords, PSI Variables, FNOCC

Warning

There is a known bug concerning the i7-5930 series combined with the Intel 15 compilers and MKL 11.2.3. When PSI4 is compiled under these conditions, parallel runs of the FNOCC code have experienced nonsensical CCSD correlation energies (often several Hartrees lower than the starting guess). At the moment, the only confirmed solutions are running serially, using a different BLAS implementation, or upgrading to Intel 16.0.2 and MKL 11.3.2.

Frozen natural orbitals (FNO)

The computational cost of the CCSD [Purvis:1982], CCSD(T) [Raghavachari:1989], and related methods be reduced by constructing a compact representation of the virtual space based on the natural orbitals of second-order perturbation theory [Sosa:1989:148]. The most demanding steps in the CCSD and (T) algorithms scale as \({\cal{O}}(o^2v^4)\) and \({\cal{O}}(o^3v^4)\), where \(o\) and \(v\) represent the number of oribitals that are occupied and unoccupied (virtual) in the reference function, respectively. By reducing the the size of the virtual space, the cost of evaluating these terms reduces by a factor of \((v / v_{FNO})^4\), where \(v_{FNO}\) represents the number of virtual orbitals retained after the FNO truncation.

The general outline for the FNO procedure in PSI4 is:

  1. construct the virtual-virtual block of the unrelaxed MP2 one-particle density matrix (OPDM)

  2. diagonalize this block of the OPDM to obtain a set of natural virtual orbitals

  3. based on some occupancy threshold, determine which orbitals are unimportant and may be discarded

  4. project the virtual-virtual block of the Fock matrix onto the truncated space

  5. construct semicanonical orbitals by diagonalizing the virtual-virtual block of the Fock matrix

  6. proceed with the QCISD(T) / CCSD(T) / MP4 computation in the reduced virtual space

A second-order correction based upon the MP2 energies in the full and truncated spaces captures much of the missing correlation effects. More details on the implementation and numerical accuracy of FNO methods in PSI4 can be found in [DePrince:2013:293]. FNO computations are controlled through the keywords NAT_ORBS and OCC_TOLERANCE, or by prepending a valid method name with “fno” in the energy call as

energy('fno-ccsd(t)')

If you wish to specify the number of active natural orbitals manually, use the keyword ACTIVE_NAT_ORBS. This keyword will override the keyword OCC_TOLERANCE.

QCISD(T), CCSD(T), MP4, and CEPA

The FNOCC module in PSI4 supports several related many-body quantum chemistry methods, including the CCSD(T) and QCISD(T) methods, several orders of many-body perturbation theory (MP2-MP4), and a family methods related to the coupled electron pair approximation (CEPA).

Quadratic configuration interaction and coupled cluster

The quadratic configuration interaction singles doubles (QCISD) method of Pople, Head-Gordon, and Raghavachari [Pople:1987:5968] was originally presented as a size-consistent extension of configuration interaction singles doubles (CISD). The method can also be obtained as a simplified version of the coupled cluster singles doubles (CCSD) method [Purvis:1982]. Consider the set of equations defining CCSD:

(1)\[\begin{split}\langle \Psi_0 | (H - E) (1 + T_1 + T_2 + \frac{1}{2}T_1^2)|\Psi_0\rangle = 0, \\ \langle \Psi_i^a | (H - E) (1 + T_1 + T_2 + \frac{1}{2}T_1^2+T_1T_2+\frac{1}{3!}T_1^3)|\Psi_0\rangle = 0, \\ \langle \Psi_{ij}^{ab} | (H - E) (1 + T_1 + T_2 + \frac{1}{2}T_1^2 + T_1T_2+\frac{1}{3!}T_1^3+\frac{1}{2}T_2^2+\frac{1}{2}T_1^2T_2+\frac{1}{4!}T_1^4)|\Psi_0\rangle = 0, \\\end{split}\]

where we have chosen the intermediate normalization, \(\langle \Psi_0| \Psi \rangle = 1\), and the symbols \(T_1\) and \(T_2\) represent single and double excitation operators. The QCISD equations can be obtained by omitting all but two terms that are nonlinear in \(T_1\) and \(T_2\):

(2)\[\begin{split}\langle \Psi_0 | (H - E) (1 + T_1 + T_2)|\Psi_0\rangle = 0, \\ \langle \Psi_i^a | (H - E) (1 + T_1 + T_2 + T_1T_2)|\Psi_0\rangle = 0, \\ \langle \Psi_{ij}^{ab} | (H - E) (1 + T_1 + T_2 + \frac{1}{2}T_2^2)|\Psi_0\rangle = 0. \\\end{split}\]

QCISD is slightly cheaper that CCSD computationally, but it retains the \({\cal{O}}(o^2v^4)\) complexity of the original equations. Just as in the familiar CCSD(T) method, the effects of connected triple excitations may be included noniteratively to yield the QCISD(T) method. Both the QCISD(T) and CCSD(T) methods are implemented for closed-shell references in the FNOCC module.

Many-body perturbation theory

QCI and CC methods are closely related to perturbation theory, and the MP2, MP3, and MP4(SDQ) correlation energies can be obtained as a free by-product of a CCSD or QCISD computation. The following is an example of the results for a computation run with the call energy('fno-qcisd') to energy():

QCISD iterations converged!

      OS MP2 FNO correction:                -0.000819116338
      SS MP2 FNO correction:                -0.000092278158
      MP2 FNO correction:                   -0.000911394496

      OS MP2 correlation energy:            -0.166478414245
      SS MP2 correlation energy:            -0.056669079827
      MP2 correlation energy:               -0.223147494072
    * MP2 total energy:                    -76.258836941658

      OS MP2.5 correlation energy:          -0.171225850256
      SS MP2.5 correlation energy:          -0.054028401038
      MP2.5 correlation energy:             -0.225254251294
    * MP2.5 total energy:                  -76.260943698880

      OS MP3 correlation energy:            -0.175973286267
      SS MP3 correlation energy:            -0.051387722248
      MP3 correlation energy:               -0.227361008515
    * MP3 total energy:                    -76.263050456101

      OS MP4(SDQ) correlation energy:       -0.180324322304
      SS MP4(SDQ) correlation energy:       -0.048798468084
      MP4(SDQ) correlation energy:          -0.230995119324
    * MP4(SDQ) total energy:               -76.266684566910

      OS QCISD correlation energy:          -0.181578117924
      SS QCISD correlation energy:          -0.049853548145
      QCISD correlation energy:             -0.231431666069
    * QCISD total energy:                  -76.267121113654

The first set of energies printed corresponds to the second-order FNO correction mentioned previously. Results for many-body perturbation theory through partial fourth order are then provided. The notation MP4(SDQ) indicates that we have included all contributions to the correlation energy through fourth order, with the exception of that from connected triple excitations.

One need not run a full QCISD or CCSD computation to obtain these perturbation theory results. The keywords for invoking perturbation theory computations are given below in Table FNOCC Methods. Full MP4 correlation energies are also available.

Coupled electron pair approximation

Coupled-pair methods can be viewed as approximations to CCSD or as size-extensive modifications of CISD. The methods have the same complexity as CISD, and solving the CISD or coupled-pair equations requires fewer floating point operations than solving the CCSD. CISD, CCSD, and the coupled-pair methods discussed below all scale formally with the sixth power of system size, and, as with the QCISD method, CEPA methods retain \({\cal{O}}(o^2v^4)\) complexity of the CCSD equations. For a detailed discussion of the properties of various coupled-pair methods, see [Wennmohs:2008:217].

What follows is a very basic description of the practical differences in the equations that define each of the coupled-pair methods implemented in PSI4. We begin with the CISD wave function

(3)\[| \Psi \rangle = | \Psi_0 \rangle + \sum_i^{occ} \sum_a^{vir} t_i^a | \Psi_i^a\rangle + \frac{1}{4}\sum_{ij}^{occ} \sum_{ab}^{vir} t_{ij}^{ab} | \Psi_{ij}^{ab}\rangle,\]

where we have chosen the intermediate normalization, \(\langle \Psi_0 | \Psi \rangle = 1\). The CISD correlation energy is given by

(4)\[E_c = \langle \Psi_0 | \hat{H} - E_0 | \Psi \rangle,\]

and the amplitudes can be determined by the solution to the coupled set of equations:

(5)\[\begin{split}0 &= \langle \Psi_{ij}^{ab} | \hat{H} - E_0 - E_c | \Psi \rangle, \\ 0 &= \langle \Psi_{i}^{a} | \hat{H} - E_0 - E_c | \Psi \rangle.\end{split}\]

The CISD method is not size-extensive, but this problem can be overcome by making very simple modifications to the amplitude equations. We replace the correlation energy, \(E_c\), with generalized shifts for the doubles and singles equations, \(\Delta_{ij}\) and \(\Delta_i\):

(6)\[\begin{split}0 &= \langle \Psi_{ij}^{ab} | \hat{H} - E_0 - \Delta_{ij} | \Psi \rangle, \\ 0 &= \langle \Psi_{i}^{a} | \hat{H} - E_0 - \Delta_i | \Psi \rangle.\end{split}\]

These shifts approximate the effects of triple and quadruple excitations. The values for \(\Delta_{ij}\) and \(\Delta_i\) used in several coupled-pair methods are given in Table CEPA Shifts. Note that these shifts are defined in a spin-free formalism for closed-shell references only.

method

\(\Delta_{ij}\)

\(\Delta_i\)

cisd

\(E_c\)

\(E_c\)

cepa(0)

0

0

cepa(1)

\(\frac{1}{2}\sum_k(\epsilon_{ik}+\epsilon_{jk})\)

\(\sum_k \epsilon_{ik}\)

cepa(3)

\(-\epsilon_{ij}+\sum_k(\epsilon_{ik}+\epsilon_{jk})\)

\(-\epsilon_{ii}+2\sum_k \epsilon_{ik}\)

acpf

\(\frac{2}{N} E_c\)

\(\frac{2}{N} E_c\)

aqcc

\([1-\frac{(N-3)(N-2)}{N(N-1)}]E_c\)

\([1-\frac{(N-3)(N-2)}{N(N-1)}]E_c\)

The pair correlation energy, \(\epsilon_{ij}\), is simply a partial sum of the correlation energy. In a spin-free formalism, the pair energy is given by

(7)\[\epsilon_{ij} = \sum_{ab} v_{ij}^{ab} (2 t_{ij}^{ab} - t_{ij}^{ba})\]

Methods whose shifts (\(\Delta_{ij}\) and \(\Delta_i\)) do not explicitly depend on orbitals \(i\) or \(j\) (CISD, CEPA(0), ACPF, and AQCC) have solutions that render the energy stationary with respect variations in the amplitudes. This convenient property allows density matrices and 1-electron properties to be evaluated without any additional effort. Note, however, that 1-electron properties are currently unavailable when coupling these stationary CEPA-like methods with frozen natural orbitals.

Density-fitted coupled cluster

Density fitting (DF) [or the resolution of the identity (RI)] and Cholesky decomposition (CD) techniques are popular in quantum chemistry to avoid the computation and storage of the 4-index electron repulsion integral (ERI) tensor and even to reduce the computational scaling of some terms. DF/CD-CCSD(T) computations are available in PSI4, with or without the use of FNOs, through the FNOCC module. The implementation and accuracy of the DF/CD-CCSD(T) method are described in [DePrince:2013:2687].

The DF-CCSD(T) procedure uses two auxiliary basis sets. The first set is that used in the SCF procedure, defined by the DF_BASIS_SCF keyword. If this keyword is not specified, an appropriate -JKFIT set is automatically selected. This auxiliary set defines the ERIs used to build the Fock matrix used in the DF-CCSD(T) procedure. The second auxiliary set is used to approximate all other ERIs in the DF-CCSD(T) procedure. The choice of auxiliary basis is controlled by the keyword DF_BASIS_CC. By default, DF_BASIS_CC is the RI set (optimized for DFMP2) most appropriate for use with the primary basis. For example, if the primary basis is aug-cc-pVDZ, the default DF_BASIS_CC will be aug-cc-pVDZ-RI.

Alternatively, the user can request that the DF-CCSD(T) procedure use a set of vectors defined by the Cholesky decomposition of the ERI tensor as the auxiliary basis. This feature is enabled by specifying CC_TYPE CD. CD methods can be enabled in the SCF procedure as well, by specifying the SCF_TYPE as CD. The accuracy of the decomposition can be controlled through the keyword CHOLESKY_TOLERANCE.

The following input file sets up a DF-CCSD(T) computation using CD integrals

molecule h2o {
    0 1
    O
    H 1 1.0
    H 1 1.0 2 104.5
}

set {
    scf_type cd
    cc_type cd
    basis aug-cc-pvdz
    freeze_core true
}
energy('ccsd(t)')

The resulting CCSD(T) correlation energy will be equivalent to that obtained from a conventional computation if CHOLESKY_TOLERANCE is sufficiently small (e.g. 1e-9).

Gn theory

The FNOCC module contains all the components that comprise the Gn family of composite methods. Currently, only the G2 method is supported [Curtiss:1991:7221]. The G2 procedure may be called through the energy() wrapper:

energy('gaussian-2')

Supported methods

The various methods supported by the FNOCC module in PSI4 are summarized in Table FNOCC Methods and detailed (except FNO) in Table FNOCC Capabilities. Note that these methods are implemented for closed-shell reference only. Without set qc_module fnocc, these methods may default to implementations in other modules.

Theoretical methods accessible through FNOCC

name

calls method

plain

FNO

mp3

third-order MP perturbation theory

E

E

mp4(sdq)

fourth-order MP perturbation theory, less triples

E

E

mp4

full fourth-order MP perturbation theory

E

E

cisd

configuration interaction singles and doubles

E

E

qcisd

quadratic configuration interaction singles doubles

E

E

qcisd(t)

QCISD with perturbative triples

E

E

lccd

linear coupled cluster doubles

E

E

cepa(0), lccsd

coupled electron pair approximation, variant 0

E

E

cepa(1)

coupled electron pair approximation, variant 1

E

E

cepa(3)

coupled electron pair approximation, variant 3

E

E

acpf

averaged coupled-pair functional

E

E

aqcc

averaged quadratic coupled-cluster

E

E

ccsd

coupled cluster singles and doubles

E

E

ccsd(t)

coupled cluster singles and doubles with perturbative triples

E

E

Detailed capabilities of the FNOCC module. “✓” runs analytically. Single underline “✓̲” is default module when QC_MODULE unspecified. Double underline “✓̳” is default algorithm type when type selector (e.g., CC_TYPE) unspecified.

name ↓ →

REFERENCE

type[1] ↓ →

FREEZE_CORE[2]

QC_MODULE=FNOCC Capabilities

Restricted (RHF)

Unrestricted (UHF)

ROHF

energy()

energy()

energy()

CV

DF

CD

CV

DF

CD

CV

DF

CD

A

F

A

F

A

F

A

F

A

F

A

F

A

F

A

F

A

F

mp2

MP2_TYPE

mp3

MP_TYPE

mp4(sdq)

MP_TYPE

✓̳

✓̳

mp4

MP_TYPE

✓̳

✓̳

cisd

CI_TYPE

✓̳

✓̳

qcisd

CI_TYPE

✓̳

✓̳

qcisd(t)

CI_TYPE

✓̳

✓̳

lccd

CC_TYPE

✓̳

✓̳

lccsd, cepa(0)

CC_TYPE

✓̳

✓̳

cepa(1)

CC_TYPE

✓̳

✓̳

cepa(3)

CC_TYPE

✓̳

✓̳

acpf

CC_TYPE

✓̳

✓̳

aqcc

CC_TYPE

✓̳

✓̳

ccsd

CC_TYPE

✓̲

✓̲

✓̲

✓̲

ccsd(t)

CC_TYPE

✓̲

✓̲

✓̲

✓̲

Basic FNOCC Keywords

BASIS

Primary basis set. Available basis sets

  • Type: string

  • Possible Values: basis string

  • Default: No Default

FREEZE_CORE

Specifies how many core orbitals to freeze in correlated computations. TRUE or 1 will default to freezing the previous noble gas shell on each atom. In case of positive charges on fragments, an additional shell may be unfrozen, to ensure there are valence electrons in each fragment. With FALSE or 0, no electrons are frozen (with the exception of electrons treated by an ECP). With -1, -2, and -3, the user might request strict freezing of the previous first/second/third noble gas shell on every atom. In this case, when there are no valence electrons, the code raises an exception. More precise control over the number of frozen orbitals can be attained by using the keywords NUM_FROZEN_DOCC (gives the total number of orbitals to freeze, program picks the lowest-energy orbitals) or FROZEN_DOCC (gives the number of orbitals to freeze per irreducible representation) or by the option POLICY in combination with appropriate inputs to FREEZE_CORE_POLICY At present, POLICY is an experimental option and is subject to change.

  • Type: string

  • Possible Values: FALSE, TRUE, 1, 0, -1, -2, -3, POLICY

  • Default: FALSE

R_CONVERGENCE

Convergence for the CC amplitudes. Note that convergence is met only when E_CONVERGENCE and R_CONVERGENCE are satisfied.

E_CONVERGENCE

Convergence criterion for CC energy. See Table Post-SCF Convergence for default convergence criteria for different calculation types. Note that convergence is met only when E_CONVERGENCE and R_CONVERGENCE are satisfied.

MAXITER

Maximum number of CC iterations

  • Type: integer

  • Default: 100

DIIS_MAX_VECS

Desired number of DIIS vectors

  • Type: integer

  • Default: 8

NAT_ORBS

Do use MP2 NOs to truncate virtual space for QCISD/CCSD and (T)?

OCC_TOLERANCE

Cutoff for occupation of MP2 virtual NOs in FNO-QCISD/CCSD(T). Virtual NOs with occupations less than OCC_TOLERANCE will be discarded. This option is only used if NAT_ORBS = true.

TRIPLES_LOW_MEMORY

Do use low memory option for triples contribution? Note that this option is enabled automatically if the memory requirements of the conventional algorithm would exceed the available resources. The low memory algorithm is faster in general and has been turned on by default starting September 2020.

CC_TIMINGS

Do time each cc diagram?

DF_BASIS_CC

Auxilliary basis for df-ccsd(t).

  • Type: string

  • Possible Values: basis string

  • Default: No Default

CHOLESKY_TOLERANCE

tolerance for Cholesky decomposition of the ERI tensor

CEPA_NO_SINGLES

Flag to exclude singly excited configurations from a coupled-pair computation.

DIPMOM

Compute the dipole moment? Note that dipole moments are only available in the FNOCC module for the ACPF, AQCC, CISD, and CEPA(0) methods.

Advanced FNOCC Keywords

SCS_MP2

Do SCS-MP2?

MP2_SCALE_OS

Opposite-spin scaling factor for SCS-MP2

  • Type: double

  • Default: 1.20

MP2_SCALE_SS

Same-spin scaling factor for SCS-MP2

  • Type: double

  • Default: 1.0

SCS_CCSD

Do SCS-CCSD?

CC_SCALE_OS

Oppposite-spin scaling factor for SCS-CCSD

  • Type: double

  • Default: 1.27

CC_SCALE_SS

Same-spin scaling factor for SCS-CCSD

  • Type: double

  • Default: 1.13

RUN_MP2

do only evaluate mp2 energy?

RUN_MP3

do only evaluate mp3 energy?

RUN_MP4

do only evaluate mp4 energy?

RUN_CCSD

do ccsd rather than qcisd?

RUN_CEPA

Is this a CEPA job? This parameter is used internally by the pythond driver. Changing its value won’t have any effect on the procedure.

COMPUTE_TRIPLES

Do compute triples contribution?

COMPUTE_MP4_TRIPLES

Do compute MP4 triples contribution?

DFCC

Do use density fitting or cholesky decomposition in CC? This keyword is used internally by the driver. Changing its value will have no effect on the computation.

CEPA_LEVEL

Which coupled-pair method is called? This parameter is used internally by the python driver. Changing its value won’t have any effect on the procedure.

  • Type: string

  • Default: CEPA(0)