- Research article
- Open Access
A comprehensive analysis of the history of DFT based on the bibliometric method RPYS
Journal of Cheminformatics volume 11, Article number: 72 (2019)
This bibliometric study aims at providing a comprehensive analysis of the history of density functional theory (DFT) from a perspective of chemistry by using reference publication year spectroscopy (RPYS). 114,138 publications with their 4,412,152 non-distinct cited references are analyzed. The RPYS analysis revealed three different groups of seminal papers which researchers in DFT have drawn from: (i) some long-known experimental studies from the 19th century about physical and chemical phenomena were referenced rather frequently in contemporary DFT publications. (ii) Fundamental quantum-chemical papers from the time period 1900–1950 which predate DFT form another group of seminal papers. (iii) Finally, various very frequently employed DFT approximations, basis sets, and other techniques (e.g., implicit descriptions of solvents) constitute another group of seminal papers. The earliest cited reference we found was published in 1806. The references to papers published in the 19th century mainly served the purpose of referring to long-known physical and chemical phenomena which were used to test if DFT approximations deliver correct results (e.g., Van der Waals interactions). The foundational papers of DFT by Hohenberg and Kohn as well as Kohn and Sham do not seem to be affected by obliteration by incorporation as they appear as pronounced peaks in our RPYS analysis. Since the 1990s, only very few pronounced peaks occur as most years were referenced nearly equally often. Exceptions are 1993 and 1996 due to seminal papers by Axel Becke, John P. Perdew and co-workers, and Georg Kresse and co-workers.
The terms bibliometrics or scientometrics (in a broader sense) are often used synonymously and can be characterized as the discipline that provides a quantitative [1, 2] overview about science. The most basic quantities used in bibliometrics are publication and citation counts. They are used to construct bibliometric indicators for research evaluation purposes. The perspective of research evaluation looks forward in time from the individual publication and the corresponding citing papers are counted. In this study, we turn the perspective backwards from the individual publication and analyze the cited references (i.e., the number of times a specific reference is included in the reference lists). One approach of such an analysis is called reference publication year spectroscopy (RPYS). This bibliometric method can be used to locate seminal papers which are cited most frequently in a certain publication set . The question about seminal papers in a given field can be answered by researchers in the field only in a subjective way. RPYS, however, can answer this question in an objective way by asking all researchers in the field (via the cited references in their publications) with subsequent quantitative analysis. Therefore, RPYS results often provide a different perspective or complement the individual expert’s perspective on the field. For example, RPYS analyses have been performed to discover the historical roots of individuals , publications in a journal [5, 6], or research fields . Very large publication sets can be analyzed by sampling methods implemented in the CRExplorer . An overview of further studies based on RPYS can be found in Marx and Bornmann . In this paper, we were able to use a very large data set as the basis for our analysis to cover the complete research field of density functional theory (DFT) and its applications from a chemical perspective.
Kohn–Sham DFT  has become one of the most important methods to solve the Schrödinger [11, 12] and Dirac [13,14,15] equations approximately. Besides the foundational theorems by Hohenberg and Kohn , DFT stands on many other pillars. Thomas  and Fermi  proposed the very first density functional approximation without mentioning the term. The simplifications of the Hartree–Fock [19, 20] method by Slater  has enabled practical DFT calculations. Kohn–Sham DFT calculates the energy of a non-interacting reference system. Exchange and correlation functionals are used to approximate the difference to the real system. The simplest exchange and correlation functionals depend only on the electron density itself . The development of exchange and correlation functionals that also included the gradient (GGA functionals) [23,24,25,26] and second derivatives (meta-GGA functionals) [27,28,29] of the electron density permitted more accurate calculations. GGA and meta-GGA calculations provide higher accuracy in general at negligible additional computational expense. DFT calculations became even more accurate and attractive with the development of hybrid functionals [30,31,32,33,34,35]. However, hybrid functionals increased also the computational expense significantly. Admixture of a global fraction of Hartree–Fock exchange yielded higher accuracy for atoms and molecules, and applicability of hybrid functionals to solids and surfaces was enabled by range-separation with a screened Coulomb potential [36,37,38]. Long-range-corrected hybrid functionals provided more accurate calculations of reaction barrier heights [39,40,41]. The accuracy of DFT calculations was increased even more by the admixture of a variable fraction of Hartree–Fock exchange, as done by local hybrid functionals [42,43,44,45,46,47,48]. Combination of the concepts of local hybrids and range-separated hybrids [49,50,51] increased the accuracy even further. In addition, correlation from wave function methods (MP2 [52, 53], RPA [54,55,56,57], coupled-cluster [58,59,60]) was also admixed with the original correlation functional [61, 62]. Time-dependent DFT, based on the work of Runge and Gross , has become a well-established methodology for treating electronically excited states.
The history of quantum chemistry for DFT and ab initio quantum chemistry and many-body perturbation theory (MBPT) has been discussed by Kutzelnigg [64, 65]. Although such a qualitative review on DFT and its historical roots from the perspective of an individual researcher is very helpful, a quantitative overview based on large publication sets can only be obtained using bibliometric methods. There is considerable interest in the evolution of the annual publication volume in the field of DFT [66, 67]. Recently, Haunschild et al.  provided a bibliometric overview about DFT publications. We intend to extend this bibliometric effort in this study by presenting a quantitative overview about the historical roots and seminal publications of DFT for the time period from 1800 until 2012 using RPYS.
Our analysis is based on the application of the search and retrieval functions of STN® to the Chemical Abstracts Plus literature database (CAplusSM) provided by CAS (Chemical Abstracts Service), a division of the American Chemical Society (ACS). The CAplus database covers scientific publications and patents related to chemistry since around 1900 (including the references cited therein since the publication year 1996).
The CAplus publication records contain index terms (IT) which have been carefully selected and assigned by the database producer (CAS). We searched for the terms “DFT”, “density functional theory”, “d functional theory”, and “TDDFT” in the IT fields of the CAplus database. Occurrences of “TD-DFT” and “time-dependent density functional theory” are also found by our search terms. The search term “d functional theory” is not used by scientists using DFT but it is used by CAS indexers. In total, we found 114,138 documents published before the end of the year 2014 (at the date of searching the year 2015 was not completely covered by the database). Throughout this paper, we will refer to this set of 114,138 documents as “DFT publications”. Although indexing takes some time, we can expect that the publication years until 2014 are nearly complete. Searching in the IT field of CAplus has the advantage that only documents are retrieved where DFT plays a major role (e.g.: where DFT methods are employed or developed). Documents in which DFT is mentioned along the way in the abstract are not retrieved. This reduces the citation count in our study in comparison with citation counts from other databases.
We analyzed the DFT publications with respect to seminal papers and historical roots on which the DFT publications are based. Such seminal papers can be located using a bibliometric method called “reference publication year spectroscopy” (RPYS)  in combination with a recently developed tool named CRExplorer (http://www.crexplorer.net) . The analysis of the publication years of the references cited by all the papers in a specific research field shows that (especially earlier) publication years are not equally represented. Some years occur particularly frequently among the references showing up as pronounced peaks in the distribution of the reference publication years (i.e., the RPYS spectrogram). In most cases, the peaks are based on single publications, which are highly cited compared to other early publications. It is assumed that the highly cited papers are of specific significance to the research field in question (here: DFT).
In a first step, the publication set is imported into the CRExplorer and all cited references are extracted. In a second step, equivalent references are clustered and merged. References occurring less often than a certain threshold (see below) are removed to reduce the background noise and to sharpen the resulting spectrogram. In the third and final step, the reference publication years are analyzed for frequently cited publications. Older RPYs require a slightly different methodology, i.e., a lower threshold of the minimum number of cited references because the scale for the number of cited references (NCR, i.e., count of publications which cited a specific reference) differs significantly across different periods of time. The 114,138 DFT publications contain 4,412,152 non-distinct cited references. Handling (clustering, merging, and analysis) of such a large number of cited references is non-trivial. Therefore, we divided our analysis into four different time periods: (1) 1800–1899, (2) 1900–1949, (3) 1950–1989, and (4) 1990–2012. Between 1800 and 1899 the maximum peak height is 125, between 1900 and 1949 it is over 3000, between 1950 and 1989 it is over 50,000, and finally for the last period it has raised to around 60,000.
The threshold for references to be removed for the first two time periods (1800–1899 and 1900–1949) is 10. For the third time period (1950–1989), we used the threshold of 100 consistent with our earlier study in this time period . The last time period (1990–2012) contained by far the most cited references. Therefore, we applied the final threshold of a minimum of 100 also to this time period after clustering and merging of reference variants.
Reference variants can occur in high numbers. As an example, we point out the number of reference variants to the very popular computational program Gaussian: (1) to the 2003 version of the program package and (2) to all different program versions. We found 2035 different reference variants amounting to 18,397 cited references of the 2003 version. This would put this version of Gaussian between CR69 and CR72 (see “Appendix”). More than 4000 reference variants could be identified for any version which amounts to 43,736 cited references. This makes the program package Gaussian referenced more often than any other publication in our set. However, as we are interested in scientific publications, we removed references to program packages (i.e., Gaussian and SHELX).
Some typos in the publication year or permutations of publication year with page number were spotted. For example, we found a reference to “KRESSE G, 1758, PHYS REV B, V59, P1999” and the correct reference is “KRESSE G, 1999, PHYS REV B, V59, P1758” (CR79). Another example is “PERDEW J, 1092, PHYS REV B, V46, P6671”. The correct cited reference is “PERDEW J, 1992, PHYS REV B, V46, P6671” (CR68). However, these errors were not corrected because they occurred rather seldom.
In a previous paper, we have briefly discussed the history of DFT  for the period from 1950 to 1989. In this paper, we analyze the history of DFT for the time period between 1800 and 2012. Since DFT was founded in 1964, when the famous Hohenberg–Kohn theorems  were published, it is obvious that our dataset must contain many references to important precursor papers which are indirectly related to DFT. Since the number of peak papers is rather large (n = 85) we have decided to focus our analysis on the most important papers only and provide the complete list of peak papers in “Appendix”.
In our study of the history of DFT, we find that the 19th century is characterized by studies of special phenomena in physics, preparations and reactions of special chemical compounds, as well as some theoretical precursors to DFT. The first half of the 20th century is characterized by the discovery of quantum mechanics and its applications to atomic and molecular structures and their related physical and chemical phenomena. In the light of DFT, this period is dominated by the paper from Møller and Plesset  on perturbation theory. In the period from 1950 to 1989, DFT was founded by Hohenberg and Kohn , and Kohn and Sham . In the aftermath, several approximations have been developed and applied to new and old problems in chemistry and physics. In the final period from 1990 to 2012, new approximations were assessed and the results demonstrate the success of DFT, especially for the calculation of larger molecules.
Time period 1800–1899
In the period from 1800 to 1899, we find a spectrogram with several rather small peaks (see Fig. 1). The red points and curve in Fig. 1 show the number of cited references (NCR) in each reference publication year (RPY) while the blue points and curve show the 5-year median (x − 2, x − 1, x, x + 1, and x + 2) deviation from the NCR in the specific RPY. This color scheme is also used for the other RPYS figures in this paper.
As can be seen in Table 1 (in “Appendix”), the set of peak papers can be roughly divided into papers with a focus on physics, physical chemistry, and classical organic chemistry. The set of physics and physical chemical papers comprises the Grotthuss mechanism for proton transfer in water (CR1, 1806), the Petit-Dulong rule to determine molar heat capacities (CR2, 1819), Michael Faraday’s work on the nature of light in magnetic fields (CR4, 1846), van der Waals theory on capillarity (CR13, 1894), and the combining rule of Marcellin Berthelot for the calculation of the Lennard-Jones potential (CR15, 1898).
Chemical discoveries include the identification and preparation of several new compounds: benzene by Michael Faraday (CR3, 1825) whose corresponding structural formula was first proposed by August Kekulé (CR7, 1865), the preparation of α-amino acids from aldehydes/ketones, ammonia, and cyanide by Adolph Strecker (CR5, 1850), the synthesis of salicylic acid by Hermann Kolbe (CR6, 1860), the Glaser coupling of two terminal alkines (CR8, 1869), the synthesis of long carbon chains by Adolf von Baeyer (CR10, 1885), the rearrangement of an acyl azide to an isocyanate by Theodor Curtius (CR12, 1890), the discovery of nickel carbonyl and the class of metal carbonyls by Ludwig Mond (CR11, 1890), and Henry John Horstman Fenton invented a reagent which can be used to destroy certain organic compounds (Fenton’s reagents) (CR14, 1894). In addition, stereochemistry was invented by Jacobus Henricus van’t Hoff (CR9, 1874). All these discoveries and inventions have been re-examined in the light of quantum mechanics by applying various approximations of DFT, mainly to test certain density functionals regarding well-known phenomena.
Time period 1900–1949
The scientific progress in the first half of the 20th century was largely dominated by the development of the fundamental theories in physics. Quantum mechanics was discovered, and the concepts were applied to atoms and molecules, new analytical tools were invented which enabled scientists to study the atomic and sub-atomic world, and an initial understanding of the nature of the chemical bond was gained. A part of this history is reflected in Fig. 2 (and Table 2 in “Appendix”) from the perspective of DFT.
Although Table 2 (in “Appendix”) contains 35 frequently cited papers we will concentrate here on the most significant peak papers. Aside from the journal publications, we also find three important books: a textbook on crystal physics by Woldemar Voigt (1850–1919) (CR28, 1928, in German), an early introduction to quantum chemistry by Hans Hellmann (1903–1938) (CR43, 1937, in German) and a handbook on Infrared and Raman Spectra of Polyatomic Molecules by Gerhard Herzberg (1904–1999) (CR49, 1945).
In 1901, George Wulff published a paper on the growth rate and the dissolution of crystal surfaces. He also defined the so-called Wulff construction, a method which allows the determination of the equilibrium shape of a droplet or a crystal of a fixed volume (CR16, 1901). Paul Ewald calculated optical and electrostatic grid potentials in which he proposed a method to analyze dipole fields based on the theta function (CR22, 1921). In the same year, Lars Vegard published a paper on the constitution of mixed crystals and the space occupied by atoms (CR23, 1921). In 1928, only 1 year after the publication of the Schrödinger equation, Enrico Fermi calculated atomic properties using a statistical approach where he treated the electrons as a perfect gas with complete degeneration (CR27, 1928). In the same year, Douglas Hartree published the so-called Hartree equations for the calculation of many-electron systems in a self-consistent field (CR29, 1928). These equations were generalized by Vladimir Fock to include exchange phenomena between two electrons (CR33, 1930) and they are known today as the Hartree–Fock equations. The Hartree–Fock method is an integral part of many quantum chemical calculations including DFT applications . Paul Dirac published a note on the exchange phenomena in the Thomas atom (CR32, 1930) and Carl Eckart proposed a theory to explain the penetration of a potential barrier by electrons (CR34, 1930).
Tjalling Koopmans proposed an approximation for the calculation of ionization energies which is known today as Koopmans’ Theorem (CR38, 1934). In the same year, Christian Møller and Milton S. Plesset proposed a perturbative treatment of many-electron systems (CR37, 1934). This theory is often used as a reference method to benchmark new functionals for larger molecules. Furthermore, hybrid correlation functionals mix correlation from DFT with correlation from wave function methods, e.g. MP2 , RPA [52, 54, 61]. The second-order treatment (MP2, if used on top of the Hartree–Fock method) has been employed most often due to the good compromise between increased accuracy and computational demand. This approach is a central pillar of DFT approximations which combine wave function correlation with density functional correlation, and hence it is not surprising that this publication is cited very frequently, and the corresponding peak completely dominates the time period 1900–1949.
Fritz London developed a theory for the description of interatomic currents in aromatic compounds (CR42, 1937). In the same year, Hermann Arthur Jahn and Edward Teller published a new theorem which was later called the Jahn–Teller effect (CR44). This effect describes the spontaneous symmetry breaking in molecules and solids. Francis Dominic Murnaghan developed an equation of state which describes the relationship between the volume and the pressure of a body (CR48, 1944). Francis Birch formulated the so-called Birch-Murnaghan isothermal equation of state based on Murnaghan’s ideas (CR50, 1947).
Time period 1950–1989
During this time period, we find several distinguished peaks, but the spectrogram is dominated by a single peak in 1988 (see Table 3 in “Appendix”). This one is actually caused by two very highly cited papers from Lee et al.  and from Becke . In addition, there are several papers dealing with extensions of the Hartree–Fock equations as well as applications of molecular orbital (MO) theory. More important are the papers by Hohenberg and Kohn , and Kohn and Sham  with their fundamental work on DFT.
John C. Slater proposed an approximation to the Hartree–Fock exchange potential (CR51, 1951), and Clemens C. J. Roothaan (CR52, 1951) developed the concept of molecular orbitals as a linear combination of atomic orbitals (LCAO). LCAO was initially applied to Hartree–Fock theory but it is used in virtually every widespread program package for post-Hartree–Fock and DFT calculations. A few years later Robert S. Mulliken (CR53, 1955) proposed an electronic population analysis based on Roothaan’s LCAO method. Using this methodology, it became possible to calculate partial charges and dipole moments.
The foundational publications for modern DFT by Pierre Hohenberg and Walter Kohn (CR54, 1964) and Walter Kohn and L. J. Sham (CR55, 1965) were published in 1964 and 1965. Hohenberg and Kohn  postulated and proved the Hohenberg–Kohn theorems which build the foundation of DFT. Kohn and Sham  provided a practical methodology (Kohn–Sham equations) based on the ideas behind the Hartree–Fock equations to apply DFT to molecules and solids.
S. Francis Boys and Fernando Bernardi developed a new direct difference method for the computation of molecular interaction energies with reduced errors (CR56, 1970). Warren J. Hehre, Robert Ditchfield, and John A. Pople (CR57, 1972) presented new basis sets for the LCAO method. The 6-31G basis set, which became very popular, is among those basis sets presented in this cited reference. The relevance of polarization functions was pointed out by Puthugraman C. Hariharan and John A. Pople (CR58, 1973), and the popular 6-31G* and 6-31G** basis sets were introduced. Jan Evert Baerends, Donald E. Ellis, and Piet Ros (CR59, 1973) presented a computational Hartree–Fock scheme using Slater’s approximation and Roothaan’s LCAO ansatz. CR60 (1976) is the only cited reference in Table 3 (in “Appendix”) specifically concerned with the solid state. The authors propose a method for generating sets of special points in the Brillouin zone. This method provides a more efficient algorithm to integrate periodic functions of the wave vector in solid state calculations.
Between 1980 and 1988 we find four publications of new density functional approximations or improvements to existing ones (CR61, CR64-CR66) and two publications of effective core potentials (CR62, CR63). Seymour H. Vosko, L. Wilk, and Marwan Nusair proposed popular local correlation functionals (CR61, 1980). P. Jeffrey Hay and Willard R. Wadt proposed effective core potentials for the atoms K-Au and Sc–Hg which enable a cost-effective implicit treatment of inner-shell electrons for heavier elements (CR62, CR63, both 1985). John P. Perdew (CR64, 1986) proposed a gradient correction to an earlier local correlation functional developed by John P. Perdew and Alex Zunger. The references to the publications by Chengteh Lee, Weitao Yang, and Robert G. Parr on the development of the Colle-Salvetti correlation-energy formula into a functional of the electron density (NCR = 23,953, CR65) and by Axel Becke on density-functional exchange-energy approximation with correct asymptotic behavior (NCR = 14,150, CR66) completely dominate the peak in 1988. Lee et al.  proposed a correlation functional (LYP) and Becke  proposed an exchange functional (B, also known as Becke88). Both functionals were combined in the highly popular functionals BLYP, B3LYP (see below), and many others. Those functionals were implemented in very popular program packages (e.g., Gaussian) and thereby made available to the computational chemistry community. The extraordinary many citations to these publications are easily explained because they are cited each time a functional containing Becke88 exchange and/or LYP correlation is used in a study. Although there is a considerable number of functionals containing Becke88 exchange and/or LYP correlation, the very high NCR values are due to the popularity of some of these functionals, especially BLYP and B3LYP.
Time period 1990–2012
In this time period, we find several highly and very highly cited papers and sometimes several peak papers have been published in a single year (see Table 4 in “Appendix”). In this period, several sophisticated approximations to the exchange potential of the DFT equations have been developed. Although there are peaks for the years 1992/1993/1994, 1996, and 1998/1999, there are two dominating peaks in 1993 and 1996 (see Fig. 4). The peak in 1993 is caused by a paper from Becke  entitled Density-functional thermochemistry. III. The role of exact exchange with 25,970 cited references. In 1996, we find three major papers, one from Perdew et al.  and two from Kresse and Furthmüller [71, 72] with a total of 29,522 cited references. For both peaks this is roughly half of all cited references in these years.
John P. Perdew and Yue Wang developed an analytic representation of the correlation energy for a uniform electron gas (CR67, 1992) and they demonstrated the accuracy in several numerical tests (CR68, 1992). A year later, Axel Becke proposed the family of hybrid functionals where DFT exchange is mixed with Hartree–Fock exchange (CR69, 1993). The first member of this new family of functionals has become known as B3PW91 (abbreviation used for: Becke, three parameters, Perdew–Wang-91) which performed significantly better than previous functionals with gradient corrections only. Due to the high accuracy of this approximation it became very popular and the paper turned out to become the most highly cited paper in this time period. Peter E. Blöchl (CR70, 1994) developed the projector augmented-wave method (PAW) which is a generalization of both the pseudopotential and the linear augmented-plane-wave (LAPW) method. In the same year, Philip J. Stephens, Frank J. Devlin, Cary F. Chabalowski, and Michael J. Frisch (CR71, 1994) proposed to use LYP correlation instead of PW91 correlation in the B3PW91 functional without adjustment of the three parameters. This led to the highly popular B3LYP functional. Furthermore, they applied several approximations to calculate vibrational transition spectra of the chiral 4-methyl-2-oxetanone and showed that B3LYP is in excellent agreement with experiment. Therefore, this publication is usually cited when B3LYP is used in studies. Later, Reiher et al.  refitted the parameters and named the resulting functional B3LYP*. Although Reiher et al.  is none of our peak papers, it does occur in our RPYS analysis with 309 cited references. The original B3LYP functional stayed far more popular than the refitted version.
John P. Perdew, Kieron Burke, and Matthias Ernzerhof published a derivation of a generalized gradient approximation (GGA) for the electron correlation and exchange energy (1996, CR72) which is known as PBE, named after the authors of the paper. PBE is a very popular and computationally rather inexpensive density functional (exchange and correlation) for molecules and solids. In the same year Georg Kresse and Jürgen Furthmüller (CR73, CR74, both 1996) presented an efficient scheme for calculating the Kohn–Sham ground state of metallic systems using pseudopotentials and a plane-wave basis set. The algorithms were implemented within a program package called VASP (Vienna ab initio simulation package). The peak in 1998 is based on four papers (CR75–CR78). Vincenzo Barone and Maurizio Cossi (CR75) performed quantum-chemical calculations of molecular energies and energy gradients in solution by a conductor solvent model. This paper also describes the COSMO (conductor-like screening solvation model) implementation in Gaussian. It is cited in studies which use the Gaussian implementation of COSMO for modelling solvated molecules. Mark E. Casida, Christine Jamorski, Kim C. Casida, and Dennis R. Salahub (CR76) and R. E. Stratmann, Gustavo E. Scuseria, and Michael J. Frisch (CR77) applied time-dependent density functional theory (TDDFT) to the calculation of excitation energies. Carlo Adamo and Vincenzo Barone (CR78) constructed exchange functionals with an improved long-range behavior.
Georg Kresse and Daniël P. Joubert (CR79, 1999) showed the formal relationship between ultra soft pseudopotentials and the projector augmented wave (PAW) method and provided critical tests of both methods calculating both small molecules as well as bulk systems. Carlo Adamo and Vincenzo Barone (CR80, 1999) constructed the popular hybrid functional PBE0. The parameter to determine the amount of Hartree–Fock exchange is derived from an argument from perturbation theory. Therefore, PBE0 is also considered to be a non-empirical hybrid functional.
In 2001, G. te Velde and co-workers published a detailed review of the Amsterdam Density Functional (ADF) program (CR81). Jacopo Tomasi, Benedetta Mennucci, and Roberto Cammi published a review about quantum mechanical continuum solvation models for the implicit description of solvents and solvation effects (CR82, 2005). Florian Weigend and Reinhart Ahlrichs (CR83, 2005) provided basis sets of different sizes for most elements of the periodic table. These basis sets (also so called Ahlrichs or Karlsruhe basis sets) are an integral part of the program package Turbomole, and they are also often used by users of other program packages. The influence of different basis sets on the calculation of atomic and molecular properties is also analyzed in CR83. In 2006, Stefan Grimme augmented Becke’s functional B97 with empirical, damped, and atom-pairwise dispersion corrections (CR84) which performed especially well for long range van der Waals interactions. Yan Zhao and Donald G. Truhlar developed a set of highly parametrized functionals (CR85, 2008) which they applied successfully to organometallic and inorganometallic compounds.
In our analysis on the history of DFT from a chemical perspective, we have found several groups of papers which are of high relevance to the DFT research field. Of course, the most important group of papers consists of methodological papers on DFT and its approximations, starting with the famous papers by Hohenberg and Kohn (CR54) and Kohn and Sham (CR55) with a total of 17,847 cited references. This is shown as a peak in 1964/1965 in Fig. 3. The 1980s and 1990s are dominated by several publications of new density functional approximations or improvements to existing ones. The main peaks are 1988 (CR65, CR66) in Fig. 3 (NCR = 38,103), and 1993 (CR69) (NCR = 25,970) as well as 1996 (CR72–CR74) (NCR = 29,522) in Fig. 4. Some of these papers on new approximations and improvements are also dealing with new basis sets as well as tests and comparisons of DFT approximations.
The other group of papers consists of fundamental papers from quantum mechanics and quantum chemistry which are important predecessors of DFT. Although this starts with the foundation of quantum mechanics the most highly cited papers are about approximate methods and their applications. In the spectrogram of the time period 1900 to 1949 we find 6 major peaks in 1928 (3 papers, NCR = 500), 1930 (3 papers, NCR = 812), 1934 (2 papers, NCR = 2733), 1937 (3 papers, NCR = 748), 1944 (1 paper, NCR = 739), and 1947 (1 paper, NCR = 377). However, this spectrogram is clearly dominated by the famous paper of Møller and Plesset (CR37, 1934) on the perturbation theory of many-electron systems (NCR = 2243).
Finally, the third group of papers deals with a broad set of physical and chemical phenomena which have been calculated with various approximations of DFT. Some long-known experiments are revisited in the light of modern quantum chemistry, i.e. DFT. Examples from the 19th century are the Grotthus mechanism for proton transfer in water (CR1), van der Waals theory on capillarity (CR13), the nature of light in magnetic fields (CR4), and the study of several chemical compounds together with their synthesis (CR5–CR12 and CR14). In later periods, the publications on specific physical phenomena and chemical compounds are hidden in the spectrograms since they did not obtain high enough citation rates compared to progresses in theoretical research.
Often, older publications are affected by obliteration by incorporation . This leads to lower citation counts than expected. This is not the case for several publications discussed here, e.g. CR54 and CR55. Kutzelnigg  explains the high citation rates of the papers from Hohenberg and Kohn (CR54, 1964) and from Kohn and Sham (CR55, 1965) by mystification of these papers in DFT. Therefore, no obliteration by incorporation occurred. Also, the fact that “DFT” and “Kohn–Sham methods” have been used as synonyms might have led to an unusually high citation count of CR55. This unusually high citation count is not unjustified considering that the vast majority of practical applications employ the Kohn–Sham approach to DFT.
Our study is not without limitations: cited references are included for publications only since 1996 in CAS databases. Therefore, references for earlier publications could not be included in our analysis. Although CAplus has a focus on chemistry, neighboring research areas such as physics and biology are often better covered than one might think. From our experience, CAplus has a broad coverage of the DFT literature (see also a very recent RPYS analysis on the topic of DFT using different databases ). Therefore, the major limitation of our study should be missing cited references in publications before 1996. However, RPYS is a rather robust method and the main conclusions should not be affected. Furthermore, the vast majority of DFT publications appeared after 1996 .
Very recently, Haunschild and Marx  compared co-citation RPYS (RPYS-CO) results using Becke  as a marker paper with RPYS results by Haunschild et al.  which were based on CAS data. Haunschild and Marx  used Web of Science and Microsoft Academic as databases which have cited references for all indexed papers. They found a surprisingly high agreement between RPYS results based on CAS data and RPYS-CO results based on Web of Science data and Microsoft Academic data. This reassures us that the missing cited references before 1996 should not have distorted our results.
We could not mention each frequently cited reference or include it in the tables in “Appendix”. Some seminal papers are concealed by even more cited references in the same or neighboring years. One of such cases is Runge and Gross  (the theoretical foundation of TD-DFT) which has an NCR value of 1986 in our RPYS analysis. Despite this rather high NCR value, there is no peak in the spectrogram. In this case, CR62, CR63, and CR64 conceal the paper introducing the theoretical foundations of TD-DFT.
Availability of data and materials
Our bibliometric study is based on the use of the commercial database CAplus from Chemical Abstracts Service. Although these databases are not available via open access we assume that almost every chemist has access to this database either via SciFinder or via STN. The procedure can also be done using a completely free to access data set from CrossRef (https://search.crossref.org/) as CRExplorer also imports CrossRef downloads in JSON format. Similar results are obtained using the data set deposited at: https://ivs.fkf.mpg.de/dft-rpys/DFT_crossref_download.tgz. However, CrossRef data have many different cited reference variants so that extensive manual merging is necessary. Furthermore, CrossRef does not provide high-quality indexing as CAS does, and both data providers have different coverage of the DFT literature. CAS more exhaustively indexes literature related to chemistry. Therefore, some peak papers observed in RPYS with CAS data do not show up in RPYS analysis using CrossRef data.
American Chemical Society
Amsterdam Density Functional
Chemical Abstracts Service
conductor-like screening model
density functional theory
generalized gradient approximation
- KS DFT:
Kohn Sham density functional theory
many-body perturbation theory
Møller Plesset perturbation theory of 2nd order
number of cited references
Perdew Burke Ernzerhof
random phase approximation
reference publication year spectroscopy
reference publication year
Scientific Technical Information Network
time-dependent density functional theory
Vienna ab initio simulation package
Lee Yang Parr
Abbott A, Cyranoski D, Jones N, Maher B, Schiermeier Q, Van Noorden R (2010) Do metrics matter? Nature 465:860–862. https://doi.org/10.1038/465860a
Van Noorden R (2010) A profusion of measures. Nature 465:864–866. https://doi.org/10.1038/465864a
Marx W, Bornmann L, Barth A, Leydesdorff L (2014) Detecting the historical roots of research fields by reference publication year spectroscopy (RPYS). J Assoc Inform Sci Technol 65:751–764. https://doi.org/10.1002/asi.23089
Bornmann L, Haunschild R, Leydesdorff L (2018) Reference publication year spectroscopy (RPYS) of Eugene Garfield’s publications. Scientometrics 114:439–448. https://doi.org/10.1007/s11192-017-2608-3
Haunschild R, Bauer J, Bornmann L (2019) Influential cited references in FEMS microbiology letters: lessons from reference publication year spectroscopy (RPYS). FEMS Microbiol Lett 66:fnz139. https://doi.org/10.1093/femsle/fnz139
Ballandonne M (2018) The historical roots (1880–1950) of recent contributions (2000–2017) to ecological economics: insights from reference publication year spectroscopy. J Econ Methodol 26:307–326. https://doi.org/10.1080/1350178X.2018.1554227
Marx W, Haunschild R, Thor A, Bornmann L (2017) Which early works are cited most frequently in climate change research literature? A bibliometric approach based on reference publication year spectroscopy. Scientometrics 110:335–353. https://doi.org/10.1007/s11192-016-2177-x
Haunschild R, Marx W, Thor A, Bornmann L (2019) How to identify the roots of broad research topics and fields? The introduction of RPYS sampling using the example of climate change research. J Inform Sci. https://doi.org/10.1177/0165551519837175
Marx W, Bornmann L (2016) Change of perspective: bibliometrics from the point of view of cited references—a literature overview on approaches to the evaluation of cited references in bibliometrics. Scientometrics 109:1397–1415. https://doi.org/10.1007/s11192-016-2111-2
Kohn W, Sham LJ (1965) Self-consistent equations including exchange and correlation effects. Phys Rev 140:1133. https://doi.org/10.1103/PhysRev.140.A1133
Schrödinger E (1926) Quantisation as an eigen value problem. Annalen Der Physik 79:361–368
Schrodinger E (1926) An undulatory theory of the mechanics of atoms and molecules. Phys Rev 28:1049–1070. https://doi.org/10.1103/PhysRev.28.1049
Dirac PAM (1928) The quantum theory of the electron. Proc R soc Lond Ser A Contain Pap Math Phys Character 117:610–624. https://doi.org/10.1098/rspa.1928.0023
Dirac PAM (1928) The quantum theory of the electron—part II. Proc R soc Lond Ser A Contain Pap Math Phys Character 118:351–361. https://doi.org/10.1098/rspa.1928.0056
Dirac PAM (1928) On the quantum theory of electrons. Physikalische Zeitschrift 29:561–563
Hohenberg P, Kohn W (1964) Inhomogeneous electron gas. Phys Rev B 136:B864. https://doi.org/10.1103/physrev.136.b864
Thomas LH (1927) The calculation of atomic fields. Proc Camb Philos Soc 23:542–548
Fermi E (1928) A statistical method for determining some properties of the atoms and its application to the theory of the periodic table of elements. Z Angew Phys 48:73–79. https://doi.org/10.1007/bf01351576
Hartree DR, Hartree FRS, Hartree W (1935) Self-consistent field, with exchange, for beryllium. Proc R Soc Lond Ser Math Phys Sci 150:0009–0033. https://doi.org/10.1098/rspa.1935.0085
Fock V (1930) Approximation method for the solution of the quantum mechanical multibody problems. Z Angew Phys 61:126–148. https://doi.org/10.1007/bf01340294
Slater JC (1951) A simplification of the Hartree-Fock method. Phys Rev 81:385–390. https://doi.org/10.1103/PhysRev.81.385
Vosko SH, Wilk L, Nusair M (1980) Accurate spin-dependent electron liquid correlation energies for local spin-density calculations—a critical analysis. Can J Phys 58:1200–1211
Perdew JP (1986) Density-functional approximation for the correlation-energy of the inhomogeneous electron-gas. Phys Rev B 33:8822–8824. https://doi.org/10.1103/PhysRevB.33.8822
Perdew JP, Burke K, Ernzerhof M (1996) Generalized gradient approximation made simple. Phys Rev Lett 77:3865–3868. https://doi.org/10.1103/PhysRevLett.77.3865
Becke AD (1988) Density-functional exchange-energy approximation with correct asymptotic-behavior. Phys Rev A 38:3098–3100. https://doi.org/10.1103/PhysRevA.38.3098
Lee CT, Yang WT, Parr RG (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron-density. Phys Rev B 37:785–789. https://doi.org/10.1103/PhysRevB.37.785
Tao JM, Perdew JP, Staroverov VN, Scuseria GE (2003) Climbing the density functional ladder: nonempirical meta-generalized gradient approximation designed for molecules and solids. Phys Rev Lett. 91:146401 https://doi.org/10.1103/PhysRevLett.91.146401
Perdew JP, Kurth S, Zupan A, Blaha P (1999) Accurate density functional with correct formal properties: a step beyond the generalized gradient approximation. Phys Rev Lett 82:2544–2547. https://doi.org/10.1103/PhysRevLett.82.2544
Zhao Y, Truhlar DG (2006) A new local density functional for main-group thermochemistry, transition metal bonding, thermochemical kinetics, and noncovalent interactions. J Chem Phys 125:18. https://doi.org/10.1063/1.2370993
Becke AD (1993) Density-functional thermochemistry. 3. The role of exact exchange. J Chem Phys 98:5648–5652. https://doi.org/10.1063/1.464913
Adamo C, Barone V (1999) Toward reliable density functional methods without adjustable parameters: the PBE0 model. J Chem Phys 110:6158–6170. https://doi.org/10.1063/1.478522
Zhao Y, Schultz NE, Truhlar DG (2005) Exchange-correlation functional with broad accuracy for metallic and nonmetallic compounds, kinetics, and noncovalent interactions. J Chem Phys 123:4. https://doi.org/10.1063/1.2126975
Zhao Y, Schultz NE, Truhlar DG (2006) Design of density functionals by combining the method of constraint satisfaction with parametrization for thermochemistry, thermochemical kinetics, and noncovalent interactions. J Chem Theory Comput 2:364–382. https://doi.org/10.1021/ct0502763
Becke AD (1993) A new mixing of Hartree-Fock and local density-functional theories. J Chem Phys 98:1372–1377. https://doi.org/10.1063/1.464304
Perdew JP, Ernzerhof M, Burke K (1996) Rationale for mixing exact exchange with density functional approximations. J Chem Phys 105:9982–9985. https://doi.org/10.1063/1.472933
Heyd J, Scuseria GE, Ernzerhof M (2003) Hybrid functionals based on a screened Coulomb potential. J Chem Phys 118:8207–8215. https://doi.org/10.1063/1.1564060
Heyd J, Scuseria GE, Ernzerhof M (2006) Hybrid functionals based on a screened Coulomb potential (vol 118, pg 8207, 2003). J Chem Phys 124:219906. https://doi.org/10.1063/1.2204597
Yanai T, Tew DP, Handy NC (2004) A new hybrid exchange-correlation functional using the Coulomb-attenuating method (CAM-B3LYP). Chem Phys Lett 393:51–57. https://doi.org/10.1016/j.cplett.2004.06.011
Vydrov OA, Scuseria GE (2006) Assessment of a long-range corrected hybrid functional. J Chem Phys 125:234109. https://doi.org/10.1063/1.2409292
Chai JD, Head-Gordon M (2008) Systematic optimization of long-range corrected hybrid density functionals. J Chem Phys 128:15. https://doi.org/10.1063/1.2834918
Iikura H, Tsuneda T, Yanai T, Hirao K (2001) A long-range correction scheme for generalized-gradient-approximation exchange functionals. J Chem Phys 115:3540–3544. https://doi.org/10.1063/1.1383587
Jaramillo J, Scuseria GE, Ernzerhof M (2003) Local hybrid functionals. J Chem Phys 118:1068–1073. https://doi.org/10.1063/1.1528936
Arbuznikov AV, Kaupp M (2008) What can we learn from the adiabatic connection formalism about local hybrid functionals? J Chem Phys 128:214107. https://doi.org/10.1063/1.2920196
Bahmann H, Rodenberg A, Arbuznikov AV, Kaupp M (2007) A thermochemically competitive local hybrid functional without gradient corrections. J Chem Phys 126:011103. https://doi.org/10.1063/1.2429058
Haunschild R, Janesko BG, Scuseria GE (2009) Local hybrids as a perturbation to global hybrid functionals. J Chem Phys 131:154112. https://doi.org/10.1063/1.3247288
Janesko BG, Scuseria GE (2007) Local hybrid functionals based on density matrix products. J Chem Phys 127:164117. https://doi.org/10.1063/1.2784406
Janesko BG, Scuseria GE (2008) Parameterized local hybrid functionals from density-matrix similarity metrics. J Chem Phys 128:084111. https://doi.org/10.1063/1.2831556
Johnson ER (2014) Local-hybrid functional based on the correlation length. J Chem Phys 141:124120. https://doi.org/10.1063/1.4896302
Haunschild R, Scuseria GE (2010) Range-separated local hybrids. J Chem Phys 132:224106. https://doi.org/10.1063/1.3451078
Henderson TM, Janesko BG, Scuseria GE, Savin A (2009) Locally range-separated hybrids as linear combinations of range-separated local hybrids. Int J Quantum Chem 109:2023–2032. https://doi.org/10.1002/qua.22049
Arbuznikov AV, Kaupp M (2012) Importance of the correlation contribution for local hybrid functionals: range separation and self-interaction corrections. J Chem Phys 136:13. https://doi.org/10.1063/1.3672080
Grimme S (2006) Semiempirical hybrid density functional with perturbative second-order correlation. J Chem Phys 124:034108. https://doi.org/10.1063/1.2148954
Hedegard ED, Heiden F, Knecht S, Fromager E, Jensen HJA (2013) Assessment of charge-transfer excitations with time-dependent, range-separated density functional theory based on long-range MP2 and multiconfigurational self-consistent field wave functions. J Chem Phys 139:13. https://doi.org/10.1063/1.4826533
Janesko BG, Henderson TM, Scuseria GE (2009) Long-range-corrected hybrids including random phase approximation correlation. J Chem Phys 130:081105. https://doi.org/10.1063/1.3090814
Furche F (2008) Developing the random phase approximation into a practical post-Kohn-Sham correlation model. J Chem Phys 129:114105. https://doi.org/10.1063/1.2977789
Furche F, Van Voorhis T (2005) Fluctuation-dissipation theorem density-functional theory. J Chem Phys 122:10. https://doi.org/10.1063/1.1884112
Eshuis H, Furche F (2011) A parameter-free density functional that works for noncovalent interactions. J Phys Chem Lett 2:983–989. https://doi.org/10.1021/jz200238f
Goll E, Werner HJ, Stoll H (2005) A short-range gradient-corrected density functional in long-range coupled-cluster calculations for rare gas dimers. Phys Chem Chem Phys 7:3917–3923. https://doi.org/10.1039/b509242f
Goll E, Werner HJ, Stoll H, Leininger T, Gori-Giorgi P, Savin A (2006) A short-range gradient-corrected spin density functional in combination with long-range coupled-cluster methods: application to alkali-metal rare-gas dimers. Chem Phys 329:276–282. https://doi.org/10.1016/j.chemphys.2006.05.020
Garza AJ, Bulik IW, Henderson TM, Scuseria GE (2015) Range separated hybrids of pair coupled cluster doubles and density functionals. Phys Chem Chem Phys 17:22412–22422. https://doi.org/10.1039/c5cp02773j
Goerigk L, Grimme S (2014) Double-hybrid density functionals. Wiley Interdiscip Rev Comput Mol Sci 4:576–600. https://doi.org/10.1002/wcms.1193
Chai JD, Head-Gordon M (2009) Long-range corrected double-hybrid density functionals. J Chem Phys 131:13. https://doi.org/10.1063/1.3244209
Runge E, Gross EKU (1984) Density-functional theory for time-dependent systems. Phys Rev Lett 52:997–1000. https://doi.org/10.1103/PhysRevLett.52.997
Kutzelnigg W (2006) Density Functional Theory (DFT) and ab initio Quantum Chemistry (AIQC). Story of a difficult partnership. In: Simos GMaT (ed) Lecture series on computer and computational sciences, vol 6. Brill academic publishers, Leiden, pp 23–62
Kutzelnigg W (2009) How many-body perturbation theory (MBPT) has changed quantum chemistry. Int J Quantum Chem 109:3858–3884. https://doi.org/10.1002/qua.22384
Burke K (2012) Perspective on density functional theory. J Chem Phys 136:150901. https://doi.org/10.1063/1.4704546
Pribram-Jones A, Gross DA, Burke K (2015) DFT: a theory full of holes? In: Johnson MA, Martinez TJ (eds) Annual review of physical chemistry, vol 66. Annual Reviews, Palo Alto, pp 283–304
Haunschild R, Barth A, Marx W (2016) Evolution of DFT studies in view of a scientometric perspective. J Cheminformatics 8:12. https://doi.org/10.1186/s13321-016-0166-y
Thor A, Marx W, Leydesdorff L, Bornmann L (2016) Introducing CitedReferencesExplorer (CRExplorer): a program for reference publication year spectroscopy with cited references standardization. J Informetr 10:503–515. https://doi.org/10.1016/j.joi.2016.02.005
Møller C, Plesset MS (1934) Note on an approximation treatment for many-electron systems. Phys Rev 46:0618–0622. https://doi.org/10.1103/PhysRev.46.618
Kresse G, Furthmüller J (1996) Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys Rev B 54:11169–11186. https://doi.org/10.1103/PhysRevB.54.11169
Kresse G, Furthmüller J (1996) Efficiency of ab initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput Mater Sci 6:15–50. https://doi.org/10.1016/0927-0256(96)00008-0
Reiher M, Salomon O, Hess BA (2001) Reparameterization of hybrid functionals based on energy differences of states of different multiplicity. Theor Chem Acc 107:48–55. https://doi.org/10.1007/s00214-001-0300-3
McCain KW (2014) Obliteration by incorporation. MIT Press, Cambridge, pp 129–149
Haunschild R, Marx W (2019) Discovering seminal works with marker papers. In: Cabanac G, Frommholz I, Mayr P (eds) 8th international workshop on bibliometric-enhanced information retrieval (BIR 2019), vol 2345. CEUR-WS.org, Cologne, pp 27–38
One author (Robin Haunschild) is very grateful to CAS for hosting him during his visit in Columbus/Ohio in the Innovation Lab of CAS. The authors thank Dr. Werner Marx for very helpful discussions.
The research visit of RH was financed partly by CAS.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Haunschild, R., Barth, A. & French, B. A comprehensive analysis of the history of DFT based on the bibliometric method RPYS. J Cheminform 11, 72 (2019) doi:10.1186/s13321-019-0395-y
- Density functional theory
- Reference publication year spectroscopy
- Historical roots
- Seminal papers