CompMattSci - Extrapolating the canonical partition function

Note: Appreciating the "why" of this post may require a background in statistical mechanics, but understanding the math should not.

In statistical mechanics, the classical canonical partition function represents the volume of phase space of an N particle system occupying a box of volume V at thermodynamic temperature, $\beta$ . In other words, it consists of an integral over all degrees of freedom of all particles in the system. There are 3N configurational degrees of freedom corresponding to each particle's position $\mathbf{s}^N=\{\mathbf{s}_1,\mathbf{s}_2,...,\mathbf{s}_N\}$ and 3N kinetic degrees of freedom corresponding to each particle's momentum $\mathbf{p}^N=\{\mathbf{p}_1,\mathbf{p}_2,...,\mathbf{p}_n\}$ . In a classical system we can analytically integrate out the kinetic degrees of freedom. What we can't easily calculate is the configurational part of the canonical partition function, which is shown below:

$Q_c(N,V,\beta)=\dfrac{1}{N!}\int\mathbf{s}^N\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]$

Here $\Gamma_{\mathbf{s}^N}$ is microstate defined by the 3N dimensional vector of scaled coordinates, and $E(\Gamma_{\mathbf{s}^N})$ is the potential energy as a function of the particle coordinates. I won't elaborate much more on the physical importance of the partition function, but will just say that it serves as the normalization term for the microstate probabilities.

$p(\Gamma_{\mathbf{s}^N})\propto\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]$

Typically one uses Monte Carlo simulations to avoid dealing with this high-dimensional integral that is impossibly complex to directly compute for all but the simplest toy systems. However, there are simulation techniques that yield the value of $Q_c$ without directly evaluating this integral (DOI: coming soon), even though its numerical value is impossibly large to comprehend. In these types of studies where $Q_c$ (or $\ln Q_c$ ) is determined for a given system, it is useful to approximate its temperature dependence via Taylor expansion:

$\ln Q_c(N,V,\beta') \approx \ln Q_c(N,V,\beta) + \sum_{a \geq 1} \dfrac{1}{a!}\dfrac{\partial ^a \ln Q_c(N,V,\beta)}{\partial \beta ^a} (\beta' - \beta)^a$

Now, let's look what happens if we take the natural log of the partition function, followed by the derivative with respect to $\beta$ .

$\dfrac{\partial\ln Q_c(N,V,\beta)}{\partial\beta}=-\dfrac{\int\mathbf{s}^NE\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]}{\int\mathbf{s}^N\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]}=-\left<E\right>$

We can see that our expression evaluates to the expectation value of the system's potential energy, or the first order moment of its potential energy distribution. So why is all of this useful? We can run a Monte Carlo (MC) simulation in the canonical ensemble, obtain the average potential energy, and then we know to a first order approximation the slope of the canonical partition function as a function of $\beta$ . Now what happens if we take the second derivative with respect to $\beta$ ? Using the fact that:

$\dfrac{\int\mathbf{s}^NE^n\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]}{\int\mathbf{s}^N\exp\left[-\beta E(\Gamma_{\mathbf{s}^N})\right]}=\left<E^n\right>$

you can work through the algebra and should end up with the following:

$\dfrac{\partial^2\ln Q_c(N,V,\beta)}{\partial\beta^2}=\left<E^2\right>-\left<E\right>^2$

So the second derivative can be related to the first and second moments of the potential energy distribution. Note that $\left<E^2\right>-\left<E\right>^2=\left<E^2-2E\left<E\right>+\left<E\right>^2\right>=\left<(E-\left<E\right>)^2\right>$ , such that the derivative can be expressed in terms of either the central rather non-central moments. With many steps of algebra, one can continue taking higher order derivatives:

$\dfrac{\partial^3\ln Q_{c}(N,V,\beta)}{\partial \beta ^3}=-\left<(E-\left<E\right>)^3 \right>$

$\dfrac{\partial^4 \ln Q_{c}(N,V,\beta)}{\partial \beta ^4} = \left\langle ( E - \left\langle E \right\rangle )^4 \right\rangle - 3 \left\langle ( E - \left\langle E \right\rangle )^2 \right\rangle ^2$

$\dfrac{\partial^5 \ln Q_{c}(N,V,\beta)}{\partial \beta ^5} = \cdots$

The evaluation of these derivatives shows the natural log of the partition function is a cumulant generating function. See the Wikipedia entry for more math details on the relationship between cumulants, moments, central-moments, etc., but the fact that the canonical partition function is cumulant generating means we can determine its curvature with respect to $\beta$ by computing the moments of the potential energy distribution as sampled during an MC simulation. One can use either the central or non-central moments to evaluate these derivatives, but in practice it saves an enormous amount of disk space to use the non-central moments since on can simply accumulate $E,E^2,E^3,...$ throughout the course of the simulation. Now the partition function can be estimated at a new temperature, provided the Taylor expansion is valid over the extrapolation range and enough statistics have been accumulated to accurately estimate the moments of the energy distribution.