Contents - Index

MCMC Diagnostics

MCMC is a Bayesian parameter estimation procedure that is most useful in MARK for estimating variance components.  However, it is difficult to tell if this estimation procedure has properly sampled the posterior distribution.  Two methods of ensuring that the Markov chain has properly sampled the posterior distribution are provided in MARK, plus a binary output file is produced so that the user can produce additional diagnostic tests if desired.

Typically, the GIBBSIT sample size calculation would be done first to determine the necessary sample sizes.  Then, multiple chains would be constructed with this sample size and the R-hat values checked to be sure that useful estimates have been generated.

Sample Size Calculations: GIBBSIT

Raftery and Lewis (1996) provide a program called GIBBSIT that calculates the number of iterations (sample size) required in a run the MCMC estimate.  Output is provided for the 0.025 and the 0.975 quantiles, where the burn-in and sample size needed to estimate these values to a precision of 0.0125 with probability 0.95 are given.  Note that often the sample sizes predicted by this algorithm are the largest for the confounded parmaeters (e.g., the last phi and p in a time-specific Cormack-Jolly-Seber model).  Besides the burn-in and sample size values, two additional values are provided.  The first skip value is the number of values to skip in the sample which will result in independent values in the Markov chain.   The second skip values is the number of values to skip in the sample which will be sufficient to produce a first-order Markov chain.  In a first-order Markov chain, the next value only depends on the previous value.  If the chain is completely independent, then the values do not depend on the preceding value.  So, you always expect the skip value to obtain an independent chain will be considerably larger than to obtain a first-order Markov chain.

Sometimes the sample sizes produced by the GIBBSIT procedure seem unduly large.  However, remember that you are estimating the 0.025 or the 0.975 quantile to within 0.0125 with 0.95 probability.  Thus, smaller sample sizes would suffice for a less precise estimate of these quantiles.

Multiple Chains

Gelman (1996) presents a diagnostic statistic that is used when multiple Markov chains are run.  He recommends at least 10 chains be constructed to apply the methods described below.

The following description of the procedure assumes a single parameter, with the procedure repeated for each of the estimated parameters in the model.  Define m as the number of Markov chains constructed, each of length n samples.   For each of the m chains, compute the mean (Xi) and SD (SDi) of the n estimates.  Then, using the m estimates, compute the average of the means (Xmean) and  SD values (SDmean).  Then, compute the statistics B = sum from i=1 to m (Xi - Xmean)^2 times n / (m - 1) and W = SDmean.  Then compute

R-hat = [W(n - 1)/n + B/n]/W.

If the Markov chains have converged properly, values of R-hat should be 1.0. Gelman (1996) recommends that values of R-hat for all the estimated parameters should be <1.2.  Typically, I have found that most values of R-hat for n = 10000 with 1000 burn-in samples are <1.05.