MCMC (Markov Chain Monte Carlo)
Markov chain Monte Carlo (MCMC) is a Bayesian parameter estimation procedure that is most useful in MARK for estimating variance components. The variance components analysis in MARK was previously limited to a method of moments procedure. The Metropolis-Hastings algorithm is used to obtain MCMC estimates of beta parameters. Real parameter estimates are placed in the results, but cannot be modeled as part of the MCMC procedure. That is, the MCMC procedure is limited to the beta parameters at this time, with only normally distributed hyperdistributions. Note that generally you should run the model that you want to use for MCMC estimation as a typical MARK analysis, so that you can provide initial estimates to start the MCMC estimation. The MCMC estimation procedure is built with the logit link in mind, i.e., the normal distribution is assumed for a prior distribution in all cases except for standard deviations, where a gamma distribution is used. You can run the MCMC estimation procedure on links other than the logit link, but beware that the results will need extra checking to be sure that the procedure work appropriately with a different link function. If multiple Markov chains are run, convergence diagnostics are provided. Necessary sample sizes are also also computed with the GIBBSIT procedure.
MCMC parameter estimation is obtained by checking the MCMC checkbox in the Run Window. Typically, you would also check the provide initial estimates box as well, so that the MCMC estimation procedure can start from good starting values. To have these initial estimates available, you will have had to run the model to generate the usual maximum likelihood estimates, preferably with the logit link function. After you click the Run button, a dialog window appears asking you for the following information.
Random Number Seed. The default value is zero, which means that the random number seed will be obtained as a function of the current time of day. The purpose of this entry is to allow you to specify a random number seed if you want to duplicate results from a previous analysis.
Number of 'burn in' samples. The Metropolis-Hastings algorithm randomly samples from the posterior distribution. Typically, initial samples are not completely valid because the Markov Chain has not stabilized. The burn in samples allow you to discard these initial samples.
Number of samples to store. After the initial burn in period, samples from the posterior distribution are saved to compute summary statistics describing this distribution.
Name of binary file to store samples. The samples are saved in a binary file that can be read with a SAS code or an R code to perform more sophisticated analyses than are available from MARK.
Default SD of normal jumping distribution. For beta parameters that are not a part of hyperdistributions, the default step size to generate the next value of the parameter is generated as a random normal variable with a SD specified in this edit box. Typically, you want to accept the new parameter value about 45% of the time, so you can adjust this SD to approximately obtain this acceptance rate.
Number of hyperdistributions. The main purpose of the MCMC algorithm in MARK is to estimate mean and variances of sets of parameters, i.e., estimate the values of the hyperdistribution of a set of beta parameters. Using this edit box, or the accompanying spinner, you can specify the number of hyperdistributions that you want to model.
Hyperdistribution means modeled with a design matrix. Checking this check box allows you to model the means of the hyperdistributions with linear models specified in the MCMC hyper design matrix. You must specify the number of columns you want in the design matrix, because the hyper design matrix is not as flexible as the usual design matrix used in MARK to build models.
Variance-covariance specified. Checking this check box allows you to specify the variance-covariance of the hyperdistributions. You do not need to check this box unless you want to estimate the process correlation between the beta parameters being estimated in the MCMC estimation process. If you do want to estimate this correlation, you have to construct the MCMC VCMatrix.
Prior Distribution on Parameters not in Hyperdistributions. Although the most likely use of the MCMC estimation procedure is to estimate the mean and variance of hyperdistributions, most models in MARK include other nuisance parameters in the models. These parameters also require a prior distribution. Three options are provided. The first is to ignore the prior distribution, and never use it to decide whether a new value in the Markov Chain is accepted or rejected. The second option is to specify a default prior distribution, consisting of a normal distribution with the mean and variance provided. All parameters not included in a hyperdistribution will use this normal prior. The third option is to specify the prior distribution for each parameter individually. However, only normal priors are allowed, so you can only specify a mean and standard deviation appropriate for each of the non-hyperdistribution parameters.
Diagnostics. This group of controls determines what diagnostic values will be generated. Checking the GIBBSIT sample size box will produce a table of estimated sample sizes for the burn-in period and the samples to store. Selecting the multiple Markov chains option will produce a diagnostic statistic (R-hat) useful for determining if the Markov chains have adequately sampled the posterior distribution.
Once you have specified the appropriate values, or want to accept the defaults, click the OK button to continue. If you have specified 1 or more hyperdistributions are to be modeled, then additional dialog windows will request information on these hyperdistributions.
When MCMC estimation is run in MARK, the resulting output is not stored in the Results Browser Window, or in the Results File. Rather, the output is placed in a Notepad Window, which you can then save to a file for later retrieval. Summaries provided for each of the beta parameters, hyperdistribution parameters, and real parameters are the mean, standard deviation, median, and mode, plus the percentage of trials when a new value was accepted (labeled as the proportion of jumps accepted). In addition, the percentiles of 2.5, 5, 10, 20, 80, 90, 95, and 97.5 are listed. These values can be used to create credibility intervals for the parameters. In addition, the binary output file is available to use for additional analyses with the SAS code or R code provided.