Contents - Index

Standardize Individual Covariates

When the mean value of individual covariates is very large or small, or the range of the covariate is over several orders of magnitude, the numerical optimization algorithm may fail to find the correct parameter estimates.  To prevent this from happening, the values of individual covariates can be scaled, or standardized.  The usual method used in MARK is to compute the mean (x-bar) and standard deviation (SD) of the individual covariate.  Then, each value is standardized by the transformation

(x - x-bar)/SD

so that the mean of the new variable is now zero, and the range is roughly from -3 to +3.  This transformation is commonly called a z transformation because the resulting transformed variable  is approximately distributed as a normally distributed variable with mean zero and variance 1, often labeled as z in basic statistics.

One caution is in order about using the z transformation on one or more individual covariates and another temporal or group covariate in the design matrix to predict a single real parameter.  Situations can arise where the real parameter estimates and the model's AIC differ between runs using the standardized covariates and the unstandardized covariates.  This situation arises because the z transformation affects both the slope and intercept of the model.  For example, with a logit link function and the covariate x1,

logit(S) = B0 + B1(x1 - x1-bar)/SD1 = [B0 - B1 x1-bar/SD1] + [B1/SD1] x1 ,

where the intercept is the quantity shown in the first set of brackets, and the second bracket is the slope.  This result shows the conversion between the beta parameter estimates for the standardized covariate and the beta parameter estimates for the untransformed covariate, i.e., the intercept for the untransformed analysis would correspond to the quantity in the first set of brackets, and the slope for the untransformed analysis would correspond to the quantity in the second set of brackets.  When an additional temporal or group covariate, x2, is added, the result is more complicated:

logit(S) = B0 + B1(x1 - x1-bar)/SD1 + B2 x2 = [B0 - B1 x1-bar/SD1] + [B1/SD1] x1+ B2 x2 .

As a result, the intercept term for x2 when x1 is standardized with the z transformation is not the same as would be obtained if x1 is not standardized with the z transformation.  This difference causes the AIC values for the 2 models to differ because the real parameter estimates differ between the 2 models.

An alternative to this z transformation to avoid the above problem is to use the product function in the design matrix to multiply the individual covariate by a scaling value.  As an example, suppose the individual covariate Var ranges from 100 to 900.  Using the design matrix function product(Var,0.001) in the entries of the design matrix would result in values ranging from 0.1 to 0.9, and would result in 3 more significant digits being reported in the estimates of the beta parameter for this individual covariate.

Real parameter estimates can be computed from user-specified values of individual covariates (see Individual Covariates and Real Parameter Estimates).

Time-specific and group covariates in the design matrix may also have to be scaled.