MLE Software Online Course
Click here to view presentation online
16 December 2023
For distributions where variance cannot be controlled separately from mean (e.g., Poisson, multinomial)
New probability distributions have been defined this way.
Alternatively could specify observation-specific random effects. E.g., Poisson with log of rate normal (this is GLMM, with log link function) - Easy to generalize using RTMB.
standard Pearson residuals
Problems with standard Pearson residuals
One step ahead (osa) aka recursive quantile residuals
Actual residuals typically:
Not normal
Not independent
In addition, we really want to look at residuals in some sense integrated over random effects, rather than at the best estimates of random effects
Approximation depends on approximate normality of the combined vector or parameter estimates and random estimates.
This is why we generally don’t specify non-normal distributions for random effects.
RTMB includes a helper function that checks the Laplace approximation
\[ \begin{array}{c}N_{a}=R \exp (-Z(a-r)), r \leq a \leq \max \\C_{i, a}=q_{i} S_{a} N_{a} \\p_{a}=\frac{S_{a} N_{a}}{\sum_{j=r}^{\max } S_{j} N_{j}} \\\underline{n}_{i} \sim \operatorname{multinom}\left(n_{i}, \underline{p}_{a}\right), n_{i}=\sum_{a=r}^{a=\max } n_{i, a}\end{array} \]
Multinomial apps often use data as proportions and “effective sample size” (ESS)
In such applications the negative log likelihood was written in terms of proportions and ESS
In RTMB emulate by providing dmultinom product of proportions and ESS as the data (don’t set size!)
The NLL returned will depend on the ESS but ESS cannot be estimated (nothing new for multinomial here)
Works because the internal dmultinom calcs allow non-integers.
Compound distribution, p vector comes from dirichlet then used as parameter of multinomial.
Used to introduce overdispersion relative to multinomial
Using linear form where ESS proportional to sample size
\[ \begin{array}{c}\underline{n}_{i} \sim \operatorname{DirMult}\left(n_{i}, \underline{\alpha}_{i}\right), \alpha_{i, a}>0 \\E\left(n_{i, a}\right)=n_{i} \frac{\alpha_{i, a}}{\sum_{j} \alpha_{i, j}}, E S S_{i}=\sum_{a} \alpha_{j, a} \\\alpha_{i, a}=\theta n_{i} p_{a}, E S S_{i}=\theta n_{i}\end{array} \]
Parameterize so that standard deviations and correlations can be calculated and converted into a variance-covariance matrix
Estimate the standard deviations on log-scale
Not sufficient to restrict correlations to -1 to 1 as some combinations of correlations are not consistent with feasible correlation matrices
Typical element of correlation matrix P is \(\rho_{i,j} = \sigma_{i,j}^{2}/(\sigma_{i,i}\sigma_{j,j})\)
Matrix algebra to convert correlation matrix and SDs to var-cov matrix
\[ \Sigma=\operatorname{diag}\left(\sigma_{1,1}, \cdots, \sigma_{k, k}\right) \mathrm{P} \operatorname{diag}\left(\sigma_{1,1}, \cdots, \sigma_{k, k}\right) \]
You might actually want to assume some data are multivariate normal (e.g., some catch-at-age assessments assume log of catch-at-age MVN). But often we impose structure on correlations.
You might want to allow different random effects to be correlated. E.g., we might expect the vonB function parameters for a pond will be either positively or negatively correlated with one another.
Illustrate with another very simple example
We have a set of multivariate observations assumed to come from a multivariate normal distribution
We estimate the mean vector and parameters that determine the variance-covariance matrix
Illustrate how to estimate parameters on real number line that determine a “legal” correlation matrix
Illustrate how to get the variance-covariance matrix from correlation matrix and vectors of standard deviations
See “unstructured” example R script
We have already seen how to make a vonB parameter a random effect. We could have made more than one parameter random at the same time using dnorm
We can implement this by having the vonB parameters follow a multivariate normal distribution
Larger and more realistic example data set, reparameterized the vonB (use L2 rather than t0)
20 different ponds
Each observation gives the pond ID and the length and age for an individual fish
This is not changing the underlying growth function, just how it is parameterized
Standard parameterization has been criticized for t0 being hard to interpret and for correlation with other parameters
Use L2 (length at age 2) instead. Chose age within range of observed data
\[ L_a=\widetilde{L}_{2}+\left(L_{\infty}-\widetilde{L}_{2}\right) \exp (-K(a-2)) \]
\[ \begin{array}{c}{\underline{\omega}_i = \left[\log L \infty_{i}, \log K_{i}, \log \widetilde{L 2}_{i}\right]^{T} \sim \\M V N\left(\left[\overline{\log L \infty_{i}}, \overline{\log K_{i}}, \overline{\log \widetilde{L 2}}\right]^{T}, \Sigma\right)} \\L_{i, j} \sim N\left(\operatorname{vonb}\left(\underline{\omega}_{i}, a_{i, j}\right), \sigma_{L, i, j}^{2}\right) \\\sigma_{L, i, j}=\exp \left[\operatorname{int}+\operatorname{slp} * \operatorname{vonb}\left(\underline{\omega}_{i}, a_{i, j}\right)\right] \operatorname{vonb}\left(\underline{\omega}_{i}, a_{i, j}\right)\end{array} \]