The parameters estimated in a moult analysis are mean start date,
duration and standard deviation. The estimates in the moult
package are found by maximising a likelihood. One way to construct
confidence intervals for these estimates is to first obtain an estimate
for the standard error (SE) and then assuming a normal sampling
distribution for the estimate and constructing a confidence interval
as
\[\mbox{estimate} \pm 1.96 \times \mbox{SE(estimate)}.\]
In the moult package, SEs are found from the observed information matrix (inverse of the matrix of 2nd derivatives of the log-likelihood function). The above normal confidence interval only works if the sampling distributions of the statistics are close to normal.
A second reason for introducing bootstrap confidence intervals: the
standard deviation parameter in the moult
package is
estimated on the log-scale. It’s SE on the original scale can be
approximated (using the Delta method) but finding covariances is much
more complicated, e.g. it is difficult to obtain the covariance between
start date and standard deviation in start date for both parameters on
the original scale.
One solution to solve both of the above problems (normal assumption possibly not valid, and covariances for all parameters) is bootstrapping. Bootstrapping also allows us to find confidence intervals for any derived parameters, such as the mean end date of moult.
The confint.moult
function finds percentile bootstrap
confidence intervals for a range of parameters, and also returns
bootstrap estimates of the covariance matrix and standard errors.
Bootstrapping approximates the sampling distribution of any statistic. It works by repeatedly sampling from the data with replacement, and estimating the statistic of interest in each of these re-samples. The resulting bootstrap sample can be used to estimate standard errors and to obtain confidence intervals. Bootstrap percentile intervals estimate confidence intervals using the percentiles (e.g. 2.5% and 97.5% for a 95% confidence level) of the boostrap distribution.
library(moult)
data(sanderlings)
<- moult(MIndex ~ Day, data = sanderlings)
m2 summary(m2)
#>
#> Call:
#> moult(formula = MIndex ~ Day, data = sanderlings)
#>
#>
#> Duration coefficients:
#> Estimate Std. Error
#> duration 96.08 6.902
#>
#> Mean start date coefficients:
#> Estimate Std. Error
#> mean-start-day 131.4 2.096
#>
#> Coefficients for standard deviation in start date:
#> Estimate Std. Error
#> SD-start-day 19.18 1.984
#>
#> Log-likelihood: -255 on 3 Df
We want to obtain confidence intervals for the usual parameters mean start date, duration, SD in start date. In addition we might be interested in confidence intervals for the mean end date (mean start date + duration) and the mid-date (start date + 0.5 \(\times\) duration).
In the following code the moult data are bootstrapped \(B = 1000\) times (1000 bootstrap samples).
The same moult model as used for the original data (m2
above) is fitted to each bootstrap sample, and the mean start date,
standard deviation in start date, duration, mean end date, and mid-date
are estimated. From these bootstrapped values a covariance matrix can be
obtained. Standard errors are square roots of the diagonals of this
covariance matrix.
95% bootstrap percentile intervals are obtained from the quantiles of the bootstrap distribution. Percentile intervals usually work relatively well if the bootstrap distribution is symmetric.
<- confint(m2, B = 1000, pch = 19, cex = 0.3, las = 1) boot.ci
$bootstrap.percentile.ci
boot.ci#> duration mean sd half.est end.est
#> 2.5% 77.02399 127.6861 14.13938 169.3794 208.2269
#> 97.5% 113.54653 135.5024 23.12084 187.9635 244.0933
$bootstrap.vcov
boot.ci#> duration mean sd half.est end.est
#> duration 90.539142 -3.8442473 11.872820 41.425324 86.6948951
#> mean -3.844247 4.3203077 1.324626 2.398184 0.4760604
#> sd 11.872820 1.3246263 5.066125 7.261036 13.1974465
#> half.est 41.425324 2.3981840 7.261036 23.110846 43.8235079
#> end.est 86.694895 0.4760604 13.197447 43.823508 87.1709554
$bootstrap.SE
boot.ci#> duration mean sd half.est end.est
#> 9.515206 2.078535 2.250805 4.807374 9.336539