This vignette is intended to show how an analysis of mortality data would work using the BayesMoFo R package. We start by installing and loading the package.
# install package
# Recommended installation
# install.packages("BayesMoFo")
# Development version (use only if needed)
# install.packages("devtools")
# devtools::install_github("jstw1g09/Rpackage-BayesMoFo")
#load package
library(BayesMoFo)The package fits various mortality models to two types of data: age-period (AP) data and age-period-product (APP) data. AP data refers to data structured by individuals’ age and the calendar period (or year) in which events (such as deaths or exposures) are observed. Each observation corresponds to a specific age and period combination, summarising the number of events and the population at risk for that group. For example, an AP dataset might record the number of deaths and the exposure (population at risk) for individuals aged 50 in the year 2010, aged 51 in 2011, and so on.
APP data extends the AP framework by including an additional stratifying variable - the “product” - which in general can be any other stratifying variable (such as cause of death, country, deprivation level, gender/sex, geographical location/region, insurance product, marital status, socioeconomic group, smoking behaviour, etc.). Each observation in an APP dataset corresponds to a specific combination of age, period, and product, capturing the number of events and the population at risk for that stratum. For example, an APP dataset might record deaths and exposures for individuals aged 50 in 2010, by cause of death or by insurance product type.
We will separately consider each analysis.
The package accepts several types of data formats. For AP data, users can supply a data-frame, a 3-dimensional (3D) data array, or a data matrix.
The data needs to be formatted as a data.frame with
columns name Age, Year, Deaths and
Exposures. An example is provided below.
You can access a dataset in this format for comparison by running
The first few lines of the data look like the following:
head(uk_mortalitydata, n = 20)
#>    Age Year   Deaths Exposures
#> 1    0 1922 74065.19  908771.5
#> 2    1 1922 23947.07  916468.6
#> 3    2 1922 10891.03  866680.1
#> 4    3 1922  4507.01  727009.9
#> 5    4 1922  2870.01  657473.8
#> 6    5 1922  2692.41  704711.4
#> 7    6 1922  2339.34  764959.2
#> 8    7 1922  2009.12  811532.0
#> 9    8 1922  1724.79  837073.8
#> 10   9 1922  1554.36  838640.6
#> 11  10 1922  1482.40  834416.1
#> 12  11 1922  1414.43  836638.8
#> 13  12 1922  1467.02  847043.2
#> 14  13 1922  1569.20  857801.8
#> 15  14 1922  1764.97  857463.9
#> 16  15 1922  1855.20  846678.5
#> 17  16 1922  2116.68  836631.3
#> 18  17 1922  2261.27  826264.3
#> 19  18 1922  2400.99  814563.2
#> 20  19 1922  2582.88  805320.9Next, we need to format the data in the format necessary for the function runBayesMoFo to work properly. To do this, we pass the dataset (separately for death and exposure) to the function preparedata_fn, which takes the arguments ages, years, and data. In the case of deaths, we pass the data using the column Age, Year, and Claim; and in the case of exposures, we use Exposure in place of Claim.
Alternatively, users can supply the data as a 3-dimensional (3D) data
array. dxt_array_product is a 3D array containing mortality
data stratified by insurance product (see
?dxt_array_product for details), where dim one: 4 insurance
products, dim two: 83 ages, dim three: 5 years.
data("dxt_array_product")
data("Ext_array_product")
# preview of death data the 1st insurance product called "ACI"
str(dxt_array_product["ACI",,,drop = FALSE])
#>  num [1, 1:83, 1:5] 0 1.01 0 0 2 ...
#>  - attr(*, "dimnames")=List of 3
#>   ..$ : chr "ACI"
#>   ..$ : chr [1:83] "18" "19" "20" "21" ...
#>   ..$ : chr [1:5] "2016" "2017" "2018" "2019" ...Similarly, users can prepare the data by either by inputting the data
as a 3-way array, or by specifying the name of the stratum to load using
the argument strat_name:
# inputting the data as a 3-way array
death <- preparedata_fn(dxt_array_product["ACI",,,drop = FALSE], ages = 35:65, years = 2016:2020)
expo <- preparedata_fn(Ext_array_product["ACI",,,drop = FALSE], ages = 35:65, years = 2016:2020)
# specifying the name of the stratum to load using `strat_name`
death <- preparedata_fn(dxt_array_product,strat_name="ACI", ages = 35:65, years = 2016:2020)
expo <- preparedata_fn(Ext_array_product,strat_name="ACI", ages = 35:65, years = 2016:2020)Data array types are less conventional but can be useful if data has been stored as it is. Preserving this data structure is useful for JAGS implementation later (for package development purposes).
Suppose the data is provided in a 2-dimensional matrix format by age and year, commonly used in the literature. For example, the following is an illustration:
# preview of death data the 1st insurance product called "ACI"
str(dxt_array_product["ACI",,,drop = TRUE])
#>  num [1:83, 1:5] 0 1.01 0 0 2 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:83] "18" "19" "20" "21" ...
#>   ..$ : chr [1:5] "2016" "2017" "2018" "2019" ...To prepare data of type matrix, users need to specify the argument
data_matrix=TRUE.
For illustrative purposes, we chose the UK mortality data.
Once the data have been prepared, they can be passed to runBayesMoFo, which is the core function in the package for estimating mortality models.
As this package is built on top of the rjags package, it
is capable of handling missing values in the death data, provided they
are coded as NA. However, if missing values are present in
the exposures data, these will be automatically replaced with a default
value of 100, and predictions will be performed using that value. If the
user wishes to perform prediction for a specific exposure value, they
can manually set the desired value of the exposure (leaving the value of
death count as NA, of course).
Users also have the option to perform model selection, depending on
their needs. If more than one model is provided in the
models argument, model selection is performed by default
using deviance information criterion (DIC). The argument
models can be set equal to:
For example, the code below fit the LC,
CMB_M3, and APCI models to the data.
Note that one can also run the individual functions rather than using the function runBayesMoFo. For example,
All other functions for analysing the output (see later) would work equally. That being said, users are highly recommend to use the function runBayesMoFo even when only one model is needed. That is,
The full list of models, with the specific names, is available by
checking ?runBayesMoFo. Alternatively, one can query the
model details through the documentation within the package,
i.e. ?fit_LC.
The argument family defines the specification for the
distribution of death. A summary is as below (note that for AP data,
just suppress the subscript \(p\)):
If family="poisson", then as proposed by Brouhns, Denuit, and Vermunt (2002), \[d_{x,t,p} \sim \text{Poisson}(E^c_{x,t,p}
m_{x,t,p}) , \] where \(d_{x,t,p}\) represents the number of deaths
at age \(x\) in year \(t\) of stratum \(p\), while \(E^c_{x,t,p}\) and \(m_{x,t,p}\) represents respectively the
corresponding central exposed to risk and central mortality rate at age
\(x\) in year \(t\) of stratum \(p\). The specification is used in
conjunction with the log link function, i.e.  \[\log(m_{x,t,p}) = \eta_{x,t,p} \] where
\(\eta_{x,t,p}\) is the predictor that
depends on the functional form of the mortality model, a full list of
which is provided in Appendices A and B.
Similarly, if family="nb" (default), then a negative
binomial distribution is fitted with the log link function, i.e. \[d_{x,t,p} \sim
\text{Negative-Binomial}(\phi,\frac{\phi}{\phi+E^c_{x,t,p} m_{x,t,p}}) ,
\] \[\log(m_{x,t,p}) = \eta_{x,t,p} ,
\] where \(\phi\) is the
overdispersion parameter. This specification of the death distribution
is standard practice for incorporating overdispersion, a phenomenon
commonly observed in mortality modelling. See Wong, Forster, and Smith (2023).
If \code{family="binomial"}, then \[d_{x,t,p} \sim \text{Binomial}(E^0_{x,t,p} ,
q_{x,t,p}) , \] where \(q_{x,t,p}\) represents the initial
mortality rate at age \(x\) in year
\(t\) of stratum \(p\), while \(E^0_{x,t,p}\approx
E^c_{x,t,p}+\frac{1}{2}d_{x,t,p}\) is the corresponding initial
exposed to risk. The binomial specification is used in conjunction with
the logit link function, i.e. \[\text{logit}(q_{x,t,p})=
\log(\frac{q_{x,t,p}}{1-q_{x,t,p}}) = \eta_{x,t,p}. \]
For example, the following fit the same set of models using the Poisson distribution for modelling number of death.
There are also other arguments for customising the MCMC sampling of the posterior distributions, as below:
n.adapt specifies the number of iterations for the
adaptation phase. See ?rjags::adapt for more details.n.chain specifies the number of parallel chains for the
posterior sampling under each model.n_iter specifies the number of iterations in the
posterior sampling.thin specifies the thinning interval for the posterior
samplingAs part of the function runBayesMoFo, forecasting
can be performed by setting the argument forecast=TRUE,
with the parameter h specifying the forecast horizon. For
example, the code below fit the LC, CMB_M3,
and APCI models to the data, and forecast the models for
h=6 time points into the future.
fitmodel_forecast <- runBayesMoFo(death, expo,
                         models = c("LC",
                                    "CBD_M3",
                                    "APCI"),
                         forecast = TRUE,
                         h = 6, 
                         n.chain = 2
                         )
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 651
#>    Unobserved stochastic nodes: 248
#>    Total graph size: 7904
#> 
#> Initializing model
#> 
#> Completed: LC (1/3)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 651
#>    Unobserved stochastic nodes: 304
#>    Total graph size: 6383
#> 
#> Initializing model
#> 
#> NOTE: Stopping adaptation
#> 
#> 
#> Completed: CBD_M3 (2/3)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 651
#>    Unobserved stochastic nodes: 302
#>    Total graph size: 8266
#> 
#> Initializing model
#> 
#> NOTE: Stopping adaptation
#> 
#> 
#> Completed: APCI (3/3)After running the model, users can then query the best and worst models (in terms of DIC) among the competing models.
A table showing the DIC of all models fitted can be returned too.
One can retrieve the fitted results for the best and worst performing
models, both of which are of type fit_result.
The function plot_param_fn plots all the fitted
parameters of the model specified, using posterior samples generated. If
more than one models were specified previously when running
runBayesMoFo, then only the best model will be
illustrated.
As evident in the plot, all fitted and forecasted parameters will be
included, with solid lines indicating the medians and dashed lines
representing the credible intervals generated from the posterior
samples. By default, the intervals are constructed based on 95%
credibility, but can be changed using the argument
pred_int. For instance, for 80% credible intervals,
The argument legends argument can be used to suppress
the legends for better visualisation (e.g. if visibility is blocked by
the legend boxes).
The function plot_rates_fn plots the fitted death rates
of the model specified for specific ages and years, using posterior
samples generated. Again, if more than one models were specified
previously when running runBayesMoFo, then only the
best model will be illustrated. By default, the (log) death rates will
be plotted against age for the first nine years.
As before, both fitted and forecasted death rates will be included,
with solid lines indicating the medians and dashed lines representing
the credible intervals (95% by default but can be changed using the
argument pred_int) generated from the posterior samples.
Also, observed crude death rates will also be included as coloured
dots.
plot_rates_fn(fitmodel_forecast)
#> Warning in plot_rates_fn(fitmodel_forecast): Too many years selected, only
#> printing the first 9 years.For better visualisation, one may customise the argument
plot_years to plot only selected years. Note that if more
than nine years have been specified, then only the first nine years will
be plotted.
The argument plot_type allows users to plot death rates
against year instead to better visualise temporal variations in death
rates. The argument plot_ages can be used accordingly to
specify which ages to plot.
The function summary_fn produces a summary of the model
results, including posterior means, standard deviations, medians, lower
and upper quantiles based on the credibility specified using
pred_int.
To obtain the posterior means and standard deviations of all death rates,
#posterior means
summary_fitmodel$rates_summary$mean
#posterior standard deviations
summary_fitmodel$rates_summary$stdTo obtain the posterior medians and lower/upper quantiles of all death rates,
#posterior medians
summary_fitmodel$rates_pn$median
#lower quantiles
summary_fitmodel$rates_pn$lower
#upper quantiles
summary_fitmodel$rates_pn$upperCorrespondingly, for model parameters,
Users can assess if convergence has been attained by the MCMC
posterior sampling procedure. The functions diag_rates_fn
produces (by default) trace plots, density plots, as well as effective
sample sizes of the posterior samples of death rates under the best
model. For example,
The effective sample sizes can be viewed as:
diagnostics_rates_result$ESS
#>  q[1,2,12]  q[1,2,16]  q[1,2,17] q[1,16,12] q[1,16,16] q[1,16,17] q[1,17,12] 
#>  135.41161   82.48078   43.02568  128.19355   45.68017   47.54397  129.67851 
#> q[1,17,16] q[1,17,17] 
#>   65.93268   94.86059The arguments plot_strata, plot_ages,
plot_years can be used to specify which death rates to
examine, as follows.
#> $ESS
#>  q[1,6,17]  q[1,6,21]  q[1,6,25] q[1,16,17] q[1,16,21] q[1,16,25] q[1,26,17] 
#>   29.29179   32.68255   40.86880   47.54397   40.91059   94.21674   31.55983 
#> q[1,26,21] q[1,26,25] 
#>   32.84719  237.42925The arguments trace and density can be used
to specify only plotting one of them or none according to personal
preferences.
#for only trace plots
converge_diag_rates_fn(fitmodel_forecast, plot_ages = c(35,45,55), plot_years = c(2016,2020,2024), trace = TRUE, density = FALSE)Auto-correlation plots can also displayed to check if posterior samples are too correlated.
#for only acf plots
converge_diag_rates_fn(fitmodel_forecast, plot_ages = c(35,45,55), plot_years = c(2016,2020,2024), trace = FALSE, density = FALSE, acf_plot = TRUE)#> $ESS
#>  q[1,6,17]  q[1,6,21]  q[1,6,25] q[1,16,17] q[1,16,21] q[1,16,25] q[1,26,17] 
#>   29.29179   32.68255   40.86880   47.54397   40.91059   94.21674   31.55983 
#> q[1,26,21] q[1,26,25] 
#>   32.84719  237.42925Similarly, convergence can be assessed for fitted parameters using
the function converge_diag_rates_fn.
#> NOTE: Only showing three randomly selected alpha.#> NOTE: Only showing three randomly selected beta.#> NOTE: Only showing three randomly selected kappa.#> NOTE: Only showing three randomly selected gamma.The effective sample sizes can be viewed as:
diagnostics_param_result$ESS
#>          rho    rho_gamma sigma2_kappa sigma2_gamma          phi  alpha[1,31] 
#>    969.64746     20.61668    500.96781     13.73325    569.60223   1398.90826 
#>  alpha[1,16]  alpha[1,15]   beta[1,22]   beta[1,14]    beta[1,7]  kappa[1,22] 
#>    993.10171    531.46272     11.51433     21.21420      9.20101   1707.96677 
#>  kappa[1,18]  kappa[1,13]  gamma[1,28]   gamma[1,4]  gamma[1,43] 
#>    121.54441    210.70046     13.81269     28.38499     21.55733By default, the function examines a selection of the parameters. But
the arguments plot_params can be used to specify which set
of parameters to examine, as follows.
converge_diag_param_fn(fitmodel_forecast, plot_params = c("kappa","gamma","rho","phi","sigma2_kappa"))#> NOTE: Only showing three randomly selected kappa.#> NOTE: Only showing three randomly selected gamma.
#> $ESS
#>          rho    rho_gamma          phi sigma2_kappa   kappa[1,6]   kappa[1,1] 
#>    969.64746     20.61668    569.60223    500.96781    105.06140    608.95448 
#>  kappa[1,22]  gamma[1,19]   gamma[1,1]  gamma[1,31] 
#>   1707.96677     37.41380     13.86666     12.44684To check the names of parameters available for examining:
fitmodel_forecast$result$best$param
#> [1] "alpha"        "beta"         "kappa"        "gamma"        "rho"         
#> [6] "sigma2_kappa" "rho_gamma"    "sigma2_gamma" "phi"For the rate-related parameters such as alpha, beta, kappa, gamma etc., only three of the randomly selected subset will be examined when specified. If a particular parameter is to be assessed, users need to specify clearly the indices of the parameters to be examined. For instance, the following will assess the beta parameters for the first stratum and the third age, as well as kappa for the first stratum and the fourth year.
#> $ESS
#> kappa[1,4] gamma[1,2] 
#> 168.574037   9.357196To check the full list of parameters available for examining:
colnames(fitmodel_forecast$result$best$post_sample[[1]])[!startsWith(colnames(fitmodel_forecast$result$best$post_sample[[1]]),"q[")]
#>   [1] "alpha[1,1]"   "alpha[1,2]"   "alpha[1,3]"   "alpha[1,4]"   "alpha[1,5]"  
#>   [6] "alpha[1,6]"   "alpha[1,7]"   "alpha[1,8]"   "alpha[1,9]"   "alpha[1,10]" 
#>  [11] "alpha[1,11]"  "alpha[1,12]"  "alpha[1,13]"  "alpha[1,14]"  "alpha[1,15]" 
#>  [16] "alpha[1,16]"  "alpha[1,17]"  "alpha[1,18]"  "alpha[1,19]"  "alpha[1,20]" 
#>  [21] "alpha[1,21]"  "alpha[1,22]"  "alpha[1,23]"  "alpha[1,24]"  "alpha[1,25]" 
#>  [26] "alpha[1,26]"  "alpha[1,27]"  "alpha[1,28]"  "alpha[1,29]"  "alpha[1,30]" 
#>  [31] "alpha[1,31]"  "beta[1,1]"    "beta[1,2]"    "beta[1,3]"    "beta[1,4]"   
#>  [36] "beta[1,5]"    "beta[1,6]"    "beta[1,7]"    "beta[1,8]"    "beta[1,9]"   
#>  [41] "beta[1,10]"   "beta[1,11]"   "beta[1,12]"   "beta[1,13]"   "beta[1,14]"  
#>  [46] "beta[1,15]"   "beta[1,16]"   "beta[1,17]"   "beta[1,18]"   "beta[1,19]"  
#>  [51] "beta[1,20]"   "beta[1,21]"   "beta[1,22]"   "beta[1,23]"   "beta[1,24]"  
#>  [56] "beta[1,25]"   "beta[1,26]"   "beta[1,27]"   "beta[1,28]"   "beta[1,29]"  
#>  [61] "beta[1,30]"   "beta[1,31]"   "gamma[1,1]"   "gamma[1,2]"   "gamma[1,3]"  
#>  [66] "gamma[1,4]"   "gamma[1,5]"   "gamma[1,6]"   "gamma[1,7]"   "gamma[1,8]"  
#>  [71] "gamma[1,9]"   "gamma[1,10]"  "gamma[1,11]"  "gamma[1,12]"  "gamma[1,13]" 
#>  [76] "gamma[1,14]"  "gamma[1,15]"  "gamma[1,16]"  "gamma[1,17]"  "gamma[1,18]" 
#>  [81] "gamma[1,19]"  "gamma[1,20]"  "gamma[1,21]"  "gamma[1,22]"  "gamma[1,23]" 
#>  [86] "gamma[1,24]"  "gamma[1,25]"  "gamma[1,26]"  "gamma[1,27]"  "gamma[1,28]" 
#>  [91] "gamma[1,29]"  "gamma[1,30]"  "gamma[1,31]"  "gamma[1,32]"  "gamma[1,33]" 
#>  [96] "gamma[1,34]"  "gamma[1,35]"  "gamma[1,36]"  "gamma[1,37]"  "gamma[1,38]" 
#> [101] "gamma[1,39]"  "gamma[1,40]"  "gamma[1,41]"  "gamma[1,42]"  "gamma[1,43]" 
#> [106] "gamma[1,44]"  "gamma[1,45]"  "gamma[1,46]"  "gamma[1,47]"  "gamma[1,48]" 
#> [111] "gamma[1,49]"  "gamma[1,50]"  "gamma[1,51]"  "gamma[1,52]"  "gamma[1,53]" 
#> [116] "gamma[1,54]"  "gamma[1,55]"  "gamma[1,56]"  "gamma[1,57]"  "kappa[1,1]"  
#> [121] "kappa[1,2]"   "kappa[1,3]"   "kappa[1,4]"   "kappa[1,5]"   "kappa[1,6]"  
#> [126] "kappa[1,7]"   "kappa[1,8]"   "kappa[1,9]"   "kappa[1,10]"  "kappa[1,11]" 
#> [131] "kappa[1,12]"  "kappa[1,13]"  "kappa[1,14]"  "kappa[1,15]"  "kappa[1,16]" 
#> [136] "kappa[1,17]"  "kappa[1,18]"  "kappa[1,19]"  "kappa[1,20]"  "kappa[1,21]" 
#> [141] "kappa[1,22]"  "kappa[1,23]"  "kappa[1,24]"  "kappa[1,25]"  "kappa[1,26]" 
#> [146] "kappa[1,27]"  "phi"          "rho"          "rho_gamma"    "sigma2_gamma"
#> [151] "sigma2_kappa"The arguments trace and density can be used
to specify only plotting one of them or none according to personal
preferences.
#for only trace plots
converge_diag_param_fn(fitmodel_forecast, plot_params = c("kappa[1,4]","gamma[1,2]"), trace = TRUE, density = FALSE)Auto-correlation plots can also displayed to check if posterior samples are correlated.
#for only acf plots
converge_diag_param_fn(fitmodel_forecast, plot_params = c("kappa[1,4]","gamma[1,2]"), trace = FALSE, density = FALSE, acf_plot = TRUE)#> $ESS
#> kappa[1,4] gamma[1,2] 
#> 168.574037   9.357196Several other commonly used MCMC convergence diagnostics, such as
Gelman’s R statistic (Gelman and Rubin
(1992)), Geweke’s Z scores (Geweke
(1991)), Heidel’s Stationarity and Half-width tests (see Heidelberger and Welch (1981) and Heidelberger and Welch (1983) for more details),
can be computed and illustrated using the function
converge_diag_fn.
converge_diag_result<-converge_diag_fn(fitmodel_forecast, plot_gelman = TRUE, plot_geweke = TRUE)
#> Note: no convergence issues identified.#Gelman's R
head(converge_diag_result$gelman_diag$psrf)
#>          Point est. Upper C.I.
#> q[1,1,1]   1.007521   1.022611
#> q[1,2,1]   1.122386   1.438192
#> q[1,3,1]   1.677692   2.917502
#> q[1,4,1]   1.003516   1.003721
#> q[1,5,1]   1.269089   1.861320
#> q[1,6,1]   1.543866   2.665951
#Geweke's Z
head(converge_diag_result$geweke_diag$z)
#>  q[1,1,1]  q[1,2,1]  q[1,3,1]  q[1,4,1]  q[1,5,1]  q[1,6,1] 
#>  3.606717  1.164595 -5.137911 -2.043750 -1.228851 -5.200268
#Heidel's Stationarity and Half-width tests
head(converge_diag_result$heidel_diag)
#>          stest start     pvalue htest         mean    halfwidth
#> q[1,1,1]     1     1 0.18887881     1 0.0007217432 4.272638e-06
#> q[1,2,1]     1     1 0.85854603     1 0.0007715985 5.760703e-06
#> q[1,3,1]     1   801 0.12233304     1 0.0008153714 5.008481e-06
#> q[1,4,1]     1     1 0.50954807     1 0.0008232145 6.828502e-06
#> q[1,5,1]     1   601 0.07154669     1 0.0008998373 4.979771e-06
#> q[1,6,1]     1   601 0.23798990     1 0.0009619807 8.592257e-06Similarly to before, the data needs to be formatted in a data.frame with columns name Age, Year, Deaths,Exposures and Cause. Some examples are provided below.
data(uk_deathscausedata)
head(uk_deathscausedata, n = 10)
#> # A tibble: 10 × 5
#>      Age  Year Deaths Exposures Cause
#>    <dbl> <int>  <dbl>     <dbl> <chr>
#>  1    15  2001   0     1653794. L057 
#>  2    15  2001   0.96  1653794. L108 
#>  3    15  2001   0     1653794. L110 
#>  4    15  2001   0     1653794. L115 
#>  5    15  2001   0     1653794. L132 
#>  6    20  2001   0     1577382. L057 
#>  7    20  2001   3.03  1577382. L108 
#>  8    20  2001   2.02  1577382. L110 
#>  9    20  2001   1.01  1577382. L115 
#> 10    20  2001   0     1577382. L132death <- preparedata_fn(uk_deathscausedata[,c("Age","Year","Deaths","Cause")])
expo <- preparedata_fn(uk_deathscausedata[,c("Age","Year","Exposures","Cause")])
#or if require a subset of the data
death <- preparedata_fn(uk_deathscausedata[,c("Age","Year","Deaths","Cause")], 
                                 ages = seq(45,90,by=5), years = 2001:2020)
expo <- preparedata_fn(uk_deathscausedata[,c("Age","Year","Exposures","Cause")], 
                                ages = seq(45,90,by=5), years = 2001:2020)
str(death)
#> List of 7
#>  $ data      : num [1:5, 1:10, 1:20] 272 506 588 37 39 ...
#>   ..- attr(*, "dimnames")=List of 3
#>   .. ..$ : chr [1:5] "L057" "L108" "L110" "L115" ...
#>   .. ..$ : chr [1:10] "45" "50" "55" "60" ...
#>   .. ..$ : chr [1:20] "2001" "2002" "2003" "2004" ...
#>  $ strat_name: chr [1:5] "L057" "L108" "L110" "L115" ...
#>  $ ages      : num [1:10] 45 50 55 60 65 70 75 80 85 90
#>  $ years     : int [1:20] 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 ...
#>  $ n_strat   : int 5
#>  $ n_ages    : int 10
#>  $ n_years   : int 20str(expo)
#> List of 7
#>  $ data      : num [1:5, 1:10, 1:20] 1643774 1643774 1643774 1643774 1643774 ...
#>   ..- attr(*, "dimnames")=List of 3
#>   .. ..$ : chr [1:5] "L057" "L108" "L110" "L115" ...
#>   .. ..$ : chr [1:10] "45" "50" "55" "60" ...
#>   .. ..$ : chr [1:20] "2001" "2002" "2003" "2004" ...
#>  $ strat_name: chr [1:5] "L057" "L108" "L110" "L115" ...
#>  $ ages      : num [1:10] 45 50 55 60 65 70 75 80 85 90
#>  $ years     : int [1:20] 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 ...
#>  $ n_strat   : int 5
#>  $ n_ages    : int 10
#>  $ n_years   : int 20As shown above, the cause of death data consists of numbers of death and exposures for five causes of death, spanning years 2001-2020 and 5-year age groups between 45-90.
Alternatively, users may wish to supply data which is already sorted as a 3D array (dim 1: strata, dim 2: ages, dim 3: years).
The syntax is similar to the case of the age-period data. The
function automatically recognises the structure of the data after being
processed by preparedata_fn. For illustration, we fit the
model by Li and Lee (2005) on the cause of
death data.
fitmodel_forecast <- runBayesMoFo(death, expo,
                         models = "MLiLee",
                         forecast = TRUE,
                         h = 5,
                         quiet = TRUE,
                         n.chain = 2
                         )
#> NOTE: Stopping adaptation
#> 
#> 
#> Completed: MLiLee (1/1)The argument quiet=TRUE was used to suppress messages
generated during model compilation stage.
The argument plot_type allows users to plot death rates
against year instead to better visualise temporal variations in death
rates. The argument plot_ages can be used accordingly to
specify which ages to plot.
diagnostics_param_result<-converge_diag_rates_fn(fitmodel_forecast, plot_ages = c(45,65,85), plot_years = c(2005,2020,2025))#for only acf plots
converge_diag_rates_fn(fitmodel_forecast, plot_ages = c(45,65,85), plot_years = c(2005,2020,2025), trace = FALSE, density = FALSE, acf_plot = TRUE)#> $ESS
#>   q[1,1,5]  q[1,1,20]  q[1,1,25]   q[1,5,5]  q[1,5,20]  q[1,5,25]   q[1,9,5] 
#>  287.54106  189.70343  915.79553  198.28783  149.41207  911.67527  171.50621 
#>  q[1,9,20]  q[1,9,25]   q[3,1,5]  q[3,1,20]  q[3,1,25]   q[3,5,5]  q[3,5,20] 
#>   94.41890  271.40731  279.57375  107.24266 1315.06789  194.84396  169.37361 
#>  q[3,5,25]   q[3,9,5]  q[3,9,20]  q[3,9,25]   q[5,1,5]  q[5,1,20]  q[5,1,25] 
#> 1118.82091  140.48926  110.08231  962.87939  288.64842   78.80306   61.44129 
#>   q[5,5,5]  q[5,5,20]  q[5,5,25]   q[5,9,5]  q[5,9,20]  q[5,9,25] 
#>  137.38822   54.26480  110.48131  115.85330   80.25128 1310.99110The effective sample sizes can be viewed as:
diagnostics_param_result$ESS
#>   q[2,1,5]  q[2,1,20]  q[2,1,25]   q[2,5,5]  q[2,5,20]  q[2,5,25]   q[2,9,5] 
#>  344.09052   75.05483  527.46728  177.55126  122.83274 1470.88709  108.79459 
#>  q[2,9,20]  q[2,9,25]   q[3,1,5]  q[3,1,20]  q[3,1,25]   q[3,5,5]  q[3,5,20] 
#>  218.11406 1654.60154  279.57375  107.24266 1315.06789  194.84396  169.37361 
#>  q[3,5,25]   q[3,9,5]  q[3,9,20]  q[3,9,25]   q[5,1,5]  q[5,1,20]  q[5,1,25] 
#> 1118.82091  140.48926  110.08231  962.87939  288.64842   78.80306   61.44129 
#>   q[5,5,5]  q[5,5,20]  q[5,5,25]   q[5,9,5]  q[5,9,20]  q[5,9,25] 
#>  137.38822   54.26480  110.48131  115.85330   80.25128 1310.99110By default, the function examines a selection of the parameters. But
the arguments plot_params can be used to specify which set
of parameters to examine, as follows.
converge_diag_param_fn(fitmodel_forecast, plot_params = c("beta","kappa","rho","phi","sigma2_kappa"))#> NOTE: Only showing three randomly selected beta.#> NOTE: Only showing three randomly selected kappa.
#> $ESS
#>    rho_Kappa    rho_kappa          phi sigma2_kappa    beta[2,9]    beta[3,5] 
#>    484.89142   1295.15052    132.26673    302.40896     37.17646     45.08807 
#>   beta[4,10]  kappa[3,19]   kappa[4,3]  kappa[2,18] 
#>     43.93540    141.73940    132.60227    127.77479To check the names of parameters available for examining:
fitmodel_forecast$result$best$param
#>  [1] "alpha"        "beta"         "kappa"        "Beta"         "Kappa"       
#>  [6] "eta_kappa"    "rho_kappa"    "sigma2_kappa" "eta_Kappa"    "rho_Kappa"   
#> [11] "sigma2_Kappa" "phi"For the predictor-related parameters such as alpha, beta, kappa, gamma etc., only three of the randomly selected subset will be examined when specified. If a particular parameter is to be assessed, users need to specify clearly the indices of the parameters to be examined. For instance, the following will assess the beta parameters for the first stratum and the third age, as well as kappa for the second stratum and the fourth year.
#> $ESS
#>  beta[1,3] kappa[2,4] 
#>   23.57477  151.44685To check the full list of parameters available for examining:
colnames(fitmodel_forecast$result$best$post_sample[[1]])[!startsWith(colnames(fitmodel_forecast$result$best$post_sample[[1]]),"q[")]
#>   [1] "Beta[1]"      "Beta[2]"      "Beta[3]"      "Beta[4]"      "Beta[5]"     
#>   [6] "Beta[6]"      "Beta[7]"      "Beta[8]"      "Beta[9]"      "Beta[10]"    
#>  [11] "Kappa[1]"     "Kappa[2]"     "Kappa[3]"     "Kappa[4]"     "Kappa[5]"    
#>  [16] "Kappa[6]"     "Kappa[7]"     "Kappa[8]"     "Kappa[9]"     "Kappa[10]"   
#>  [21] "Kappa[11]"    "Kappa[12]"    "Kappa[13]"    "Kappa[14]"    "Kappa[15]"   
#>  [26] "Kappa[16]"    "Kappa[17]"    "Kappa[18]"    "Kappa[19]"    "Kappa[20]"   
#>  [31] "Kappa[21]"    "Kappa[22]"    "Kappa[23]"    "Kappa[24]"    "Kappa[25]"   
#>  [36] "alpha[1,1]"   "alpha[2,1]"   "alpha[3,1]"   "alpha[4,1]"   "alpha[5,1]"  
#>  [41] "alpha[1,2]"   "alpha[2,2]"   "alpha[3,2]"   "alpha[4,2]"   "alpha[5,2]"  
#>  [46] "alpha[1,3]"   "alpha[2,3]"   "alpha[3,3]"   "alpha[4,3]"   "alpha[5,3]"  
#>  [51] "alpha[1,4]"   "alpha[2,4]"   "alpha[3,4]"   "alpha[4,4]"   "alpha[5,4]"  
#>  [56] "alpha[1,5]"   "alpha[2,5]"   "alpha[3,5]"   "alpha[4,5]"   "alpha[5,5]"  
#>  [61] "alpha[1,6]"   "alpha[2,6]"   "alpha[3,6]"   "alpha[4,6]"   "alpha[5,6]"  
#>  [66] "alpha[1,7]"   "alpha[2,7]"   "alpha[3,7]"   "alpha[4,7]"   "alpha[5,7]"  
#>  [71] "alpha[1,8]"   "alpha[2,8]"   "alpha[3,8]"   "alpha[4,8]"   "alpha[5,8]"  
#>  [76] "alpha[1,9]"   "alpha[2,9]"   "alpha[3,9]"   "alpha[4,9]"   "alpha[5,9]"  
#>  [81] "alpha[1,10]"  "alpha[2,10]"  "alpha[3,10]"  "alpha[4,10]"  "alpha[5,10]" 
#>  [86] "beta[1,1]"    "beta[2,1]"    "beta[3,1]"    "beta[4,1]"    "beta[1,2]"   
#>  [91] "beta[2,2]"    "beta[3,2]"    "beta[4,2]"    "beta[1,3]"    "beta[2,3]"   
#>  [96] "beta[3,3]"    "beta[4,3]"    "beta[1,4]"    "beta[2,4]"    "beta[3,4]"   
#> [101] "beta[4,4]"    "beta[1,5]"    "beta[2,5]"    "beta[3,5]"    "beta[4,5]"   
#> [106] "beta[1,6]"    "beta[2,6]"    "beta[3,6]"    "beta[4,6]"    "beta[1,7]"   
#> [111] "beta[2,7]"    "beta[3,7]"    "beta[4,7]"    "beta[1,8]"    "beta[2,8]"   
#> [116] "beta[3,8]"    "beta[4,8]"    "beta[1,9]"    "beta[2,9]"    "beta[3,9]"   
#> [121] "beta[4,9]"    "beta[1,10]"   "beta[2,10]"   "beta[3,10]"   "beta[4,10]"  
#> [126] "eta_Kappa[1]" "eta_Kappa[2]" "eta_kappa[1]" "eta_kappa[2]" "kappa[1,1]"  
#> [131] "kappa[2,1]"   "kappa[3,1]"   "kappa[4,1]"   "kappa[1,2]"   "kappa[2,2]"  
#> [136] "kappa[3,2]"   "kappa[4,2]"   "kappa[1,3]"   "kappa[2,3]"   "kappa[3,3]"  
#> [141] "kappa[4,3]"   "kappa[1,4]"   "kappa[2,4]"   "kappa[3,4]"   "kappa[4,4]"  
#> [146] "kappa[1,5]"   "kappa[2,5]"   "kappa[3,5]"   "kappa[4,5]"   "kappa[1,6]"  
#> [151] "kappa[2,6]"   "kappa[3,6]"   "kappa[4,6]"   "kappa[1,7]"   "kappa[2,7]"  
#> [156] "kappa[3,7]"   "kappa[4,7]"   "kappa[1,8]"   "kappa[2,8]"   "kappa[3,8]"  
#> [161] "kappa[4,8]"   "kappa[1,9]"   "kappa[2,9]"   "kappa[3,9]"   "kappa[4,9]"  
#> [166] "kappa[1,10]"  "kappa[2,10]"  "kappa[3,10]"  "kappa[4,10]"  "kappa[1,11]" 
#> [171] "kappa[2,11]"  "kappa[3,11]"  "kappa[4,11]"  "kappa[1,12]"  "kappa[2,12]" 
#> [176] "kappa[3,12]"  "kappa[4,12]"  "kappa[1,13]"  "kappa[2,13]"  "kappa[3,13]" 
#> [181] "kappa[4,13]"  "kappa[1,14]"  "kappa[2,14]"  "kappa[3,14]"  "kappa[4,14]" 
#> [186] "kappa[1,15]"  "kappa[2,15]"  "kappa[3,15]"  "kappa[4,15]"  "kappa[1,16]" 
#> [191] "kappa[2,16]"  "kappa[3,16]"  "kappa[4,16]"  "kappa[1,17]"  "kappa[2,17]" 
#> [196] "kappa[3,17]"  "kappa[4,17]"  "kappa[1,18]"  "kappa[2,18]"  "kappa[3,18]" 
#> [201] "kappa[4,18]"  "kappa[1,19]"  "kappa[2,19]"  "kappa[3,19]"  "kappa[4,19]" 
#> [206] "kappa[1,20]"  "kappa[2,20]"  "kappa[3,20]"  "kappa[4,20]"  "kappa[1,21]" 
#> [211] "kappa[2,21]"  "kappa[3,21]"  "kappa[4,21]"  "kappa[1,22]"  "kappa[2,22]" 
#> [216] "kappa[3,22]"  "kappa[4,22]"  "kappa[1,23]"  "kappa[2,23]"  "kappa[3,23]" 
#> [221] "kappa[4,23]"  "kappa[1,24]"  "kappa[2,24]"  "kappa[3,24]"  "kappa[4,24]" 
#> [226] "kappa[1,25]"  "kappa[2,25]"  "kappa[3,25]"  "kappa[4,25]"  "phi"         
#> [231] "rho_Kappa"    "rho_kappa"    "sigma2_Kappa" "sigma2_kappa"The arguments trace and density can be used
to specify only plotting one of them or none according to personal
preferences.
#for only trace plots
converge_diag_param_fn(fitmodel_forecast, plot_params = c("beta[1,3]","kappa[2,4]"), trace = TRUE, density = FALSE)Auto-correlation plots can also displayed to check if posterior samples are correlated.
#for only acf plots
converge_diag_param_fn(fitmodel_forecast, plot_params = c("beta[1,3]","kappa[2,4]"), trace = FALSE, density = FALSE, acf_plot = TRUE)#> $ESS
#>  beta[1,3] kappa[2,4] 
#>   23.57477  151.44685Convergence diagnostics can be applied as usual, but are not run in the example below.
#Gelman's R
head(converge_diag_result$gelman_diag$psrf)
#Geweke's Z
head(converge_diag_result$geweke_diag$z)
#Heidel's Stationarity and Half-width tests
head(converge_diag_result$heidel_diag)Interestingly, there is an article discussing the use of common (shared) cohort effects for modelling cause of death data as described by Cairns (2023). Thus, we can fit some of the models that incorporate shared cohort effects as below (NOT RUN).
fitmodel_forecast <- runBayesMoFo(death, expo,
                         models = c("APCI_sharegamma",
                                    "RH_sharegamma"),
                         forecast = TRUE,
                         h = 5,
                         quiet = TRUE
                         )| Model | Predictor, \(\eta_{x,t}\) | 
|---|---|
| APCI | \(a_{x}+b_{x}(t-\bar{t})+k_{t} + \gamma_{c}\) | 
| CBD_M3 | \(a_{x}+k_{t} + \gamma_{c}\) | 
| CBD_M5 | \(k^1_{t} + k^2_{t}(x-\bar{x})\) | 
| CBD_M6 | \(k^1_{t} + k^2_{t}(x-\bar{x}) +\gamma_{c}\) | 
| CBD_M7 | \(k^1_{t} + k^2_{t}(x-\bar{x}) + k^3_{t}((x-\bar{x})^2-\hat{\sigma}_x^2) +\gamma_{c}\) | 
| CBD_M8 | \(k^1_{t} + k^2_{t}(x-\bar{x}) +\gamma_{c}(constant-x)\) | 
| LC | \(a_{x}+b_{x}k_{t}\) | 
| MLiLee | \(a_{x}+B_xK_t\) | 
| PLAT | \(a_{x}+k^1_{t} + k^2_{t}(\bar{x}-x) + k^3_{t}(\bar{x}-x)^+ +\gamma_{c}\) | 
| RH | \(a_{x}+b_{x}k_{t} + \gamma_{c}\) | 
| Model | Predictor, \(\eta_{x,t,p}\) | 
|---|---|
| APCI | \(a_{x,p}+b_{x,p}(t-\bar{t})+k_{t,p} + \gamma_{c,p}\) | 
| APCI_sharealpha | \(a_{x}+b_{x,p}(t-\bar{t})+k_{t,p} + \gamma_{c,p}\) | 
| APCI_sharebeta | \(a_{x,p}+b_{x}(t-\bar{t})+k_{t,p} + \gamma_{c,p}\) | 
| APCI_sharegamma | \(a_{x,p}+b_{x,p}(t-\bar{t})+k_{t,p} + \gamma_{c}\) | 
| APCI_sharealpha_sharebeta | \(a_{x}+b_{x}(t-\bar{t})+k_{t,p} + \gamma_{c,p}\) | 
| APCI_sharealpha_sharegamma | \(a_{x}+b_{x,p}(t-\bar{t})+k_{t,p} + \gamma_{c}\) | 
| APCI_sharebeta_sharegamma | \(a_{x,p}+b_{x}(t-\bar{t})+k_{t,p} + \gamma_{c}\) | 
| APCI_shareall | \(a_{x}+b_{x}(t-\bar{t})+k_{t,p} + \gamma_{c}\) | 
| CBD_M3 | \(a_{x,p}+k_{t,p} + \gamma_{c,p}\) | 
| CBD_M3_sharealpha | \(a_{x}+k_{t,p} + \gamma_{c,p}\) | 
| CBD_M3_sharegamma | \(a_{x,p}+k_{t,p} + \gamma_{c}\) | 
| CBD_M3_shareall | \(a_{x}+k_{t,p} + \gamma_{c}\) | 
| CBD_M5 | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x})\) | 
| CBD_M6 | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) +\gamma_{c,p}\) | 
| CBD_M6_sharegamma | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) +\gamma_{c}\) | 
| CBD_M7 | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) + k^3_{t,p}((x-\bar{x})^2-\hat{\sigma}_x^2) +\gamma_{c,p}\) | 
| CBD_M7_sharegamma | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) + k^3_{t,p}((x-\bar{x})^2-\hat{\sigma}_x^2) +\gamma_{c}\) | 
| CBD_M8 | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) +\gamma_{c,p}(c_p-x)\) | 
| CBD_M8_sharegamma | \(k^1_{t,p} + k^2_{t,p}(x-\bar{x}) +\gamma_{c}(c_p-x)\) | 
| LC | \(a_{x,p}+b_{x,p}k_{t,p}\) | 
| LC_sharealpha | \(a_{x}+b_{x,p}k_{t,p}\) | 
| LC_sharebeta | \(a_{x,p}+b_{x}k_{t,p}\) | 
| LC_shareall | \(a_{x}+b_{x}k_{t,p}\) | 
| M1A | \(a_{x}+c_p+b_xk_t\) | 
| M1U | \(a_{x,p}+b_xk_t\) | 
| M1M | \(a_{x}c_p+b_xk_t\) | 
| M2A1 | \(a_{x}+(c_p+b_x)k_t\) | 
| M2A2 | \(a_{x}+b_{x,p}k_t\) | 
| M2Y1 | \(a_{x}+b_x(k_t+c_p)\) | 
| M2Y2 | \(a_{x}+b_{x}k_{t,p}\) | 
| MLiLee | \(a_{x,p}+b_{x,p}k_{t,p}+B_xK_t\) | 
| MLiLee_sharealpha | \(a_{x}+b_{x,p}k_{t,p}+B_xK_t\) | 
| PLAT | \(a_{x,p}+k^1_{t,p} + k^2_{t,p}(\bar{x}-x) + k^3_{t,p}(\bar{x}-x)^+ +\gamma_{c,p}\) | 
| PLAT_sharealpha | \(a_{x}+k^1_{t,p} + k^2_{t,p}(\bar{x}-x) + k^3_{t,p}(\bar{x}-x)^+ +\gamma_{c,p}\) | 
| PLAT_sharegamma | \(a_{x,p}+k^1_{t,p} + k^2_{t,p}(\bar{x}-x) + k^3_{t,p}(\bar{x}-x)^+ +\gamma_{c}\) | 
| PLAT_shareall | \(a_{x}+k^1_{t,p} + k^2_{t,p}(\bar{x}-x) + k^3_{t,p}(\bar{x}-x)^+ +\gamma_{c}\) | 
| RH | \(a_{x,p}+b_{x,p}k_{t,p} + \gamma_{c,p}\) | 
| RH_sharealpha | \(a_{x}+b_{x,p}k_{t,p} + \gamma_{c,p}\) | 
| RH_sharebeta | \(a_{x,p}+b_{x}k_{t,p} + \gamma_{c,p}\) | 
| RH_sharegamma | \(a_{x,p}+b_{x,p}k_{t,p} + \gamma_{c}\) | 
| RH_sharealpha_sharebeta | \(a_{x}+b_{x}k_{t,p} + \gamma_{c,p}\) | 
| RH_sharealpha_sharegamma | \(a_{x}+b_{x,p}k_{t,p} + \gamma_{c}\) | 
| RH_sharebeta_sharegamma | \(a_{x,p}+b_{x}k_{t,p} + \gamma_{c}\) | 
| RH_shareall | \(a_{x}+b_{x}k_{t,p} + \gamma_{c}\) |