Why GAMLSS?

Mikis Stasinopoulos
Bob Rigby
Gillian Heller
Fernanda De Bastiani
Niki Umlauf

Introduction

DAY 1

  • Morning

    • why GAMLSS

    • Available software

    • Practical 1

  • Afternoon

    • Distributions

    • Continuous distributions

    • Practical 2

DAY 2

  • Morning

    • Discrete Distributions

    • Mixed distributions

    • Practical 3

  • Afternoon

    • Model Fitting

    • Model Selection

    • Practical 4

DAY 3

  • Morning

    • Centile estination

    • Diagnostics and ggplots

    • Practical 5

  • Afternoon

    • Model Comparison

    • Model Interpretation

    • Discussion

Why GAMLSS

  • Statistical modelling

  • Ditributional Regression

Statistical modelling

Statistical models

“all models are wrong but some are useful”.

– George Box

  • Models should be parsimonious

  • Models should be fit for purpose and able to answer the question at hand

  • Statistical models have a stochastic component

  • All models are based on assumptions.

Assumptions

  • Assumptions are made to simplify things

  • Explicit assumptions

  • Implicit assumptions

  • it is easier to check the explicit assumptions rather the implicit ones

Model circle

Regression

  • \[ X \stackrel{\textit{M}(\theta)}{\longrightarrow} Y \]
  • \(y\): targer, the y or the dependent variable
  • \(X\): explanatory, features, x’s or independent variables or terms

Linear Model

  • standard way

\[ \begin{equation} y_i= b_0 + b_1 x_{1i} + b_2 x_{2i}, \ldots, b_p x_{pi}+ \epsilon_i \end{equation} \qquad(1)\]

Linear Model

  • different way

\[ \begin{split} y_i & \stackrel{\small{ind}}{\sim } & {N}(\mu_i, \sigma) \nonumber \\ \mu_i &=& b_0 + b_1 x_{1i} + b_2 x_{2i}, \ldots, b_p x_{pi} \end{split} \qquad(2)\]

Example: BMI data

Example: BMI fitted model

Additive Models

\[ \begin{split} y_i & \stackrel{\small{ind}}{\sim } & {N}(\mu_i, \sigma) \nonumber \\ \mu_i &=& b_0 + s_1(x_{1i}) + s_2(x_{2i}), \ldots, s_p(x_{pi}) \end{split} \qquad(3)\]

Example: additive fitted model

Machine Learning Models

\[\begin{split} y_i & \stackrel{\small{ind}}{\sim }& {N}(\mu_i, \sigma) \nonumber \\ \mu_i &=& ML(x_{1i},x_{2i}, \ldots, x_{pi}) \end{split} \qquad(4)\]

Example: neural networks

Example: regression tree

Generalised Linear Models

\[\begin{split} y_i & \stackrel{\small{ind}}{\sim }& {E}(\mu_i, \phi) \nonumber \\ g(\mu_i) &=& b_0 + b_1 x_{1i} + b_2 x_{2i}, \ldots, b_p x_{pi} \end{split} \qquad(5)\]

  • \({E}(\mu_i, \phi)\) : Exponential family

  • \(g(\mu_i)\) : the link function

Example: GLM

Example: diagnostics 1

Example: diagnostics 2

Example: diagnostics 3

Example: conclusion

  • the mean of the response is fitted fine with the linear model but the distribution is not

  • the distribution (implicit Normal) is not-adequate

  • even the explicit Gamma distribution of the GLM is not-adequate

  • therefore if we are interested on a statistic different from the mean we need something extra.

Distributional regression

Distributional regression

\[ X \stackrel{\textit{M}(\boldsymbol{\theta})}{\longrightarrow} D\left(Y|\boldsymbol{\theta}(\textbf{X})\right) \]

  • All parameters \(\boldsymbol{\theta}\) could functions of the explanatory variables \(\boldsymbol{\theta}(\textbf{X})\).

  • \(D\left(Y|\boldsymbol{\theta}(\textbf{X})\right)\) can be any \(k\) parameter distribution

Generalised Additive models for Location Scale and Shape

\[\begin{split} y_i & \stackrel{\small{ind}}{\sim }& {D}( \theta_{1i}, \ldots, \theta_{ki}) \nonumber \\ g(\theta_{1i}) &=& b_{10} + s_1({x}_{1i}) + \ldots, s_p({x}_{pi}) \nonumber\\ \ldots &=& \ldots \nonumber\\ g({\theta}_{ki}) &=& b_0 + s_1({x}_{1i}) + \ldots, s_p({x}_{pi}) \end{split} \qquad(6)\]

GAMLSS + ML

\[\begin{split} y_i & \stackrel{\small{ind}}{\sim }& {D}( \theta_{1i}, \ldots, \theta_{ki}) \nonumber \\ g({\theta}_{1i}) &=& {ML}_1({x}_{1i},{x}_{2i}, \ldots, {x}_{pi}) \nonumber \\ \ldots &=& \ldots \nonumber\\ g({\theta}_{ki}) &=& {ML}_1({x}_{1i},{x}_{2i}, \ldots, {x}_{pi}) \end{split} \qquad(7)\]

Example: GAMLSS

Example: diagnostics

Example: diagnostics 2

Example: diagnostics 3

Fitted Centiles

Figure 1: Centile-plot of the fitted m6 model

The true BMI data

The fitted model

The fitted centiles

Diagnostics 1

Diagnostics 2

Diagnostics 3

Summary

  • Distributional assumptions often needed for the response to be fitted properly

  • In the BMI example above we needed to model all the parameters of the distribution as function of the explanatory variable age.

  • Those parameters were the location parameter \(\mu\), the scale parameter, \(\sigma\), the skewness parameter, \(\nu\), and the kurtotic parameter \(\tau\)

  • Machine Learning methods are useful (especially for modelling interactions between variables) but they are not suitable if the interest do non lie in the mean.

end

back to the index

The Books