Australasian Mathematical Psychology Conference 2019

A new approach to compositional data analysis

Michael Smithson
Psychology, The Australian National University

For many years methods for analysing compositional data have been dominated by Dirichlet distribution regression and Aitchison’s log-odds transformation method. These approaches have several limitations. The new “probability-ratio” approach presented here overcomes some of these limitations and permits any distribution whose support is (0,1) to be applied to the analysis of compositional data. Given compositional data \(\pi_k\), where \(\sum\limits_{k = 1}^K {{\pi _k} = 1}\), we generate \(K - 1\) \({\nu_k} = W\left( {{\pi_j},j = 1, \ldots ,K} \right)\), such that \(0 < \nu_k < 1\) and they are not sum-constrained. The \(\nu_k\) may then be modelled via copulas whose marginal distributions include any distribution whose support is (0, 1), such as the beta or the CDF-Quantile family.

A typical example of suitable \(\nu_k\) is equivalent to Aitchison’s (1986) “additive log-ratio” transformation: \( \nu_k = \pi_k / \left( \pi_k + \pi_K \right) \), for \(k = 1, \ldots, K-1\). The probability-ratio method’s strengths are as follows: