A comparison of the frequentist and Bayesian approach to multinomial logistic regression in statistics: an application to study habits data from PRIME

More Info
expand_more

Abstract

Frequentist statistics and Bayesian statistics are the two main approaches to statistical inference. The frequentist approach is commonly integrated into academic curricula, while the Bayesian approach is less frequently employed. However a comparison of the approaches, further investigating their shortcomings and advantages, might give a better comprehension of statistics and more insight in statistical inference. Therefore the current study applied both the frequentist and Bayesian approach to multinomial logistic regression.

The multinomial logistic regression model can be described as a generalized linear model and as a random utility model, and the current study has shown that these models generate an equivalent probability function. Moreover, the method of estimating coefficients in the frequentist and in the Bayesian approach were described. The multinomial logistic regression model was subsequently applied to data from educational research, conducted by PRIME. Three different R packages were used to perform the multinomial logistic regression: the VGAM package (frequentist, generalized linear model), the mlogit package (frequentist, random utility model) and the UPG package (Bayesian, random utility model). The results of the analysis of one dependent variable were subsequently compared.

The results indicated that the frequentist and Bayesian approach differ in their estimation time and model fit: the Bayesian approach required more computational time, but resulted in a better model fit. The frequentist 95% confidence intervals and Bayesian 95% credible intervals are comparable, but the interpretation of these is considerably different due to the philosophical underpinnings of both approaches. Moreover in the Bayesian approach, existing knowledge and information can be incorporated by choosing the prior distribution. Furthermore the Bayesian approach gives a posterior distribution, which is more informative than only a point estimate. In comparing the three different R packages it is noted that all three have a slightly different theoretical background. Since the packages all have their own shortcomings and advantages, combining them when conducting multinomial logistic regression could be desirable.