Print Email Facebook Twitter Evaluating students' study habits using Bayesian Multinomial Logistic Regression Title Evaluating students' study habits using Bayesian Multinomial Logistic Regression Author Ahmad, Amna (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Spandaw, J.G. (mentor) Jongbloed, G. (graduation committee) Vroegrijk, T.W.C. (graduation committee) Degree granting institution Delft University of Technology Corporate name Delft University of Technology Programme Applied Mathematics Project PRIME: PRogramme of Innovation in Mathematics Education Date 2023-08-24 Abstract The Bayesian approach is a very important approach for tackling problems in statistics. It involves choosing a distribution that reflects the prior knowledge and thus takes all knowledge into account in contrast to the frequentist approach. It also assumes that the parameters (the regression coefficients) follow a distribution called the posterior distribution instead of fixed constants. When a specific choice of this prior is made, this needs to be justified as the prior directly influences the posterior distribution of the regression coefficients. It is also possible to consider priors that do not carry a lot of information and such priors will be compared in this project. In this thesis, the Bayesian approach will be used to apply a multinomial logistic regression model to data concerning students’ study habits and beliefs. The data is provided by a research group called PRIME and they focus on mathematics education at the TU Delft. Multinomial logistic regression is used to find predictions of the choices expressed in probabilities. Bayesian statistics is not only useful in a sense that it offers the possibility to specify the prior knowledge, but also because the Bayesian way of thinking can be incorporated in evaluating results. This can be done by constructing credible intervals for the predicted probabilities. Overlap between intervals can then give insight on prediction quality. In this project, the models are coded in R and here two packages are used: the UPG and the BRMS package. The priors that are compared are the Gaussian and Cauchy distributions. Other than that there are also default priors used in the packages, which can be compared to the Gaussian and Cauchy priors. In the end, a conclusion can be drawn about the performance of each model based on the prediction accuracy. It can be concluded that the BRMS package outperforms the UPG package in terms of accuracy both using default priors and overall using default priors gives more accurate results than specifying the prior. However, the difference in the accuracy of the model using the BRMS package is not significantly higher than the accuracy obtained from the UPG model and the running time is a lot higher for the BRMS package. From the models with a specified prior, the model with the Cauchy distribution as prior performed better. Subject Bayesian statisticsMultinomial logistic regressionPrior distribution To reference this document use: http://resolver.tudelft.nl/uuid:7c81160d-baba-4122-b9f5-c7b90e4e14ae Part of collection Student theses Document type bachelor thesis Rights © 2023 Amna Ahmad Files PDF BEP_definitief.pdf 1.39 MB Close viewer /islandora/object/uuid:7c81160d-baba-4122-b9f5-c7b90e4e14ae/datastream/OBJ/view