Vine Regression with Bayes Nets
A Critical Comparison with Traditional Approaches Based on a Case Study on the Effects of Breastfeeding on IQ
More Info
expand_more
Abstract
Regular vines (R-vines) copulas build high dimensional joint densities from arbitrary one-dimensional margins and (conditional) bivariate copula densities. Vine densities enable the computation of all conditional distributions, though the calculations can be numerically intensive. Saturated continuous nonparametric Bayes nets (CNPBN) are regular vines. Computing regression functions from the vine copula density is termed vine regression. The epicycles of regression–including/excluding covariates, interactions, higher order terms, multicollinearity, model fit, transformations, heteroscedasticity, bias–are dispelled. One simply computes the regressions from the vine copula density. Only the question of finding an adequate vine copula remains. Vine regression is applied to a data set from the National Longitudinal Study of Youth relating breastfeeding to IQ. The expected effects of breastfeeding on IQ depend on IQ, on the baseline level of breastfeeding, on the duration of additional breastfeeding and on the values of other covariates. A child given two weeks breastfeeding can expect to increase his/her IQ by 1.5–2 IQ points by adding 10 weeks of breastfeeding, depending on values of other covariates. A child given two years breastfeeding can expect to gain from 0.48–0.65 IQ points from 10 additional weeks. Adding 10 weeks breastfeeding to each of the 3,179 children in this data set has a net present value $50,700,000 according to the Bayes net, compared to $29,000,000 according to the linear regression.