John M. Rose | TU Delft Repository

On the robustness of efficient experimental designs towards the underlying decision rule

Journal article (2018) - Sander van Cranenburgh, John M. Rose, Caspar G. Chorus

We present a methodology to derive efficient designs for Stated Choice (SC) experiments based on Random Regret Minimisation (RRM) behavioural assumptions. This complements earlier work on the design of efficient SC experiments based on Random Utility Maximisation (RUM) models. Capitalizing on this methodology, and using both analytical derivations and empirical data, we investigate the importance of the analyst's assumption regarding the underlying decision rule used to generate the efficient experimental design. We find that conventional RUM-efficient designs can be statistically highly inefficient in cases where RRM is the better representation of the actual choice behaviour, and vice versa. Furthermore, we present a methodology to construct efficient designs that are robust towards the uncertainty on the side of the analyst regarding the underlying decision rule. ...

On The Robustness Of Efficient Experimental Designs

Abstract (2017) - Sander van Cranenburgh, John M. Rose, Caspar Chorus

Introduction

Stated Choice (SC) experiments are widely used to acquire understanding on choice behaviour. Nowadays, SC experiments are increasingly being based on so-called “efficient designs”. Efficient designs aim to generate stated choice tasks that maximize the collected information in the data, yielding more reliable parameter estimates with an equal or lower number of observations than traditional orthogonal designs. While earlier research efforts on efficient experimental design mainly focussed on extending the design theory to encompass more advanced models of choice, such as Nested Logit and (Panel) Mixed Logit models, recent efforts are shifting towards understanding the robustness of the modelling results towards the efficient experimental design. These studies focus on exploring to what extent a particular experimental design loses efficiency when the data generating process does not match the model on which the design is based. This literature has predominantly focussed on misspecification in terms of parameter priors, interaction effects and the way in which (correlations between) error terms are modelled (Ferrini and Scarpa 2007; Yu et al. 2008; Bliemer and Rose 2011; Ojeda-Cabral et al. 2016)

However, despite compelling evidence that decision-makers use a wide range of decision rules when making choices (Hess et al. 2012), and despite the rapidly growing interest in the choice modelling community into alternative decision rules (Leong and Hensher 2012; Guevara and Fukushi 2016), robustness issues concerning potential misspecification of the presumed decision rule has attracted only very limited attention within the literature. In fact, to the authors’ knowledge, research on experimental designs has exclusively been based on the (often implicit) assumption that decision-makers make choices based on a (linear-additive) Random Utility Maximization (RUM) rule. As a consequence, it is currently unclear what is the influence of different assumptions regarding the decision rules on the statistical efficiency of the design, and on the performance of different choice models under different design assumptions. For instance, do non-RUM designs differ much from RUM-designs?; does a misspecification of the decision rule result in perhaps highly suboptimal designs?; do RUM designs favour RUM models in terms of model fit?

This paper aims to fill these knowledge gaps. To do so, we construct efficient designs based on a non-RUM model – in casu: a Random Regret Minimization (RRM) model – and assess the effects of the design decision rule[1] analytically as well as empirically. We use an RRM model for our analyses because RRM models are among the more popular non-RUM models. Specifically, we use the P-RRM model (Van Cranenburgh et al. 2015) as this model has very convenient mathematical properties for constructing efficient designs. First, we investigate the effect of decision rule misspecification on the statistical efficiency of the SC design. Specifically, we consider two cases: (1) the case in which the experimental designs are optimized for linear-additive RUM while the true Data Generating Process (DGP) is P-RRM, and (2) the case in which the experimental efficient designs are optimized for P-RRM while the true DGP is linear-additive RUM. After that, we use empirical data (collected specifically for this study) to investigate the influence of the design decision rule on the modelling results.

Contributions

The methodological contributions of this paper to the experimental design literature are twofold. Firstly, we show that for the P-RRM model (Van Cranenburgh et al. 2015) efficient designs can relatively easily be constructed. Because the P-RRM model has a piecewise linear form, the Asymptotic Variance Covariance matrix – which is needed to construct efficient designs – can be determined analytically. Thereby, we complement the choice modeller’s toolbox for designing efficient experimental designs. Secondly, we extend the experimental design literature by developing new insights on the effects of misspecification of the assumed design decision rule on the statistical efficiency. Finally, the substantive contribution of this paper is that we develop new empirical insights on the robustness of modelling results with respect to the assumed design decision rule.

Key findings

Conventional RUM efficient designs can be statistically highly inefficient in case RRM is the better representation of the actual choice behaviour

Inferences based on empirical SC data on what decision rule (in casu: RUM or RRM) best explains the observed choices are found to be highly sensitive to the particular design that is being used. Model fit differences can be substantial and highly significant.

To the extent that a design is more efficient for one particular decision rule (RUM or RRM), the choice modeller is more likely to conclude – based on comparison of the final Log-Likelihoods – that that particular decision rule is the better representation of the actual observed choice behaviour. ...

Introduction

Stated Choice (SC) experiments are widely used to acquire understanding on choice behaviour. Nowadays, SC experiments are increasingly being based on so-called “efficient designs”. Efficient designs aim to generate stated choice tasks that maximize the collected information in the data, yielding more reliable parameter estimates with an equal or lower number of observations than traditional orthogonal designs. While earlier research efforts on efficient experimental design mainly focussed on extending the design theory to encompass more advanced models of choice, such as Nested Logit and (Panel) Mixed Logit models, recent efforts are shifting towards understanding the robustness of the modelling results towards the efficient experimental design. These studies focus on exploring to what extent a particular experimental design loses efficiency when the data generating process does not match the model on which the design is based. This literature has predominantly focussed on misspecification in terms of parameter priors, interaction effects and the way in which (correlations between) error terms are modelled (Ferrini and Scarpa 2007; Yu et al. 2008; Bliemer and Rose 2011; Ojeda-Cabral et al. 2016)

However, despite compelling evidence that decision-makers use a wide range of decision rules when making choices (Hess et al. 2012), and despite the rapidly growing interest in the choice modelling community into alternative decision rules (Leong and Hensher 2012; Guevara and Fukushi 2016), robustness issues concerning potential misspecification of the presumed decision rule has attracted only very limited attention within the literature. In fact, to the authors’ knowledge, research on experimental designs has exclusively been based on the (often implicit) assumption that decision-makers make choices based on a (linear-additive) Random Utility Maximization (RUM) rule. As a consequence, it is currently unclear what is the influence of different assumptions regarding the decision rules on the statistical efficiency of the design, and on the performance of different choice models under different design assumptions. For instance, do non-RUM designs differ much from RUM-designs?; does a misspecification of the decision rule result in perhaps highly suboptimal designs?; do RUM designs favour RUM models in terms of model fit?

This paper aims to fill these knowledge gaps. To do so, we construct efficient designs based on a non-RUM model – in casu: a Random Regret Minimization (RRM) model – and assess the effects of the design decision rule[1] analytically as well as empirically. We use an RRM model for our analyses because RRM models are among the more popular non-RUM models. Specifically, we use the P-RRM model (Van Cranenburgh et al. 2015) as this model has very convenient mathematical properties for constructing efficient designs. First, we investigate the effect of decision rule misspecification on the statistical efficiency of the SC design. Specifically, we consider two cases: (1) the case in which the experimental designs are optimized for linear-additive RUM while the true Data Generating Process (DGP) is P-RRM, and (2) the case in which the experimental efficient designs are optimized for P-RRM while the true DGP is linear-additive RUM. After that, we use empirical data (collected specifically for this study) to investigate the influence of the design decision rule on the modelling results.

Contributions

The methodological contributions of this paper to the experimental design literature are twofold. Firstly, we show that for the P-RRM model (Van Cranenburgh et al. 2015) efficient designs can relatively easily be constructed. Because the P-RRM model has a piecewise linear form, the Asymptotic Variance Covariance matrix – which is needed to construct efficient designs – can be determined analytically. Thereby, we complement the choice modeller’s toolbox for designing efficient experimental designs. Secondly, we extend the experimental design literature by developing new insights on the effects of misspecification of the assumed design decision rule on the statistical efficiency. Finally, the substantive contribution of this paper is that we develop new empirical insights on the robustness of modelling results with respect to the assumed design decision rule.

Key findings

Conventional RUM efficient designs can be statistically highly inefficient in case RRM is the better representation of the actual choice behaviour

Inferences based on empirical SC data on what decision rule (in casu: RUM or RRM) best explains the observed choices are found to be highly sensitive to the particular design that is being used. Model fit differences can be substantial and highly significant.

To the extent that a design is more efficient for one particular decision rule (RUM or RRM), the choice modeller is more likely to conclude – based on comparison of the final Log-Likelihoods – that that particular decision rule is the better representation of the actual observed choice behaviour.

Detecting dominance in stated choice data and accounting for dominance-based scale differences in logit models

Journal article (2017) - Michiel C.J. Bliemer, John M. Rose, Caspar G. Chorus

Stated choice surveys have been used for several decades to estimate preferences of agents using choice models, and are widely applied in the transportation domain. Different types of experimental designs that underlie such surveys have been used in practice. In unlabelled experiments, where all alternatives are described by the same generic utility function, such designs may suffer from choice tasks containing a dominant alternative. Also in labelled experiments with alternative specific attributes and constants such dominance may occur, but to a lesser extent. We show that dominant alternatives are problematic because they affect scale and may bias parameter estimates. We propose a new measure based on minimum regret to calculate dominance and automatically detect such choice tasks in an experimental design or existing dataset. This measure is then used to define a new experimental design type that removes dominance and ensures the making of trade-offs between attributes. Finally, we propose a new regret-scaled multinomial logit model that takes the level of dominance within a choice task into account. Results using simulated and empirical data show that the presence of dominant alternatives can bias model estimates, but by making scale a function of a smooth approximation of normalised minimum regret we can properly account for scale differences without the need to remove choice tasks with dominant alternatives from the dataset. ...