Hypothesis Testing in Contingency Tables

A Discussion, and Exact Unconditional Tests for r×c Tables

More Info
expand_more

Abstract

Every time one counts the number of occurrences of a pair of values for two categorical variables, one obtains a contingency table. These tables are one of the simplest representations of data in order to statistically test for the presence of some association between the two variables under consideration. Although naturally occurring in so many scientific disciplines, there is still a lot of debate on the appropriate way to perform tests of significance on these contingency tables.

Especially when one wants to use exact methods, i.e., methods that are based on the exact probabilities of observing the table of interest, there is great disagreement on which marginal totals one should treat as fixed for inference. This has led to the development of the conditional tests, most famously Fisher's exact test, and unconditional tests, of which Barnard's CSM test was the first example. Mostly due to philosophical objections and computational challenges, the unconditional test has received far less attention over the years. This is especially true for contingency tables with more than 2 rows or columns. To our knowledge, there are no implementations available of exact unconditional tests for these larger tables.

The aim of this text is two-fold. First, we give a historical account on the rivalry between conditional and unconditional test, and argue that there is a case to be made to research exact unconditional methods in greater depth. Second, we will present implementations of exact unconditional tests that are applicable to general r×c contingency tables. Some of these implementations are generalisations of existing methods for the 2×2 table, such as Barnard's CSM test, with some additions in order to increase the computational efficiency. In addition, we also introduce a new approach that translates the classical Neyman-Pearson procedure of constructing a critical region for a given significance level α into a a mixed integer linear programming problem. The latter can be solved efficiently with one of many existing optimisation software packages.

This will eventually lead to a power study comparing 14 different tests, of which 12 unconditional ones, for different table dimensions and marginal totals. Although no test comes out as most powerful in every situation, the tests using a linear programming formulation have comparable, and often higher power than the classical unconditional approaches. This comes at a cost however, the critical regions produced via this optimisation approach are not guaranteed to be nested, i.e., they are not necessarily contained in each other for increasing values of α. This limits their use and interpretability. Further research should point out whether additional requirements can be formulated that would make the critical regions nested, while still keeping the advantages of the linear programming formulation.