Automated Sample Ratio Mismatch (SRM) Detection and Analysis

None, None; None, None; None, None

Automated Sample Ratio Mismatch (SRM) Detection and Analysis

Conference Paper (2022)

Author(s)

Lukas Vermeer (Vista )

Kevin Anderson (Vista , TU Delft - Software Engineering)

Mauricio Acebal (Vista )

Research Group

Software Engineering

Copyright

DOI related publication

https://doi.org/10.1145/3530019.3534982

A/B Testing Trustworthiness Infrastructure Data Quality SRM Sample Ratio Mismatch Online Controlled Experimentation

To reference this document use:

https://resolver.tudelft.nl/uuid:31d2806f-2bf3-43c3-9e71-3985fd7368a6

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Software Engineering

Pages (from-to)

268–269

ISBN (print)

978-1-4503-9613-4

ISBN (electronic)

9781450396134

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Background: Sample Ratio Mismatch (SRM) checks can help detect data quality issues in online experimentation [3]. Not all experimentation platforms provide these checks as part of their solution. Users of these platforms must therefore manually check for SRM, or rely on additional processes—such as checklists [2]—or automation. Objective: To ensure reliable and early detection of SRM, we wanted to automate the detection and analysis of SRM in experiments running on third-party experimentation platforms. Method: A set of Looker dashboards were built to facilitate self-serve SRM detection and root cause analysis. In addition, we added email and chat based alerting to pro-actively inform experimenters of SRM and guide them towards these dashboards when needed. Results: Several cases of SRM have been detected and experimenters have been warned. Bad decisions based on flawed data were avoided. We provide one such example as an illustration. Conclusions: SRM checks are relatively straightforward to automate and can be useful for data quality monitoring even for companies who rely on third-party experimentation platforms. Pro-active alerting—rather than passive reporting—can reduce time to detection and help non-experts avoid making decisions based on biased data.

Files

Automated_Sample_Ratio_Mismatc... (pdf)

(pdf | 0.71 Mb)

License info not available