BackgroundOnline symptom checkers are often developed and validated on data subject to self-selection and selective attrition, potentially introducing biases in prediction models.ObjectivesTo assess recruitment, selection, and attrition patterns in a large Dutch online symptom ch
...
BackgroundOnline symptom checkers are often developed and validated on data subject to self-selection and selective attrition, potentially introducing biases in prediction models.ObjectivesTo assess recruitment, selection, and attrition patterns in a large Dutch online symptom checker for musculoskeletal complaints and to evaluate potential biases by comparing participant characteristics across recruitment sources and with external target populations.MethodsUsing data from the online Dutch Rheumatic? Questionnaire on musculoskeletal complaints, we compared baseline characteristics and key self-reported symptoms between responders to the follow-up survey and nonresponders. The survey responders were furthermore compared according to source of recruitment to the questionnaire, i.e., via primary care clinics, secondary care clinics, or via different online sources. Sex, age and BMI distributions from the total study group were compared to external data of potential target populations of primary and secondary care patients within the Netherlands.ResultsThe total study group of answers to the questionnaire comprised 31,457 responders, of which 50% (n = 15,591) responded to the follow-up survey. Study participants were predominantly female (76%), middle-aged (one-third 50–60 years), never-smokers (66%), and overweight. While participants recruited through healthcare settings resembled target populations, follow-up survey responders were older, had more rheumatic diagnoses (49% vs. 32%), and reported more symptoms than non-responders. Participant characteristics varied by recruitment source, with social media attracting younger females while healthcare routes reached more diverse populations with varying symptom presentations.ConclusionPatterns of recruitment and attrition produced differences in participant characteristics. Healthcare-based recruitment yielded participants resembling intended target populations, and follow-up survey responders differed on some points from nonresponders. Awareness of these selection processes is essential when using real-world symptom checker data for model development.