Published in

Lippincott, Williams & Wilkins, Epidemiology, 4(34), p. 462-466, 2023

DOI: 10.1097/ede.0000000000001607

Links

Tools

Export citation

Search in Google Scholar

Geolocation to Identify Online Study-Eligible Gay, Bisexual, and Men who have Sex with Men in Philadelphia, Pennsylvania

Journal article published in 2023 by Nguyen K. Tran ORCID, Seth L. Welles, Neal D. Goldstein ORCID
This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Background: Data collection and cleaning procedures to exclude bot-generated responses are used to maintain the data integrity of samples from online surveys. However, these procedures may be time-consuming and difficult to implement. Thus, we aim to evaluate the validity of a single-step geolocation algorithm for recruiting eligible gay, bisexual, and men who have sex with men in Philadelphia for an online study. Methods: We used a 4-step approach, based on common practices for evaluating bot-generated and fraudulent responses, to assess the validity of participants’ Qualtrics survey data as our referent standard. We then compared it to Qualtrics’ single-step geolocation algorithm that used the MaxMind commercial database to map participants’ Internet protocol address to their approximate location. We calculated the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the single-step geolocation approach relative to the 4-step approach. Results: There were 826 respondents who completed the survey and 440 (53%) were eligible for enrollment based on the 4-step approach. The single-step geolocation approach yielded a sensitivity of 91% (95% CI = 88%, 93%), specificity of 79% (95% CI = 74%, 83%), PPV of 83% (95% CI = 80%, 86%), and NPV of 88% (95% CI = 85%, 91%). Conclusions: Geolocation alone provided a moderately high level of agreement with the 4-step approach for identifying geographically eligible participants in the online sample, but both approaches may be subject to additional misclassification. Researchers may want to consider multiple procedures to ensure data integrity in online samples.