Data Quality in HIV/AIDS Web-Based Surveys: Handling Invalid and Suspicious Data

Bauermeister, Jose A.; Pingel, Emily; Zimmerman, Marc; Couper, Mick; Carballo-Diéguez, Alex; Strecher, Victor J.

Published in

SAGE Publications, Field Methods, 3(24), p. 272-291, 2012

DOI: 10.1177/1525822x12443097

Tools

Export citation

Search in Google Scholar

Data Quality in HIV/AIDS Web-Based Surveys: Handling Invalid and Suspicious Data

Journal article published in 2012 by Jose A. Bauermeister

, Emily Pingel, Marc Zimmerman, Mick Couper, Alex Carballo-Diéguez, Victor J. Strecher

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Invalid data may compromise data quality. We examined how decisions taken to handle these data may affect the relationship between Internet use and HIV risk behaviors in a sample of young men who have sex with men (YMSM). We recorded 548 entries during the three-month period, and created 6 analytic groups (i.e., full sample, entries initially tagged as valid, suspicious entries, valid cases mislabeled as suspicious, fraudulent data, and total valid cases) using data quality decisions. We compared these groups on the sample’s composition and their bivariate relationships. Forty-one cases were marked as invalid, affecting the statistical precision of our estimates but not the relationships between variables. Sixty-two additional cases were flagged as suspicious entries and found to contribute to the sample’s diversity and observed relationships. Using our final analytic sample (N = 447; M = 21.48 years old, SD = 1.98), we found that very conservative criteria regarding data exclusion may prevent researchers from observing true associations. We discuss the implications of data quality decisions and its implications for the design of future HIV/AIDS web-surveys.

Published in

Links

Tools

Data Quality in HIV/AIDS Web-Based Surveys: Handling Invalid and Suspicious Data

Abstract