Published in

BioMed Central, BMC Public Health, S3(16), 2016

DOI: 10.1186/s12889-016-3699-0

Links

Tools

Export citation

Search in Google Scholar

The Australian longitudinal study on male health sampling design and survey weighting: implications for analysis and interpretation of clustered data

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Abstract Background The Australian Longitudinal Study on Male Health (Ten to Men) used a complex sampling scheme to identify potential participants for the baseline survey. This raises important questions about when and how to adjust for the sampling design when analyzing data from the baseline survey. Methods We describe the sampling scheme used in Ten to Men focusing on four important elements: stratification, multi-stage sampling, clustering and sample weights. We discuss how these elements fit together when using baseline data to estimate a population parameter (e.g., population mean or prevalence) or to estimate the association between an exposure and an outcome (e.g., an odds ratio). We illustrate this with examples using a continuous outcome (weight in kilograms) and a binary outcome (smoking status). Results Estimates of a population mean or disease prevalence using Ten to Men baseline data are influenced by the extent to which the sampling design is addressed in an analysis. Estimates of mean weight and smoking prevalence are larger in unweighted analyses than weighted analyses (e.g., mean = 83.9 kg vs. 81.4 kg; prevalence = 18.0 % vs. 16.7 %, for unweighted and weighted analyses respectively) and the standard error of the mean is 1.03 times larger in an analysis that acknowledges the hierarchical (clustered) structure of the data compared with one that does not. For smoking prevalence, the corresponding standard error is 1.07 times larger. Measures of association (mean group differences, odds ratios) are generally similar in unweighted or weighted analyses and whether or not adjustment is made for clustering. Conclusions The extent to which the Ten to Men sampling design is accounted for in any analysis of the baseline data will depend on the research question. When the goals of the analysis are to estimate the prevalence of a disease or risk factor in the population or the magnitude of a population-level exposure-outcome association, our advice is to adopt an analysis that respects the sampling design.