Dissemin is shutting down on January 1st, 2025

Published in

Lippincott, Williams & Wilkins, Epidemiology, 6(33), p. 843-853, 2022

DOI: 10.1097/ede.0000000000001534

Links

Tools

Export citation

Search in Google Scholar

Log-transformation of Independent Variables: Must We?

This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Epidemiologic studies often quantify exposure using biomarkers, which commonly have statistically skewed distributions. Although normality assumption is not required if the biomarker is used as an independent variable in linear regression, it has become common practice to log-transform the biomarker concentrations. This transformation can be motivated by concerns for nonlinear dose-response relationship or outliers; however, such transformation may not always reduce bias. In this study, we evaluated the validity of motivations underlying the decision to log-transform an independent variable using simulations, considering eight scenarios that can give rise to skewed X and normal Y. Our simulation study demonstrates that (1) if the skewness of exposure did not arise from a biasing factor (e.g., measurement error), the analytic approach with the best overall model fit best reflected the underlying outcome generating methods and was least biased, regardless of the skewness of X and (2) all estimates were biased if the skewness of exposure was a consequence of a biasing factor. We additionally illustrate a process to determine whether the transformation of an independent variable is needed using NHANES. Our study and suggestion to divorce the shape of the exposure distribution from the decision to log-transform it may aid researchers in planning for analysis using biomarkers or other skewed independent variables.