15 Checking assumptions and data transformations

Tutorial learning objectives

  • Learn how to check the normality assumption
    • Normal quantile plots
    • Shapiro-Wilk test for normality
  • Learn how to check the equal variance assumption
    • Levene’s Test
  • Learn how to transform the response variable to help meet assumptions
    • log-transform
    • Dealing with zeroes
    • log bases
    • back-transforming log data
    • logit transform
    • back-transforming logit data
    • when to back-transform?

Most statistical tests, such as the \(\chi\)2 goodness of fit test, the \(\chi\)2 contingency test, t-test, ANOVA, Pearson correlation, and least-squares regression, have assumptions that must be met. For example, the one-sample t-test requires that the variable is normally distributed in the population, and least-squares regression requires that the residuals from the regression be normally distributed. In this tutorial we’ll learn ways to check the assumption that the variable is normally distributed in the population.

We’ll also learn how transforming a variable can sometimes help satisfy assumptions, in which case the analysis is conducted on the transformed variable.