Data summaries with “gtsummary” package

This tutorial introduces an alternative to the skimr package for getting overviews of datasets.

The skimr package and its skim_without_charts function can cause issues when knitting to PDF.

The gtsummary package appears to have fewer such issues.

If you haven’t already, install the gtsummary package by typing this in your console (do this only once):

install.packages("gtsummary")

Let’s load packages …

library(tidyverse)
library(palmerpenguins)
library(gtsummary)
library(knitr)

The key function in the gtsummary package is the tbl_summary function:

?tbl_summary

Take note of the default settings for the “statistic” argument… by default, the function will return the median and IQR for numeric variables, and the sample size and relative frequency (expressed as percentage) for categorical variables.

For details on this function, along with a tutorial, see this webpage.

We’ll get an overview of the data using the tbl_summary function.

penguins %>%
  tbl_summary()
Characteristic N = 3441
species
    Adelie 152 (44%)
    Chinstrap 68 (20%)
    Gentoo 124 (36%)
island
    Biscoe 168 (49%)
    Dream 124 (36%)
    Torgersen 52 (15%)
bill_length_mm 44.5 (39.2, 48.5)
    Unknown 2
bill_depth_mm 17.30 (15.60, 18.70)
    Unknown 2
flipper_length_mm 197 (190, 213)
    Unknown 2
body_mass_g 4,050 (3,550, 4,750)
    Unknown 2
sex
    female 165 (50%)
    male 168 (50%)
    Unknown 11
year
    2007 110 (32%)
    2008 114 (33%)
    2009 120 (35%)
1 n (%); Median (IQR)

We could select just some numeric variables, and ask for the mean and standard deviation. Note the syntax for the “statistic” argument… we have to provide a “list”, as follows:

penguins %>%
  select(bill_length_mm, bill_depth_mm) %>%
  tbl_summary(statistic = list(all_continuous() ~ "{mean} ({sd})"))
Characteristic N = 3441
bill_length_mm 43.9 (5.5)
    Unknown 2
bill_depth_mm 17.15 (1.97)
    Unknown 2
1 Mean (SD)

So, if you find youself running into issues with the skimr package, feel free to use the gtsummary package instead!