12.1 Load packages and import data
Load the tidyverse, skimr, naniar, knitr, ggmosaic, and janitor packages:
We’ll also need a new package called epitools, so install that now if you haven’t done so.
##
## Attaching package: 'epitools'
## The following objects are masked from 'package:binom':
##
## binom.exact, binom.wilson
We’ll use two datasets described in the Whitlock & Schluter text:
- the “cancer.csv” dataset (described in Example 9.2 in the text, page 238)
- the “worm.csv” dataset (described in Example 9.4 in the text, page 246)
## Rows: 39876 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): aspirinTreatment, response
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 141 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): infection, fate
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Take a look at the cancer dataset:
| Name | Piped data |
| Number of rows | 39876 |
| Number of columns | 2 |
| _______________________ | |
| Column type frequency: | |
| character | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| aspirinTreatment | 0 | 1 | 7 | 7 | 0 | 2 | 0 |
| response | 0 | 1 | 6 | 9 | 0 | 2 | 0 |
And the worm dataset:
| Name | Piped data |
| Number of rows | 141 |
| Number of columns | 2 |
| _______________________ | |
| Column type frequency: | |
| character | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| infection | 0 | 1 | 6 | 10 | 0 | 3 | 0 |
| fate | 0 | 1 | 5 | 9 | 0 | 2 | 0 |
Both datasets are formatted “tidy” format. For a refresher on this, review the Biology Procedures and Guidelines document chapter on Tidy data.