17.1 Load packages and import data

Load the tidyverse, skimr, broom, knitr, and janitor packages:

library(tidyverse)
library(skimr)
library(broom)
library(knitr)
library(janitor)

We’ll use the “wolf.csv” and “trick.csv” datasets (discussed in examples 16.2 and 16.5 in the text, respectively).

wolf <- read_csv("https://raw.githubusercontent.com/ubco-biology/BIOL202/main/data/wolf.csv")

## Rows: 24 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): inbreedCoef, nPups
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

trick <- read_csv("https://raw.githubusercontent.com/ubco-biology/BIOL202/main/data/trick.csv")

## Rows: 21 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): years, impressivenessScore
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

The wolf dataset includes inbreeding coefficients for wolf pairs, along with the number of the pairs’ pups surviving the first winter.

Explore the data:

wolf %>% skim_without_charts()

(#tab:corr_seedata1)Data summary
Name	Piped data
Number of rows	24
Number of columns	2
_______________________
Column type frequency:
numeric	2
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100
inbreedCoef	0	1	0.23	0.10	0	0.19	0.24	0.30	0.4
nPups	0	1	3.96	1.88	1	3.00	3.00	5.25	8.0

We see that there are 24 observations for each of the two variables, and no missing values. If there WERE missing values, be sure to report the correct sample size in your results!

Now let’s explore the trick dataset:

trick %>% skim_without_charts()

(#tab:corr_seedata2)Data summary
Name	Piped data
Number of rows	21
Number of columns	2
_______________________
Column type frequency:
numeric	2
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100
years	0	1	27.29	15.21	2	17	28	39	50
impressivenessScore	0	1	3.43	1.36	1	2	4	4	5

It includes 21 observations, no missing values, and two integer variables: “years”, and “impressivenessScore”. Reading example 16.5 from the text, we see that the latter variable is a form of ranking variable.