BIOL202 Tutorials
Preface
Welcome
Author
Acknowledgments
Copyright
UBCO Biology open materials
Getting started with R, RStudio & R Markdown
1
What are R and RStudio?
1.1
Installing R and RStudio
2
Start using R & RStudio
2.1
The RStudio Interface
2.2
Coding basics
2.3
R packages
2.4
Package installation
2.5
Package loading
2.6
Intro to R Markdown
2.7
Literate programming with R Markdown
2.8
Making sure R Markdown knits to PDF
2.9
Extra resources
2.10
R Resources Online
Reproducible Workflows
3
Reproducible Research
3.1
Computational reproducibility
3.2
An example BIOL202 workflow
3.3
Microsoft OneDrive
3.4
Directory structure
3.5
Steps to set up directories
3.6
Lecture workflow
3.7
Tutorial workflow
3.8
Create an RStudio Project
3.9
Create subdirectories
3.9.1
The
here
package
3.10
Edit an R markdown file
3.11
Components of an R Markdown file
3.12
Interacting with Tutorial material
3.13
Lab assignments workflow
4
Preparing and formatting assignments
4.1
Open your assignment RStudio project
4.2
Download the assignment Rmd file
4.3
Open the assignment Rmd file
4.4
What to include in your answers
4.4.1
Code chunk headers
4.4.2
Import data
4.4.3
Load packages
4.4.4
Answer the questions
4.5
Setting up R Markdown for graphing
4.6
Example question / answer
4.7
Knitting your assignment to PDF
4.8
Submit your assignment
5
Preparing and importing Tidy Data
5.1
Tidy data
5.2
Import a CSV file from a website
5.3
Create a CSV file
5.4
Import a local CSV file
5.5
Get an overview of a dataset
5.6
Tutorial practice activities
Visualizing and Describing Data
6
Visualizing a single variable
6.1
Load packages and import data
6.2
Get an overview of the data
6.3
Create a frequency table
6.4
Create a bar graph
6.5
Create a histogram
6.6
Describing a histogram
7
Describing a single variable
7.1
Load packages and import data
7.2
Describing a categorical variable
7.3
Describing a numerical variable
7.3.1
Calculating the median & IQR
7.3.2
Calculating the mean & standard deviation
7.4
Describing a numerical variable grouped by a categorical variable
8
Visualizing associations between two variables
8.1
Load packages and import data
8.2
Visualizing association between two categorical variables
8.2.1
Constructing a contingency table
8.2.2
Constructing a grouped bar graph
8.2.3
Constructing a mosaic plot
8.2.4
Interpreting mosaic plots
8.3
Visualizing association between two numeric variables
8.3.1
Interpreting and describing a scatterplot
8.4
Visualizing association between a numeric and a categorical variable
8.4.1
Create a stripchart
8.4.2
Create a violin plot
8.4.3
Creating a boxplot
8.4.4
Combining violin and boxplots
8.4.5
Interpreting stripcharts, violin plots and boxplots
Inferential Statistics
9
Sampling, Estimation, & Uncertainty
9.1
Load packages and import data
9.2
Functions for sampling
9.2.1
Setting the “seed” for random sampling
9.3
Sampling error
9.4
Sampling distribution of the mean
9.4.1
Visualize the sampling distribution
9.5
Standard error of the mean
9.6
Rule of thumb 95% confidence interval
10
Hypothesis testing
10.1
Load packages and import data
10.2
Steps to hypothesis testing
10.3
An hypothesis test example
10.3.1
Following the steps to hypothesis testing
10.3.2
Simulating a “null distribution”
10.3.3
Calculating the
P
-value
10.3.4
Writing a concluding statement
11
Analyzing a single categorical variable
11.1
Load packages and import data
11.2
Estimating proportions
11.2.1
Standard error for a proportion
11.2.2
Confidence interval for a proportion
11.3
Binomial distribution
11.4
Binomial test
11.5
Confidence interval approach to hypothesis testing
11.6
Goodness-of-fit tests
12
Analyzing associations between two categorical variables
12.1
Load packages and import data
12.2
Fisher’s Exact Test
12.2.1
Hypothesis statement
12.2.2
Display a contingency table
12.2.3
Display a mosaic plot
12.2.4
Conduct the Fisher’s Exact Test
12.3
Estimate the Odds of getting sick
12.4
Estimate the odds ratio
12.5
\(\chi\)
2
Contingency Test
12.5.1
Hypothesis statement
12.5.2
Display the contingency table
12.5.3
Visualize a mosaic plot
12.5.4
Check the assumptions
12.5.5
Get the results of the test
13
Analyzing a single numerical variable
13.1
Load packages and import data
13.2
One-sample
t
-test
13.2.1
Hypothesis statement
13.2.2
Assumptions of one-sample
t
-test
13.2.3
A graph to accompany a one-sample
t
-test
13.2.4
Conduct the one-sample
t
-test
13.2.5
Concluding statement for the one-sample
t
-test
13.3
Confidence intervals for
\(\mu\)
13.3.1
Confidence interval as a measure of precision for an estimate
13.3.2
Confidence interval approach to hypothesis testing
14
Comparing means among two groups
14.1
Load packages and import data
14.2
Paired
t
-test
14.2.1
Calculate differences
14.2.2
Hypothesis statement
14.2.3
A graph to accompany a paired
t
-test
14.2.4
Assumptions of the paired
t
-test
14.2.5
Conduct the test
14.2.6
Concluding statement
14.3
Two sample
t
-test
14.3.1
Hypothesis statement
14.3.2
A table of descriptive statistics
14.3.3
A graph to accompany a 2-sample
t
-test
14.3.4
Assumptions of the 2-sample
t
-test
14.3.5
Conduct the 2-sample
t
-test
14.3.6
Concluding statement
14.4
When assumptions aren’t met
15
Checking assumptions and data transformations
15.1
Load packages and import data
15.2
Checking the normality assumption
15.2.1
Normal quantile plots
15.2.2
Shapiro-Wilk test for normality
15.3
Checking the equal-variance assumption
15.4
Data transformations
15.4.1
Log-transform
15.4.2
Dealing with zeroes
15.4.3
Log bases
15.4.4
Back-transforming log data
15.4.5
Logit transform
15.4.6
Back-transforming logit data
15.4.7
When to back-transform?
16
Comparing means among more than two groups
16.1
Load packages and import data
16.2
Analysis of variance
16.2.1
Hypothesis statements
16.2.2
A table of descriptive statistics
16.2.3
Visualize the data
16.2.4
Assumptions of ANOVA
16.2.5
Conduct the ANOVA test
16.2.6
Calculate
\(R^2\)
for the ANOVA
16.2.7
Tukey-Kramer post-hoc test
16.2.8
Visualizing post-hoc test results
16.2.9
Concluding statement
16.3
When assumptions aren’t met
17
Analyzing associations between two numerical variables
17.1
Load packages and import data
17.2
Pearson correlation analysis
17.2.1
Hypothesis statements
17.2.2
Visualize the data
17.2.3
Assumptions of correlation analysis
17.2.4
Conduct the correlation analysis
17.2.5
Concluding statement
17.3
Rank correlation (Spearman’s correlation)
17.3.1
Hypothesis statements
17.3.2
Visualize the data
17.3.3
Assumptions of Spearman rank correlation
17.3.4
Conduct the test
17.3.5
Concluding statement
18
Least-squares linear regression
18.1
Load packages and import data
18.2
Least-squares regression analysis
18.2.1
Equation of a line and “least-squares line”
18.2.2
Hypothesis testing or prediction?
18.2.3
Steps to conducting regression analysis
18.2.4
State question and set the
\(\alpha\)
level
18.2.5
Visualize the data
18.2.6
Interpreting a scatterplot
18.2.7
Checking assumptions of regression analysis
18.2.8
Residual plots when you have missing values
18.2.9
Transform the data
18.2.10
Conduct the regression analysis
18.2.11
Confidence interval for the slope
18.2.12
Scatterplot with regression confidence bands
18.2.13
Concluding statement
18.3
Making predictions
18.3.1
Back-transforming regression predictions
18.4
Model-I versus Model-II regression
18.4.1
Definitions
18.4.2
Which one do I use?
Load all the necessary packages
Data summaries with “gtsummary” package
Creating tables in R Markdown
18.5
Load packages and import data
18.5.1
Formatting output from the
skimr
package
18.5.2
A nicely formatted table of descriptive statistics
Visual Markdown Editor
A more familiar editing environment
Common errors and their solutions
Google can help
Rosetta error
Rtools required during install
Could not find function
There is no package
Trying to use CRAN without setting a mirror
PDF Latex is not found
Error in parse
No such file or directory exists
Messy output when loading packages
Unused argument
Object not found
Figure caption doesn’t show up below figure in knitted document
Figures are placed in weird spots in knitted PDF
Installing packages: there is a binary version available
Unicode knitting error
Published with bookdown
Tutorials for BIOL202: Introduction to Biostatistics
Common errors and their solutions