## 12.3 Estimate the Odds of getting sick

The odds of success (O) are the probability of success (p) divided by the probability of failure (1-p):

$$O = \frac{p}{1-p}$$

Curiously, in health-related studies, a “success” is equated with getting ill!!

We’ll use the data stored in the contingency table we produced before, called “cancer.aspirin.table”:

cancer.aspirin.table <- cancer %>%
tabyl(aspirinTreatment, response) %>%
cancer.aspirin.table 
##  aspirinTreatment Cancer No cancer Total
##           Aspirin   1438     18496 19934
##           Placebo   1427     18515 19942
##             Total   2865     37011 39876

And recall that proportions are calculated using frequencies - which is exactly what we have in the table!

Thus, to estimate the “odds” of getting cancer while taking aspirin, we need to:

• first calculate the proportion (= probability) of women who got cancer while taking aspirin (= $${p}$$)
• then calculate the proportion (= probability) of women who remained healthy while taking aspirin ($$= 1-{p}$$)
• then calculate the odds as $$O = \frac{p}{1-p}$$

We’ll do all of this in one go using a series of steps strung together with pipes (“%>%”).

Here’s the code, and we’ll explain each step after:

cancer.aspirin.table %>%
filter(aspirinTreatment == "Aspirin") %>%
select(Cancer, Total) %>%
mutate(
propCancer_aspirin = Cancer / Total,
propHealthy_aspirin = 1 - propCancer_aspirin,
oddsCancer_aspirin = propCancer_aspirin/propHealthy_aspirin
) 
##  Cancer Total propCancer_aspirin propHealthy_aspirin oddsCancer_aspirin
##    1438 19934         0.07213806           0.9278619         0.07774654
• we first filter the table to return only the rows pertaining to the “Aspirin” treatment group; the frequencies that we need for the calculations are in this row
• then we select the columns from that row with names “Cancer” and “Total”, which include the frequency of women who got cancer while on Aspirin (under the “Cancer” column), and the total frequency of women in the Aspiring treatment group (in the “Total”)
• we then use the mutate function to create three new variables:
• “propCancer_aspirin” is calculated at the Cancer frequency divided by the Total frequency (within the Aspirin group)
• “propHealth_aspirin” is calculated simply as 1 minus propCancer_aspirin
• “oddsCancer_aspirin” is calculated last as “propCancer_aspirin/propHealthy_aspirin”

Thus, the odds of getting cancer while on aspirin are about 0.08:1, or equivalently, approximately 1:13 (which you get from dividing 0.0777 into 1).

Alternatively, “the odds are 13 to 1 that a women who took aspirin would not get cancer in the next 10 years”.

1. Estimate odds
• Estimate the odds that a woman in the placebo group would get cancer