# Why R plots Residuals vs Leverage instead of Residuals vs Factor Levels (ANOVA test & model with ‘aov’)

I’m analysing a data set with the weight of newborn babies and some info about their mothers, including a categorical variable ‘smoke’ – whether a mother is a smoker, or not.

I did an aov test and wanted to plot diagnostic plots of an ANOVA model with its help. I expected to get four plots, including a ‘Residuals vs Factor Levels’ plot. Instead, I got a ‘Residuals vs Leverage’ plot, as if my categorical variable was a numeric one.

You can find the dataset here: https://drive.google.com/file/d/1VwiAHdYZF2BrGZZ875GGdkyamKMgxmGU/view?usp=sharing

In there variable ‘smoke’ has values 0 (non-smoker) and 1 (smoker). I used mutate to change it into a proper factor (among others, like parity), then made the aov test itself and tried to plot the results, to verify the assumptions. Below you can find my code:

``````babies <- read.csv("babies.csv")
babies <- babies %>%
mutate(parity = factor(parity,
levels = c(0, 1),
labels = c("not firstborn", "firstborn"))) %>%
mutate(smoke = factor(smoke,
levels = c(0, 1),
labels = c("non smoker", "smoker")))

model6 <- aov(babies\$bwt ~ babies\$smoke)
par(mfrow = c(2,2))
plot(aov(babies\$bwt ~ babies\$smoke))
``````

The result I’m getting in the fourth plot is this:

I tried to check whether ‘smoke’ is a factor as I wanted or not, like that:

``````> head(babies\$smoke)
[1] non smoker non smoker smoker     non smoker smoker     non smoker
Levels: non smoker smoker
``````

Since ‘smoke’ is a factor (as I understand) and a categorical variable, why is there leverage as per numeric variable? How to fix this and get the proper plot?

Thanks for the help in advance!