Why do results differ for dplyr left_join() and right_join() using these two dataframes

I am learning how to use the R dplyr ‘join’ functions by doing the exercises from this course: and got stuck on the problem described below.

First, download the example dataframes used for this question:


Load the package:


Then in R/RStudio load the dataframe files, ‘clinical2’ and ‘expression’ by typing:


The task is, firstly:
Join the expression and clinical2 tables by the patient reference, using the left_join and the right_join functions.
I did that in this way:

left_join(expression, clinical2, 
          by = c("patient" = "patientID"))
right_join(expression, clinical2,
                     by = c("patient" = "patientID"))

The second task is to explain why the results are different. I found that there are 3 more rows in the right_join output versus the left_join output. This seems odd to me given that ‘clinical2’ has 516 rows, whereas ‘expression’ has 570 rows. The 3 extra rows present in the r_join output have in common that they contain multiple NA values, which presumably represent patients found in ‘clinical2’ and not in ‘expression’. I don’t really understand what is going on here, and would be grateful for any help.


Vuetify datetime picker with input fields

A Lightweight eventbus with async compatibility for Golang