Joining with missing values
Two new data.tables have been loaded into your R session: heart and cardio. Each one contains a set of microarray probes you have found to be associated with heart disease in two separate studies*. Each probe measures the expression levels of a gene. Each gene can be measured by one or more probes, and some probes do not have any known gene annotation in the human genome reference sequence. The two studies have used different microarray platforms that use different probes to measure each gene. Your goal is to find which genes had reproducible associations with heart disease in both studies.
* Note: associations are randomly generated, not representative of any true biological finding or real dataset.
Deze oefening maakt deel uit van de cursus
Joining Data with data.table in R
Oefeninstructies
- Using the
merge()function, inner joincardiotoheartwith the appropriate argument to override any errors that you encounter. - Remove the probes from both
data.tableswith no gene annotation (i.e., remove rows with missing values in thegenecolumn). - Repeat the inner join with the new
data.tablesto get adata.tableof reproducible associations between genes and heart disease.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Try an inner join
___
# Filter missing values
heart_2 <- ___
cardio_2 <- ___
# Inner join the filtered data.tables
___