Aan de slagGa gratis aan de slag

Specificeer pakketten in de eerste codechunk

We hebben de resultaten van je analyse uit het tweede hoofdstuk gekopieerd naar een ander RMarkdown-rapport, rechts in beeld. Het gebruikt je structuur uit de vorige oefeningen en heeft R-codechunks toegevoegd aan de bijbehorende secties.

Er mist echter iets: de informatie over de software die je hebt gebruikt. Het opgeven van de pakketten die nodig zijn voor een bepaalde analyse is een must als je reproduceerbaarheid wilt garanderen, zodat anderen je werk kunnen hergebruiken.

Onthoud dat je je R-codechunk als volgt specificeert!

```{r}
# Some code

```

Deze oefening maakt deel uit van de cursus

Communiceren met data in de Tidyverse

Cursus bekijken

Oefeninstructies

  • Maak je eerst vertrouwd met de structuur van het nieuwe RMarkdown-document waarmee je vanaf nu gaat werken.
  • Voeg in de "Preparations"-sectie een nieuwe R-codechunk toe en verwijder de placeholdertekst. Laad in deze codechunk de Tidyverse-libraries dplyr, ggplot2 en forcats.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

{"my_document.Rmd":"---\ntitle: \"The reduction in weekly working hours in Europe\" \nsubtitle: \"Looking at the development between 1996 and 2006\"\nauthor: \"Insert your name here\"\noutput: html_document\n---\n\n## Summary \n\nThe **International Labour Organization (ILO)** has many [data sets](http://www.ilo.org/global/statistics-and-databases/lang--en/index.htm) on working conditions. For example, one can look at how weekly working hours have been decreasing in many countries of the world, while monetary compensation has risen. In this report, *the reduction in weekly working hours* in European countries is analysed, and a comparison between 1996 and 2006 is made. All analysed countries have seen a decrease in weekly working hours since 1996 – some more than others.\n\n## Preparations \n\nThis is where you have to load the necessary R packages.\n\n## Analysis\n\n### Data\n\nThe herein used data can be found in the [statistics database of the ILO](http://www.ilo.org/ilostat/faces/wcnav_defaultSelection;ILOSTATCOOKIE=ZOm2Lqrr-OIuzxNGn2_08bNe9AmHQ1kUA6FydqyZJeIudFLb2Yz5!1845546174?_afrLoop=32158017365146&_afrWindowMode=0&_afrWindowId=null#!%40%40%3F_afrWindowId%3Dnull%26_afrLoop%3D32158017365146%26_afrWindowMode%3D0%26_adf.ctrl-state%3D4cwaylvi8_4). For the purpose of this course, it has been slightly preprocessed.\n\n```{r}\nload(url(\"http://s3.amazonaws.com/assets.datacamp.com/production/course_5807/datasets/ilo_data.RData\"))\n```\n\n```{r}\n# Some summary statistics\nilo_data %>%\n  group_by(year) %>%\n  summarize(mean_hourly_compensation = mean(hourly_compensation),\n            mean_working_hours = mean(working_hours))\n```\n\nAs can be seen from the above table, the average weekly working hours of European countries have been descreasing since 1980.\n\n### Preprocessing\n\nThe data is now filtered so it only contains the years 1996 and 2006 – a good time range for comparison. \n\n```{r}\nilo_data <- ilo_data %>%\n  filter(year == \"1996\" | year == \"2006\")\n  \n# Reorder country factor levels\nilo_data <- ilo_data %>%\n  # Arrange data frame first, so last is always 2006\n  arrange(year) %>%\n  # Use the fct_reorder function inside mutate to reorder countries by working hours in 2006\n  mutate(country = fct_reorder(country,\n                               working_hours,\n                               last))\n```  \n\n### Results\n\nIn the following, a plot that shows the reduction of weekly working hours from 1996 to 2006 in each country is produced.\n\nFirst, a custom theme is defined.\n\n```{r}\n# Better to define your own function than to always type the same stuff\ntheme_ilo <- function(){\n  theme_minimal() +\n  theme(\n    text = element_text(family = \"Bookman\", color = \"gray25\"),\n    plot.subtitle = element_text(size = 12),\n    plot.caption = element_text(color = \"gray30\"),\n    plot.background = element_rect(fill = \"gray95\"),\n    plot.margin = unit(c(5, 10, 5, 10), units = \"mm\")\n  )\n}\n```  \n\nThen, the plot is produced. \n\n```{r}\n# Compute temporary data set for optimal label placement\nmedian_working_hours <- ilo_data %>%\n  group_by(country) %>%\n  summarize(median_working_hours_per_country = median(working_hours)) %>%\n  ungroup()\n\n# Have a look at the structure of this data set\nstr(median_working_hours)\n\n# Plot\nggplot(ilo_data) +\n  geom_path(aes(x = working_hours, y = country),\n            arrow = arrow(length = unit(1.5, \"mm\"), type = \"closed\")) +\n  # Add labels for values (both 1996 and 2006)\n  geom_text(\n        aes(x = working_hours,\n            y = country,\n            label = round(working_hours, 1),\n            hjust = ifelse(year == \"2006\", 1.4, -0.4)\n          ),\n        # Change the appearance of the text\n        size = 3,\n        family = \"Bookman\",\n        color = \"gray25\"\n   ) +\n  # Add labels for country\n  geom_text(data = median_working_hours,\n            aes(y = country,\n                x = median_working_hours_per_country,\n                label = country),\n            vjust = 2,\n            family = \"Bookman\",\n            color = \"gray25\") +\n  # Add titles\n  labs(\n    title = \"People work less in 2006 compared to 1996\",\n    subtitle = \"Working hours in European countries, development since 1996\",\n    caption = \"Data source: ILO, 2017\"\n  ) +\n  # Apply your theme \n  theme_ilo() +\n  # Remove axes and grids\n  theme(\n    axis.ticks = element_blank(),\n    axis.title = element_blank(),\n    axis.text = element_blank(),\n    panel.grid = element_blank(),\n    # Also, let's reduce the font size of the subtitle\n    plot.subtitle = element_text(size = 9)\n  ) +\n  # Reset coordinate system\n  coord_cartesian(xlim = c(25, 41))\n```\n\n\n"}
Code bewerken en uitvoeren