1. Defensive R Programming
Hi. I'm Colin Gillespie and I have the pleasure of taking you through
this fabulous course.
2. The Data Science Pipeline
Unless you've been cut-off entirely from the R world, you've probably
seen this tidyverse diagram of the data science process. It nicely encapsulates
the smooth transition from data to cleaning, then on to visualization
before finishing with reporting. There's only one small problem with this
diagram. My R workflow is never this smooth!
3. My Data Science Workflow
My data science process often feels a bit more like this.
There's clearly a pipeline, but at times it feels a bit more chaotic!
Data is messy and can change under our feet.
So it goes through a series of sometimes scrappy paths, often
paths that require a lot of patching, before something useful is produced.
4. Goals
This course aims to make our data science pipeline a bit more robust.
It will still occasionally break, but hopefully in a far more controlled and
predictable manner.
We want to anticipate problems before they happen, and set up the necessary
defenses to cope with the unexpected.
5. Let's go
Let's set the scene with a few exercises.