We'll start the course by defining what data science is. We'll cover the data science workflow and how data science is applied to real-world problems. We'll finish the chapter by learning about different roles within the data science field.
Now that we understand the data science workflow, we'll dive deeper into the first step: data collection and storage. We'll learn about the different data sources you can draw from, what that data looks like, how to store the data once it's collected, and how a data pipeline can automate the process.
Data preparation is fundamental: data scientists spend 80% of their time cleaning and manipulating data, and only 20% of their time actually analyzing it. This chapter will show you how to diagnose problems in your data, deal with missing values and outliers. You will then learn about visualization, another essential tool to both explore your data and convey your findings.
In this final chapter, we'll discuss experimentation and prediction! Beginning with experiments, we'll cover A/B testing, and move on to time series forecasting where we'll learn about predicting future events. Finally, we'll end with machine learning, looking at supervised learning, and clustering.