The web is a rich source of data from which you can extract various types of insights and findings. In this chapter, you will learn how to get data from the web, whether it is stored in files or in HTML. You'll also learn the basics of scraping and parsing web data.

Platte bestanden van het web importeren

Platte bestanden van het web importeren: jouw beurt!

Platte bestanden van het web openen en inlezen

Niet-platte bestanden van het web importeren

HTTP-verzoeken om bestanden van het web te importeren

HTTP-verzoeken uitvoeren in Python met urllib

HTTP-resultaten afdrukken in Python met urllib

HTTP-verzoeken uitvoeren in Python met requests

Het web scrapen in Python

HTML parsen met BeautifulSoup

Een webpagina omzetten naar data met BeautifulSoup: de tekst ophalen

Een webpagina omzetten naar data met BeautifulSoup: de hyperlinks ophalen

Importing data from the Internet

In this chapter, you will gain a deeper understanding of how to import data from the web. You will learn the basics of extracting data from APIs, gain insight on the importance of APIs, and practice extracting data by diving into the OMDB and Library of Congress APIs.

Introduction to APIs and JSONs

Pop quiz: What exactly is a JSON?

Loading and exploring a JSON

Pop quiz: Exploring your JSON

APIs and interacting with the world wide web

Pop quiz: What's an API?

API requests

JSON–from the web to Python

Checking out the Wikipedia API

Interacting with APIs to import data from the web

In this chapter, you will consolidate your knowledge of interacting with APIs in a deep dive into the Twitter streaming API. You'll learn how to stream real-time Twitter data, and how to analyze and visualize it.

The Twitter API and Authentication

Streaming tweets

Load and explore your Twitter data

Twitter data to DataFrame

A little bit of Twitter text analysis

Plotting your Twitter data

Final Thoughts

Diving  deep into the Twitter API

Latitudes (XLS)

Tweets

Red wine quality

Course Glossary

Als data scientist moet je data opschonen, transformeren en bewerken, visualiseren, voorspellende modellen bouwen en deze modellen interpreteren. Voor het zover is, moet je weten hoe je data in Python krijgt. In de voorloper van deze cursus leerde je op verschillende manieren data importeren in Python: uit platte bestanden zoals .txt en .csv; uit bestanden van andere software zoals Excel-spreadsheets en bestanden van Stata, SAS en MATLAB; en uit relationele databases zoals SQLite en PostgreSQL. In deze cursus bouw je hierop voort door te leren hoe je data van het web importeert en hoe je data ophaalt via Application Programming Interfaces — API’s — zoals de Twitter streaming API, waarmee je real-time tweets kunt streamen.

De video’s bevatten live-transcripten die je kunt tonen door linksonder in de video’s op "Show transcript" te klikken.
De begrippenlijst van de cursus vind je rechts in de sectie met bronnen.
Om CPE-punten te behalen, moet je de cursus afronden en minimaal 70% scoren op de gekwalificeerde toets. Je gaat naar de toets door rechts op de callout voor CPE-punten te klikken.

Introduction to Importing Data in Python

Leer hoe je data in Python importeert van bronnen zoals het web en API's, zoals de Twitter API.

Gevorderd data importeren in Python

Verbeter je Python-vaardigheden voor het importeren van data en leer werken met web- en API-data.

Data-engineer in Python

Datawetenschapper in Python

Gegevens importeren en opschonen  in Python

Een webpagina omzetten naar data met BeautifulSoup: de hyperlinks ophalen

Gevorderd data importeren in Python

Oefeninstructies

Praktische interactieve oefening