Houston, we got a 404!
As you've seen in the video, a fundamental part of the HTTP system are status codes: They tell you if everything is okay or if there is a problem with your request.
It is good practice to always check the status code of a response before you start working with the downloaded page. For this, you can use the status_code()
function from the httr()
package. It takes as an argument a response object that results from a request method.
Now let's assume you're trying to scrape the same page as before, but somehow you got the URL wrong (Varigott
instead of Varigotti
).
Cet exercice fait partie du cours
Web Scraping in R
Instructions
- Read out the status code of the response object from the GET request.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
response <- GET('https://en.wikipedia.org/wiki/Varigott')
# Print status code of inexistent page
___