Exercise

Do it the httr way

Here's some rvest code that I used to find out the elevation of a beautiful place where I recently spent my vacation.

# Get the HTML document from Wikipedia
wikipedia_page <- read_html('https://en.wikipedia.org/wiki/Varigotti')
# Parse the document and extract the elevation from it
wikipedia_page %>% 
  html_elements('table tr:nth-child(9) > td') %>% 
  html_text()

As you have learned in the video, read_html() actually issues an HTTP GET request if provided with a URL, like in this case.

The goal of this exercise is to replicate the same query without read_html(), but with httr methods instead.

Note: Usually rvest does the job, but if you want to customize requests like you'll be shown later in this chapter, you'll need to know the httr way.

For a little repetition, you'll also translate the CSS selector used in html_elements() into an XPATH query.

Instructions 1/2

undefined XP
    1
    2
  • Use only httr functions to replicate the behavior of read_html(), including getting the response from Wikipedia and parsing the response object into an HTML document.
  • Check the resulting HTTP status code with the appropriate httr function.