Session Ready
Exercise

Using CSS to scrape nodes

As mentioned in the video, CSS is a way to add design information to HTML, that instructs the browser on how to display the content. You can leverage these design instructions to identify content on the page.

You've already used html_node(), but it's more common with CSS selectors to use html_nodes() since you'll often want more than one node returned. Both functions allow you to specify a css argument to use a CSS selector, instead of specifying the xpath argument.

What do CSS selectors look like? Try these examples to see a few possibilities.

Instructions
100 XP

We've read in the same HTML page from Chapter 4, the Wikipedia page for Hadley Wickham, into test_xml.

  • Use the CSS selector "table" to select all elements that are a table tag.
  • Use the CSS selector ".infobox" to select all elements that have the attribute class = "infobox".
  • Use the CSS selector "#firstHeading" to select all elements that have the attribute id = "firstHeading".