Get startedGet started for free

Introduction to CSS

1. Introduction to CSS

The second most important web technology you need to know for scraping is CSS.

2. Cascading Style Sheets

With CSS or Cascading Style Sheets, a website can be styled according to a certain specification. These are rules that specify how the HTML is rendered in the browser, for example, how big the fonts should be. Let's look at an example. On the left side, you see both the CSS and the HTML that was specified to render the simple page on the right. You already know the HTML part. With CSS, a certain style can be specified for any element on that page. Here, two styles are defined. One that concerns all h1 tags on the page, and another that concerns all paragraph elements. The syntax is simple: A selector is followed by curly braces that contain the styles. A style is always a key-value-pair. There are a lot of possible keys for all sorts of style attributes. You can find all possible keys and their values in this reference.

3. CSS selectors

However, we're not so much interested in the actual style definitions but rather in the way these definitions are matched with their counterparts in the HTML document. You might have already guessed it: That's the job of the selectors that are in front of the style definitions. And actually, these selectors are also the ones we use normally for scraping with rvest. Just as the selector h1 comma p is used to give all the text elements in this example a sans-serif font, it can be used to query these elements with the html_elements() function. This is not a coincidence: CSS needs to be able to provide web developers with a way to define styles on a per-element basis throughout very complex websites. So it lends itself perfectly for scraping: Also there, you need to tell the scraper exactly what it should query and download, and what not.

4. Type selectors

So far you have only seen the type selector. It is merely the name of the HTML tag that a specific style should apply to. The corresponding rvest code for selecting elements by type looks like this. It's familiar, isn't it? In this video, you saw that one can also concatenate several types after each other with a comma. The style definition will then be applied to every specified HTML element. Lastly, there's the so-called universal selector, written with an asterisk. Styles defined here are applied to every HTML element on the page. Throughout the rest of this chapter, you'll get to know many different types of CSS selectors. And you'll learn how to combine them in order to come up with even more specific queries.

5. Let's do this!

Okay, let's look at some CSS!