1. CSS classes and IDs
Using type selectors might be enough for very simple web sites with only basic styling.
However, modern web sites are much more complex and usually have hundreds, if not thousands of HTML elements. Therefore, classes and IDs were introduced to better identify certain parts of a page.
2. Classes
With classes, HTML elements can be categorized into certain style groups. Every element with the same class will be applied the same styles as defined in the CSS. This also applies if the types of the HTML elements are different. For instance, one could define a class "alert" with a red font and apply it to whole divs, but also only to certain links.
The class is just another attribute of an HTML element, analogous to the href attribute of an a element. In CSS, classes are specified with a dot, and this of course also applies to selecting them with rvest.
3. Selecting multiple classes at once
Consider that an HTML element can have more than one class. That allows you to combine different style groups.
With rvest, you query an element with multiple classes as follows: You write the name of the class and directly append the name of the other class, without a space. Note that using the comma between both class names does something else: It selects the elements with the first class and the elements with the second class, but not only the elements that have both classes at the same time. In this example, only the element that has both classes "alert" and "emph" is selected.
4. IDs
IDs are somewhat a special form of classes. They fulfill more or less the same purpose – making HTML nodes identifiable – but in contrast to classes, they should be unique across a page. That means that there should only be one HTML element with a certain ID in a web site. In CSS, IDs are written with a pound sign at the beginning.
Likewise, you select the node with a specific ID with the pound sign and the name of the ID.
5. Narrowing the selection down with types
Instead of just selecting everything with a certain class or ID, you can prepend the class or ID selector with an HTML type. This narrows down the scope of a class or ID selector to the respective type.
In the example here, only the a elements with the "alert" class are selected.
Note that it should not be necessary to do this with ID selectors, as there should only be one element with that ID. However, it certainly enhances the readability of your selector.
6. Pseudo-classes for selecting specific children
CSS also has a concept called "pseudo-classes". There are many different types, but probably the most important are first-child, last-child, and nth-child. These come in handy if you want to select children nodes at a specific position. Here the li elements are children of an ordered list. First-child and last-child select the first and the last item in the list, respectively. nth-child is a generalization of both and takes as an argument the position in the list.
For example, if you wanted to scrape the last item in the list only, you could either use the last-child pseudo-class or the nth-child class with the number 3 as an argument. Pseudo-classes are normally prepended by the type of the children nodes, followed by a colon.
7. To sum it up...
In this video, you've been introduced to a lot of different CSS selectors. Therefore, here's a compact overview showing the basic selector types and how they can be combined. Of course, there are many more possible combinations.
8. Let's practice!
Okay, let's practice this with some examples.