Get to know the position() function
As you saw in the video, the position()
function is very powerful when used within a predicate. Together with operators, you can basically select any node from those that match a certain path.
You'll try this out with the following HTML excerpt that is available to you via rules_html
. Let's assume this is a continuously updated website that displays certain Coronavirus rules for a given day and the day after.
...
<div>
<h2>Today's rules</h2>
<p>Wear a mask</p>
<p>Wash your hands</p>
</div>
<div>
<h2>Tomorrow's rules</h2>
<p>Wear a mask</p>
<p>Wash your hands</p>
<small>Bring hand sanitizer with you</small>
</div>
...
This exercise is part of the course
Web Scraping in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Select the text of the second p in every div
rules_html %>%
html_elements(xpath = ___) %>%
___