MulaiMulai sekarang secara gratis

Capstone Crawler

This exercise gives you a chance to show off what you've learned! In this exercise, you will write the parse function for a spider and then fill in a few blanks to finish off the spider. On the course directory page of DataCamp, each listed course has a title and a short course description. This spider will be used to scrape the course directory to extract the course titles and short course descriptions. You will not need to follow any links this time. Everything you need to know is:

  • The course titles are defined by the text within an h4 element whose class contains the string block__title (double underline).
  • The short course descriptions are defined by the text within a paragraph p element whose class contains the string block__description (double underline).

Latihan ini adalah bagian dari kursus

Web Scraping in Python

Lihat Kursus

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# parse method
def parse(self, response):
  # Extracted course titles
  crs_titles = response.xpath(____).extract()
  # Extracted course descriptions
  crs_descrs = response.xpath(____).extract()
  # Fill in the dictionary: it is the spider output
  for crs_title, crs_descr in zip(crs_titles, crs_descrs):
    dc_dict[crs_title] = crs_descr
Edit dan Jalankan Kode