Starting with Start Requests
In the last lesson we learned about setting up the start_requests method within a scrapy spider. Here we have another toy-model spider which doesn't actually scrape anything, but gives you a chance to play with the start_requests method. What we want is for you to start becomming familiar with the arguments you pass into the scrapy.Request call within start_requests.
As before, we have created the function inspect_class to examine what you are yielding in start_requests.
This exercise is part of the course
Web Scraping in Python
Exercise instructions
- Fill in the required
scrapyobject into the classYourSpiderneeded to create thescrapyspider. - Fill in the blank in the yielded
scrapy.Requestcall within thestart_requestsmethod so that the URL this spider would start scraping is"https://www.datacamp.com"and would use theparsemethod (within theYourSpiderclass) as the method to parse the website.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import scrapy library
import scrapy
# Create the spider class
class YourSpider( ____ ):
name = "your_spider"
# start_requests method
def start_requests( self ):
yield scrapy.Request( ____ )
# parse method
def parse( self, response ):
pass
# Inspect Your Class
inspect_class( YourSpider )