Starting with Start Requests
In the last lesson we learned about setting up the start_requests
method within a scrapy
spider. Here we have another toy-model spider which doesn't actually scrape anything, but gives you a chance to play with the start_requests method. What we want is for you to start becomming familiar with the arguments you pass into the scrapy.Request
call within start_requests
.
As before, we have created the function inspect_class
to examine what you are yielding in start_requests
.
This exercise is part of the course
Web Scraping in Python
Exercise instructions
- Fill in the required
scrapy
object into the classYourSpider
needed to create thescrapy
spider. - Fill in the blank in the yielded
scrapy.Request
call within thestart_requests
method so that the URL this spider would start scraping is"https://www.datacamp.com"
and would use theparse
method (within theYourSpider
class) as the method to parse the website.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import scrapy library
import scrapy
# Create the spider class
class YourSpider( ____ ):
name = "your_spider"
# start_requests method
def start_requests( self ):
yield scrapy.Request( ____ )
# parse method
def parse( self, response ):
pass
# Inspect Your Class
inspect_class( YourSpider )