Get startedGet started for free

Building a data pipeline

Using factory methods not only makes code easier to read, it also makes it easier to use. In this example, you'll get to practice creating a data pipeline that extracts data from a database. The DataPipeline class implements a factory method design pattern, and is shown here. Also defined for you are two concrete products of the Database class; Postgres and Redshift.

class DataPipeline:
  def _get_database(self, provider):
    if provider == "Postgres":
      return Postgres()
    elif provider == "Redshift":
      return Redshift()

  def extract_data(self, provider, query):
    database = self._get_database(provider)
    dataset = database.query_data(query)
    print(f"Extracted dataset from {provider} database")
    return dataset

This exercise is part of the course

Intermediate Object-Oriented Programming in Python

View Course

Exercise instructions

  • Create an items_pipeline using the DataPipeline class, extract a dataset from a "Redshift" database with the query SELECT * FROM items;.
  • Update the items_pipeline to pull from a "Postgres" database instead, using the same query as before.
  • Create an etl_pipeline that extracts data from "Redshift".

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create an ETL DataPipeline, query using Redshift
items_pipeline = ____()
____.extract_data("____", "SELECT * FROM items;")

# Now, switch the pipeline to Postgres
____

# Finally, create an etl_pipeline with Redshift
____ = ____()
____.____("____", "SELECT * FROM sales;")
Edit and Run Code