Building a data pipeline

Using factory methods not only makes code easier to read, it also makes it easier to use. In this example, you'll get to practice creating a data pipeline that extracts data from a database. The DataPipeline class implements a factory method design pattern, and is shown here. Also defined for you are two concrete products of the Database class; Postgres and Redshift.

class DataPipeline:
  def _get_database(self, provider):
    if provider == "Postgres":
      return Postgres()
    elif provider == "Redshift":
      return Redshift()

  def extract_data(self, provider, query):
    database = self._get_database(provider)
    dataset = database.query_data(query)
    print(f"Extracted dataset from {provider} database")
    return dataset

Create an items_pipeline using the DataPipeline class, extract a dataset from a "Redshift" database with the query SELECT * FROM items;.
Update the items_pipeline to pull from a "Postgres" database instead, using the same query as before.
Create an etl_pipeline that extracts data from "Redshift".

Overloading and Multiple Inheritance

Custom Class Features and Type Hints

Object-oriented design patterns

Exercise

Building a data pipeline

Instructions