Selecting columns with SQL
Datasets can contain columns that are not required for an analysis, like the weather
table in data.db
does. Some, such as elevation, are redundant, since all observations occurred at the same place, while others contain variables we are not interested in. After making a database engine, you'll write a query to SELECT
only the date and temperature columns, and pass both to read_sql()
to make a dataframe of high and low temperature readings.
pandas
has been loaded as pd
, and create_engine()
has been imported from sqlalchemy
.
Note: The SQL checker is quite picky about column positions and expects fields to be selected in the specified order.
Diese Übung ist Teil des Kurses
Streamlined Data Ingestion with pandas
Anleitung zur Übung
- Create a database engine for
data.db
. - Write a SQL query that
SELECT
s thedate
,tmax
, andtmin
columns from theweather
table. - Make a dataframe by passing the query and engine to
read_sql()
and assign the resulting dataframe totemperatures
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create database engine for data.db
engine = ____
# Write query to get date, tmax, and tmin from weather
query = """
SELECT ____,
____,
____
FROM ____;
"""
# Make a dataframe by passing query and engine to read_sql()
temperatures = ____
# View the resulting dataframe
print(temperatures)