Counting in groups
In previous exercises, you pulled data from tables, then summarized the resulting dataframes in pandas
to create graphs. By using COUNT
and GROUP BY
in a SQL query, we can pull those summary figures from the database directly.
The hpd311calls
table has a column, complaint_type
, that categorizes call records by issue, such as heating or plumbing. In order to graph call volumes by issue, you'll write a SQL query that COUNT
s records by complaint type.
pandas
has been imported as pd
, and the database engine for data.db
has been created as engine
.
This exercise is part of the course
Streamlined Data Ingestion with pandas
Exercise instructions
- Create a SQL query that gets the
complaint_type
column and counts of all records fromhpd311calls
, grouped bycomplaint_type
. - Create a dataframe with
read_sql()
of call counts by issue,calls_by_issue
. - Run the last section of code to graph the number of calls for each housing issue.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create query to get call counts by complaint_type
query = """
____ ____,
____(*)
FROM hpd311calls
____ ____;
"""
# Create dataframe of call counts by issue
calls_by_issue = pd.read_sql(____, ____)
# Graph the number of calls for each housing issue
calls_by_issue.plot.barh(x="complaint_type")
plt.show()