Getting distinct values

Sometimes an analysis doesn't need every record, but rather unique values in one or more columns. Duplicate values can be removed after loading data into a dataframe, but it can also be done at import with SQL's DISTINCT keyword.

Since hpd311calls contains data about housing issues, we would expect most records to have a borough listed. Let's test this assumption by querying unique complaint_type/borough combinations.

pandas has been imported as pd, and the database engine has been created as engine.

Note: The SQL checker is quite picky about column positions and expects fields to be selected in the specified order.

Create a query that gets DISTINCT values for borough and complaint_type (in that order) from hpd311calls.
Use read_sql() to load the results of the query to a dataframe, issues_and_boros.
Print the dataframe to check if the assumption that all issues besides literature requests appear with boroughs listed.

Exercise

Getting distinct values

Instructions

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise