Counting distinct data
As mentioned in the video, SQLAlchemy's func
module provides access to built-in
SQL functions that can make operations like counting and summing faster and more efficient.
In the video, Jason used func.sum()
to get a sum of the pop2008
column of census
as shown below:
select([func.sum(census.columns.pop2008)])
If instead you want to count the number of values in pop2008
, you could use func.count()
like this:
select([func.count(census.columns.pop2008)])
Furthermore, if you only want to count the distinct values of pop2008
, you can use the .distinct()
method:
select([func.count(census.columns.pop2008.distinct())])
In this exercise, you will practice using func.count()
and .distinct()
to get a count of the distinct number of states in census
.
So far, you've seen .fetchall()
, .fetchmany()
, and .first()
used on a ResultProxy to get the results. The ResultProxy also has a method called .scalar()
for getting just the value of a query that returns only one row and column.
This can be very useful when you are querying for just a count or sum.
This is a part of the course
“Introduction to Databases in Python”
Exercise instructions
- Build a
select
statement to count the distinct values in thestate
field ofcensus
. - Execute
stmt
to get the count and store the results asdistinct_state_count
. - Print the value of
distinct_state_count
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build a query to count the distinct states values: stmt
stmt = select([____])
# Execute the query and store the scalar result: distinct_state_count
distinct_state_count = connection.execute(____).scalar()
# Print the distinct_state_count
print(____)