Get Started

Counting distinct data

As mentioned in the video, SQLAlchemy's func module provides access to built-in SQL functions that can make operations like counting and summing faster and more efficient.

In the video, Jason used func.sum() to get a sum of the pop2008 column of census as shown below:

select([func.sum(census.columns.pop2008)])

If instead you want to count the number of values in pop2008, you could use func.count() like this:

select([func.count(census.columns.pop2008)])

Furthermore, if you only want to count the distinct values of pop2008, you can use the .distinct() method:

select([func.count(census.columns.pop2008.distinct())])

In this exercise, you will practice using func.count() and .distinct() to get a count of the distinct number of states in census.

So far, you've seen .fetchall(), .fetchmany(), and .first() used on a ResultProxy to get the results. The ResultProxy also has a method called .scalar() for getting just the value of a query that returns only one row and column.

This can be very useful when you are querying for just a count or sum.

This is a part of the course

“Introduction to Databases in Python”

View Course

Exercise instructions

  • Build a select statement to count the distinct values in the state field of census.
  • Execute stmt to get the count and store the results as distinct_state_count.
  • Print the value of distinct_state_count.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build a query to count the distinct states values: stmt
stmt = select([____])

# Execute the query and store the scalar result: distinct_state_count
distinct_state_count = connection.execute(____).scalar()

# Print the distinct_state_count
print(____)
Edit and Run Code