Get Started

Counting distinct data

As mentioned in the video, SQLAlchemy's func module provides access to built-in SQL functions that can make operations like counting and summing faster and more efficient.

In the video, Jason used func.sum() to get a sum of the pop2008 column of census as shown below:

select([func.sum(census.columns.pop2008)])

If instead you want to count the number of values in pop2008, you could use func.count() like this:

select([func.count(census.columns.pop2008)])

Furthermore, if you only want to count the distinct values of pop2008, you can use the .distinct() method:

select([func.count(census.columns.pop2008.distinct())])

In this exercise, you will practice using func.count() and .distinct() to get a count of the distinct number of states in census.

So far, you've seen .fetchall(), .fetchmany(), and .first() used on a ResultProxy to get the results. The ResultProxy also has a method called .scalar() for getting just the value of a query that returns only one row and column.

This can be very useful when you are querying for just a count or sum.

This is a part of the course

“Introduction to Databases in Python”

View Course

Exercise instructions

  • Build a select statement to count the distinct values in the state field of census.
  • Execute stmt to get the count and store the results as distinct_state_count.
  • Print the value of distinct_state_count.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build a query to count the distinct states values: stmt
stmt = select([____])

# Execute the query and store the scalar result: distinct_state_count
distinct_state_count = connection.execute(____).scalar()

# Print the distinct_state_count
print(____)

This exercise is part of the course

Introduction to Databases in Python

IntermediateSkill Level
4.4+
14 reviews

In this course, you'll learn the basics of relational databases and how to interact with them.

In this chapter, you will build on your database knowledge by writing more nuanced queries that allow you to filter, order, and count your data—all within the Pythonic framework provided by SQLAlchemy.

Exercise 1: Filtering and targeting dataExercise 2: Connecting to a PostgreSQL databaseExercise 3: Filter data selected from a Table - SimpleExercise 4: Filter data selected from a Table - ExpressionsExercise 5: Filter data selected from a Table - AdvancedExercise 6: Ordering query resultsExercise 7: Ordering by a single columnExercise 8: Ordering in descending order by a single columnExercise 9: Ordering by multiple columnsExercise 10: Counting, summing, and grouping dataExercise 11: Counting distinct data
Exercise 12: Count of records by stateExercise 13: Determining the population sum by stateExercise 14: SQLAlchemy and pandas for visualizationExercise 15: ResultsSets and pandas DataFramesExercise 16: From SQLAlchemy results to a plot

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free