Counting distinct data
As mentioned in the video, SQLAlchemy's func
module provides access to built-in
SQL functions that can make operations like counting and summing faster and more efficient.
In the video, Jason used func.sum()
to get a sum of the pop2008
column of census
as shown below:
select([func.sum(census.columns.pop2008)])
If instead you want to count the number of values in pop2008
, you could use func.count()
like this:
select([func.count(census.columns.pop2008)])
Furthermore, if you only want to count the distinct values of pop2008
, you can use the .distinct()
method:
select([func.count(census.columns.pop2008.distinct())])
In this exercise, you will practice using func.count()
and .distinct()
to get a count of the distinct number of states in census
.
So far, you've seen .fetchall()
, .fetchmany()
, and .first()
used on a ResultProxy to get the results. The ResultProxy also has a method called .scalar()
for getting just the value of a query that returns only one row and column.
This can be very useful when you are querying for just a count or sum.
This is a part of the course
“Introduction to Databases in Python”
Exercise instructions
- Build a
select
statement to count the distinct values in thestate
field ofcensus
. - Execute
stmt
to get the count and store the results asdistinct_state_count
. - Print the value of
distinct_state_count
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build a query to count the distinct states values: stmt
stmt = select([____])
# Execute the query and store the scalar result: distinct_state_count
distinct_state_count = connection.execute(____).scalar()
# Print the distinct_state_count
print(____)
This exercise is part of the course
Introduction to Databases in Python
In this course, you'll learn the basics of relational databases and how to interact with them.
In this chapter, you will build on your database knowledge by writing more nuanced queries that allow you to filter, order, and count your data—all within the Pythonic framework provided by SQLAlchemy.
Exercise 1: Filtering and targeting dataExercise 2: Connecting to a PostgreSQL databaseExercise 3: Filter data selected from a Table - SimpleExercise 4: Filter data selected from a Table - ExpressionsExercise 5: Filter data selected from a Table - AdvancedExercise 6: Ordering query resultsExercise 7: Ordering by a single columnExercise 8: Ordering in descending order by a single columnExercise 9: Ordering by multiple columnsExercise 10: Counting, summing, and grouping dataExercise 11: Counting distinct dataExercise 12: Count of records by stateExercise 13: Determining the population sum by stateExercise 14: SQLAlchemy and pandas for visualizationExercise 15: ResultsSets and pandas DataFramesExercise 16: From SQLAlchemy results to a plotWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.