1. Finding maximum levels of overlap
As we wrap up this course, I want to cover a more complicated scenario and show that the best result in SQL is not necessarily the most straightforward.
2. Start with some data
Suppose we have a store where we keep track of when people enter and leave as well as the number of products they order. We would like to make use of this data to determine how many staff we should have in our stores. The way we will determine this is to count the maximum number of people in the store at any one time.
3. Reasoning through the problem
For example, the first two customers are in the store at the same time from 3:35 PM until 4:01 PM. The first customer then leaves before the third customer arrives, so the maximum at that point is 2.
4. Reasoning through the problem
Over the entire set, the maximum number of customers is three. The first of the three customers enters at 4:35 PM and the last enters at 5:55 before that customer leaves at 5:57 PM.
5. Algorithm, step 1
To perform this operation efficiently in T-SQL, the first thing we want to do is break up our start and end times into separate rows so that we have an event per entrance and an event per exit. We will add two more columns: entry count and start ordinal. Entry count helps us keep track of the number of people in the store at a given time and decrements whenever a person leaves. Start ordinal gives us the order of entry, so it will be NULL for any exit.
6. Algorithm, step 1
And here is how the results look. To make best use of screen space, I removed the dates from these because all events are on the same day.
We'll put this into a common table expression which I'll call StartStopPoints, as it represents the starting and stopping points for each customer visit.
7. Algorithm, step 2
This next common table expression, which I will call StartStopOrder, takes each of our start and end times in the first query and adds a new ordinal value arranging when people leave and enter. We order by time of entry and then by start ordinal.
Ordering by start ordinal is important because we have exits marked as NULL values, so they will sort before the entrances. That way, if a person walks out the door exactly when another person walks in the door, we don't say there were two people in--we decrement the counter for the person leaving and then increment the counter for the person entering.
8. Algorithm, step 2
The StartEndOrdinal value gives us an ordering of the order in which people entered and left the store. Already we can piece together when there will be multiple people in the store: if we see positive entry counts start to outnumber negative entry counts, we have more people in the store.
9. Algorithm, step 2
A brute force technique might be to sum the value of EntryCount using a running total, which we have seen earlier in this chapter. For example, after row 6, we have 2 people in the store: 4 entrances and 2 exits. This works, but there is a better way.
10. Algorithm, step 3
We will have two StartEndOrdinal`rows for every StartOrdinal row, so if we double the StartOrdinal value and subtract from it the StartEndOrdinal value, that leaves us with the number of people in the store at any given time. Here are the first results.
11. Algorithm, step 3
And putting this into SQL, we get back our final total of 3 concurrent visitors.
12. Let's practice!
The following exercises will help you understand and apply the algorithm, so let's jump to it.