1. Aggregating events
Often, process data come with a long list of different activities. A first strategy to lower the amount is by subsetting events, as we just saw in the previous lesson and exercises. Another strategy is to abstract from too detailed activities, thereby aggregating them.
2. Aggregation types
In this lesson, we’ll learn about two different aggregation types: the is-a aggregation, and the part-of aggregation.
3. Is-a aggregation
The is-a aggregation occurs when there are slightly different activity labels, which at some level all related to the same activity. Suppose we are looking at an auditing process. We may have the activities “Normal automated check”, “Advanced Automated check”, and “Extra automated checks”. While the distinction might be useful for a certain analysis, suppose we are fairly comfortable with just calling these activities “Automated check”.
This aggregation is called an is-a aggregation, because each of the detailed activities is a Automated check. Performing this aggregation can be done with the act_unite function, which means that we are about to unite several activities into a single all encompassing label.
4. Is-a aggregation
In this act_unite function, for each aggregation, we specify a vector of activity labels, contain the detailed activities, and we name this vector with the new activity label.
5. Part-of aggregation
On the other hand, the part-of aggregation occurs when there is a set of activities which are clearly different from each other, but they are nonetheless related as a part of a single, higher level activity.
For example, in the claims management process, we have the activities “Start Investigation”,”Appoint Lawyer”, “Appoint Expert”,”Receive Conclusion” and “Decision”. While each of these is clearly a very different activity, they are all part of what we could call the “Investigation”. We can refer to this situation as a sub process, which we want to collapse.
6. Part-of aggregation
This collapse can be done using the act_collapse function, where we use the same notation as before: a vector containing the old activities, which we name with the new activity name.
Note that for both act_unite and act_collapse, it is perfectly possible to provided multiple vectors, in case there are multiple aggregations to be performed. In such a situation, the aggregation will be done in the order that they are listed.
7. Impact on Activity Types and Instances
The is-a aggregation and the part-of aggregation both have an impact on the amount of activity types and activity instances.
The amount of activity types will, in each aggregation, decrease depending on the number of detailed activity types which are to be replaced with the new activity. Collapsing a sub process of 5 activities into one activity, will decrease the number of activity types with 4. The same is true in case of a is-a aggregation.
However, the is-a aggregation wont impact the number of activity instances, this will remain the same, while the part-of aggregation will lower the amount of activity instances. Indeed, the multiple activity instances describing the activities in a sub process will be collapse into a single activity instance of the new activity.
8. Let's practice!
In the next exercises, we'll apply these techniques on our hr process.