A Distance Metric for Sets of Events

A Distance Metric for Sets of Events

Abstract

In this work, we introduce a novel distance metric that describes the distance between sets of events, where events in the most common form are actions that happen at a given time. More generally, an event can be any object that is in an ordered relation to other objects. In our case, an event is a course taken by a student that happens during a specific semester. Calculating the distance uses the difference between the positional relations of all individual events in the set. For this, we do not use the absolute position of events but instead use the sum of differences of the relations before, concurrent, and after to express distance. We describe our metric algorithmically and evaluate it formally as well as exemplary on an existing data set of student exams. We also show that the results of the metric are intuitive to interpret for humans by comparing them to the results of a user study that we ran. This metric can be applied to a range of problems that rely on the positional relation of events by removing the dependency of timestamps for events and replacing them with a set of ordered identifiers. We show a specific application of the metric by tackling the problem of clustering and predicting study paths from university students.

Grafik Top
Authors
  • Sahann, Raphael
  • Plant, Claudia
  • Möller, Torsten
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
The 7th IEEE International Conference on Data Science and Advanced Analytics 2020
Divisions
Visualization and Data Analysis
Subjects
Angewandte Informatik
Event Location
Sydney, Australia
Event Type
Conference
Event Dates
6-9 October 2020
Date
9 October 2020
Export
Grafik Top