Work

Methods for Multi-Objective Genetic Clustering of Time-Evolving Data

Public

Sequential batches of time-evolving data for a set of persistent identifiable entities (e.g. online shopping behavior by month for a customer ID, or economic figures by year for a collection of countries) can exhibit temporal shifts in their underlying clustering structure. Methods for recovering this evolutionary clustering structure exploit natural smoothness in cluster evolution at consecutive time steps to stabilize cluster assignments as batches of updated data arrive daily, weekly, monthly, etc. In traditional evolutionary clustering contexts, for specific choices of minimization criterion, a routine based on approximation or relaxation optimizes a user-determined trade-off between two objective functions – one reflecting goodness-of-fit of a clustering arrangement against historical data or clusters, and one goodness-of-fit against the most current data. However, not much attention has been devoted to the use of a posteriori multi-objective optimization algorithms for simultaneous optimization of these competing objectives, which naturally accommodate multiple costs of complicated form and organically detect a range of solutions exhibiting differing emphasis on one or the other of historical and current costs without the need for user-determined weighting parameters. This thesis proposes a technique for evolution- ary multi-objective clustering of evolving data (EMOCED) incorporating spanning trees to improve various aspects of the genetic clustering process, all of which carry over to traditional static clustering. EMOCED is applied in a handful of experiments, including simulated flocking dynamics, stock market pricing evolution, and coronavirus spread data, where it produces data partitions superior to those of other recent evolutionary clustering algorithms.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items