Technical University of Denmark
More and more business activities are performed using information systems. These systems produce such huge amounts of event data that existing systems are unable to store and process them. Moreover, few processes are in steady-state and due to changing circumstances processes evolve and systems need to adapt continuously. Since conventional process discovery algorithms have been defined for batch processing, it is difficult to apply them in such evolving environments. Existing algorithms cannot cope with streaming event data and tend to generate unreliable and obsolete results.
In this paper, we discuss the peculiarities of dealing with streaming event data in the context of process mining. Subsequently, we present a general framework for defining process mining algorithms in settings where it is impossible to store all events over an extended period or where processes evolve while being analyzed. We show how the Heuristics Miner, one of the most effective process discovery algorithms for practical applications, can be modified using this framework. Different stream-aware versions of the Heuristics Miner are defined and implemented in ProM. Moreover, experimental results on artificial and real logs are reported.
In CoRR abs/1212.6383, Dec. 2012.
Copyright and moral rights for the publications made accessible in the public website are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.