CSG Data Science and Machine Learning presents

Training: Process Mining and Scientific Workflows running on the HPC cluster

Date: 18.10.2022 (9 am - 1 pm); Format: online

Short abstract:

The goal of process mining is to turn event data into insights and actions. On the other side, there exist scientific workflows running on HPC clusters.

Now, how can we combine process mining and scientific workflows running on HPC clusters? The first idea is “Process mining on logs of execution of scientific workflows on HPC”. Process mining on previous executions of a workflow on an HPC system can be used to deduce certain parallel execution parameters such as the number of tasks, number of cores, dedicated memory, etc. So, here we can provide valuable insights and offer optimization ideas for running the tasks on the HPC cluster.

Currently, based on the analysis of the extracted event log, there is limited usage of SLURM as the workflow management system, which means that only a small fraction of accounts declare interdependencies between tasks. So here comes the second idea, which is implementing a workflow engine that runs workflow steps (jobs) on SLURM with correct interdependencies. The third idea could be enabling efficient and distributed execution of process mining operators on HPC clusters. That is to allow users to run process mining workflows on an HPC cluster.

Information about CSG Data Science and Machine Learning

Agenda

8:45 – 9:00
Welcome

9:00 – 9:30
Talk 1: Introduction to Process Mining

9:30 – 10:00
Talk 2: Introduction to HPC challenges to process mining

10:00 – 10:30
Talk 3: Introduction to scientific workflows and workflow management systems

10:30 – 11:00
Break

11:00 – 11:30
Talk 4: Process mining and scientific workflows

11:30 – 12:00
Talk 5: Process mining on logs of execution of scientific workflows on HPC

12:00 – 12:30
Discussion

12:30 – 12:45
Conclusion

Contact person

Zahra Sadeghibogar

RWTH Aachen University