Journal of International Technology and Information Management

Document Type



The discovery of useful or worthwhile process models must be performed with due regards to the transformation that needs to be achieved. The blend of the data representations (i.e data mining) and process modelling methods, often allied to the field of Process Mining (PM), has proven to be effective in the process analysis of the event logs readily available in many organisations information systems. Moreover, the Process Discovery has been lately seen as the most important and most visible intellectual challenge related to the process mining. The method involves automatic construction of process models from event logs about any domain process, and describes causal dependencies between the various activities as performed within the process execution environment. In principle, one can use process discovery to obtain process models that describes reality. To this end, the work in this artcle presents a Fuzzy-BPMN mining approach that uses training events log representing 10 different real-time business process executions to provide a method for discovery of useful process models, and then cross-validating the derived models with a set of test event logs in order to measure the accuracy and performance of the employed approach. The method focuses on carrying out a classification task to determine the traces, i.e. individual cases that makes up the test event logs in order to determine which traces that can be replayed by the original model. Thus, the paper aim is to provide a technique for process models discovery which is as good in balancing between “overfitting” and “underfitting” as it is able to correctly classify the traces that can be replayed (i.e allowed) or non-replayable (disallowed) by the model. In other words, the study shows through the Fuzzy-BPMN replaying notation and the series of validation experiments - how given any classified trace (for the test events log) and discovered process model (the training log) it can be unambiguously determined whether or not the traces found can be replayed on the discovered model.