Article
InnovationProcess Mining: Definition, Methods & Application in Service Optimization
Process Mining analyzes real process data from IT systems. Learn the three methods, the workflow, and how to optimize service processes with data.
Process mining is a data-driven analysis method that reconstructs, visualizes, and evaluates real business processes from the event logs of operational IT systems. Unlike traditional process modeling approaches — where teams describe at a whiteboard how a process should run — process mining shows how it actually runs, with all the deviations, loops, and bottlenecks that emerge in daily operations [1].
The method was developed from the late 1990s onward by Wil van der Aalst at Eindhoven University of Technology and has since evolved from an academic research field into a multi-billion-dollar software market, with vendors including Celonis, Signavio, Minit, and IBM [1][2]. In 2011, the IEEE Task Force on Process Mining published a manifesto positioning process mining as a distinct discipline between data mining and business process management [3].
If you search for “process mining” online, you will find ten results that all do the same thing: software vendors explaining why their tool is the best. Wikipedia delivers an academic definition. Hardly any result explains how process mining is concretely applied in service optimization — beyond the same SAP procurement process examples. None connects process mining to the service development toolkit: Where does it complement value stream mapping? When is a service blueprint the better choice? And what are the honest limitations of the method?
This guide closes these gaps — with a clear classification of the three process mining types, a step-by-step workflow, a service example from the insurance industry, and an honest assessment of when process mining is the wrong method.
What Distinguishes Process Mining from Traditional Process Analysis
To position process mining, a comparison with two related approaches helps: manual process modeling (as practiced in BPMN workshops) and value stream mapping from the Lean tradition.
| Dimension | Manual Process Modeling | Value Stream Mapping | Process Mining |
|---|---|---|---|
| Data Source | Interviews, workshops, expert knowledge | On-site observation (Gemba Walk) | Event logs from IT systems |
| Perspective | Target state (“This is how it should run”) | Current state (observed) | Current state (data-based) |
| Objectivity | Subjective — depends on participants’ knowledge | Semi-objective — observer may miss things | Objective — based on system data |
| Scalability | Low — works only for individual processes | Low — requires physical presence | High — can analyze millions of process instances simultaneously |
| Time Dimension | Static — snapshot of a moment | Semi-dynamic — observed time period | Dynamic — can show historical evolution |
| Blind Spots | Workarounds, informal processes | Digital process steps without physical presence | Manual steps without system logging |
| Typical Use | Process design, compliance documentation | Lean optimization, waste analysis | Process analysis, conformance, automation |
The central insight: Process mining and value stream mapping are not alternatives — they complement each other. Value stream mapping delivers the human perspective on the process: Why does something happen? What do participants experience? What informal workarounds exist? Process mining delivers the data foundation: What actually happens? How often? How long? In what variants? The strongest approach combines both: process mining for quantitative assessment, Gemba Walk and value stream mapping for qualitative depth.
The Three Types of Process Mining
Van der Aalst distinguishes three fundamental types of process mining that build upon each other [1]:
1. Process Discovery
What it does: A process model is automatically generated from event log data — without a model needing to exist beforehand. Algorithms such as the Alpha Miner, the Heuristic Miner, or the Inductive Miner analyze the sequence of activities in the logs and construct a process flow diagram from them.
When to use it: When you do not know how a process actually runs — or when you suspect that the documented process description deviates from reality. This is the most common entry point into process mining.
What you get: A visual process model showing all observed paths — including rare variants and loops that are typically absent from manual models. Depending on the algorithm, the model also shows frequencies (how often is each path taken?) and throughput times (how long does each step take?).
Typical result: A discovery run for an insurance claims process with 50,000 cases might reveal not 8 process steps (as documented) but 23 observed variants — where the “happy path” (the documented standard process) is actually followed in only 34% of cases.
2. Conformance Checking
What it does: An existing process model (target) is compared with actual process data (reality). Every deviation is identified and quantified — both cases where steps are skipped and cases where additional, unplanned steps occur.
When to use it: When a defined target process exists — such as a documented SOP, a regulatory requirement, or a certified quality process — and you want to verify whether it is being followed. Particularly relevant in regulated industries: banking, insurance (Solvency II), healthcare (GxP).
What you get: A deviation analysis with concrete numbers: “In 23% of cases, step 4 (four-eyes review) is skipped.” “In 11% of cases, an unplanned feedback loop occurs between steps 3 and 5.” “The average deviation from the target process is 2.3 steps per case.”
Typical result: A conformance check of a lending process might reveal that the regulatory KYC review (Know Your Customer) occurs after the loan approval in 7% of cases — a compliance risk that would have remained invisible without data-based analysis.
3. Enhancement
What it does: Existing process models are enriched with additional information from the data — typically throughput times, waiting times, resource utilization, and bottleneck analyses. While discovery and conformance analyze the process flow itself, enhancement adds the performance dimension.
When to use it: When you know how the process runs but not where the bottlenecks are. Enhancement is the natural next step after discovery or conformance checking.
What you get: A process model that shows not only which steps exist but also where time is lost. Typical enhancements: heatmaps for waiting times (which transition between two steps takes longest?), bottleneck analysis (which step slows the overall process?), resource analysis (which case worker type processes fastest/slowest?).
Typical result: An enhancement run for an onboarding process might show that 68% of total waiting time falls on a single transition — the assignment from application to responsible case worker — because it is done manually and only once per day.
The Three Types Working Together
In practice, the three types are applied iteratively:
- Discovery as the entry point — making the actual process visible
- Conformance checking as the comparison — quantifying deviations from target
- Enhancement as the deep dive — identifying bottlenecks and optimization potential
This approach mirrors the pattern described in value stream mapping: map current state, identify waste, design future state. The difference: process mining does this data-based and can analyze hundreds of thousands of cases simultaneously, while value stream mapping works qualitatively on individual processes.
When Is Process Mining the Right Choice?
Use process mining when:
- Your process is logged in IT systems (ERP, CRM, ticketing system, BPM suite) and you can extract event logs
- You have many process instances (hundreds to millions of cases) and manual analysis does not scale
- You want to know how the process actually runs — not how the documentation says it should run
- You need to quantify deviations — for example, for compliance, audit, or regulatory purposes
- You want to identify bottlenecks with data rather than relying on assumptions
- You are building a KPI dashboard and need baseline data on throughput times and process variants
Use a different tool when:
| Situation | Better Alternative | Why |
|---|---|---|
| The process has no digital traces (no ticket system, no logs) | Value Stream Mapping | VSM works with observation; process mining needs data |
| You want to understand the customer perspective on the service | Service Blueprint or Customer Journey Map | Process mining shows internal processes, not the customer experience |
| You want to analyze the root causes of a single problem | Ishikawa Diagram | Ishikawa explains why; process mining shows what and how long |
| You have few cases (under 50) | Manual process analysis | Process mining needs volume to identify meaningful patterns |
| You want to design a new process, not analyze an existing one | Service Design | Process mining analyzes the existing; service design creates the new |
Step by Step: Process Mining in Service Optimization
Step 1: Define Objective and Scope
Before extracting data, define precisely what you want to discover. A process mining project without a clear question produces impressive visualizations — and no decisions.
Typical questions for service processes:
| Question | Process Mining Type | Example |
|---|---|---|
| ”How does the process actually run?” | Discovery | Insurance claims process |
| ”Are we meeting our SLAs?” | Conformance | First-response time in IT support |
| ”Where do we lose the most time?” | Enhancement | Bank onboarding process |
| ”Why do some cases take 10x longer?” | Discovery + Enhancement | Credit review with high variance |
Tip: Formulate the objective as a concrete, measurable question — not a vague wish. “We want to understand our processes” is not an objective. “We want to know why 15% of our claims take longer than 21 days” is.
Step 2: Extract Event Log Data
The event log is the raw material for process mining. Each entry in an event log describes an event: what was done, when it was done, and in which case (Case ID) it occurred.
Minimum requirements for an event log:
| Field | Description | Example |
|---|---|---|
| Case ID | Unique identifier for the case | Claim number CLM-2025-004712 |
| Activity | Name of the performed activity | ”Completeness check” |
| Timestamp | Time of the activity | 2025-11-14 09:23:17 |
Optional but valuable fields:
| Field | Description | Use |
|---|---|---|
| Resource | Who performed the activity? | Resource analysis, workload distribution |
| Cost | Cost of the activity | Cost-based optimization |
| Additional Attributes | Claim type, amount, customer type | Segmentation, comparative analyses |
Common data problems (and solutions):
| Problem | Impact | Solution |
|---|---|---|
| Missing timestamps | No throughput time analysis possible | Switch data source or add logging |
| Inconsistent activity names | ”Review”, “Check”, “Verification” mean the same thing | Harmonize beforehand (mapping table) |
| Missing Case ID | No assignment possible | Build composite key (e.g., customer no. + date) |
| Too coarse granularity | Only start and end visible, no intermediate steps | Activate finer logging level |
Van der Aalst emphasizes: “80% of the effort in a process mining project is spent on data extraction and preparation” [1]. Plan this step not as preparation but as core work.
Step 3: Generate Process Model (Discovery)
Load the cleaned event log into a process mining tool and run a discovery algorithm. The choice of algorithm affects the result:
| Algorithm | Strength | Weakness | Suitable for |
|---|---|---|---|
| Alpha Miner | Simple, fast | Cannot handle noise and complex patterns | Simple, structured processes |
| Heuristic Miner | Robust against noise | May miss rare paths | Real processes with variants |
| Inductive Miner | Guarantees “sound” models | May over-generalize | Complex processes, compliance |
| Fuzzy Miner | Simplifies complex models visually | Loses details | Initial orientation, management presentations |
Practical tip: Start with the Heuristic Miner or Inductive Miner. These algorithms produce usable results for most service processes. The Alpha Miner is more academically relevant; the Fuzzy Miner is suitable for first visualization, not for deep analysis.
Step 4: Analyze and Interpret
The generated process model is the beginning, not the result. Now the actual analysis begins:
Variant analysis: How many different paths exist through the process? How frequently is each path taken? Which variants are “happy paths” (standard flows) and which are exceptions?
Bottleneck analysis: Where does the longest waiting time occur? Which transition between two steps slows the overall process? This analysis resembles what value stream mapping does with the timeline — only automated and for thousands of cases simultaneously.
Loop analysis: Where do loops occur — cases that bounce back and forth between two steps? Loops are a strong indicator of rework, follow-up questions, or unclear process rules.
Segmentation: Does process behavior differ by customer type, claim amount, region, or case worker? Segmentation uncovers systematic differences — for instance, that cases from case worker team A are closed on average 3 days faster than those from team B.
Step 5: Derive and Implement Actions
Process mining delivers diagnoses, not therapies. Interpreting results and deriving actions requires process knowledge — this is exactly where process mining and qualitative methods complement each other.
Typical actions after a process mining analysis:
| Finding | Possible Action | Method for Deepening |
|---|---|---|
| 68% of waiting time in one transition | Automatic assignment instead of manual batch processing | Value Stream Mapping for detail |
| 23% of cases skip review step | Tighten process rule or add mandatory fields | Conformance monitoring |
| 15% loops due to follow-up questions | Improve input form, add mandatory fields | Service Blueprint for customer perspective |
| High variance by case worker | Training, best-practice documentation | Gemba Walk for root cause investigation |
Example: Process Mining in an Insurance Claims Process
Context: A German auto insurer with 120,000 claims per year discovers that customer satisfaction (CSAT) has dropped 8 percentage points in the last 12 months. The hypothesis: throughput times are too long. But which cases are affected, and where exactly does the delay occur?
Discovery Result
The process mining team extracts event logs from the claims management system (SAP Claims Management) for the past 12 months: 118,000 cases with a total of 1.2 million events.
Surprising result:
| Metric | Expectation | Reality |
|---|---|---|
| Process variants | 8 (per SOP) | 147 observed variants |
| Happy path rate | >80% | 29% |
| Median throughput time | 7 days (SLA) | 11.3 days (median), 28 days (90th percentile) |
| Top bottleneck | Assumed: adjuster | Actual: case worker assignment (3.2 days waiting time) |
Conformance Result
The comparison with the documented target process shows:
- 12% of cases skip the completeness check
- 8% of cases go through an unplanned loop between assessment and approval (customer follow-up questions not anticipated in the target process)
- 22% of cases include a “manual data transfer” step that does not exist in the target process — a workaround for a missing system integration
Enhancement Result
The bottleneck analysis shows the distribution of waiting time:
| Transition | Average Waiting Time | Share of Total Waiting Time |
|---|---|---|
| Intake to case worker assignment | 3.2 days | 31% |
| Assessment to approval | 2.1 days | 20% |
| Completeness check to assessment | 1.8 days | 17% |
| Approval to payout | 1.4 days | 14% |
| Remaining transitions | 1.8 days | 18% |
Key insights:
- The assumed bottleneck (adjuster) was not the actual bottleneck. Case worker assignment — a seemingly trivial step — causes 31% of total waiting time because it is done manually in daily batches.
- 22% of cases contain an informal workaround (“manual data transfer”) that does not exist in the target process. This step costs 12 minutes per case and is performed 26,000 times per year — that is 5,200 labor hours that could be eliminated through system integration.
- The 90th-percentile cases (28 days) share a common pattern: they go through an average of 2.7 feedback loops because data is missing from the initial report.
Note: This example is illustratively constructed to demonstrate process mining in a service context. The numbers are based on typical industry values for auto claims processing in Germany.
Derived Actions
| Action | Expected Effect | Priority |
|---|---|---|
| Rule-based automatic assignment in CRM | -3 days for 100% of cases | High |
| Smart form validation (mandatory fields + photo upload) | -1.5 feedback loops for 22% of cases | High |
| API integration between claims system and archive | -12 min. for 22% of cases (5,200 hrs/year) | Medium |
| Raise approval threshold to EUR 2,000 | -2 days for 65% of cases | Medium |
Process Mining and Service Innovation: The Strategic Perspective
Most accounts of process mining end at operational process optimization: faster, leaner, fewer errors. That is valuable — but it is only half the story.
Process mining as a sensor for service innovation:
If you view process mining not just as an efficiency tool but as an insight tool, a strategic perspective opens up: the data reveals not only where the process is slow but also where the service fails to meet the customer need.
- 147 process variants in a claims process mean not just inefficiency — they mean the standard process does not fit 71% of cases. That is an indication of a service problem, not merely a process problem.
- 2.7 feedback loops for the slowest cases mean not just time loss — they mean the customer interface (the initial report) is not working. That is a customer journey problem.
- 22% informal workarounds mean not just compliance risk — they mean employees find the official process inadequate and develop their own solutions. These workarounds are often the seed for better service processes.
The connection to the Integrated Service Development Process (iSEP): Process mining provides the data-based foundation for the analysis phase of service development. Instead of building on assumptions (“We believe the process is too slow”), you start with facts (“The data shows that 31% of waiting time falls on a single transition”). This data-based diagnosis makes the transition from analysis to service development more precise — and reduces the risk of working on the wrong problem.
When Process Mining Does NOT Work
No tool fits every problem. You should know these limitations:
1. Processes without digital traces: Process mining needs event logs. If your service process runs predominantly analog — verbal agreements, paper forms, informal decisions — there is no data to analyze. In that case, start with value stream mapping or a Gemba Walk.
2. Too few cases: Process mining derives its strength from volume. With 50 cases per year, manual analysis is often more efficient. As a rule of thumb: below 200-300 cases per analysis period, process mining rarely delivers robust patterns.
3. Garbage in, garbage out: Poor data quality leads to misleading process models. If activity names are inconsistent, timestamps are missing, or case IDs cannot be uniquely assigned, the result shows a distorted picture of the process — not reality. Data preparation is not an optional pre-step but decisive for analysis quality [1][4].
4. The illusion of objectivity: Process mining shows what happens, not why. A bottleneck in case worker assignment could be due to insufficient capacity, a poor tool, organizational rules, or lack of knowledge. Root cause investigation requires qualitative methods — interviews, observation, Ishikawa analysis. Those who interpret process mining results without contextual knowledge “jump to conclusions that may not be relevant” [5].
5. Resistance to transparency: Process mining makes processes radically transparent — including workarounds, rule violations, and performance differences between teams. This can be politically sensitive. Without a clear mandate from leadership and a constructive error culture, process mining becomes a surveillance technology rather than an improvement tool.
6. The directly-follows graph trap: Many commercial tools generate a directly-follows graph (DFG) as the first result — a simple representation showing which activities directly follow each other. DFGs are intuitively readable but can be “misleading if you do not understand how they are generated” [6]. They can suggest causal relationships that do not exist and hide rare but important paths.
Object-Centric Process Mining: The Next Generation
Classical process mining has a fundamental limitation: each event is assigned to exactly one case ID. In reality, processes interact with multiple objects simultaneously — an order has multiple line items, a customer case involves multiple contracts, a claim involves multiple parties.
Object-centric process mining (OCPM) — largely driven by van der Aalst and his team at RWTH Aachen — resolves this limitation [7]. Instead of a single case ID, each event can be linked to multiple objects. This enables analysis of processes where multiple objects are processed in parallel and interactively.
For service processes, this is particularly relevant: a customer case (object 1) triggers an internal review (object 2) and an external adjuster assignment (object 3) — all three objects have their own process flows that interact with each other. Classical process mining cannot capture this; OCPM can.
As of 2026, OCPM is still transitioning from research to broader practical adoption. First commercial tools (Celonis, ProM) support OCPM, but data preparation is more complex than with the classical approach [7].
Process Mining Tools: An Overview
| Tool | Type | Strength | Suitable for |
|---|---|---|---|
| Celonis | Commercial | Market leader, strong SAP integration, execution management | Large enterprises with SAP landscapes |
| Signavio (SAP) | Commercial | Integration with SAP BPM, process modeling | SAP environments, BPM-focused organizations |
| UiPath Process Mining | Commercial | Integration with RPA (Robotic Process Automation) | Automation projects |
| ProM | Open Source | Academically founded, extensive algorithms | Research, evaluation, small teams |
| PM4Py | Open Source (Python) | Programmatic access, Fraunhofer FIT | Data science teams, individual analyses |
| Disco (Fluxicon) | Commercial | Ease of use, fast results | Entry-level, exploration, workshops |
Recommendation: For getting started, ProM (free, academically comprehensive) or Disco (commercial but very user-friendly) are suitable. For enterprise-wide implementation, Celonis or Signavio are the established options. For data-savvy teams, PM4Py — developed by Fraunhofer FIT — is a powerful Python library [8].
Frequently Asked Questions
What is process mining in simple terms?
Process mining is a method where software automatically analyzes how business processes actually run — based on the traces processes leave in IT systems (so-called event logs). Instead of drawing a process on a whiteboard and hoping the drawing is accurate, process mining reveals the real process with all its variants, loops, and waiting times. The method was developed by Wil van der Aalst from the 1990s onward [1].
What is the difference between process mining and data mining?
Data mining searches for patterns in data generally — such as purchasing behavior, risk patterns, or anomalies. Process mining is a specialized form of data mining that focuses exclusively on process data: temporally ordered events belonging to a case. While data mining asks “What patterns exist in the data?”, process mining asks “How does this process actually run?” [1].
What data is needed for process mining?
At minimum, three fields per event: a case ID (which case?), an activity name (what was done?), and a timestamp (when?). Additional fields such as the resource, costs, or customer type enable deeper analyses but are not strictly required. The data typically comes from ERP, CRM, or ticketing systems.
How much does process mining cost?
Costs vary widely: open-source tools like ProM or PM4Py are free. Entry-level commercial solutions (Disco, smaller vendors) cost a few thousand euros per year. Enterprise platforms (Celonis, Signavio) typically cost six-figure annual amounts — depending on data volume, user count, and modules. The biggest cost factor is not the software but data extraction and preparation (typically 60-80% of the project budget) [1].
How does process mining differ from value stream mapping?
Both methods analyze processes but in different ways: Value stream mapping is based on human observation and works qualitatively on individual processes. Process mining is based on system data and can automatically analyze hundreds of thousands of cases. Value stream mapping captures human context (workarounds, frustration, informal paths); process mining captures data-based context (frequencies, throughput times, variants). The strongest approach combines both methods.
Is process mining the same as process intelligence?
Process intelligence is the overarching term for data-driven analysis and management of business processes. Process mining is a core method within process intelligence, complemented by real-time monitoring, predictive analytics, and automated process optimization. You might say: process mining is the analytical foundation; process intelligence is the broader discipline that also includes real-time monitoring and forecasting.
Related Methods
A typical workflow in data-driven service optimization: With process mining you analyze process data and identify bottlenecks. With value stream mapping you deepen understanding of critical areas on-site. With a service blueprint you add the customer perspective. With a KPI dashboard you monitor improvements over time.
- Value Stream Mapping: Qualitative on-site process analysis — complementary to data-based analysis through process mining
- Service Blueprint: When you want to visualize the customer perspective on the service, which process mining does not capture
- Customer Journey Mapping: When you want to understand the customer experience across all touchpoints
- Ishikawa Diagram: When you want to analyze the root causes of a bottleneck identified through process mining
- KPI Dashboard: When you want to continuously monitor the metrics gained through process mining
Research Methodology
This article synthesizes findings from the foundational works of Wil van der Aalst (Process Mining: Data Science in Action, 2016), the IEEE Process Mining Manifesto (2011), Fraunhofer FIT research on open-source process mining, and the analysis of 10 German-language expert articles on process mining. Sources were selected for academic rigor, practical relevance, and timeliness. The practical example (insurance claims process) is illustratively constructed to demonstrate the method in a service context — not a documented case study.
Limitations: The academic literature on process mining predominantly originates from ERP-adjacent contexts (procurement, logistics, accounting). Empirical studies on application in service innovation — particularly in the redesign of services, not just their optimization — are limited.
Disclosure
SI Labs provides consulting services in the area of service innovation. In the Integrated Service Development Process (iSEP), we recommend process mining as a data-based foundation for the analysis phase when sufficient event log data is available. SI Labs is not a provider of process mining software and has no commercial relationships with the tool vendors mentioned in this article.
References
[1] van der Aalst, Wil M. P. Process Mining: Data Science in Action. 2nd ed. Berlin: Springer, 2016. ISBN: 978-3-662-49851-4 [Foundational Work | Process Mining | Citations: 6,000+ | Quality: 95/100]
[2] van der Aalst, Wil M. P. “Process Mining: Overview and Opportunities.” ACM Transactions on Management Information Systems 3, no. 2 (2012): Article 7. DOI: 10.1145/2229156.2229157 [Journal Article | Overview | Citations: 1,500+ | Quality: 90/100]
[3] van der Aalst, Wil M. P., et al. “Process Mining Manifesto.” Business Process Management Workshops, Lecture Notes in Business Information Processing, vol. 99. Berlin: Springer, 2012, pp. 169-194. DOI: 10.1007/978-3-642-28108-2_19 [IEEE Task Force | Manifesto | Citations: 2,000+ | Quality: 92/100]
[4] Mans, Ronny S., Wil M. P. van der Aalst, and Rob J. B. Vanwersch. Process Mining in Healthcare: Evaluating and Exploiting Operational Healthcare Processes. Cham: Springer, 2015. ISBN: 978-3-319-16071-9 [Book | Healthcare Process Mining | Citations: 300+ | Quality: 82/100]
[5] Reinkemeyer, Lars, ed. Process Mining in Action: Principles, Use Cases and Outlook. Cham: Springer, 2020. ISBN: 978-3-030-40171-9 [Practice Book | Use Cases | Citations: 150+ | Quality: 78/100]
[6] van der Aalst, Wil M. P. “A Practitioner’s Guide to Process Mining: Limitations of the Directly-Follows Graph.” Procedia Computer Science 164 (2019): 321-328. DOI: 10.1016/j.procs.2019.12.189 [Conference Paper | Methodology Critique | Citations: 200+ | Quality: 85/100]
[7] van der Aalst, Wil M. P., and Alessandro Berti. “Discovering Object-Centric Petri Nets.” Fundamenta Informaticae 175, no. 1-4 (2020): 1-40. DOI: 10.3233/FI-2020-1946 [Journal Article | OCPM | Citations: 200+ | Quality: 88/100]
[8] Berti, Alessandro, Sebastiaan J. van Zelst, and Wil M. P. van der Aalst. “Process Mining for Python (PM4Py): Bridging the Gap Between Process- and Data Science.” ICPM Demo Track (CEUR 2374), 2019. [Conference Paper | Tool | Fraunhofer FIT | Quality: 75/100]