Open Editions

 View Only

Advancing Process Visibility with Real-Time Analytics through Kogito, Process Mining, and Kafka Streaming

By Karina Varela posted Wed February 21, 2024 04:09 PM

  

Many organizations struggle with inefficient, opaque business processes, resulting in a lack of agility and an inability to quickly identify and adjust bottlenecks. By integrating process automation with process analytics, it's possible to transform the way data is used to improve operations.

It's time to explore a wealth of untapped potential within our automated business process. To provide end-to-end visibility on potential hidden inefficiencies and to spot opportunities to improve, we can rely on the powerful combination of process automation and process mining technologies asynchronously communicating via an event-driven architecture.

Extract valuable data, refine your operations, and achieve peak performance with the powerful combination of Kogito, Kafka, and Process Mining. 

In this blog, you will: 

  • Understand potential benefits and ways for these capabilities to add up and complement each other · Explore the architectural solution, how it works, and the roles played by each technology
  • Know key particularities of implementation and relevant aspects to have in mind
  • Learn the foundational setup for delivering an implementation using Kogito for process automation, Kafka for event streaming, and IBM Process Mining for data analytics and insights on business data

As we delve into the topic, remember to observe how the selected technologies impact the overall complexity and maintainability of the solution:

  • Kogito serves as the heart of process automation
  • IBM Process Mining provides data analytics and insights
  • Kafka facilitates event streaming capabilities for seamless integration
  • Why should you consider Process Automation + Process Mining?

The driving force behind integrating these technologies is to expand visibility and enable the continuous optimization of business processes.

This combination brings advantages such as:

  • Increased agility for identifying and responding to process inefficiencies
  • Faster and more efficient workflow optimization backed by data insights
  • Facilitate the achievement of process compliance and managing risks
  • Increased productivity and cost reductions

By leveraging the strengths of Kogito and IBM Process Mining analytics, we can accelerate process excellence initiatives and increase operational maturity.

Now, let’s embark on our journey of discovery, exploring initial insights and impressions as into the innovative potential and technical nuances of creating a synergistic integration between Kogito processes and IBM Process Mining.

Overview: Integrating Kogito Process Automation and Process Mining

Technologies Highlights 

  • Kogito is a cloud-native, open-source process automation solution that delivers both performance and flexibility for developers. Some key highlights of Kogito include:
    • A fast development cycle with automatic code generation
    • Out-of-the-box integration with a powerful DMN 1.5 compliant decision engine
    • Process (BPMN) and decision (DMN) design tools for business analysts are accessible directly via browser, and for developers through VSCode
    • Good fit for microservice architectures given its fast, lightweight nature, especially when combined with the Quarkus runtime. The combination of Quarkus and Kogito gives Java developers a high-performing development flow, leveraging the goodness of the Quarkus ecosystem and the innovative Eclipse MicroProfile specification.

For those seeking an enterprise-ready open source product version, IBM Process Automation Manager Open Edition is worth checking out (Kogito processes Technology Preview planned for 2024).

  • IBM Process Mining is a sophisticated analytics platform that transforms the way organizations analyze and optimize their business processes. It employs advanced algorithms to delve into event logs from a variety of IT systems, uncovering not just how processes are assumed to function, but how they actually operate in reality.

By facilitating automated process discovery, IBM Process Mining extracts process models directly from event logs, offering invaluable insights into operational inefficiencies, bottlenecks, and deviations. Beyond identifying areas for improvement, it supports conformance checking—aligning actual process performance with the intended model—and extends to capabilities like social network mining, simulation model construction, model extension, repair, case prediction, and providing recommendations based on historical data.

IBM Process Mining champions principles of transparency, fairness, and robustness, positioning it as an indispensable tool for businesses committed to operational excellence and continuous improvement.

The architecture

The solution is based on an event-driven integration between a Kogito process service and IBM Process Mining. The integration is enabled by Kafka, which in our particular example, is available through IBM Event Streams.

Image: Integration between Kogito, the producer, and IBM Process Mining, the consumer, through Kafka events.

Here's an overview of the integration: 

  1. For every change in process instances, tasks, and events, new events are sent from the process service to Kafka
  2. The process mining tool consumes all new events on the defined topic, recurrently, based on the pre-configured timeframe (e.g. every 30 minutes)

How to Integrate Kogito and IBM Process Mining

Kogito: The Process Service

It all starts with Kogito, our process automation solution. In our example, we used one of the existing community's example projects as the starting point. It contains a hiring process with automated tasks, timers and user tasks, giving us some extra room for exploration once we have data to visualize in the process mining tool.

Business Process: Hiring Process

Configuring the Process Service

To enable our Kogito process service to emit events that flow smoothly to Kafka, we must equip Kogito with an events add-on. Since we're running on Quarkus, the following dependency can be configured in the pom.xml file:

<dependency>

    <groupId>org.kie.kogito</groupId> 

   <artifactId>kogito-addons-quarkus-events-process</artifactId>

</dependency> 

<dependency>

   <groupId>io.quarkus</groupId>

   <artifactId>quarkus-smallrye-reactive-messaging-kafka</artifactId>

</dependency>

Next, we need to configure the process application by adding to the src/main/resources/application.properties the settings needed for establishing a clear connection with Kafka. In this example, we show the configuration of a "Kafka" service, that could be for example running on localhost, on another server, or even on the cloud. Remember that each cloud service has its own authentication and authorization settings.  

In this sample, the streaming service we are using is IBM Event Streams running as a service on the IBM Cloud. It offers a "copy-paste" text with all potentially needed configurations for producers and consumers, in this case, the Kogito service and IBM Process Mining solution respectively.

kafka.bootstrap.servers=kafka:9092
 
mp.messaging.outgoing.kogito-processinstances-events.connector=smallrye-kafka
mp.messaging.outgoing.kogito-processinstances-events.topic=kogito-processinstances-events
mp.messaging.outgoing.kogito-processinstances-events.value.serializer=org.apache.kafka.common.serialization.StringSerializer

Kafka: The Integration Layer

Kafka, responsible for enabling the choreography of our services, is responsible for receiving these process events and making them available to be consumed by IBM Process Mining. Certain Kafka solutions won't allow client services to create topics, therefore, you may need to create beforehand the topic that Kogito needs to publish its events. By default, the topic name used by our process service is kogito-processinstances-events. Ensure any required ACL (access-control list) is properly configured in your Kafka instance as well.

Process Mining: Finding Meaning in the Data

Firstly, let's define and understand the key concepts associated with both Kogito and IBM Process Mining:

Kogito

IBM Process Mining

Description

Node

Activity

Tasks, process events, and others

Start Time

Enter

Timestamp of a node activation

Process Instance ID

Process ID

An ID of a process execution

Creating a new Process Mining Project

In order to get started on the IBM Process Mining side of this implementation, you need to create a new project.

New Project Wizard: Creating a new Project in IBM Process Mining

When creating a new project, you should input the name of the process, and you can optionally upload the BPMN model that can be later used for further analysis by IBM Process Mining.

Next, you'll probably be requested to upload a CSV file. Here are important thoughts on this particular step:

In the IBM Process Mining version used to demonstrate this solution, we observed that the creation of new projects requires a non-optional upload of a CSV file. Even though our solution leverages an asynchronous data stream input, we must upload a CSV file containing even the simplest sample set of expected data, such as: 

process_instance_id

name

enter

872e639e-978d-411a-866f-11544a5eb164

HR Interview

2024-01-17 20:45:03.529-03:00

Or, in raw CSV:

"process_instance_id","name","enter","exit" "61eba33a-af21-471a-a9d5-59ea3a12df82","New Hiring","2024-01-17 20:45:03.529-03:00",""

This is how the user instructs IBM Process Mining on how to obtain from the incoming data, the information it needs, such as Process ID, Activity Name, and Activity Start Time.

Data Mapping for CSV: Data Mapping, New Project Wizard in IBM Process Mining

Projects that do not use events typically ingest data into IBM Process Mining through CSV-based data sources. These CSV files contain the data for analysis.

Finally, the last step of the project creation is "teaching" IBM Process Mining how to unmarshall incoming data. If you used a sample like the one provided in this article, it should automatically detect the format yyyy-MM-ddTHH:mm:ss.SSSXXX.

Time format settings: Formatting, New Project Wizard in IBM Process Mining

Now, with a project in hand, we can move forward and configure our data streams to enable IBM Process Mining to consume the process data that Kogito emits to Kafka.

Using a data stream to consume process events in IBM Process Mining

IBM Process Mining eagerly awaits those events, but it requires initial configuration details such as the location of the Kafka broker, the credentials for connection, the specific topic from which it should consume events, and instructions on interpreting the incoming event data. If you are using the default topic name for the process service, when creating a new data stream, configure the subscription to the kogito-processinstances-events topic and set it to fetch new data at regular intervals, such as every 30 minutes.

As mentioned before, during our exploration we used the IBM Event Streams service where the IBM Process Mining setup wizard streamlined the integration process:

New Kafka Data Source: Creation of a new data stream integration

Click on "Verify connection", to validate if the communication between the services is working:

Integration Settings and Verification

Next, still using the wizard, we can quickly map the incoming data in the events consumed through Kafka. These events are received in a JSON format and we need to map them to the expected variables, just like we've done during the project creation.

Note: Based on the data you wish to analyze, it may be necessary to parse the incoming data into a format compatible with IBM Process Mining, ensuring it can be accurately processed and analyzed.

Important: Here's another particularity of IBM Process Mining. The input from event data values received should match the data format of the CSV file used during the creation of the process mining project. For instance, a date field can't be formatted as YYYY-MM-DD in one source, and YYYY-MM in the other. They should be in the same format.

If you are exploring the possibilities, this step allows for more than just mapping the three required inputs. You can enhance this by copying and pasting an example of the anticipated event data into the user interface. For example, consider this sample event data emitted by Kogito:

{
    "id":"cb17f8a2-835d-4822-bdae-5948a7779d18",
    "source":"http://localhost:8080/hiring",
    "type":"ProcessInstanceNodeDataEvent",
    "time":"2024-01-19T01:06:42.261312-03:00",
    "data":{
       "eventDate":"2024-01-19T01:06:42.261-03:00",
       "eventUser":null,
       "eventType":2,
       "processId":"hiring",
       "processVersion":"1.0",
       "processInstanceId":"fbcfb08b-d37e-4d70-9147-7899ee24f3b7",
       "connectionNodeDefinitionId":"_5334FFDC-1FCB-47E6-8085-36DC9A3D17B9",
       "nodeDefinitionId":"_834B21EF-9229-44F8-A5DB-D96EBB39A347",
       "nodeName":"Send notification HR Interview avoided",
       "nodeType":"ActionNode",
       "nodeInstanceId":"9f7bed87-ee4a-4324-ab4a-1c9fd1f42e3a",
       "workItemId":null,
       "slaDueDate":null,
       "data":{
         
       }
    },
    "specversion":"1.0",
    "datacontenttype":"application/json",
    "kogitoprocinstanceid":"fbcfb08b-d37e-4d70-9147-7899ee24f3b7",
    "kogitoprocid":"hiring",
    "kogitoaddons":"process-management,cloudevents,source-files,jdbc-persistence,jobs-management,process-svg",
    "kogitoprocversion":"1.0",
    "kogitoprocist":"2",
    "kogitoproctype":"BPMN"
 }

This mapping step also requires three non-optional values to be mapped to the Process ID, Activity name, and Start time (which is usually a timestamp). The good thing is that you only need to do this once - after the event data is mapped, all the analysis can be done automatically.

 

Data Mapping: Mapping of Event Data Fields to IBM Process Mining Expected Fields

Recurring Data Updates for Dashboards 

Based on Recent Information To enable dashboards, process maps, and analyses to reflect the latest process reality possible, we can rely on two facts:

  1. Kogito will be constantly sending events to register every process interaction.
  2. IBM Process Mining can automatically update its data from the configured data stream.

All that needs to be done is the configuration of a regular refresh interval. With that, IBM Process Mining should regularly check and consume new events on the configured topics, and update its insights accordingly.

Scheduled updates: Refresh Rate of Data Stream data consumption

It's also possible to trigger the data update manually at any point in time, triggering a synchronization of the data stream and further data analysis.

And, it's ready!

And that's all we need to do to unlock significantly improved visibility over business process data. Below you can see examples of available insights provided off the most basic data integration we could do for an exploration setup. Throughout 28 process instance executions, we can assess around two hundred events, and observe how our process behaved. We can very clearly see the ~3-minute duration we've simulated as a delayed human task execution that results in the timer event being triggered. Another interesting insight is the clear view of the most executed tasks of a process, which enables strategic decisions to be made to more efficiently achieve corporate goals.

  • Model Overview in IBM Process Mining: A visual representation of frequently executed nodes. In our example, we see “Anonymous” where we would see instead, user groups responsible for executing each node if we had these configurations mapped.

  • Statistics: Overview of the Analyzed Process Execution Data, including, for example, average and maximum execution time, and even how many cases we can complete in a day.

  • Insights: One of the out-of-the-box insights available to us, is a visual representation of the overall process' tasks duration, giving a clear view of potential improvement points within that corporate flow, team structure, size, and so on.

  • Critical Activities: As part of the analytics, we can also see the most critical activities and their respective average wait times.

  • Timespan view: Have a better understanding of your business backed by near real-time data, and to do strategic planning upfront. Visualize, for instance, when usually you have the most ongoing processes, and if that is impacting the overall duration of a process completion, or associate through abnormal spikes or downfalls of a specific flow if a market or corporate event impacted a business functioning and performance.

The examples above are some of the data insights we get out of the box for the simplest possible integration between Kogito and IBM Process Mining. There are other amazing features such as process simulation, cost and ROI evaluation, RPA integration, custom dashboards and much more to be leveraged through this powerful technology combination.

Conclusion

This is Just the Beginning Our sleeves rolled up as we are exploring ways to not only enrich and add value to this integration, so it would only be fair to share our insights on the integration between Kogito and IBM Process mining, recommendations, challenges, do's and don'ts. We plan to share more resources with practical steps of building and deploying the concepts we just explored, bringing you code snippets, configuration tips, and best practices to guide you through the process.

There's a huge potential for organizations, users, and developers to explore and benefit from cloud-native technologies like Kogito in combination with powerful solutions such as Process Mining, especially when built on top of a solid architectural design.

Stay curious, stay connected. Let's explore the exciting possibilities of technology blends!

- A shout out to Thiago Menezes for going above and beyond by collaborating on this exploration with his expertise. Such dedication and knowledge sharing exemplifies the spirit of open source. Thank you!


#process-mining
#kogito
3 comments
61 views

Permalink

Comments

Thu February 22, 2024 01:42 PM

@Patrick, on the Kogito side, we would simply add an add-on like we've done for processes, but that would emit events for human tasks. Then, we could configure `kogito-usertaskinstances-events` as a data stream topic in ibm process mining, map the fields, and voilà!

Thu February 22, 2024 04:07 AM

Very interesting article. We often want to add contextual data into process mining to help understanding/discovering root-causes. Besides sending instance id, activity, and timestamp, how would you send also task/instance data?

Thu February 22, 2024 03:55 AM

Really interesting and useful content.  Thank you!  One point I need clarification on. What ae the advantages/disadvantages of using Process Mining over Kibana?  Should both be used for monitoring dashboards, or is PM a super set of Kibana?