IBM i

IBM i 

A space for professionals working with IBM’s integrated OS for Power systems to exchange ideas, ask questions, and share expertise on topics like RPG and COBOL development, application modernization, open source integration, system administration, and business continuity.


#Power


#IBMi
 View Only

Fraud detection on IBM i with AI inference in a Linux logical partition (LPAR)

By Gwendolyn Nguyen posted 19 hours ago

  

This blog showcases the benefit of setting up an AI ready software stack from an existing IBM i environment easily and the benefit from co-locating the AI environment with the IBM i instance where the business application is running.

Background

AI inference was enabled on IBM i by deploying models externally in a co-located Linux logical partition (LPAR) and integrating them through a Representational State Transfer (REST) API endpoint. The IBM i LPAR sends relevant information to the endpoint and receives the model’s prediction in response.

In this solution, a Long Short-Term Memory (LSTM) model is deployed on a Flask API in the Linux LPAR. An LSTM is a type of recurrent neural network (RNN) that is used to classify sequential data by learning long-term dependencies. On the IBM i side, a trigger is called before a new record is inserted into the IBM DB2 table (representing a credit card transaction). This trigger gathers the current transaction along with the six previous transactions for the same user and card combination, and sends this data to the inference endpoint. The LSTM model processes this sequence of seven transactions to make a prediction and then sends the result back to IBM i. The model looks at all the information that is provided by the seven transactions to assign a score (0-1) for the percentage of fraud likeliness for the singular input transaction. Then this “is-fraud” value is set for the corresponding column for in the before trigger, and inserted to the table in IBM i.

Architecture

As described earlier, the workload is run on two LPARs (Linux and IBM i).

Linux

The Linux LPAR contains the REST API endpoint where an LSTM model does inferencing with a sequence of seven credit card transactions to predict if the current transaction is fraudulent.

IBM i

The IBM i LPAR contains a database of credit card transactions. The schema contains the following database elements:

  • User-defined function (UDF)s: UDFs help to retrieve the previous six transactions for a given user and card. The UDF will be called in the trigger.
  • Trigger: Upon insert, the trigger creates a JSON document with the current transaction, as well as the previous six transactions for that user/card (this is done by calling the UDF). The JSON document is then sent to the REST API on the Linux LPAR to determine if the current transaction is fraud. The result is sent back to IBM i, which inserts the transaction into the database with the is_fraud value set.

Setup steps (RHEL environment)

You need to perform the following steps on the Linux and IBM i LPARs to initialize them for the inferencing workload.

On the Linux LPAR:

  1. Clone the GitHub repository (https://github.com/IBM/project-pim/tree/main/examples/fraud-analytics/app) to the system.
  2. Run REST API endpoint, linux_inference_endpoint.py, inside a podman container.
    1. Move to the image-build directory.
    2. Copy the wheels files from the following box folder into the image-build directory: https://ibm.box.com/s/w0yl8bcf4ijvw6mdzpxyvsesv1f7uwu6
    3. Build an image using the podman build -t fraud_analytics command.
    4. Run the container using the podman run -p 5000:5000 localhost/fraud_analytics command.

On the IBM i LPAR:

  1. Create the schema and table using the create_table.sql command.
  2. Create the UDF that grabs the previous six transactions for a given user and card using the get_transactions.sql file.
  3. Create the BEFORE INSERT trigger using the insert_trigger.sql file.
  4. Test if the pipeline is working by running the following commands:
    INSERT INTO PIM.INDEXED_TR (USER_ID, CARD, "YEAR", "MONTH", "DAY", "TIME", AMOUNT, USE_CHIP, MERCHANT_NAME, MERCHANT_CITY, MERCHANT_STATE, ZIP, MCC, IS_ERRORS)
    VALUES (29, 3, 2019, 2, 20, '12:38', '$1.88', 'Chip Transaction', '6051395022895754231', 'Rome', 'Italy', 0, 5310, 'Example');
    
    SELECT * FROM PIM.INDEXED_TR WHERE IS_ERRORS = 'Example';
    
     
    If insert is successful, the row will be added to the database and the is_fraud column will be set to Yes.

RHEL findings

This section explains the RHEL findings related to system hardware, test parameters, CPU utilization, time breakdown, and so on.

System hardware

The workload was tested on an IBM Power E1180 system. The Linux LPAR used nine cores and the IBM i LPAR used a single core. Both the LPARs were placed on the same socket.

Test parameters

A series of tests were conducted using Apache JMeter to measure throughput and latency, varying the number of users. Each user made 100 requests.

Result of the tests conducted using Apache JMeter to measure throughput and latency, varying the number of users

CPU utilization

The CPU utilization on the Linux LPAR remained very low despite the number of users (around 2%).

The CPU utilization on the Linux LPAR remained very low despite the number of users (around 2%).

Time breakdown

When observing time breakdown, majority of the time is spent on the inference phase of the workload, indicating that the IBM i side is not the bottleneck. The two areas that remain constant in time regardless of the number of users are the model load time and the JSON data extraction time. The model is loaded globally, once at the initialization of the podman container, with a time of 90 ms on average. The JSON data extraction time averages at about 0.11ms. The two areas that notably increase in time with an increase in users are the data frame creation and inferencing. The data frame creation time increases by about 8% to12% per user while the inferencing time increases by about 15% to 20% per user.

Results and findings

The results above demonstrate that throughput remains relatively constant when increasing the number of users, highlighting a limitation of scaling. After this investigation, it was evident that the Flask endpoint was processing requests sequentially instead of concurrently, causing throughput to remain flat when increasing the number of users.

Conclusion

In summary, this investigation demonstrates that IBM i can be seamlessly integrated with REST API endpoints to enable AI capabilities such as real-time credit card fraud detection. While Flask endpoints process requests sequentially, deploying the solution using multiple Red Hat OpenShift pods running different inference endpoints allowed the workload to scale well with an increasing number of users.

Disclaimer: This is based on IBM internal testing of IBM i version 7.6. Tests were conducted under laboratory conditions, and are valid as of July 1st, 2025. Individual results can vary based on database size, AI model type, size and network configurations. This blog is meant to demonstrate a repeatable pattern of calling a REST API end point from an IBM i LPAR during an SQL transaction.

0 comments
5 views

Permalink