Fans of IBM Z Hub

Fans of IBM Z Hub

Fans of IBM Z Hub

Join us and share the love of IBM Z with our global community!

 View Only

From Data to Diagnosis: How IBM LinuxONE Powered Our Datathon Winning AKI Dialysis Prediction Project

By Harshavardhan Gowda posted Sat May 03, 2025 09:30 AM

  

From Data to Diagnosis: How IBM LinuxONE Powered Our Datathon Winning AKI Dialysis Prediction Project

Authors: Harshavardhan MG (Team leader) - BE in AIML, Global Academy of Technology, harshavardhangowda2004@gmail.com, P Praveen Raj - BE in AIML, Global Academy of Technology,  7220praveen@gmail.com, Rishith P  - BE in AIML, Global Academy of Technology, rishithprasanna264@gmail.com, Suhas S Gowda - BE in AIML, Global Academy of Technology , Jeevan K Thejas - BE in AIML, Global Academy of Technology

Mentors: Vineet Dumir - Senior Solution Architect, AI on IBM Z, IBM Labs, Bangalore, INDIA, Vineet.Dumir@ibm.com, Vineet Dumir  Sudharsana Srinivasan - Senior Technical Program Manager, Pleasanton, CA, Sudharsana.Srinivasan@ibm.com,

Introduction

Hello everyone! We're Team Alpha, and we’re excited to share our journey in developing an innovative AI/ML solution for predicting early dialysis needs in ICU patients with Acute Kidney Injury (AKI). This project represents a critical intersection of advanced machine learning techniques and real-time clinical decision-making in intensive care settings.

To prepare for the datathon, IBM mentors conducted workshops introducing AI/ML frameworks on IBM Z. We gained hands-on experience with IBM LinuxONE Community Cloud (L1CC), setting up machine learning environments using Docker containers on Ubuntu Linux. With easy-to-follow tutorials, we quickly deployed popular AI frameworks like TensorFlow and PyTorch. It was exciting to see how seamlessly familiar AI tools could run on IBM Z’s s390x architecture, reinforcing its adaptability for enterprise AI workloads.

In this blog, we’ll take you through our journey of developing and deploying an AI/ML solution on IBM Z systems—from the initial idea to a fully operational model.

The Problem

Predicting early dialysis needs for ICU patients with Acute Kidney Injury (AKI) is a crucial challenge in intensive care settings where timing is everything.

Why is this important? When patients develop AKI in the ICU, medical teams face a crucial dilemma:

  • Initiating dialysis too early risks unnecessary intervention.

  • Starting too late could be life-threatening.

Currently, this decision relies heavily on manual monitoring and individual judgment, making it prone to delays and variations in care. The challenge is compounded by the sheer volume of patient data—from vital signs to laboratory results—that must be analyzed in real time.

Our goal was to develop an AI-powered solution capable of processing this complex data and providing timely, accurate predictions to support medical decision-making. This isn’t just about building a model; it’s about potentially saving lives through better-timed interventions and more efficient resource allocation in critical care settings.

The Dataset

Our project leverages a comprehensive ICU patient dataset curated specifically for AKI prediction. The dataset includes key ICU patient metrics such as heart rate, blood pressure, creatinine levels, AKI stage, anion gap, and PT max—all critical indicators for early dialysis prediction.

Given that patients requiring early dialysis intervention are fewer in number, we used specialized sampling techniques to address class imbalance:

  • SMOTE (Synthetic Minority Over-sampling Technique) – Generates synthetic samples for the minority class.

  • Class Weights – Assigns higher penalties for misclassifying the minority class (class_weight='balanced').

Data Preprocessing Implementation

# Step 1: Data Preprocessing Pipeline

preprocessor = ColumnTransformer([

    ('num', StandardScaler(), numeric_columns),

    ('cat', OneHotEncoder(drop='first', sparse_output=False), categorical_columns)

])

# Step 2: Feature Selection

feature_selector = SelectFromModel(

    RandomForestClassifier(n_estimators=100, random_state=42),

    max_features=20

)

# Step 3: Complete Pipeline with SMOTE and Class Weights

pipeline = Pipeline([

    ('preprocessor', preprocessor),

    ('feature_selector', feature_selector),

    ('smote', SMOTE(random_state=42)),

    ('classifier', LogisticRegression(random_state=42, class_weight='balanced'))

])

Model Implementation

To predict early dialysis needs in ICU patients, we implemented a suite of machine learning models, including:

  • Logistic Regression – Offers interpretable results with cross-validated hyperparameter tuning.

  • Random Forest – Captures complex feature interactions and identifies key AKI predictors.

  • XGBoost – A powerful gradient boosting technique that enhances performance through grid search optimization.

  • Ensemble Techniques – Combines multiple models using a voting classifier for enhanced prediction accuracy.

Each model underwent rigorous evaluation using metrics such as accuracy, precision, recall, F1-score, and ROC AUC. Our best-performing model, XGBoost, achieved an F1-score of 0.87 and an AUC of 0.92, ensuring reliable dialysis predictions in ICU settings.

The table below presents a comparative analysis of different machine learning models based on their classification performance metrics.

Model Application Accuracy

Class 0

 Precision

Class 0

 Recall

Class 0

 F1

Class 1

 Precision

Class 1

 Recall

Class 1

 F1

XGBoost Application 78% 0.39 0.43 0.41 0.87 0.85 0.86
Random Forest Application ~75% 0.29 0.29 0.29 0.85 0.85 0.85
Logistic Regression Application 63% 0.26 0.57 0.35 0.87 0.64 0.74
Ensemble (Voting)   80% 0.36 0.14 0.20 0.84 0.95 0.89

Workflow of our ProjectA diagram of a patient prediction system

AI-generated content may be incorrect.

Deploying to IBM LinuxONE Community Cloud (L1CC)

The deployment phase was a critical milestone, marking the transition from model development to real-world application. The IBM Z platform proved to be the ideal environment for operationalizing our AKI prediction models.

Thanks to IBM Z’s compatibility with open standards, we seamlessly transferred our trained models (Logistic Regression, Random Forest, XGBoost, and ensemble models) to the L1CC environment.

Key benefits of IBM Z deployment:

  • High-speed inference: Model predictions were generated in under 1 second—significantly faster than local execution.

  • Scalability & reliability: The enterprise-grade infrastructure ensured consistent model performance.

  • Enhanced security: IBM Z’s built-in encryption provided a secure environment for handling sensitive medical data.

The successful deployment on IBM LinuxONE Community Cloud demonstrated the real-world feasibility of our system for clinical applications, where timely predictions can make a life-saving difference.

ROC-AUC Graph

 

Here is the video blog of our incredible journey with IBM and video description of our project.

Conclusion

Our journey in developing the Early Dialysis Prediction System has been both challenging and rewarding. By implementing a combination of Logistic Regression, Random Forest, XGBoost, and ensemble techniques, we successfully created a robust AI system for predicting dialysis needs in ICU patients with AKI.

Implementation Representation

Key Takeaways:

  • Handles complex medical data preprocessing with advanced techniques like SMOTE & feature selection.

  • Achieves high predictive accuracy using multiple ML models with XGBoost leading in performance.

  • Deploys seamlessly on IBM Z, ensuring fast inference, enterprise scalability, and security.

Looking ahead, we plan to:

  • Expand our dataset to improve generalization.

  • Enhance model interpretability to support better clinical adoption.

  • Validate our solution in real-world clinical settings to assess practical impact.

For now, we’re proud to have developed an AI-powered system that could assist medical professionals in making timely dialysis decisions. A huge thank you to our mentor Vineet Dumir and everyone who supported this project!

Until next time,
Team Alpha

0 comments
9 views

Permalink