Harnessing the power of IBM LinuxONE to combat electricity theft – A Datathon winning team’s journey!
Authors: Simanga Mchunu(Team leader) - Software Engineer graduate, ALX Africa, simacoder@hotmail.com, Nkosinathi Nhlapo - Data Science graduate, ALX Africa, pronkosinathi@mail.com, Kagiso Leboka - Data Analytics graduate, ALX Africa, kagisogrant@gmail.com, Bongani Baloyi - Software Engineer, ALX Africa, bonganibaloyi94@gmail.com
Mentors: Ajit-Samuel John - Software development Manager, AI on IBM Z, IBM India Systems Development Lab, Bangalore, INDIA, Ajit.SamuelJohn@ibm.com, Saurabh Srivastava - AI Architect, AI on IBM Z and LinuxONE, IBM India Systems Development Lab, Bangalore, India, Saurabh.srivastava4@ibm.com
Abstract
Meter fraud and electricity theft pose significant financial and operational challenges for utility providers. Fraudulent activities such as meter tampering and illegal connections lead to substantial revenue losses and an increased burden on legitimate consumers. This study leverages machine learning techniques, including Isolation Forest, Random Forest, and XGBoost, to identify anomalies in energy consumption patterns. The findings highlight that XGBoost provides superior accuracy in detecting fraudulent behavior. The study also incorporates geospatial mapping and trend analysis to enhance fraud detection efforts. This research contributes to the ongoing efforts to mitigate electricity theft using AI-driven solutions. The project was supported by IBM Z and LinuxONE machines and was made possible through collaboration with the Shooting Stars Foundation and dedicated mentors.
Figure 1: The data pipeline of our project, illustrating the end-to-end flow from raw smart meter data ingestion through preprocessing, feature engineering, model training, and real-time fraud detection deployment.
Introduction
Electricity theft remains a significant issue in South Africa, leading to billions of rands in annual losses. Fraudulent consumers manipulate their smart meters to reduce their bill amounts, negatively impacting utility providers and honest customers. Traditional fraud detection methods rely on manual inspections and rule-based systems, which are inefficient and prone to human error. This study proposes an automated, machine-learning-based approach to detecting fraudulent energy consumption patterns. By leveraging the power of L1CC, we can implement a machine-learning-based approach to detect fraudulent energy consumption patterns. The high-performance computing capabilities of L1CC allow us to process large datasets, train deep learning models efficiently, and detect anomalies in real time.
L1CC offers a powerful, secure, and cost-effective solution for running complex computations, such as fraud detection algorithms. Its sustainability, scalability, and performance enhancements make it an ideal choice for enterprises dealing with vast data-processing requirements. By adopting L1CC, we not only improve operational efficiency but also contribute to a more secure and sustainable IT infrastructure.
Problem Statement
Electricity theft in South Africa poses a significant challenge to utility providers, resulting in substantial economic losses estimated at approximately R20 billion annually (Netwerk24, 2024), (Mujuzi, 2020). According to the article by Jamil Ddamulira Mujuzi, the leading cause of blackouts in South Africa is primarily due to the collapsing Eskom infrastructure as a result of illegal connections. Figure 1 emphasizes the increase in load-shedding hours over the years, indicating the impact of degrading Eskom infrastructures possibly due to illegal connections
Traditional manual detection methods are proving increasingly ineffective and time-consuming in combating this pervasive issue. This necessitates the exploration and implementation of automated, AI-driven solutions capable of real-time anomaly detection and prevention to mitigate the escalating financial burden and ensure the sustainable provision of electricity services.
Figure 2: Load-shedding Trend by hour by stages (Wikipedia, South African Energy Crisis 2025)
In South Africa, power outages are implemented in 6 stages depending on the demands that the power grid has to meet. The stages of loadshedding are explained in detail below:
Definition: Stage 1: 2 hours of power cut per day( 1000 MW to be taken on selected areas)
Stage 2: 4 hours of power cut per day(2000 MW to be taken on the whole country)
Stage 3: 4 hours of power cut per day (3000 MW to be taken on the whole country)
Stage 4: 8 hours of power cut per day(4000 MW to be taken on the whole country)
Stage 5: at least 8 hours of power cut per day(5000 MW to be taken on whole country)
Stage 6: at least 8 hours of power cut per day(6000 MW to be taken on whole country)
Data Collection & Preprocessing
Dataset Features
The dataset includes:
● Customer Data: Meter ID, Timestamp, Province, City, GPS Coordinates
● Energy Metrics: Energy Consumption (kWh), Solar Generation (kWh), Voltage, Frequency, Power Factor
● Fraud Label: Binary classification (Fraud/No Fraud)
● Load Shedding Impact Considered