Global AI and Data Science

Global AI & Data Science

Train, tune and distribute models with generative AI and machine learning capabilities

 View Only

A Case Study On Telecom Customer Analytics

By Moloy De posted Thu December 03, 2020 08:12 PM

  
Following is a case study from my work experience with a Telecom Giant from Africa.

Business Context


Business Intelligence is no longer about the daily batch jobs of pulling data into the EDW at the end of the day. With rapid expanding nature of business, it becomes increasingly important to have a corporate pervasive BI which will help empower everyone in the organization, at all levels, with analytics, alerts and feedback at the right time.

Planning within the organization needs to be automated and have a scientific foundation. Presently all planning undertaken from finance to new product campaigns are undertaken using projected figures that have been supplied based on historic statistics and guesstimates.
Majority of the planning models used in business cases and budget planning and forecasting are done in Excel but the figures provided are often subjective, not based on any scientific justification and are not automated making it difficult to modify or consider new variables that may have an impact on planning projections.

In competitive environment, Business Analysts (Power Users) and business users would be empowered to access and exploit data from the EDW to support the immediate requirement for detailed and ad hoc customer and product level analysis.


Churn Prediction

Churn Likelihood is to be predicted for each active MSISDN. These likelihoods are uploaded in a table in Mining Mart schema in EDW. The likelihood prediction job runs once every month and a new table with time stamp in its name gets generated to hold the information. Reports are published on the churn analysis results.

PROC LOGISTIC of SAS is used to build the churn model. A Churn likelihood taking value between 0 and 1 is assigned against each MSISDN. The estimated parameters against each churn variable are also published to see their effect on churn propensity. Churn Prediction when run on around 45 million subscribers produced churn prediction accuracy around 84.9%.

Initially churn prediction accuracy came out to be very low. Then histogram of each explanatory variable is studied. The histogram of TENURE variable is found to extremely skewed to the left showing a high accumulation near the maximum. These are found to be the records that are migrated from old EDW and don’t have ACTIVATION DATE. Then the Logistic Regression is run keeping these records with high TENURE value out of the calculation but included only in prediction. The method turned out to be successful producing high churn prediction accuracy.



Product Propensity Calculation

Product propensities are calculated in a similar way as churn likelihood calculation. Here the target variable is the 1 – 0 product flag showing whether an MSISDN has the product or not. Other explanatory attributes are extracted from the database against each active MSISDN. PROC LOGISTIC in SAS is run to calculate the product propensities and the estimated betas are produced as propensity parameters that show the effect of the attributes on product propensities.


Customer Lifetime Value

CLV of a customer is derived based on customers’ revenue (R) during past six months and their churn likelihoods (p). The other inputs are the discount rate (D) and the future number of months (M) considered for CLV calculation. The formula for CLV calculation is given by


Given past six months revenue of a customer (MSISDN), PROC REGRESS module of SAS is
applied on individual customer data to predict revenues in future months. Once CLV gets calculated for a customer it is uploaded in the mining mart schema against each MSISDN.


Customer Segmentation

PROC FASTCLUST of SAS is used to apply k-means clustering on customer usage data. Here the number (k) of clusters is provided as input and as the model runs it uploads the cluster ID against each active MSISDN in a mining mart table.


Ad-hoc Reporting

All data mining details including data extracted from EDW and results calculated using SAS are stored in a table on which reporting works in mining mart and archived for past two months. Users are allowed to run their ad-hoc queries on this table to generate their reports.

QUESTION I : How much justified is considering the threshold to be 0.5 in Churn Prediction?
QUESTION II :  How to find the optimal K in K-Means Clustering?

#GlobalAIandDataScience
#GlobalDataScience
0 comments
37 views

Permalink