As a global player in the telecoms industry, Vodafone’s reputation thrives on their ability to innovate and adapt to market trends. To that end, the Marketing Academy in Vodafone was seeking a relevant, in-depth and effective professional education course to sharpen the digital skills of their workforce.




Information technology and related services


Software development, data analytics, artificial intelligence, cloud computing, computer systems, and IT consulting services.


Telco customer churn (11.1.3+) (ibm.com)


In the scenario of the talent search competition on Kaggle (a globally renowned platform for working with data and artificial intelligence), IBM could leverage Kaggle as a source to discover and attract talented individuals in the technology and data analytics field.

You will have to showcase your talent among thousands of talented participants from around the world.

The challenge data is provided by the IBM Sample Team.

“Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs.” [IBM Sample Data Sets]


Data cleaning and validation

Churn is one of the biggest challenges in the telecommunications industry. Research has shown that the average monthly churn rate for the top 4 mobile service providers in the United States ranges from 1.9% to 2%. 

Using machine learning techniques to build a model.

In this project, we will focus on predicting churn (whether customers will leave the service or not) as the objective, and we will also explore what contributes to churn.The issues we will analyze include:

  • Demographic statistics in the project: Gender distribution, percentage of elderly individuals, partner and dependent status.
  • Customer account information: Usage duration, contracts, the distribution of different services used by customers.
  • Is there any relationship between churn and whether users opt for monthly or yearly subscription packages ?


We have captured users’ demographic information, habits, and their usage rates.

Furthermore, we have also gained a clear understanding of service distribution, the relationships between users, services, and costs. With this information, we can develop plans/strategies to modify services or adjust pricing in a way that is reasonable to retain users.

From all the exploratory data analysis (EDA) we conducted, we can confidently build a machine learning model to predict churn using the algorithms we employed. The prediction results are highly feasible, reaching approximately 81%. Additionally, we can examine the feature importance levels contributing to the prediction process.