5.1 General Overview

The data set used in this study contains 29 variables and 7,032 customers from a telecom firm. For each client, the data includes:

  • Demographic information: CustomerID, City, Zip_Code, Latitude, Longitude, Gender, Senior_Citizen, Partner and Dependents.

  • Customer account information: Tenure_Months, Contract, Paperless_Billing, Payment_Method, Monthly_Charges, Total_Charges, Churn_Label, Churn_Value, Churn_Score, CLTV, Churn_Reason.

  • Services information: Phone_Service, Multiple_Lines, Internet_Service, Online_Security, Online_Backup, Device_Protection, Tech_Support, Streaming_TV, Streaming_Movies.

Table 5.1: The first 5 customers in the data
CustomerID City Zip_Code Latitude Longitude Gender Senior_Citizen Partner Dependents Tenure_Months Phone_Service Multiple_Lines Internet_Service Online_Security Online_Backup Device_Protection Tech_Support Streaming_TV Streaming_Movies Contract Paperless_Billing Payment_Method Monthly_Charges Total_Charges Churn_Label Churn_Value Churn_Score CLTV Churn_Reason
1 Los Angeles 90003 33.96 -118.27 Male No No No 2 Yes Yes DSL Yes Yes No No No No Month-to-month Yes Mailed check 53.85 108.15 Yes 1 86 3239 Competitor made better offer
2 Los Angeles 90005 34.06 -118.31 Female No No Yes 2 Yes Yes Fiber optic No No No No No No Month-to-month Yes Electronic check 70.70 151.65 Yes 1 67 2701 Moved
3 Los Angeles 90006 34.05 -118.29 Female No No Yes 8 Yes Yes Fiber optic No No Yes No Yes Yes Month-to-month Yes Electronic check 99.65 820.50 Yes 1 86 5372 Moved
4 Los Angeles 90010 34.06 -118.32 Female No Yes Yes 28 Yes Yes Fiber optic No No Yes Yes Yes Yes Month-to-month Yes Electronic check 104.80 3046.05 Yes 1 84 5003 Moved
5 Los Angeles 90015 34.04 -118.27 Male No No Yes 49 Yes Yes Fiber optic No Yes Yes No Yes Yes Month-to-month Yes Bank transfer 103.70 5036.30 Yes 1 89 5340 Competitor had better devices
6 Los Angeles 90020 34.07 -118.31 Female No Yes No 10 Yes Yes DSL No No Yes Yes No No Month-to-month No Credit card 55.20 528.35 Yes 1 78 5925 Competitor offered higher download speeds

As shown by table 5.1, the Churn_Value status variable indicates whether the customer left the firm’s portfolio within the last month and Tenure_Months is the duration variable.

Since the purpose of our study relies in estimating the overall value of this fictional firm’s portfolio, two groups of target variables can be considered. On the one hand Churn_Value and Tenure_Months permit to determine whether a customer is active in the portfolio. They are used as response variables in the survival models. On the other hand, Monthly_Charges variable indicates the price paid by customers each month and may be used to derive a customer raw value. Even though the CLTV variable represents each customer’s value through measurement of customer lifetime value, we do not have any information on its calculation. Thus, it is not used in the model developed in the next chapter.