5.1 General Overview
The data set used in this study contains 29 variables and 7,032 customers from a telecom firm. For each client, the data includes:
Demographic information:
CustomerID,City,Zip_Code,Latitude,Longitude,Gender,Senior_Citizen,PartnerandDependents.Customer account information:
Tenure_Months,Contract,Paperless_Billing,Payment_Method,Monthly_Charges,Total_Charges,Churn_Label,Churn_Value,Churn_Score,CLTV,Churn_Reason.Services information:
Phone_Service,Multiple_Lines,Internet_Service,Online_Security,Online_Backup,Device_Protection,Tech_Support,Streaming_TV,Streaming_Movies.
| CustomerID | City | Zip_Code | Latitude | Longitude | Gender | Senior_Citizen | Partner | Dependents | Tenure_Months | Phone_Service | Multiple_Lines | Internet_Service | Online_Security | Online_Backup | Device_Protection | Tech_Support | Streaming_TV | Streaming_Movies | Contract | Paperless_Billing | Payment_Method | Monthly_Charges | Total_Charges | Churn_Label | Churn_Value | Churn_Score | CLTV | Churn_Reason |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Los Angeles | 90003 | 33.96 | -118.27 | Male | No | No | No | 2 | Yes | Yes | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 86 | 3239 | Competitor made better offer |
| 2 | Los Angeles | 90005 | 34.06 | -118.31 | Female | No | No | Yes | 2 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 1 | 67 | 2701 | Moved |
| 3 | Los Angeles | 90006 | 34.05 | -118.29 | Female | No | No | Yes | 8 | Yes | Yes | Fiber optic | No | No | Yes | No | Yes | Yes | Month-to-month | Yes | Electronic check | 99.65 | 820.50 | Yes | 1 | 86 | 5372 | Moved |
| 4 | Los Angeles | 90010 | 34.06 | -118.32 | Female | No | Yes | Yes | 28 | Yes | Yes | Fiber optic | No | No | Yes | Yes | Yes | Yes | Month-to-month | Yes | Electronic check | 104.80 | 3046.05 | Yes | 1 | 84 | 5003 | Moved |
| 5 | Los Angeles | 90015 | 34.04 | -118.27 | Male | No | No | Yes | 49 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | Month-to-month | Yes | Bank transfer | 103.70 | 5036.30 | Yes | 1 | 89 | 5340 | Competitor had better devices |
| 6 | Los Angeles | 90020 | 34.07 | -118.31 | Female | No | Yes | No | 10 | Yes | Yes | DSL | No | No | Yes | Yes | No | No | Month-to-month | No | Credit card | 55.20 | 528.35 | Yes | 1 | 78 | 5925 | Competitor offered higher download speeds |
As shown by table 5.1, the Churn_Value status variable indicates whether the customer left the firm’s portfolio within the last month and Tenure_Months is the duration variable.
Since the purpose of our study relies in estimating the overall value of this fictional firm’s portfolio, two groups of target variables can be considered. On the one hand Churn_Value and Tenure_Months permit to determine whether a customer is active in the portfolio. They are used as response variables in the survival models. On the other hand, Monthly_Charges variable indicates the price paid by customers each month and may be used to derive a customer raw value. Even though the CLTV variable represents each customer’s value through measurement of customer lifetime value, we do not have any information on its calculation. Thus, it is not used in the model developed in the next chapter.