3.2 KiB
3.2 KiB
| 1 | # example_dataset.csv |
|---|---|
| 2 | # This CSV file provides a sample dataset for demonstrating feature engineering techniques within the feature-engineering-toolkit plugin. |
| 3 | # |
| 4 | # Column Descriptions: |
| 5 | # - user_id: Unique identifier for each user (integer). |
| 6 | # - age: Age of the user (integer). |
| 7 | # - gender: Gender of the user (categorical: Male, Female, Other). |
| 8 | # - signup_date: Date the user signed up (YYYY-MM-DD). |
| 9 | # - last_login: Date of the user's last login (YYYY-MM-DD). |
| 10 | # - total_purchases: Total number of purchases made by the user (integer). |
| 11 | # - avg_purchase_value: Average value of each purchase (float). |
| 12 | # - country: Country of the user (categorical). |
| 13 | # - marketing_channel: The marketing channel through which the user signed up (categorical). |
| 14 | # - is_active: Indicates whether the user is currently active (boolean: True, False). |
| 15 | # - churned: Target variable indicating whether the user churned (boolean: True, False). This is what we want to predict. |
| 16 | # |
| 17 | # Instructions: |
| 18 | # - Use this dataset to experiment with feature engineering techniques. |
| 19 | # - Consider creating new features such as: |
| 20 | # - Time since signup (calculated from signup_date). |
| 21 | # - Time since last login (calculated from last_login). |
| 22 | # - Purchase frequency (total_purchases / time since signup). |
| 23 | # - Age groups (binning the age variable). |
| 24 | # - Interactions between features (e.g., age * avg_purchase_value). |
| 25 | # - Use feature selection techniques to identify the most important features for predicting churn. |
| 26 | # - Apply feature transformations (e.g., scaling, normalization, encoding categorical variables). |
| 27 | # - Remember to handle missing values appropriately (if any). |
| 28 | # - The 'churned' column is the target variable. The goal is to build a model that accurately predicts churn. |
| 29 | user_id,age,gender,signup_date,last_login,total_purchases,avg_purchase_value,country,marketing_channel,is_active,churned |
| 30 | 1,25,Male,2023-01-15,2024-01-10,10,25.50,USA,Facebook,True,False |
| 31 | 2,30,Female,2023-02-20,2024-01-15,5,50.00,Canada,Google Ads,True,False |
| 32 | 3,40,Other,2023-03-10,2023-12-20,2,100.00,UK,Email,False,True |
| 33 | 4,22,Male,2023-04-05,2024-01-05,15,15.75,Germany,Facebook,True,False |
| 34 | 5,35,Female,2023-05-01,2023-11-30,1,200.00,France,Referral,False,True |
| 35 | 6,28,Male,2023-06-12,2024-01-20,8,30.20,USA,Google Ads,True,False |
| 36 | 7,45,Female,2023-07-08,2023-10-25,3,75.00,Canada,Email,False,True |
| 37 | 8,31,Other,2023-08-03,2024-01-01,12,20.00,UK,Facebook,True,False |
| 38 | 9,24,Male,2023-09-18,2023-12-10,7,40.00,Germany,Referral,False,True |
| 39 | 10,38,Female,2023-10-22,2024-01-25,6,60.50,France,Google Ads,True,False |
| 40 | 11,29,Male,2023-11-05,2023-12-15,4,80.00,USA,Email,False,True |
| 41 | 12,33,Female,2023-12-01,2024-01-08,9,28.00,Canada,Facebook,True,False |
| 42 | 13,42,Other,2024-01-02,2024-01-28,11,22.50,UK,Google Ads,True,False |
| 43 | 14,27,Male,2023-01-28,2024-01-12,13,18.00,Germany,Referral,True,False |
| 44 | 15,36,Female,2023-02-15,2023-11-01,0,0.00,France,Email,False,True |
| 45 | 16,23,Male,2023-03-22,2024-01-18,14,17.25,USA,Facebook,True,False |
| 46 | 17,39,Female,2023-04-10,2023-10-10,2,90.00,Canada,Google Ads,False,True |
| 47 | 18,41,Other,2023-05-05,2024-01-03,16,14.50,UK,Referral,True,False |
| 48 | 19,26,Male,2023-06-01,2023-12-25,5,55.00,Germany,Email,False,True |
| 49 | 20,34,Female,2023-07-15,2024-01-22,17,13.00,France,Facebook,True,False |