Files
2025-11-29 18:51:19 +08:00

50 lines
3.2 KiB
CSV

# example_dataset.csv
# This CSV file provides a sample dataset for demonstrating feature engineering techniques within the feature-engineering-toolkit plugin.
#
# Column Descriptions:
# - user_id: Unique identifier for each user (integer).
# - age: Age of the user (integer).
# - gender: Gender of the user (categorical: Male, Female, Other).
# - signup_date: Date the user signed up (YYYY-MM-DD).
# - last_login: Date of the user's last login (YYYY-MM-DD).
# - total_purchases: Total number of purchases made by the user (integer).
# - avg_purchase_value: Average value of each purchase (float).
# - country: Country of the user (categorical).
# - marketing_channel: The marketing channel through which the user signed up (categorical).
# - is_active: Indicates whether the user is currently active (boolean: True, False).
# - churned: Target variable indicating whether the user churned (boolean: True, False). This is what we want to predict.
#
# Instructions:
# - Use this dataset to experiment with feature engineering techniques.
# - Consider creating new features such as:
# - Time since signup (calculated from signup_date).
# - Time since last login (calculated from last_login).
# - Purchase frequency (total_purchases / time since signup).
# - Age groups (binning the age variable).
# - Interactions between features (e.g., age * avg_purchase_value).
# - Use feature selection techniques to identify the most important features for predicting churn.
# - Apply feature transformations (e.g., scaling, normalization, encoding categorical variables).
# - Remember to handle missing values appropriately (if any).
# - The 'churned' column is the target variable. The goal is to build a model that accurately predicts churn.
user_id,age,gender,signup_date,last_login,total_purchases,avg_purchase_value,country,marketing_channel,is_active,churned
1,25,Male,2023-01-15,2024-01-10,10,25.50,USA,Facebook,True,False
2,30,Female,2023-02-20,2024-01-15,5,50.00,Canada,Google Ads,True,False
3,40,Other,2023-03-10,2023-12-20,2,100.00,UK,Email,False,True
4,22,Male,2023-04-05,2024-01-05,15,15.75,Germany,Facebook,True,False
5,35,Female,2023-05-01,2023-11-30,1,200.00,France,Referral,False,True
6,28,Male,2023-06-12,2024-01-20,8,30.20,USA,Google Ads,True,False
7,45,Female,2023-07-08,2023-10-25,3,75.00,Canada,Email,False,True
8,31,Other,2023-08-03,2024-01-01,12,20.00,UK,Facebook,True,False
9,24,Male,2023-09-18,2023-12-10,7,40.00,Germany,Referral,False,True
10,38,Female,2023-10-22,2024-01-25,6,60.50,France,Google Ads,True,False
11,29,Male,2023-11-05,2023-12-15,4,80.00,USA,Email,False,True
12,33,Female,2023-12-01,2024-01-08,9,28.00,Canada,Facebook,True,False
13,42,Other,2024-01-02,2024-01-28,11,22.50,UK,Google Ads,True,False
14,27,Male,2023-01-28,2024-01-12,13,18.00,Germany,Referral,True,False
15,36,Female,2023-02-15,2023-11-01,0,0.00,France,Email,False,True
16,23,Male,2023-03-22,2024-01-18,14,17.25,USA,Facebook,True,False
17,39,Female,2023-04-10,2023-10-10,2,90.00,Canada,Google Ads,False,True
18,41,Other,2023-05-05,2024-01-03,16,14.50,UK,Referral,True,False
19,26,Male,2023-06-01,2023-12-25,5,55.00,Germany,Email,False,True
20,34,Female,2023-07-15,2024-01-22,17,13.00,France,Facebook,True,False