Zindi New User Engagement Prediction Challenge
Can you predict if a new user will be active for 2 months in a row?
Prize
$5 000 USD
Time
17 days to go
Participants
433 active · 1090 enrolled
Intermediate
Prediction
Months order
Data · 21 Jan 2023, 14:53 · 6

@ZINDI @amyflorida626 Please tell the order of the month in data set, as the months are encoded, we cannot tell which month comes before which one.

And, how do we know if a user is active or inactive in a month ?

Discussion 6 answers

The months are ordered starting from month 11 (11 - 12 - 1 - 2 - 3 - 4 - 5)

You can check if a user is active or inactive in a month by looking for their activities in UserActivity.csv, if they have an entry in UserActivity.csv with datetime Month = m then they are active at that month m.

21 Jan 2023, 23:11
Upvotes 0

Thank you for the reply.

Would you please tell how did you derive this order. The order we see in user_activity data is [11, 12, 5, 4, 1, 3, 2].

There are users id mentioned in comments and competition_partipation csv which are not mentioned in user_activity run this below code to check

ID = 'User_ID'
comments_and_comp = list(set(comments[ID].unique().tolist() + competition_partipation[ID].unique().tolist()) )
activity_found_elsewhere = [x for x in comments_and_comp if x not in user_activity[ID].unique().tolist()]
print(len(activity_found_elsewhere))

It is certainly weird that there are users mentioned in Comments and CompetitionPartipation but are missing from UserActivity.

As for the order of months, it is stated in VariableDefinition.csv that "Months are in chronological order but January is not neccesarily month 1" for all month columns. I confirmed this by comparing users Created At Month column with the datetime Month of their activities. For example, if a user created an account on month 12 and registered in a competition on month 1, then month 12 if before month 1.

Thanks for the explanations.

I unserstand in your logic for the month order.

Anyway we would like this to be confirmed from ZINDI officially , in stead of we guesssing or derive it.

I checked the month order from your logic, it seems to be underivable

def check_activity_months(month):
    user_created_in_given_month = users[users['Created At Month']==month]
    activity = user_activity[user_activity['User_ID'].isin(user_created_in_given_month['User_ID'].unique())]['datetime Month'].unique()
    print(f"Users activity months for users created in month {month} >> {activity}")
    
for k in user_activity['datetime Month'].unique():
    check_activity_months(k) 
Users activity months for users created in month 11 >> [11 12 5 4 1 3 2]
Users activity months for users created in month 12 >> [12 1 4 3 2 5]
Users activity months for users created in month 5 >> [ 5 11 1 4 2 3 12]
Users activity months for users created in month 4 >> [11 4 5 1 3]
Users activity months for users created in month 1 >> [ 2 4 1 11 3 5 12]
Users activity months for users created in month 3 >> [5 4 3]
Users activity months for users created in month 2 >> [5 2 3 4]

@ZINDI waiting for the clarification

21 Jan 2023, 23:45
Upvotes 0