Primary competition visual

Zindi New User Engagement Prediction Challenge

Helping Africa
$5 000 USD
Challenge completed over 2 years ago
Prediction
1269 joined
222 active
Starti
Oct 14, 22
Closei
Feb 12, 23
Reveali
Feb 12, 23
Months order
Data · 21 Jan 2023, 14:53 · 6

@ZINDI @amyflorida626 Please tell the order of the month in data set, as the months are encoded, we cannot tell which month comes before which one.

And, how do we know if a user is active or inactive in a month ?

Discussion 6 answers
User avatar
mouaff25
Insat

The months are ordered starting from month 11 (11 - 12 - 1 - 2 - 3 - 4 - 5)

You can check if a user is active or inactive in a month by looking for their activities in UserActivity.csv, if they have an entry in UserActivity.csv with datetime Month = m then they are active at that month m.

21 Jan 2023, 23:11
Upvotes 0

Thank you for the reply.

Would you please tell how did you derive this order. The order we see in user_activity data is [11, 12, 5, 4, 1, 3, 2].

There are users id mentioned in comments and competition_partipation csv which are not mentioned in user_activity run this below code to check

ID = 'User_ID'
comments_and_comp = list(set(comments[ID].unique().tolist() + competition_partipation[ID].unique().tolist()) )
activity_found_elsewhere = [x for x in comments_and_comp if x not in user_activity[ID].unique().tolist()]
print(len(activity_found_elsewhere))
User avatar
mouaff25
Insat

It is certainly weird that there are users mentioned in Comments and CompetitionPartipation but are missing from UserActivity.

As for the order of months, it is stated in VariableDefinition.csv that "Months are in chronological order but January is not neccesarily month 1" for all month columns. I confirmed this by comparing users Created At Month column with the datetime Month of their activities. For example, if a user created an account on month 12 and registered in a competition on month 1, then month 12 if before month 1.

Thanks for the explanations.

I unserstand in your logic for the month order.

Anyway we would like this to be confirmed from ZINDI officially , in stead of we guesssing or derive it.

I checked the month order from your logic, it seems to be underivable

def check_activity_months(month):
    user_created_in_given_month = users[users['Created At Month']==month]
    activity = user_activity[user_activity['User_ID'].isin(user_created_in_given_month['User_ID'].unique())]['datetime Month'].unique()
    print(f"Users activity months for users created in month {month} >> {activity}")
    
for k in user_activity['datetime Month'].unique():
    check_activity_months(k) 
Users activity months for users created in month 11 >> [11 12 5 4 1 3 2]
Users activity months for users created in month 12 >> [12 1 4 3 2 5]
Users activity months for users created in month 5 >> [ 5 11 1 4 2 3 12]
Users activity months for users created in month 4 >> [11 4 5 1 3]
Users activity months for users created in month 1 >> [ 2 4 1 11 3 5 12]
Users activity months for users created in month 3 >> [5 4 3]
Users activity months for users created in month 2 >> [5 2 3 4]

@ZINDI waiting for the clarification

21 Jan 2023, 23:45
Upvotes 0