My observation is about the current data set.
In my understanding, all users created in month 04 will be the users analyzed for the current submission, that is, those included in the submission file (User_ID_Next_month_Activity).
I know that for the final submission we will need to predict for users created in month 5 and 6, but I'm talking for the today submission.
Today there are 1382 users with 'Created At Month']==4.
However, in the example submission file there are 1340 users, that is, 42 different users.
I identified that these 42 users are users with 'Created At Month']==4 but with Created At Day_of_month==30.
Is there a reason why these users are not in the sample submission file?