First of all i want thank you my GOD, And i want thank you greatly zindi for gave this opportunity.
when i go to my approach
1. I divided the train and test data in three :
1. train_1 it is which has activity and created day is < 22 and test_1 aslo which has activity.
2. train_2, it is which has no activity and the created day is >21 and also for test.
3. train_3 is created day is <22 and which has no activity.
2. feature engineering
1. i taken sum, mean, quantile rows wise of activity.
2. i used day of created and i take how many day remain for next month
3. i used how many competition joined and how many are seceret and how many are hosted his country.
4. i used also how many competition is active, how many are active and secret, how many are go to next month so on.....
5. i taken the standard deviations of activity day and last activity day and how many day activity and hour.....
6. from discussion, i used count of discussion and type of discussion.
7. i didn't use the comments for train_1 because test_1 users didn't made comments.
8. i extracted some info from jobs and blogs.
1. i taken for training users who has more probablity to test data.
2. i used small stacknet method combination of rf, lg and (catboost the final model).
this is my short solution, i can't got to a broad because i am taking final exam, i hope it give small idea. thank you all, @Koleshjr you did great work!
Amazing approach , congrats once again👏👏
Thank my bro
Thanks for sharing @Yisakberhanu and congrats on your winning.