Primary competition visual

Zimnat Insurance Recommendation Challenge

Helping Zimbabwe
$5 000 USD
Completed (over 5 years ago)
Prediction
Collaborative Filtering
1777 joined
612 active
Starti
Jul 01, 20
Closei
Sep 13, 20
Reveali
Sep 13, 20
You may NOT round your predictions to 0s and 1s.
Help · 7 Sep 2020, 07:58 · edited 1 minute later · 10

@ZindiAdmins, With the above statement, is it safe to assume that we CANNOT calibrate our probabilities by any manual process. The ones that are predicted can be stacked / blended, but we cannot setup any manual calibration stretegy?

Would appriciate your response.

Discussion 10 answers
User avatar
ZINDI

Correct!

7 Sep 2020, 08:16
Upvotes 0

Thank you, and I believe even clipping would not be deemed acceptable? I read somewhere in the forum that someone was suggesting to clip your obvious cases to higher/lower values.

Hello, Zindi! In my opinion, in our situation with data some products can't be connected with another product. We can find this in statistic values. It will be prediction, not from model, but it called prediction. In that case we should round them. Fro example, if we have 1s product with abstract name 'YYY' we see in train that product 'OOO' have 0s for all 1s 'YYY'. It means that these products can't combine and we must to round 'OOO' to ~0 for each client, who have 1s on 'YYY'.

And for importance P-value, it's a statistically significant difference

What about the probabilities for pairs "customer_id - product_id" in test, for which we know exactly, that this customer has this particaular product (in SampleSubmission.csv we have 1.0 for these pairs)? Do we need to make predictions for these pairs too, or labels for them can be equal to 1.0?

7 Sep 2020, 09:00
Upvotes 0

refer to the sample submission. Products that the customer_id already has reflected with a prob of 1.

@darrel Yes, I see it, just wanted to clarify)

Ok, I just round my prediction for these values to 1e-53 or 1-(1e-53) and not to 0 or 1) Thank you!

ye, let's wait them

7 Sep 2020, 09:38
Upvotes 0
User avatar
ZINDI

You can clip or round for the known values, however, please do not clip or round your predictions.

7 Sep 2020, 11:13
Upvotes 0

What about statistic values from our train data? In rules we have no information how we must do predictions(I mean way). Statistics-prediction too and I think if we use machine learning model and than statistics values, we don't break the rules. We do it properly, because it is our prediction from statistics.

For me "known values" are statistics values