💰 Must-Read: You may NOT round your predict...

Zimnat Insurance Recommendation Challenge

Helping Zimbabwe

$5 000 USD

Completed (almost 6 years ago)

Skills you will learn

Prediction

Collaborative Filtering

1784 joined

612 active

Info Data Chat Leaderboard

Start

Jul 01, 20

Sep 13, 20

Reveal

Sep 13, 20

nuhsikander

You may NOT round your predictions to 0s and 1s.

Help · 7 Sep 2020, 07:58 · edited 1 minute later · 10

@ZindiAdmins, With the above statement, is it safe to assume that we CANNOT calibrate our probabilities by any manual process. The ones that are predicted can be stacked / blended, but we cannot setup any manual calibration stretegy?

Would appriciate your response.

Discussion 10 answers

ZINDI

Correct!

7 Sep 2020, 08:16

Upvotes 0

nuhsikander

Thank you, and I believe even clipping would not be deemed acceptable? I read somewhere in the forum that someone was suggesting to clip your obvious cases to higher/lower values.

replied to ZINDI7 Sep 2020, 08:17 (edited 3 minutes later)

Upvotes 0

dead-mazai

Hello, Zindi! In my opinion, in our situation with data some products can't be connected with another product. We can find this in statistic values. It will be prediction, not from model, but it called prediction. In that case we should round them. Fro example, if we have 1s product with abstract name 'YYY' we see in train that product 'OOO' have 0s for all 1s 'YYY'. It means that these products can't combine and we must to round 'OOO' to ~0 for each client, who have 1s on 'YYY'.

And for importance P-value, it's a statistically significant difference

replied to ZINDI7 Sep 2020, 09:22 (edited 2 minutes later)

Upvotes 0

PhystechPozitron

What about the probabilities for pairs "customer_id - product_id" in test, for which we know exactly, that this customer has this particaular product (in SampleSubmission.csv we have 1.0 for these pairs)? Do we need to make predictions for these pairs too, or labels for them can be equal to 1.0?

7 Sep 2020, 09:00

Upvotes 0

darrel

refer to the sample submission. Products that the customer_id already has reflected with a prob of 1.

replied to PhystechPozitron7 Sep 2020, 09:02

Upvotes 0

PhystechPozitron

@darrel Yes, I see it, just wanted to clarify)

replied to darrel7 Sep 2020, 09:46

Upvotes 0

dead-mazai

Ok, I just round my prediction for these values to 1e-53 or 1-(1e-53) and not to 0 or 1) Thank you!

7 Sep 2020, 09:29 (edited 1 minute later)

Upvotes 0

dead-mazai

ye, let's wait them

7 Sep 2020, 09:38

Upvotes 0

ZINDI

You can clip or round for the known values, however, please do not clip or round your predictions.

7 Sep 2020, 11:13

Upvotes 0

dead-mazai

What about statistic values from our train data? In rules we have no information how we must do predictions(I mean way). Statistics-prediction too and I think if we use machine learning model and than statistics values, we don't break the rules. We do it properly, because it is our prediction from statistics.

For me "known values" are statistics values

replied to ZINDI7 Sep 2020, 21:27

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status