AutoInland Vehicle Insurance Claim Challenge
$1,000 USD
Can you predict if a client will submit a vehicle insurance claim in the next 3 months?
835 data scientists enrolled, 373 on the leaderboard
InsuranceFinancial ServicesPredictionStructured
26 March—27 June
Ends in 2 months
zero_rank_score
published 9 Apr 2021, 22:41

How is it that I get good accuracy and f1_score say above 0.9 for both and still get a zero score. what am I missing.

0.9 is not exactly a good score . Follow tips in the starter notebook to improve your score.

you may be having data leakage. I also tried uploading the csv created by the starter notebook and getting zero score like you 😵😕

I am trying everything I could think of and it like am running out of options

Hi Daniel!

If you are using the starter notebook, please pay attention to the constitution of the submission file!

instead of sub_file.predictions = predictions, use sub_file.target = predictions.

Regards,

CapitainData!

Hi CaptainData,

Thank you for the comment! I noticed it too in the starter notebook. But I put my predictions in the 'target' column instead of the 'predictions' column like you pointed out. I also cross checked with the sample submission. But I'm still getting a 0 score for some reason. Do you know what the problem might be?

Thanks in advance! Happy Learning :D

Hi fakejayduler,

Sorry for the lateness!

That is really strange! Because it worked by my side!

Here is the lines of code in my submission making cell:

# Make prediction on the test set

test_df = test_df[main_cols]

predictions = model.predict(test_df)

# Create a submission file

sub_file = ss.copy()

sub_file.target = predictions

# sub_file.target = sub_file.target.apply(lambda x: int(x))

# Check the distribution of your predictions

sns.countplot(sub_file.target);

And I got a score of : 0.3387....

Hey, thank you very much for the reply.

I am also doing the same thing. I reached out to team Zindi and they confirmed that my submission format is correct. Seems to be some problem with the host. Anyway I appreciate the reply.

Just one more question, the command where you convert the output (predictions) to int data type,

# sub_file.target = sub_file.target.apply(lambda x: int(x))

Is there a particular reason you used lamda function for the operation. Because,

sub_file.target = sub_file.target.astype(int)

This command also performs the same operation right.

Anyway, really appreciate the help. Hopefully they fix it soon.

Best of luck for the competition and happy learning! :D

Hey, were you able to come up with a solution? Because I'm having the same problem.

Hi fakejaydulera!

If you are using the starter notebook, please pay attention to the constitution of the submission file!

instead of sub_file.predictions = predictions,

use sub_file.target = predictions.

Regards

Have you ensured you're doing this

sub_file.target = predictions ### Ensure values in predictions are 0, 1, and not 0.2, 0.8

Also you mentioned getting a local f1 score of 0.9, that's really large, definitelly overfitting to a particular sample of the dataset.

Try to do cross validation maybe and also tune your model not to overfit.

Hi Dave, thank you for the reply. The format of my submission is correct (I reached out to team Zindi and they confirmed), and my predictions are binary, and not float point numbers like you correctly pointed out. But the score still shows 0. If the worst case scenario where my score was actually 0, I changed one value from 0 to 1 and submitted and I still got a score of 0 which should not be possible. I am running out of options now and don't know what to do.

Anyway, thank you for the reply, I appreciate it! Happy Learning :D

Here is a link towards the Python Starter Notebook working by my side: https://drive.google.com/file/d/1-iWt8J_u0jb7z4yfc8saxuGR2EbIpexY/view?usp=sharing

Regards