🚜 This Week on Zindi: The current situation

Digital Green Crop Yield Estimate Challenge

Helping India

€9 400 EUR

Completed (over 2 years ago)

Skills you will learn

Prediction

1370 joined

677 active

Info Data Chat Leaderboard

Start

Sep 04, 23

Dec 03, 23

Reveal

Dec 03, 23

yanteixeira

The current situation

Platform · 4 Dec 2023, 12:42 · 3

I would like to report a constant lack of communication that Zindi has been exhibiting towards the competitors since the first day of this competition. Many problems arose, but nearly nothing was done to make things clear. On top of that, it's clear that the test set was changed. The lack of transparent communication caused several competitors to focus in the wrong direction, which would have been the right direction had the data not been changed. What is happening is not fair.

Let's review some problems that arose during the competition:

1) Ensemble rule

From the early days of the competition, participants complained about unclear resource restrictions. I myself made two posts about it:

https://zindi.africa/competitions/digital-green-crop-yield-estimate-challenge/discussions/18493

https://zindi.africa/competitions/digital-green-crop-yield-estimate-challenge/discussions/18558

The competition just ended, and this still remains a valid question. We remain uncertain about what is permissible and what is not.

2) Metric

The metric was also a topic of discussion here:

https://zindi.africa/competitions/digital-green-crop-yield-estimate-challenge/discussions/18645

Although RMSE makes sense from a business perspective, it did not make sense given the characteristics of the data. The high correlation (near 100%) between Acre and Yield creates a situation where it is impossible to predict extreme values. Suddenly, the problem was no longer about regression, but anomaly detection.

3) Data entry errors

The following post marked a turning point:

https://zindi.africa/competitions/digital-green-crop-yield-estimate-challenge/discussions/19492

Kudos to @VIRADUS for highlighting what many competitors were experiencing. We discovered that the outliers were actually data entry errors, which implies that a model scoring well could turn out to be ineffective in real-world applications. Zindi's response added to the confusion. It turns out we needed to decipher the enigmatic answer to realize that the private test could change.

---

Although I'm glad to have learned a lot in this competition, it should be noted that the experience of everyone here would be so much better with better communication.

Discussion 3 answers

Koleshjr

Multimedia university of kenya

Truee

4 Dec 2023, 13:37

Upvotes 0

Abdallah_Abra

Thank you @yanteixeira, for highlighting all those valid points! You've captured all of the important feedback that many of us have been sharing throughout this competition.

You've been impressively communicative and have consistently shared your thoughts since the beginning of the competition. Should @Zindi be in need of extra data science / communication personnel, I would highly vouch for @yanteixeira!

4 Dec 2023, 13:45

Upvotes 1

yanteixeira

@Zindi I'm definitely available hahaha

replied to Abdallah_Abra4 Dec 2023, 13:54

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status