Wazihub Soil Moisture Prediction Challenge
$8,000 USD
Predict soil humidity using sensor data from low-cost DIY Internet of Things in Senegal
29 July–20 October 2019 23:59
699 data scientists enrolled, 96 on the leaderboard
"Your solution needs to use one model to predict soil humidities for all four fields."
published 15 Oct 2019, 16:05

Hi ZINDI I am still confused, please I need a clarification:

Which one of these two strategies is correct:

Strategy 1:

- Train on field 1 then predict on field 1

- Train on field 2 then predict on field 2

- Train on field 3 then predict on field 3

- Train on field 4 then predict on field 4

Stategy 2:

- Train on all 4 fields without mixing information from different fields then predict on all 4 fields

From my point of view, I think you should train in individual fields. Prediction on the test set should be as a result of different models trained on the different fields data but the parameters of the models(the four models) should be the same. For instance, if you are using a tree-based algorithm if you have used 100 estimators to model field 1 that should be constant throughout the other fields. That's my take.

replying to JavaBlack
edited less than a minute later

Thank you @javablack for your response.

But in production if we want to deploy our solution in new Field x, should us train our model in that new Field in order to be operative ?

I think in practice we don't have to train our work each time for each new Farmer (user). It is weird !

Yes, I totally agree with you. However, if you think critically about the aim of the competition is to find the best model which can generalize on any field given the exact same features as the four fields. Maybe they(The people who want the solution) have more data which they will try with the model and hyperparameters the competition winner would have submitted to train now a big general model.