🌾 AI in Focus: 2nd Place Solution : Keep...

CGIAR Root Volume Estimation Challenge

Helping Africa

$15 000 USD

Completed (~1 year ago)

Skills you will learn

Computer Vision

Prediction

1064 joined

257 active

Info Data Chat Leaderboard

Start

Jan 24, 25

Mar 09, 25

Reveal

Mar 10, 25

wizzard

2nd Place Solution : Keep it Simple

Notebooks · 11 Mar 2025, 19:06 · 17

Hi Everyone, I share my solution with you

For this challenge I worked locally and didn’t use any GPU. The whole code (train and inference) runs in one minute.

Given the small size of the dataset, I decided to use a very light model. The only features we used were (5 in total):

The aggregated mean over the categorical variables (Plant Number, Side, Genotype and Stage)
The width which is the difference between the proposed end and start layers.

The model is a simple xgboost trained using stratified folds on Genotype.

The environment is Python 3.10.13.

The code snippet is provided below:

import pandas as pd

from sklearn.metrics import mean_squared_error

from sklearn.model_selection import StratifiedKFold

import numpy as np

from tqdm import tqdm

import xgboost as xgb

import time

start = time.time()

CATCOLS = ["Stage","Genotype","Side","PlantNumber"]

NUMCOLS = ["Start","End","Delta"]

ID = "ID"

TGT = "RootVolume"

PATH = "Your Path Here..."

tr = pd.read_csv(f"{PATH}/Train.csv")

te = pd.read_csv(f"{PATH}/Test.csv")

tr["train"] = 1

te["train"] = 0

tr["Delta"] = tr["End"] - tr["Start"]

te["Delta"] = te["End"] - te["Start"]

data = pd.concat([tr,te])

data["x_geno"] = data.groupby("Genotype")[TGT].transform("mean")

data["x_plant"] = data.groupby("PlantNumber")[TGT].transform("mean")

data["x_stage"] = data.groupby("Stage")[TGT].transform("mean")

data["x_side"] = data.groupby("Side")[TGT].transform("mean")

data["x_bl"] = 0.8*data["x_geno"] + 0.2 * data["x_plant"]

tr = data.loc[data["train"]==1].copy().reset_index()

te = data.loc[data["train"]==0].copy().reset_index()

FE = ["Delta"] + ["x_geno","x_plant","x_stage","x_side"]

X = tr[FE].values

Xe = te[FE].values

y = tr[TGT].values

grp = tr["Genotype"].values

NFOLDS = 10

skf = StratifiedKFold(n_splits=NFOLDS)

FOLDS = list(skf.split(X,grp))

oof = np.zeros(y.shape)

pe = 0.0

for idx in range(NFOLDS):

tr_idx, val_idx = FOLDS[idx]

clf = xgb.XGBRegressor(max_depth=4, n_estimators=80, learning_rate=0.025) #

clf.fit(X[tr_idx],y[tr_idx])

oof[val_idx] = clf.predict(X[val_idx])

pe += clf.predict(Xe) / NFOLDS

print("FOLD :", idx)

oof = np.round(oof, 2)

CV = mean_squared_error(y, oof, squared=False)

print("Ridge CV:", CV)

sub = te[[ID]].copy()

sub[TGT] = pe

sub.to_csv(f"{PATH}/submission.csv", index=False)

end = time.time()

print(f"Elapsed Time: {end-start} seconds" )

Discussion 17 answers

PUBG

Waouh !!! first of all thanks for sharing your approach but I'm just wondering if such kind of solution that does not use the images at all is still valuable to the client . This challenge is mainly a Computer Vision Challenge . "Can you estimate cassava root volume from underground scanning images? "

Anyway just wanted to share my opinion on this .

11 Mar 2025, 19:20

Upvotes 4

wizzard

I think there is a signal in the images. because the delta features indirectly used the images. A bigger dataset could have allowed to really assess the importance of images

replied to PUBG11 Mar 2025, 20:48

Upvotes 3

MICADEE

LAHASCOM (Freelance)

@PUBG You just said my mind. Anyway let's see what clients will consider as their preferred solution. The onus is on them.

replied to PUBG11 Mar 2025, 22:29

Upvotes 0

offei_lad

University of mines and technology

This is also a concern of mine, however so far it's shown that predictions made using the images are inferior to those without ( most definitely due to the small sample size) so any model that incorporates the images would be less valuable to the client. And also, as @wizzard stated, the tabular data contains some features linked to the images and seem to yield better results. In the end though, the decision is left to Zindi and the clients.

replied to PUBG11 Mar 2025, 23:11

Upvotes 2

majdsouissi

Ecole nationale supérieure d'ingénieurs de Tunis

thanks for sharing , nice work keeping it as simple as possible , maybe i misunderstood the chalenge as it appeared to be a computer vision or image processing but you have not used this and your model succeded so very cool work

11 Mar 2025, 19:23

Upvotes 0

Ran_don

@wizzard well played.

11 Mar 2025, 19:31

Upvotes 0

nymfree

Amazing. one question: did you have strong conviction in the cross-validation scores you got with this approach? my teammate had such a model initially. but in the end it was difficult to justify selecting such a model given the cv of models with image features.

11 Mar 2025, 19:49

Upvotes 1

wizzard

Not really, but I chose it thinking maybe it can be successful on the private data. The size of the dataset didn't reward big models which used the images.

replied to nymfree11 Mar 2025, 20:45

Upvotes 0

nymfree

your intuition paid off. congrats

replied to wizzard11 Mar 2025, 20:47

Upvotes 1

Congratulations on your victory! Your solution is well deserved. However, it seems the host would be happier if an image-based solution ranked high 😂

12 Mar 2025, 04:30

Upvotes 3

AJoel

Zindi

Hi @Wizzard,

First of all congratulations in taking part in the challenge. How does your solution take the images provided into account. Recall that the objective of the challenge was primarily to use computer vision techniques to estimate the volume from the images. So it is a two step process.

12 Mar 2025, 06:59

Upvotes 4

wizzard

Hello @Joel. One of the challenge was to choose the appropriate depth range for the images. I tried many, but finally I stick with the proposed start and end layers in the metadata provided. My idea is simple, the more the delta (end - start) is high, the volume should be higher since the plant is assumed to take more space. So I only used that feature. We all experienced that models which used extensive image features (object detection, Feature map from conv nets, etc.) didn't generalize well on the private. Though it is legitimate to enforce the use of images, I will always stick with the more efficient model. In my case, I think that it uses images indirectly. To me, it is a predictive challenge, we should not forget that one of the primary focus is to generalize on unseen data.

replied to AJoel12 Mar 2025, 07:23

Upvotes 3

offei_lad

University of mines and technology

I think the flaw is from the structure of the competition, the size of the data meant that trying to increase your private scores when using computer vision techniques was effectively overfitting. Also there's no way a pure computer vision model would beat the baseline which would be predicting with the average volume per genotype. The only way it would do so is if the images were used with the tabular data and since it gives less scores than without it means it was effective noise. The question would the winning solution be given to the team who incorporated noise and were the best at ignoring it? Plus we were able to beat the baseline score because we added features that were linked to the images so although indirectly, the images were used.

replied to AJoel12 Mar 2025, 09:32

Upvotes 2

bit_guber

Hi @AJoel,

my solutions are purely based radar data on left & right side image of plant and cv 0.90 MAE ( used mae score for generalization purpose since dataset is small ) , and i got 1.40042721 on lb ( 182'th rank )

maybe, problem is evalution metrics Root mean Squared error that sensitive to outliers

replied to AJoel12 Mar 2025, 22:02

Upvotes 0