🌾 Trending Now: Dirty Data?

CGIAR Crop Damage Classification Challenge

Helping Africa

$10 000 USD

Completed (over 2 years ago)

Skills you will learn

Classification

1148 joined

346 active

Info Data Chat Leaderboard

Start

Oct 27, 23

Jan 28, 24

Reveal

Jan 28, 24

Koleshjr

Multimedia university of kenya

Dirty Data?

Platform · 10 Jan 2024, 08:34 · 5

The third Image is definitely mislabelled right?

Discussion 5 answers

nobody2

The dataset has many Images like that(In my opinion were mislabelled),even in Data colunm introduction

10 Jan 2024, 08:41

Upvotes 1

Neo

were you using all the data or did you clean some images? The best I am getting with using all the data is in 0.6xx range , Thanks in advance.

replied to nobody210 Jan 2024, 10:07

Upvotes 0

nobody2

I used all the data for training, tta and Split data into 5 folds may help you, or a model with larger scale.

replied to Neo10 Jan 2024, 12:28

Upvotes 1

Koleshjr

Multimedia university of kenya

And is this consistent in the test set too @Zindi? Because if the test set is clean of these mislabelled images then we have to clean the data , but if also the test set has this mislabelled images what are we supposed to do then? Account for them in our training? @nobody2 @flamethrower @sinchinov @Mohamed_Salam_Jedidi thoughts?

10 Jan 2024, 09:55

Upvotes 6

hashman

Good question. In the training it looks straight forward to just remove the milabelled/dissimilar images. On Test am skeptical.

replied to Koleshjr10 Jan 2024, 11:51

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status