🤖 This Week on Zindi: LB v/s CV

Fault Impact Analysis: Towards Service-Oriented Network Operation & Maintenance by ITU

8 000 CHF

Completed (almost 3 years ago)

Skills you will learn

Classification

277 joined

88 active

Info Data Chat Leaderboard

Start

Jul 26, 23

Aug 18, 23

Reveal

Aug 18, 23

Rakesh_Jarupula

National Institute of Technology Silchar

LB v/s CV

Connect · 4 Aug 2023, 01:41 · 24

Hello all,

I am kind of curious regarding the local f1_score and LB score. For me there is a huge gap between them. Please share your score here to get an idea of what's happening.

Simple XGB:

CV = 0.65

LB = 0.69

I Think I am overfitting

Discussion 24 answers

Koleshjr

Multimedia university of kenya

0.69...cv - 0.68... lb

4 Aug 2023, 05:22

Upvotes 0

University of Yaoundé I

stable!

replied to Koleshjr5 Aug 2023, 17:47

Upvotes 0

Reacher

CV : 0.8 - LB : 0.68

4 Aug 2023, 13:43

Upvotes 0

Koleshjr

Multimedia university of kenya

0.8 wow!

replied to Reacher4 Aug 2023, 13:54

Upvotes 1

Reacher

UPDATE : i did a small code bug when creating labels,

Now CV : 0.7 LB : 0.7

replied to Reacher6 Aug 2023, 05:53

Upvotes 0

Koleshjr

Multimedia university of kenya

what is your cv score / lb looking like?

4 Aug 2023, 14:01

Upvotes 0

yanteixeira

It seems that you solved the issue

4 Aug 2023, 17:35

Upvotes 0

Koleshjr

Multimedia university of kenya

what's your cv/lb looking like?

replied to yanteixeira5 Aug 2023, 14:43

Upvotes 0

Rakesh_Jarupula

National Institute of Technology Silchar

I tried different ways to generate training set...and surprisingly there is no consisitency in the scores. It's varying with great difference. I personally think that generating proper training set that is representative of the objective is key here.

Things I tried:

- Include all the rows from all files

- Include only rows where the fault > 0

- Include only the 1st instance when the fault > 0 / file

Any new approaches are most wellcome.

replied to yanteixeira5 Aug 2023, 16:21

Upvotes 0

yanteixeira

@koleshjr exactly like yours! It seems that we need to improve that hehe

@Rakesh_Jarupula yes, that's also what I'm experiencing. I think two points are essencial: 1) the way you set up the training set, and 2) how you input the NaNs on test set.

replied to yanteixeira5 Aug 2023, 17:46

Upvotes 1

Koleshjr

Multimedia university of kenya

@yanteixeira Yeah I have been seen you following me closely and I think we have the same approach. I'm shocked by guys like @ff and @reacher getting 0.9.. cv and 0.8 cv respectively, I haven't encountered a cv of greater than the 0.65 - 0.69 region. Also , what I got from @AntonioDeDomenico is that we are supposed to be having one row per network element, if that is wrong I stand to be corrected.

I am referring to this discussion here:

https://zindi.africa/competitions/fault-impact-analysis-towards-service-oriented-network-operation-maintenance/discussions/17921

@AntonioDeDomenico Hi Rakesh, I hope i understand well your question. In the training set you need to label the datarate change when the fault occurs, max 1label per file, comparing the datarate in the row prior to the fault and the datarate measured when the fault appears.

replied to yanteixeira5 Aug 2023, 18:08

Upvotes 0

University of Yaoundé I

I will review my approach.

Your are not wrong. We are supposed to be having one row per NE.

replied to Koleshjr5 Aug 2023, 18:16

Upvotes 0

Charrada

Same issue here! My model doesnt seem to learn anything. Most probabilites are around 0.5. Except some data_rate == 0 rows which are easy to classify as 0.

replied to ff5 Aug 2023, 18:18

Upvotes 0

yanteixeira

@Charrada Right now, you have an LB of 0.69. How can you say your model isn't learning? hahaha

@Koleshjr I'm also confused about how people are getting those high scores. Suddenly, the first position has an LB of 0.71... I think we should rethink our steps and try a different approach.

replied to Charrada5 Aug 2023, 18:24

Upvotes 0

Koleshjr

Multimedia university of kenya

@yanteixeira that's trueee, but we seem to have a correlating cv vs lb , as the old competitive machine learning adage says: always trust you cv 😅

replied to yanteixeira5 Aug 2023, 18:27

Upvotes 1

Charrada

@yanteixeira a trick gave me a small boost. But still, my model is not performing well when looking at the predictions distribution. Maybe i wont use that submission for the private leaderboard :D

replied to yanteixeira5 Aug 2023, 18:28

Upvotes 1

University of Yaoundé I

I struggle to make my CV stable compared to my LB.

CV = 0.63 & LB = 0.54

CV = 0.91 & LB = 0.66

5 Aug 2023, 17:46

Upvotes 0

Koleshjr

Multimedia university of kenya

@ff It seems you and @Yisakberhanu have the same approach that is for the: CV = 0.63 & LB = 0.54

replied to ff5 Aug 2023, 18:32

Upvotes 1

Juliuss

Freelance

0.69 cv, 0.70 lb... Initially had 0.58 cv vs 0.67lb.. corrected a small bug too and harmonized the scores... The competition is interesting we have a 0.72 now with a single submission!! Probably we will see a 0.8>= before competition closes? That would be great solution for @AntonioDeDomenico 's problem.

6 Aug 2023, 13:59

Upvotes 1