☎️ This Week on Zindi: The bokke won by 1 point ...

Sasol Customer Retention Recruitment Competition

Helping South Africa

R10 000 ZAR

Challenge completed ~2 years ago

Skills you will learn

Prediction

Job Opportunity

253 joined

56 active

Info Data Chat Leaderboard

Start

Oct 05, 23

Nov 26, 23

Reveal

Nov 26, 23

skaak

Ferra Solutions

The bokke won by 1 point ...

Platform · 23 Nov 2023, 08:53 · 6

Just some trivia for your entertainment.

The difference between the top spots is 0.000005 (yes 5e-6!) at the moment. Given the size of the test set and the ~20% that is used for public (so public scores ~76,000 observations) it means that there is (probably) just a single difference between the 20% public observations between the top spots!

Talk about small margins ... 1 / 76,000 ...Rassie would be proud ...

Code to simulate this with fwiw

import numpy as np

from sklearn.metrics import f1_score

# Configure

n   = 76000

f11 = 0.699872184

f12 = 0.699867441

# Simulate

rng = np.random.default_rng ( 41 )

act = rng.integers ( 2, size = n )

prd = act.copy ()

i = 35000

prd [ : i ] = 0

while f1_score ( act, prd ) > f11 :

	prd [ i ] = 0

	i        += 1

print ( f"Changed { i } values for { f1_score ( act, prd ) } want { f11 }" )

k = 0

j = i

while f1_score ( act, prd ) > f12 :

	if prd [ j ] :

		k += 1

	prd [ j ] = 0

	j        += 1

print ( f"Changed { j } values for { f1_score ( act, prd ) } want { f12 } { k } real changes" )

print ( f"Difference is {f11 - f12:.9f}" )

Discussion 6 answers

skaak

Ferra Solutions

@wuuthraad this clearly shows I'm an "applied" mathematician. If I was a real one I'd derive it with first principles ...

23 Nov 2023, 08:55

Upvotes 1

wuuthraad

My money is on everyone's CV scores, Did you cross-validate @skaak. I was just optimizing my recall and precision scores individually which just helps me understand where the model might be going stray. Like I mentioned in an earlier post Fe has been surprisingly helpful. In all honesty my model is fairly staright forward... it's "simple" for the lack of a better word. I am just balancing the Bias-Variance tradeoff, you know. Keeping the model as simple as possible but also as predictive as possible. that is why I am going towards FE , CV then modelling( RedBulll Too).

won't be surprised if there's a shake-up in the LB, these scores are cutting it close, nobody seems to be breaking the .70 wall(yet)

23 Nov 2023, 17:09

Upvotes 1

wuuthraad

One point is all you need or in case a fraction of a point

replied to wuuthraad23 Nov 2023, 17:10

Upvotes 1

skaak

Ferra Solutions

Yeah, have emptied my bench (as @adamjcordy also did) and managed to join the 0.6999 club (with sub 99 fwiw) ... few hours more and will see if it is overfit or elegance.

replied to wuuthraad26 Nov 2023, 20:17

Upvotes 1

wuuthraad

@skaak so you did breach the .70 wall. I weirdly enough submitted one final csv after adding a few new features and It outperformed all my previous subs. that was around 1:58 AM. I had a feeling it was going to do good but the moment I selected it to be part of my evaluation subs an error message showed up. hahahaha ironic isn't it BTW the score I got at the 11th hour was (private LB : 0.699455367) ChurchHill would be proud. congrats on beating me on the LB yet again! hahahaha

And in true Bokke fashion you won it by one point!

replied to skaak27 Nov 2023, 00:12

Upvotes 0

skaak

Ferra Solutions

Tx @wuuthraad and also congrats. Pity you could not select that final sub - that would take you from 10 to 7? Not long ago I had similar trouble - repeatedly ran into an error in the final hour ... you must talk to Amy, perhaps she can do something for you.

Often when I post or submit, I get an error, and when I reload it just goes away. Over here I also did some last minute changes and was wondering if I am not making a huge mistake, but this time at least it worked and I got the bit of LB lift I needed. fwiw I consistently had lower CV than LB and I see most private LB is also higher than public.

Wow what a comp ... I'll set up a chat, just a kind of review and open discussion around this. @MakalaMabotja also asked for something like that, but I'll post some invite here, perhaps with a bit of a review of my approach, and then we discuss, especially if you can also describe your approach a bit.

Congrats @Sivuyile_Nzimeni you got a nice lift on the LB wow.

replied to wuuthraad27 Nov 2023, 05:25

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status