🏦 This Week on Zindi: Issue with linear model

Absa Corporate Client Activity Forecasting Challenge

Helping South Africa

$5 000 USD

Completed (~3 years ago)

Skills you will learn

Forecast

151 joined

46 active

Info Data Chat Leaderboard

Start

Nov 01, 22

Nov 27, 22

Reveal

Nov 27, 22

Delta_E

Issue with linear model

Data · 8 Nov 2022, 17:37 · 3

I attepted to start off with a linear model and continuously adjust the model. That is start with a modle that guesses the same event for all inputs and continuously refine it. Event 14 was the most prominent event in the training data, so i used this as the starting linear model. However, I obtained a test score of 0. I moved on to the second most frequent event, yet still obtained the same result. I eventually ended up trying all 25 events with the same result.

This would imply that none of the events in the Training data appear in the Test data. This would be quite odd.

I had interpreted the target variable to be the 'event' colomn an this to be categorial in nature (although the categories are represented by numbers). Is this correct?

Is the above interpretation correct. I'm unsure what I am missing here.

Thanks.

Discussion 3 answers

TheNoCodeMovement

You just have to predct if each user in the test set "does" event 14 on the given date-time pair. So your prediction is a 1 or 0

8 Nov 2022, 20:06

Upvotes 1

Delta_E

Sorry. Replied to my own post in the discussion. That was meant to be here. I can send you the file so you can see what I mean.

replied to TheNoCodeMovement9 Nov 2022, 16:22

Upvotes 0

Delta_E

@TheNoCodeMovement That is not what I was asking. about 40% of the training data has event = 14. I wanted to use the guess 'event = 14' as an initial model(this would be the best 'random' guess), then upgrade the model to one that select between event = 14 or 30 making the model slightly better. Continuously adding event to hopefully improve the model.

I submitted a file with this event = 14 'guess model'. Score came back as 0. So, I did the same for the other event 'event = x' to see which what the best 'guess model' to start from. All 25 files came back with a score of 0. But that is impossible because that would imply that we have data with target values that are not even in the test data.

I do see some players with a score on the leaderboard. See a bunch of other guys with 0 scores. But I don't get what I'm missing.

9 Nov 2022, 16:17

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status