Primary competition visual

Fossil Demand Forecasting Challenge

$5 000 USD
Completed (over 3 years ago)
Forecast
1009 joined
200 active
Starti
May 24, 22
Closei
Aug 28, 22
Reveali
Aug 28, 22
Data duplicates
Help Ā· 8 Aug 2022, 11:36 Ā· 3

Hi, I see some sku_name&date duplicates only in train data. e.g.

Does anyone know if they are real duplicates and we should eliiminate them (I mean it is a kind of data entry problem)? Or there is some other reason for them to exist?

Discussion 3 answers
User avatar
skaak
Ferra Solutions

Good question ... I *think* there was a price change in the middle of the month and you get the data on the two sides of that.

8 Aug 2022, 11:48
Upvotes 0

Thanks, I have just checked. So not in all the cases there was a price change for duplicates. For ABEMULAASHL sku there was no change, but we have 4 records for 2019-10, but even if it was a price change - in most of the cases we see something like that for ABEAHAMASHL (2 duplicates for one price and two for another).

Another strange thing is that all duplicates are for 2019-10... That's why I though that is may be some data collection problem

User avatar
skaak
Ferra Solutions

Nice spot Ililily ... this data is a bit broken ... looks like duplicates. I'd suggest just ignoring 44893, 44895, 44575 and 44577?