Primary competition visual

Unifi Value Frameworks PDF Lifting Competition

Helping South Africa
$5 000 USD
Challenge completed over 1 year ago
Generative AI
450 joined
73 active
Starti
Dec 21, 21
Closei
Mar 17, 24
Reveali
Mar 17, 24
User avatar
HackP
National School Of Computer Science (ENSI) - Tunisia
Train.csv might have some wrong labeled values
Data · 10 Mar 2024, 16:48 · 6

Hello All, I hope you are doing fun with this awesome competition. I would like to encounter a critical point that made me confused about whether to the Train.csv or not.

Okay for example, when i wanted to dive into EDA and see how values of year 2021 have been collected, i remarked that labeling might have some issues. For example, for the Impala company, the pdf is ESG-spreads.pdf, I started by selecting Train.csv rows that have this Group to know what different metrics it has. (As the photo below shows : ).

Focusing in metric 128: Total Direct CO2

I am back to the pdfs to found out that these are not the actual values for the metric and they are different from those mentioned in Train.csv. Shall we rely on train.csv in that case ?

Photo Link:

https://drive.google.com/file/d/1wG60luQtKMb_fZymvCLMBGH-wFy9OQt1/view?usp=sharing

Discussion 6 answers
User avatar
Juliuss
Freelance

Was about to start this thread..yes the train.csv is terrible and its not only for Impalla. Data entry issues?? If the file we are scored against is also having these issues, its even a big issue. @Zindi ??

10 Mar 2024, 16:51
Upvotes 0
User avatar
HackP
National School Of Computer Science (ENSI) - Tunisia

Sorry there was an error with the picture and now it is uploaded well.

We need clarification into this issue as it might affect our approchs.

User avatar
Koleshjr
Multimedia university of kenya

Actually it's not an error. Dive into the data and understand how they got that value. Because it's actually a correct value. I had this assumption when I started but after more analysis, I found that it's not a data entry issue

10 Mar 2024, 17:07
Upvotes 2
User avatar
Juliuss
Freelance

This is interesting, let me have a look

User avatar
Juliuss
Freelance

@Koleshjr, you're right!... It introduces a level of complexity that I cannot manage deal with in these final few hours 😅.

User avatar
Koleshjr
Multimedia university of kenya

😂😂 told you guys the data was clean