🌱 Challenge Chat: Malfunctional Reference File?

Unifi Value Frameworks PDF Lifting Competition

Helping South Africa

$5 000 USD

Completed (almost 2 years ago)

Skills you will learn

Generative AI

452 joined

73 active

Info Data Chat Leaderboard

Start

Dec 21, 21

Mar 17, 24

Reveal

Mar 17, 24

Juliuss

Freelance

Malfunctional Reference File?

Platform · 29 Dec 2023, 07:41 · 7

@Zindi kindly check that the reference file or evaluation metric is correct. The sample submission file with zeroes already gives a >90% accuracy, any slight deviation gives an accuracy of around 0%.. (I could be mistaken)

Discussion 7 answers

Amy_Bray

Zindi

Hi Julius, We will look into this and get back to you by 3 January.

29 Dec 2023, 16:16

Upvotes 1

Wajdi_Hajji

ESPRIT

Hi Julius, I initially shared your confusion regarding why accuracy is the metric and why a simple submission can achieve such high accuracy. Here's my understanding, though I could be wrong:

This is a multiclass classification challenge where the classes represent the values extracted from the PDFs. The simple submission reveals that about 90% of the values are 0. This occurs because many AMKEYS are not mentioned in the PDF, leading to their values being set to 0. Similarly, in the training data, these values are set to null, which comprises about 91% of the data. Thus, there is a similar imbalance between the training and evaluation datasets.

The task involves extracting 511 AMKEY values from 12 companies for the year 2022. If an AMKEY is not found in a document, a 0 is assigned to its value.

30 Dec 2023, 00:22

Upvotes 1

Juliuss

Freelance

Certainly, @Wajdi_Hajji... Nevertheless, when I attempted to manually modify just five values for five distinct companies associated with a specific AMKEY key that I was confident were accurate, the accuracy plummeted to 0%.

replied to Wajdi_Hajji30 Dec 2023, 01:02

Upvotes 0

Nelly43

Zindi

Hi @JuliusFx, Please ensure the "2022_Value" column in your submission is of type float(same as the original target variable type) since for a task like this challenge even "0" and "0.0" are considered different "classes".

2 Jan 2024, 12:08

Upvotes 0

Juliuss

Freelance

yea it now scores, thanks @Nelly43. maybe you could now check that if two submissions have similar scores, only the earliest is considered. I noted that my latest submission that scores similarly to an earlier submission is considered and I moved down the leaderboard.

replied to Nelly432 Jan 2024, 13:17

Upvotes 0

grayb

I'd recommend reaching out to Zindi directly to double-check and clarify the accuracy concerns. They should be able to assist you and ensure everything aligns properly.

4 Jan 2024, 15:48

Upvotes 0

Juliuss

Freelance

Yea, I think it was clarified.

replied to grayb4 Jan 2024, 16:05

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status