Primary competition visual

Fault Impact Analysis: Towards Service-Oriented Network Operation & Maintenance by ITU

8 000 CHF
Completed (over 2 years ago)
Classification
273 joined
89 active
Starti
Jul 26, 23
Closei
Aug 18, 23
Reveali
Aug 18, 23
User avatar
Amy_Bray
Zindi
Updated SampleSubmission
Data · 28 Jul 2023, 12:20 · 19

Dear Zindians,

The SampleSubmission file has been updated, it now has 1932 records. The reference files have also been updated and the leaderboard rescored.

All the best for this exciting challenge!

Discussion 19 answers

Hi, Amy.

The test data is not there. Now, there is only SampleSubmission CSV

28 Jul 2023, 12:57
Upvotes 5
User avatar
HungryLearner

And now, there is no test data. or am I missing something?

28 Jul 2023, 18:14
Upvotes 0
User avatar
Elhassnaoui

Could someone please tell me where I can find the data? I have already followed all the steps to register on this website :https://challenge.aiforgood.itu.int/match/matchitem/78, but I am unable to locate the data.

Could you please tell us where the Test Data is ?

30 Jul 2023, 16:09
Upvotes 0

Hello, I am checking whether there is a problem with data availability. Sorry for this delay.

Antonio

31 Jul 2023, 10:25
Upvotes 0

Any timeline on when the test data will be uploaded ?

User avatar
Amy_Bray
Zindi

Hello, the validation is now available.

1 Aug 2023, 14:05
Upvotes 0
User avatar
yanteixeira

@amyflorida626

The data changed or is the same as before?

Some reason why the team deleted the validation data?

User avatar
Amy_Bray
Zindi

I would recommend downloading the new data in case one or two features changed.

User avatar
HungryLearner

@amyflorida626, many of the new validation data provided has missing "access_success_rate" value.

To be specific, 572 files have the access_success_rate value as all NaN. e.g. (B0017-25_27.csv.csv)

This was not the case for the old validation data provided. Is this intentional?

3 Aug 2023, 08:47
Upvotes 0

The data during the faults is not reported as you would not need a model but you could just calculate the data rate trends comapring the rate before and during the fault. The previously reported validation data is a processed version of the currently available, where still you would not have the data measured during the fault but just the data collected before the fault + (fault duration and the relation between the node where the fault occurs and the where the rate is measured)

User avatar
HungryLearner

I am refering to the access_success_rate for the data collected before the fault.

I see, in this case, handling anomalies and missing values is part of the challenge

User avatar
HungryLearner

In the previously provided validation data. The values are provided for the instance just before the fault occurs. This is now missing for some files.

I wish to have an option for posting figures. I would have added some pictorial example of our observation

You can point me out the specific file. In general, as I said, managing missing data is part of the challenge.

User avatar
HungryLearner

Examples include

'B0017-25_27.csv.csv'

'B0017-32_16.csv.csv'

'B0017-33_16.csv.csv'

For instance, B0017-25_27 below

compared to the old processed validation data

One can see clearly that all the access_success_rate values are NaN even before fault in the new validation data for this particular ID

User avatar
yanteixeira

My understanding is that these lines with NaN values are your target. You want to know whether the data rate goes up or down.

User avatar
HungryLearner

No, the targets are actually not any of the provided columns.

For the lines before the last, we have no fault..... so, no NaN.

For the last line (when there is fault), all other columns are intentionally turned to NaN except the fault_duration and relation columns.

As said, it is part of the job, you can try to infer the data or neglect it.