🤖 Data Talk: Problem Requirement

Problem Requirement

Help · 31 Jul 2023, 06:06 · 26

I dont know if its me or has someone noticed that from the defination of the problem, it suggests its does not require a machine learning solution. I have experimented with a rule based approach and its better that guessing. Can someone with a different opinion explain further. Some who understands it.

Discussion 26 answers

Multimedia university of kenya

Honestly I think there are no restrictions, you are free to choose whatever methods that work for you , If rule based works then use that, if ml works for you well use that

31 Jul 2023, 06:46

Upvotes 0

This is exactly the reason for which a ML model is useful

31 Jul 2023, 09:37

Upvotes 0

replied to AntonioDeDomenico31 Jul 2023, 09:57

National Institute of Technology Silchar

Hi Antonio,

If simple if else statements are used to label the data. We don't need the ML models here. Simple Rule based code will do.

In order to use ML here the labeling should be based on many factors, Not simply IF Else statements on the values of Two columns.

Please provide use the labeled dataset to apply ML.

Upvotes 0

replied to AntonioDeDomenico31 Jul 2023, 10:00

What people got from your presentation is that you label a node a 1 if there is a decrease in data rate whenever a fault occurs. Thats why I personally think this is a hack, or the problem requirement is not properly defined.

Upvotes 0

replied to Rakesh_Jarupula31 Jul 2023, 10:09

Hi, this is the labeling processing in the train set. The data_rate when the fault accurs is not known in the validation data. How do you label the fault impact using IF Else in this case?

Upvotes 1

replied to AntonioDeDomenico31 Jul 2023, 10:14

This is what I was not getting. Now I understand. But the validation data is not available. I had it when I first downloaded the data, but its now not present.

Upvotes 0

replied to lesjones31 Jul 2023, 10:22

Can you please tell me which files you can download currently? There should be 1932 input files (where the data_rate when the fault accurs is not known) and a sample submission file (with 1932 values).

Upvotes 0

replied to AntonioDeDomenico31 Jul 2023, 10:33

National Institute of Technology Silchar

We can currently download 7256 CSV training files (`imgs/2023050915314323740.rar`) from ITU platform and Sample submission file from the Zindi.

We are not able to access the test data.

And regarding the data_rate values....there are known in the test data which was previously shared.

Here is the test sample:

ID access_success_rate resource_utilition_rate TA bler cqi mcs data_rate fault_duration relation     0 B0017-25_24 99.357688 84.004 2.923368 14.209819 5.582824 5.667775 1.175289 301 0.654162   1 B0017-25_25 99.642289 92.242 2.877206 15.083843 5.628569 5.051611 0.966620 145 0.654162   2 B0017-25_26 99.546228 80.028 3.151677 13.437244 5.226969 4.896700 1.561278 250 0.654162   3 B0017-25_27 100.000000 8.616 3.728730 8.817188 5.947785 7.884572 10.963935 1971 0.654162   4 B0017-32_1 99.597616 70.445 2.732496 12.644968 6.445368 7.136024 4.471131 3461 0.654162   5 B0017-32_10 99.781591 60.941 2.727843 12.841164 6.161731 6.602028 3.161234 64 0.654162   6 B0017-32_11 99.389205 74.666 2.750890 13.120919 6.302626 6.807933 2.339437 93 0.654162   7 B0017-32_12 99.773719 62.808 2.721176 12.314137 6.191431 7.228925 2.901728 43 0.654162   8 B0017-32_13 99.832905 64.025 2.859035 12.810227 6.218475 6.776292 4.002578 37 0.654162   9 B0017-32_14 99.709197 79.159 2.648477 13.542462 6.265120 6.653105 3.521976 76 0.654162

Sorry for the format, I don't know how to upload image here

Upvotes 0

replied to AntonioDeDomenico31 Jul 2023, 10:38

In the files we can download the data_rate and when the fault occurs are all

Upvotes 0

Multimedia university of kenya

31 Jul 2023, 11:00

Upvotes 0

replied to Koleshjr31 Jul 2023, 11:08

The test set is missing. Or do you have it? I was only asking this because I thought the data was complete. I now understand the problem.

Upvotes 0

replied to lesjones31 Jul 2023, 11:14

Multimedia university of kenya

Oh sorry about that, Yes The test set had been posted earlier , I don't know why its missing @amyflorida626 can you kindly solve this?

Upvotes 0

replied to lesjones31 Jul 2023, 11:14

Multimedia university of kenya

But you have been scored on the lb, what have you used??

Upvotes 0

replied to Koleshjr31 Jul 2023, 11:34

I was experimenting with the data to better understand what I was missing. I just used the sample submission files. That is how I noticed it was missing.

Upvotes 0

replied to AntonioDeDomenico1 Aug 2023, 15:31

Hello everyone, the required data is now available. Antonio

1 Aug 2023, 14:47

Upvotes 0

Stark

Wedoo.ai

Thanks and correct me if I'm wrong but the folder given to us provides the different test csv files I guess (without the row for which the fault has occured). Based on that, I'm not really sure of what to predict anymore. Can you please clarify this? Thanks :)

Upvotes 0

replied to Stark1 Aug 2023, 15:34

For each of these files, you need to predict whether the data_rate in the row where the fault has occured or not is larger or smaller than the data_rate measured just before the fault occurs. One file, one output, as you can see in the SampleSubmission.

Upvotes 0

Stark

Wedoo.ai

Ok makes more sense, thanks

replied to AntonioDeDomenico1 Aug 2023, 15:41

Upvotes 0

Juliuss

Freelance

Hello @AntonioDeDomenico, amyflorida626

I have an ask!

-Can I choose which validation set to use at my discretion? If I decide to stick with the original validation set and disregard the new one, would that pose any issues?

-Alternatively, if I opt to utilize both validation sets, would there be any problems associated with that approach as well

Regards

replied to AntonioDeDomenico2 Aug 2023, 06:37

Upvotes 0

replied to AntonioDeDomenico2 Aug 2023, 06:50

National Institute of Technology Silchar

Hello Antonio,

Regarding the training files....Do we label them at observation level or file level (Because we will be predicting one value per file in the validation)?

- If it were to label at obsevation level....then the validation needs KPI values (EXCEPT data_rate) when the fault occurs to make prediction.

- If it were to label at file level....how to consider multiple 1 state in a single file.

Thanks

Upvotes 0

replied to Juliuss2 Aug 2023, 07:36

Hi Julius, the first validation dataset is just a pre-processed version of the new one, which I transfer to Amy by mistake. Using both of them would just add redundant information. The new one is in line with the training data, includes more info and your solution will be evaluated based on It, we expect 1932 prediction, one per file in the dataset.

Upvotes 0

replied to Rakesh_Jarupula2 Aug 2023, 07:41

Hi Rakesh, I hope i understand well your question. In the training set you need to label the datarate change when the fault occurs, max 1label per file, comparing the datarate in the row prior to the fault and the datarate measured when the fault appears.

Upvotes 0

Juliuss

Freelance

Noted and thanks for the prompt reply @AntonioDeDomenico?

replied to AntonioDeDomenico2 Aug 2023, 07:44

Upvotes 0

replied to Juliuss2 Aug 2023, 07:47

Hi Julius, if i give you that information, you do not need a model, you would just compare the two measured datarates to provide me the output i asked. Is it clear?

Upvotes 0

Juliuss

Freelance

Yea got it now. Thanks alot

replied to AntonioDeDomenico2 Aug 2023, 07:49

Upvotes 0