Hello everyone,
I've decided to offer a quick recap of the competition for those who may be feeling lost. I also hope to encourage more participants to join because I genuinely believe this is an exciting challenge.
The first thing to note is that this competition is distinct from traditional ML competitions. The data isn't directly ready for predictions. Both the training and test sets are available across several CSV files, so you'll need to merge them before use. Furthermore, the target variable we're supposed to predict isn't immediately available; you'll need to engineer it. The host's decision to present the data in this manner is commendable, as it mirrors real-world situations where data is rarely perfectly structured for modeling.
So, what exactly is the target? It's best described as the "status of the data rate when a fault occurs." In the telecom O&M context, it's vital to ascertain whether a fault will directly affect the end-user. How can we determine this? A fault is likely to impact the end-user when the data rate at the fault's moment is lower than the rate before the fault. If this occurs, the network's service quality deteriorates, likely frustrating the user and perhaps even prompting them to switch to a competitor.
For this competition, our primary concern is the kind of fault that results in a decreased data rate. This is deemed an urgent fault that the company must promptly address.
Take note of the discrepancy in row numbers between the submission file and the test data. This is another unique aspect of this competition; not all test data needs to be used.
Why is this competition engaging?
Final thoughts: Given that data gathering is integral to this challenge and there are various ways to create the target, producing a starter notebook that doesn't influence newcomers can be tricky. Nevertheless, it's an essential skill for a data scientist.
I'm eager to hear thoughts from other competitors. And, if the host spots any inaccuracies in my recap, kindly point them out! :D
Hi @yanteixeira. I agree with you.
OK. THANK YOU. I APPRECIATE.