Hello Nasere, The challenge is a rare one that deviates from the usual Data science predictive analytics. We are presented with clea and dirty data. First you have to stack each column in dirty and clean side by side to create your labels. On one hand,where equal, means no error or 0. On the other hand, where not equal means error or 1. With this you have your labels.
To build a classifier, you have various options: anomaly detection algorithms; classification or clustering algorithms or a combination of them.
Lastly you will definitely need lots of text analytics skills to make sense of the data.
Hello Nasere, The challenge is a rare one that deviates from the usual Data science predictive analytics. We are presented with clea and dirty data. First you have to stack each column in dirty and clean side by side to create your labels. On one hand,where equal, means no error or 0. On the other hand, where not equal means error or 1. With this you have your labels.
To build a classifier, you have various options: anomaly detection algorithms; classification or clustering algorithms or a combination of them.
Lastly you will definitely need lots of text analytics skills to make sense of the data.
WhaiWhaoWhao wonderful
Wawuuu