To zindi i hope the competition description is not well enough the target variable is unknown and no description to go about
Hello Nasere, The challenge is a rare one that deviates from the usual Data science predictive analytics. We are presented with clea and dirty data. First you have to stack each column in dirty and clean side by side to create your labels. On one hand,where equal, means no error or 0. On the other hand, where not equal means error or 1. With this you have your labels.
To build a classifier, you have various options: anomaly detection algorithms; classification or clustering algorithms or a combination of them.
Lastly you will definitely need lots of text analytics skills to make sense of the data.