Hello, whether there is a leak or not you are still not allowed to use the field id as a feature. It will not be useful to the client nor is it good data science skills.
Yes, this is still an interesting insight (For learning purposes though). I noticed that fields (both train and test) that are present in the same chip have sequential ids (1, 2, 3, ...4). So the model might have learned to cluster field ids from the label in the train set.
What do you mean (data leak)?
Hello, whether there is a leak or not you are still not allowed to use the field id as a feature. It will not be useful to the client nor is it good data science skills.
Thank u.
this is exactly what I meant,in real world it means nothing but i found out that using it improves the score.
Yes, this is still an interesting insight (For learning purposes though). I noticed that fields (both train and test) that are present in the same chip have sequential ids (1, 2, 3, ...4). So the model might have learned to cluster field ids from the label in the train set.
hmm smart