Thank you! Some images in the Train / csv sometimes no longer have annotations in the associated label file. We can clearly see in the train that an image is associated with cssvd and healthy but in the associated label file we no longer see cssvd. Is this an anomaly or a misunderstanding on my part?
I believe the correct IDs are in the labels folder
Yeah, I guess so, but ... those are all numeric.
So is a label of 2 or 1 healthy? Is cssvd 0 or 1?
Thank you for pointing this error out. Here is a quick fix that you can include in your notebook:
# strip any spacing from the class item and make sure it is a string
train['class'] = train['class'].str.strip()
# The correct mapping from class to class_id
class_map = {cls: i for i, cls in enumerate(sorted(train['class'].unique().tolist()))}
# This will give you
{'anthracnose': 0, 'cssvd': 1, 'healthy': 2}
# Map it
train['class_id'] = train['class'].map(class_map)
# Check
train[['class', 'class_id']].value_counts()
class class_id count
healthy 2 4280
cssvd 1 3241
anthracnose 0 2271
Great, thanks for clarifying
Thank you! Some images in the Train / csv sometimes no longer have annotations in the associated label file. We can clearly see in the train that an image is associated with cssvd and healthy but in the associated label file we no longer see cssvd. Is this an anomaly or a misunderstanding on my part?