It's nearly 3 weeks past the competition end. Can the top teams share their approaches even if its not code. Atleast a writeup would be considerate. Most of these competitions are not only for winning prizes but also for learning purposes. We are suppossed to help each other grow. Can we atleast grow the Zindi discussions to be more like Kaggle where even in the next 1-5 hrs after the competition ends, people share their approaches willingly. We should not be begging for writeups mahn.
Also for people who were disqualified, we also would love to know the tricks you used. It was impressive to see people with 0.16 scores hows that even possible. Must be a very interesting trick
Thank you.
#7 Private Solution.
Our solution is an ensemble of 4 models trained using two different pipelines - a basic PyTorch one and one based on fastai. We used a V100 GPU with 32 GB of memory for training.
Data:
We did a basic ten fold split, stratified along the damage column. We then used fold 0 and fold 1 for validations.
PyTorch pipeline:
Here we trained two models: tf_efficientnetv2_b3 and swin_base_patch4_window7_224.ms_in22k_ft_in1k from timm. Training was improved by freezing all years but the last one and the head for 4 epochs and then unfreezing and training for 10 epochs. We also used class weighted logloss for the first 4 epochs and normal logloss afterwards.
* Augmentations:
```
transform = transforms.Compose([
transforms.Resize((IMG_SIZE, IMG_SIZE)), # Resize the image to a fixed size
transforms.RandAugment(),
transforms.RandomPerspective(distortion_scale=0.5, p=0.5),
transforms.RandomRotation((-10,25)),
transforms.ToTensor(), # Convert PIL Image to tensor
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # Normalize the image
transforms.RandomErasing(p=0.5),
])
```
Fastai pipeline:
Here we trained vit_large_patch14_clip_224.openai_ft_in12k_in1k and eva02_large_patch14_448.mim_m38m_ft_in1k using default fastai settings for the most part. We used the augmentations below
```
tfm = [
RandomErasing(p=0.5, max_count=7),
AlbumentationsTransform(get_train_aug()),
]
transfrms = aug_transforms(size = sz, min_scale = 0.75, p_affine=0.5, p_lighting=0.5, max_zoom=2.5, min_zoom=0.25,\
max_lighting=0.5, max_rotate=25.0, max_warp=0.5, xtra_tfms=tfm)
```
CV:
** Fold 0 / Fold 1 **
We optimised the ensemble weights using scipy's minimize function with oof logloss as the objective and got best CV of 0.538 for fold 0 and 0.535 for fold 1.
## Run time:
The total training + inference time of our pipelines was 8.7 hours.
## Remarks
The labels were pretty bad in this competition. we analysed a good number of images where our model was "wrong" and saw that the model was in fact correct. However cleaning attempts seem futile for us as the test set was also equally affected.
eva02_large_patch14_448.mim_m38m_ft_in1k is a very strong model and we sadly didn't have enough memory to train with large image sizes. This would have boosted the score significantly.
thanks @nymfree for sharing
Thanks for sharing
Yes. You should.