Simple LLM fine-tuning solution with H2O LLM Studio
According to the latest rules only the provided train CSV was used for model training. The dataset had quite a few missing values with empty tweets and locations. All records with missing values were dropped. For train-validation split 80-20% random sample was used.
Initially I started with Small Language Models H2O Danube2-1.8B and H2O Danube3-4B. With H2O Danube3-4B it was possible to reach 0.132 on the Public Leaderboard.
Utilizing larger models (e.g. meta-llama/Meta-Llama-3-8B) resulted in slight improvements both on my validation set and the public leaderboard, achieving WER between 0.118 and 0.121.
You can find all the details at: https://github.com/gaborfodor/zindi-location-recognition
Thank you!
Thank you