@Zindi The test data contains around 260 files, which have more features. should we at least add some lengthy audio files to the train data so that the model can be trained on them, and achieve almost the same accuracy on both validation and test data? At the moment, there is a huge difference between the model accuracy (without any overfitting) and the leaderboard results.