Can we provide the exact formula for the mean absolute error calculation used for the test score? It seems like it does not appear to be likely the same magnitude as taking the absolute difference between each image-class pair's counts and taking the mean. Thanks
just use mean_absolute_error from sklearn