I found the data set is also too low to make a good clinical SLM, the Zindi team should redefine the challenge to higher the probability of getting a good clinical reasoning SLM through the participants as we all have seen clinician responses are actually penalized over the prompt summarizers.
yes, exactly this was happening