Is server.py part of the evaluation environment, or is it provided only as a reference interface? Are participants allowed to modify it or define new tools, either within it or externally? (SOLVED: https://www.youtube.com/live/PbS0kxmMiak)
Additionally, how similar will the Phase 2 and Phase 3 datasets be compared to Phase 1? Will they follow the same schema, distributions, and difficulty level? In previous telco competitions, there were significant changes in data structure and difficulty, so we would like to better understand what to expect here.
Hi, the server.py should not be modified. You can create tools to process the acquired data or retrieve information from other sources. The datasets in the 3 phases will contain the same type (level of complexity) of questions in this competition.
Thanks @AntonioDeDomenico, how about the utils.py and types.py? it seems class Scenario(BaseModel) missing placeholder for tag that contains info about multi-answer or single-answer, or is it just intended like that?