I've created a baseline using Phi-2 with simple RAG implementation and LoRA fine-tuning: https://github.com/progin2037/specializing_llm_for_telecom_networks . The repository includes an already fine-tuned model, in case you run into out-of-memory issues. Keep in mind that your result using this code could be slightly different but similar. My solution achieved 0.6 LB when using RAG and fine-tuning with RAG context, 0.57 LB when using fine-tuning and then RAG without fine-tuning with context and 0.54 LB with fine-tuning without RAG. When using only a fraction of documents for RAG (50%), the results get a little worse by ~0.01-0.03 on the leaderboard.
Hello sir. What was the training time ? Thks.
Hello @nostml,
Fine-tuning with 3 epochs takes about 13 minutes, inference on train and test takes ~5 minutes and vectorizing and storing all rel18 documents takes ~1 hour, but that elapsed time is for RTX 3090Ti with 24 GB of VRAM.
RTX 3090Ti with 24GB VRAM, is this a Deep Learning rig you bought. That means vectorizing the rel18 documents is accelerated because you're on a GPU environment right?
I think so. You could get free access to a decent GPU through Colab or Kaggle.
1 hour for vectorizing all the rel18 documents that is quite fast. Some of the word documents even have images. That's fast processing right there.
So is rel19 available ? Can you share it AdeptSchneider22 ?
What do you mean? I don't follow, the documents we are all using are in rel18.rar file.
You wrote rel19 documents above. I think it was a mistake.
I have edited the message. Sorry I meant rel18. I'm struggling to create a vector database with all the documents as embeddings to have a proper RAG pipeline to iterate on. Yet to run inference on Falcon 7B Instruct model without RAG to see how it performs.