Primary competition visual

Malawi Public Health Systems LLM Challenge

Helping Malawi
$2 000 USD
Challenge completed over 1 year ago
Questioning and Answering
Generative AI
407 joined
74 active
Starti
Jan 24, 24
Closei
Mar 03, 24
Reveali
Mar 03, 24
User avatar
Professor
Starter Notebooks ๐Ÿ˜‰!
Notebooks ยท 11 Feb 2024, 11:06 ยท 22

Hi hackers, 👋 I trust you are having a great time in this competition. Gen-AI competitions are new, so you should have as much fun as possible🌟. For folks still wondering how to even get started, I present you with two baseline approaches and some tips on how to do better at the ROUGE-1 metric. I also took out quality time to explain as high level as possible.🚀 Cheers.

1. RAG approach (0.47+ public score)📈:

  • GPU only(quantization): The notebook is available on GitHub, Colab, and Kaggle.
  • CPU and GPU Compatible (GGUF equivalent): The notebook is available on GitHub, Colab, and Kaggle.

2. Fine-tuning approach (~0.32 public score)📊:

  • Training: For training your model and saving on the hub. It's available on GitHub, Colab, and Kaggle.
  • Inference: Inference and submission to Zindi. It's available on GitHub, Colab, and Kaggle.

Have fun!! 🥂 Keep learning! Keep Winning! 🏆

Discussion 22 answers
User avatar
Nayal_17

Great work @Professor, but we can't use GPUs in this competition and this RAG notebook used quantization which requires GPUs.

11 Feb 2024, 11:53
Upvotes 0

I don't see anything in the info that says that GPUs aren't allowed 🤷🏿‍♂️

User avatar
Koleshjr
Multimedia university of kenya
User avatar
Professor

Hi, @Nayal_17, thanks for pointing that out. You are completely right, Just checked out Steve's thread. Bits and Bytes require GPUs, so it may make more sense to switch to the GGUF/GGML equivalent of the model on TheBloke's HF repo. I'll implement that and edit the post. But damn, inference on the entire test set may take years on Kaggle's CPU.😀

User avatar
Koleshjr
Multimedia university of kenya

I know right, Tbf you can perform inference using gpu enabled runtime, but make sure the model can run locally as well, my opinion though. That way you will speed up your experimentation time. Performing inference locally may take over 1 day. My experience though

User avatar
Professor

Yeah, that makes more sense. Damn! You spent over a day doing inference? 😅

User avatar
Koleshjr
Multimedia university of kenya

Yeah broooo, it's tough 😂😂😂

Not on kaggle cpus though

In my local laptop

But maybe others have found a way to run them faster on cpu I don't know

@Professor

No other way to run inference faster on only a CPU, unless you use smaller models like the tiny Flan T-5 , BERT, GPT 2.

User avatar
Professor

Yeah @KevinKibe, You are right, but as KoleshJr has said, an idea is that you can probably use a GPU to run your experiments for submission to the leaderboard, this will help you experiment faster. But most importantly, ensure that your solution works on a CPU. Also when I ran !lscpu on Kaggle I saw that the processor there has 2 cores and 4 threads, whereas the specifications limit for this comp is a core i9 (8 cores & 16 threads). So there's a high chance a single inference will be 4 times faster with an i9.

User avatar
Nayal_17

@Koleshjr, are you using CTransformers, llamacpp or original transformers library without quantization.

User avatar
Koleshjr
Multimedia university of kenya

I am using ollama models

Good job! thanks for sharing.

11 Feb 2024, 12:27
Upvotes 1
User avatar
Professor

Thanks, Reacher.🥂

User avatar
GIrum
Adama Science and Technology University

am just a starter in all this sorry if it is a silly question, but the code in the RAG notebook doesn't gives error when i try models like mixtral and with other models too.

12 Feb 2024, 20:36
Upvotes 0
User avatar
Professor

Hi @GIrum, please feel free to share a screenshot of your error or create a new discussion, I'm sure you'll get help. Also, you might want to note that an edited version of the notebook is now available which supports CPU/GPUs depending on the available compute.

User avatar
AdeptSchneider22
Kenyatta University

I know it is a bit late but I think you need to replace the model_type in the CTransformer class instantiation with mistral i.e. llm = CTransformers(model='mistral-7b-v1.0.gguf', # Location of downloaded GGML model

model_type='mistral',

batch_size=4,

config=config)

User avatar
Professor

Edit: A CPU/GPU-compatible version was added. the previous notebook was modified. Cheers 🥂

13 Feb 2024, 05:30
Upvotes 0

About Fine-tuning approach, it is unusual when training to predict answer from question rather than from (question + relevant passages).

I mean: to my knowledge, you should find relevant passages before predicting answer.

19 Feb 2024, 03:56
Upvotes 1
User avatar
Professor

Yeah, depending on your usecase. This is a baseline to get started. Conventionally, fine-tuning is mostly just question/answer pairs, introducing relevant context changes it to a rag based approach.

To rank in top positions, I'll assume that you have to combine both RAG and Fine-tuning techniques.

User avatar
AdeptSchneider22
Kenyatta University

I agree. Thanks a lot @Professor for sharing your knowledge especially the RAG notebook.

User avatar
Professor

This is very educational. Thanks a lot.

4 Mar 2024, 06:14
Upvotes 1