🏥 Must-Read: Starter Notebooks 😉!

Malawi Public Health Systems LLM Challenge

Helping Malawi

$2 000 USD

Challenge completed almost 2 years ago

Skills you will learn

Questioning and Answering

Generative AI

409 joined

74 active

Info Data Chat Leaderboard

Start

Jan 24, 24

Mar 03, 24

Reveal

Mar 03, 24

Professor

Starter Notebooks 😉!

Notebooks · 11 Feb 2024, 11:06 · 22

Hi hackers, 👋 I trust you are having a great time in this competition. Gen-AI competitions are new, so you should have as much fun as possible🌟. For folks still wondering how to even get started, I present you with two baseline approaches and some tips on how to do better at the ROUGE-1 metric. I also took out quality time to explain as high level as possible.🚀 Cheers.

1. RAG approach (0.47+ public score)📈:

GPU only(quantization): The notebook is available on GitHub, Colab, and Kaggle.
CPU and GPU Compatible (GGUF equivalent): The notebook is available on GitHub, Colab, and Kaggle.

2. Fine-tuning approach (~0.32 public score)📊:

Training: For training your model and saving on the hub. It's available on GitHub, Colab, and Kaggle.
Inference: Inference and submission to Zindi. It's available on GitHub, Colab, and Kaggle.

Have fun!! 🥂 Keep learning! Keep Winning! 🏆

Discussion 22 answers

Nayal_17

Great work @Professor, but we can't use GPUs in this competition and this RAG notebook used quantization which requires GPUs.

11 Feb 2024, 11:53

Upvotes 0

rashid0784

I don't see anything in the info that says that GPUs aren't allowed 🤷🏿‍♂️

replied to Nayal_1711 Feb 2024, 17:54

Upvotes 0

Koleshjr

Multimedia university of kenya

https://zindi.africa/competitions/malawi-public-health-systems-llm-challenge/discussions/20000

replied to rashid078411 Feb 2024, 21:00

Upvotes 0

Professor

Hi, @Nayal_17, thanks for pointing that out. You are completely right, Just checked out Steve's thread. Bits and Bytes require GPUs, so it may make more sense to switch to the GGUF/GGML equivalent of the model on TheBloke's HF repo. I'll implement that and edit the post. But damn, inference on the entire test set may take years on Kaggle's CPU.😀

replied to Nayal_1712 Feb 2024, 03:01

Upvotes 0

Koleshjr

Multimedia university of kenya

I know right, Tbf you can perform inference using gpu enabled runtime, but make sure the model can run locally as well, my opinion though. That way you will speed up your experimentation time. Performing inference locally may take over 1 day. My experience though

replied to Professor12 Feb 2024, 04:10

Upvotes 1

Professor

Yeah, that makes more sense. Damn! You spent over a day doing inference? 😅

replied to Koleshjr12 Feb 2024, 09:50

Upvotes 0

Koleshjr

Multimedia university of kenya

Yeah broooo, it's tough 😂😂😂

Not on kaggle cpus though

In my local laptop

But maybe others have found a way to run them faster on cpu I don't know

replied to Professor12 Feb 2024, 12:05

Upvotes 1

KevinKibe

@Professor

No other way to run inference faster on only a CPU, unless you use smaller models like the tiny Flan T-5 , BERT, GPT 2.

replied to Koleshjr13 Feb 2024, 17:38

Upvotes 1

Professor

Yeah @KevinKibe, You are right, but as KoleshJr has said, an idea is that you can probably use a GPU to run your experiments for submission to the leaderboard, this will help you experiment faster. But most importantly, ensure that your solution works on a CPU. Also when I ran !lscpu on Kaggle I saw that the processor there has 2 cores and 4 threads, whereas the specifications limit for this comp is a core i9 (8 cores & 16 threads). So there's a high chance a single inference will be 4 times faster with an i9.

replied to KevinKibe14 Feb 2024, 18:37

Upvotes 0

Nayal_17

@Koleshjr, are you using CTransformers, llamacpp or original transformers library without quantization.

replied to Koleshjr18 Feb 2024, 12:44

Upvotes 0

Koleshjr

Multimedia university of kenya

I am using ollama models

replied to Nayal_1718 Feb 2024, 13:54

Upvotes 0

Reacher

Good job! thanks for sharing.

11 Feb 2024, 12:27

Upvotes 1

Professor

Thanks, Reacher.🥂

replied to Reacher12 Feb 2024, 09:51

Upvotes 0

GIrum

Adama Science and Technology University

am just a starter in all this sorry if it is a silly question, but the code in the RAG notebook doesn't gives error when i try models like mixtral and with other models too.

12 Feb 2024, 20:36

Upvotes 0

Professor

Hi @GIrum, please feel free to share a screenshot of your error or create a new discussion, I'm sure you'll get help. Also, you might want to note that an edited version of the notebook is now available which supports CPU/GPUs depending on the available compute.

replied to GIrum13 Feb 2024, 05:26

Upvotes 0

AdeptSchneider22

Kenyatta University

I know it is a bit late but I think you need to replace the model_type in the CTransformer class instantiation with mistral i.e. llm = CTransformers(model='mistral-7b-v1.0.gguf', # Location of downloaded GGML model

model_type='mistral',

batch_size=4,

config=config)

replied to GIrum25 Feb 2024, 07:09

Upvotes 2

Professor

Edit: A CPU/GPU-compatible version was added. the previous notebook was modified. Cheers 🥂

13 Feb 2024, 05:30

Upvotes 0

duongkstn

About Fine-tuning approach, it is unusual when training to predict answer from question rather than from (question + relevant passages).

I mean: to my knowledge, you should find relevant passages before predicting answer.

19 Feb 2024, 03:56

Upvotes 1

Professor

Yeah, depending on your usecase. This is a baseline to get started. Conventionally, fine-tuning is mostly just question/answer pairs, introducing relevant context changes it to a rag based approach.

To rank in top positions, I'll assume that you have to combine both RAG and Fine-tuning techniques.

replied to duongkstn19 Feb 2024, 08:24

Upvotes 0