🩺 Trending Now: 0.35 Score Notebook with infer...

Kenya Clinical Reasoning Challenge

Helping Kenya

$10 000 USD

Completed (9 months ago)

Skills you will learn

Prediction

Natural Language Processing

SLM

1666 joined

440 active

Info Data Chat Leaderboard

Start

Apr 03, 25

Jun 29, 25

Reveal

Jun 30, 25

Joseph_gitau

African center for data science and analytics

0.35 Score Notebook with inference Below 100ms

Notebooks · 24 May 2025, 09:36 · 11

https://colab.research.google.com/drive/18ySAIJ-pfdlaXeLouaT19DSmMFfs3HrE?usp=sharing

Hi All above is a notebook I have been working on, testing to improve inference speeds to be below 100ms as per the constraints. You can run the whole notebook on colab and test the results there as well.

The first inference method (321.28 ms) gives a score of 0.38 which means for fast inference the accuracy is sacrifised a little bit. Please review and share your comments.

Discussion 11 answers

Joseph_gitau

African center for data science and analytics

Also note Ram Usage is Below 2GB at 1.18GB

24 May 2025, 09:40

Upvotes 1

Bone

The inference speed constraint of 100ms is per vignette and not for the entire test set. I don't know if that helps

24 May 2025, 09:58

Upvotes 0

Joseph_gitau

African center for data science and analytics

In the notebook it's per vignette. Only that it's an everage (Average time per vignette)

replied to Bone24 May 2025, 10:02

Upvotes 0

Bone

Alright

replied to Joseph_gitau24 May 2025, 10:28

Upvotes 0

Bone

I will check the notebook

replied to Bone24 May 2025, 10:28

Upvotes 0

Knowledge_Seeker101

Freelance

Thank you Great work!! that's a clear notebook

Upvotes 0

Great man.

Upvotes 0

very nice.

Upvotes 0

no need to share how you do it but am curious if you are able to hit that 100ms?

27 May 2025, 16:59

Upvotes 0

jeffrey_paul

The inference time should be per vignette not the calculated average inference time per vignette after running the samples throught the model in batches (multiparallel). This is what your code does. As it stands, there is no model between 100M and 200M that meets the constraint of generating full response below 100ms per vignette. I have tested this out with all forms of tweaks (except int1-3).

28 May 2025, 14:02

Upvotes 2

Joseph_gitau

African center for data science and analytics

That's what I noted as well. Something that needs checking and improvement.

replied to jeffrey_paul28 May 2025, 14:25

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status