💊 Trending Now: Word meaning and data set desc...

Indaba Grand Challenge: Curing Leishmaniasis by Deep Learning Indaba

Helping Africa

3000 Zindi Points

Completed (~5 years ago)

Skills you will learn

Reinforcement Learning

343 joined

24 active

Info Data Chat Leaderboard

Start

Jun 29, 20

May 31, 21

Reveal

May 31, 21

Kirus

Nuclear sciences center of tunisia

Word meaning and data set description

Help · 16 Jul 2020, 08:46 · edited ~1 hour later · 2

Hi,

It could be useful if we clarify some words in the dataset:

For Molecules:

1- There are 5 differents sources of molecules: What do you mean by : Endogenous?, In-Trials?, World?

1a- DrugBank_Leishmania means a valid molecules anti-Leishmania?

1b- in-Trials means a molecules in testing by industries?

1c- World means a merge of all molecules in the remain 4 files?

1d- Endogenous means a molecules produced by Leishmania

2- What was the criteria (keyword == Leishmania) to collect these molecules from differents sources? Or it was a just a raw download?

For Targets:

1- What is mean "I.major"?

2- If prefered Target is available, What is the interest for All-Target? For discovering a unknown Targets?

3- Prefered Targets means known Target of valid anti-Leishmania molecules? 30 MB seems to be a lot of proteins...

4- Sequences from Leishmania or Human?

From data Description:

1- Google storage is not yet available.

2- approved drug molecules: We have to wait these molecules from Google Storage? or it is a task that have to do during prediction?

3- safe for humans drug-like molecules: What does it mean?

Thanks

Kirus

Discussion 2 answers

nlopezcarranza

Hello Kirus,

There is a google bucket with available target and ligand PDB structures.

You can download a list of "safe" approved drugs from the bucket : https://storage.googleapis.com/indaba-challenge/molecules.zip

The pdb files describing the possible targets for the Leishmania are also stored in the google bucket https://storage.googleapis.com/indaba-challenge/

You can follow this example : https://instadeep-public.gitlab.io/grandchallenge/IndabaChallenge.html to see how we use this information through a Rosetta docking protocol.

17 Jul 2020, 12:55

Upvotes 0

Kirus

Nuclear sciences center of tunisia

Thanks @nlopezcarranza,

In my case I will not use PyRosetta package. The link of the possible target is not working.

In molecules.zip folder, each file is repeated 3 times with 3 differents Extentions: sdf, pdb and params.

Where can find the target protein for each molecule?
What is the difference between molecules.zip in notebook and molecules.zip in dataset? The forders do not have the same files.

Thanks,

replied to nlopezcarranza20 Jul 2020, 15:48

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status