There is no data to download for this reinforcement learning challenge. Instead you will simulate an environment using the code provided here.
At the core of the challenge are underlying epidemiological models which describe the transmission, control and prevalence of the malaria parasite. These are abstracted so that malaria control may be framed as a sequential decision making process. You can download slides from a short presetation by Oliver Bent, the file is Indabe_Zindi.pdf.
Copyright 2019 Sekou L Remy
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
You can install the environment using the pip install line below.
pip install git+https://github.com/ibm/ushiriki-policy-engine-library --user
Below is the start code, you can also download the code in the StarterCode.txt file.
import numpy as np
import random
from ushiriki_policy_engine_library.DLI19ChallengeEnvironment import ChallengeEnvironment
from ushiriki_policy_engine_library.EvaluateSubmission import EvaluateAugmentedChallengeSubmission, EvaluateChallengeSubmission
class ChallengeEnvironment1(ChallengeEnvironment):
__init__(self):
ChallengeEnvironment.__init__(self,baseuri="http://alpha-upe-challenge.eu-gb.mybluemix.net", experimentCount=2000)
class CustomAgent:
def __init__(self, environment):
self.environment = environment
def generate(self):
best_policy = None
best_reward = -float('Inf')
candidates = []
try:
# Agents should make use of 20 episodes in each training run, if making sequential decisions
for i in range(20):
self.environment.reset()
policy = {}
for j in range(self.environment.policyDimension): #episode length
policy[str(j+1)] = [random.random(),random.random()]
candidates.append(policy)
rewards = self.environment.evaluatePolicy(candidates)
best_policy = candidates[np.argmax(rewards)]
best_reward = rewards[np.argmax(rewards)]
except (KeyboardInterrupt, SystemExit):
print(exc_info())
return best_policy, best_reward
eval = EvaluateAugmentedChallengeSubmission(ChallengeEnvironment1, CustomAgent, "test.csv") Join the largest network for
data scientists and AI builders