AI4D Africa’s Anglophone Research Lab Tanzania Tourism Classification Challenge 🏝️

AI4D Africa’s Anglophone Research Lab Tanzania Tourism Classification Challenge

Helping Tanzania, United Republic of

$1 000 USD

Completed (~4 years ago)

Skills you will learn

Classification

509 joined

180 active

Info Data Chat Leaderboard

Start

Jun 01, 22

Jul 01, 22

Reveal

Jul 01, 22

About

The dataset describes 24,675 rows of up-to-date information on tourist expenditure collected by the National Bureau of Statistics (NBS) in Tanzania.The dataset was collected to gain a better understanding of the status of the tourism sector and provide an instrument that will enable sector growth.

Your goal is to accurately classify the range of expenditures a tourist spends in Tanzania.

The majority of the visitors under the age group of 25-44 came for business (18.5%), or leisure and holidays (53.2%), which is consistent with the fact that they are economically more productive. Those at the age group of 45-64 were more prominent in holiday making and visiting friends and relatives. The results further reveal that most visitors belonging to the age group of 18-24 came for leisure and holidays (55.3%) as well as volunteering (13.7%). The majority of senior citizens (65 and above) came for leisure and holidays (80.9%) and visiting friends and relatives (9.5%).

The survey covers seven departure points, namely: Julius Nyerere International Airport, Kilimanjaro International Airport, Abeid Amani Karume International Airport, and the Namanga, Tunduma, Mtukula and Manyovu border points.

Files

Description

Files

provides definitions of the variables found in Test.csv and Train.csv

is an example of what your submission file should look like. Note that this is a table of probabilities across the six cost categories (High Cost, Higher Cost, Highest Cost, Low Cost, Lower Cost and Normal Cost).

is the dataset to which you will apply your model to test how well it performs. The test set contains 6,169 rows of tourists information. This dataset includes the same fields as train.csv except for the last column. Use your model and this dataset to predict in which of the six classifications the tourist is likely in (High Cost, Higher Cost, Highest Cost, Low Cost, Lower Cost and Normal Cost)

contains the target. This is the dataset that you will use to train your model.

Join the largest network for
data scientists and AI builders

About FAQs

Status