Uber Nairobi Ambulance Perambulation Challenge
$6,000 USD
Can you use ML to create an optimised ambulance deployment strategy in Nairobi?
979 data scientists enrolled, 330 on the leaderboard
17 September 2020—24 January 2021
Ends in 7 days
How to merge road segments_id to train
published 13 Jan 2021, 12:12

Greetings Zidians, better late than never. Can someone give me a hint on how to merge road survey data with the Train. Thanks!

Hi Acalo!

You can try to find the segment which is nearest of each accident location point in train set and get the index of this line in roads data. And then, for the given index in road data, add in train the additional values from road data.

import numpy as np

import geopandas as gpd # For loading the map of road segments

from shapely.geometry import Point, LineString

from tqdm.notebook import tqdm

train = pd.read_csv('Data/Train.csv', parse_dates=['datetime'])

road_surveys = pd.read_csv('Data/Segment_info.csv')

road_segment_locs = gpd.read_file('Data/segments_geometry.geojson')

geo_data = pd.merge(road_surveys, road_segment_locs, on="segment_id", how="right")

train["point"]=gpd.points_from_xy(train["longitude"], train["latitude"])

def segment_finder(segments, point):

distances=[line.distance(point) for line in segments]

return np.argmin(distances)

indices=[segment_finder(geo_data["geometry"], point) for point in tqdm(df["point"])]


additional_data.reset_index(drop=True, inplace=True)

def add_data(col):

return additional_data[col]

for col in tqdm(geo_data.columns):