Mine :
CV : 26.19 LB:31.11
CV: 27.6 LB: 29.33
cv 26 lb 29
CV: 30.26 LB: 32.35
you may use data from the same place in both train and test
try this:
from sklearn.cluster import KMeans
tr=train['Place_ID'].unique()[x:]
te=train['Place_ID'].unique()[:x]
tr=train[train['Place_ID'].isin(tr)]
te=train[train['Place_ID'].isin(te)]
X_train,y_train=tr.drop(columns=['Place_ID X Date', 'Date', 'Place_ID', 'target', 'target_min',
'target_max', 'target_variance', 'target_count','target_diff']),tr['target']
X_test,y_test=te.drop(columns=['Place_ID X Date', 'Date', 'Place_ID', 'target', 'target_min',
'target_max', 'target_variance', 'target_count','target_diff']),te['target']
CV: 24.52 LB: 30.26
Before giving your CV score, what is your validation strategy so that everyone can compare ?
CV: 27.6 LB: 29.33
cv 26 lb 29
CV: 30.26 LB: 32.35
you may use data from the same place in both train and test
try this:
from sklearn.cluster import KMeans
tr=train['Place_ID'].unique()[x:]
te=train['Place_ID'].unique()[:x]
tr=train[train['Place_ID'].isin(tr)]
te=train[train['Place_ID'].isin(te)]
X_train,y_train=tr.drop(columns=['Place_ID X Date', 'Date', 'Place_ID', 'target', 'target_min',
'target_max', 'target_variance', 'target_count','target_diff']),tr['target']
X_test,y_test=te.drop(columns=['Place_ID X Date', 'Date', 'Place_ID', 'target', 'target_min',
'target_max', 'target_variance', 'target_count','target_diff']),te['target']
CV: 24.52 LB: 30.26
Before giving your CV score, what is your validation strategy so that everyone can compare ?