Primary competition visual

Spatio-Temporal Beam-Level Traffic Forecasting Challenge by ITU

12 000 CHF
Completed (over 1 year ago)
Forecast
716 joined
171 active
Starti
Jul 24, 24
Closei
Oct 11, 24
Reveali
Oct 11, 24
User avatar
skaak
Ferra Solutions
It is what it is ...
Help · 2 Aug 2024, 09:26 · 8

Hmmmm, perhaps it simply it-is-what-it-is, but ... I was expecting the sample submission file to contain values, not just 0 everywhere, for the "other" non-target variables. Actually, I was hoping there would be a "test" file with those values in. So we have to predict the target "blindly". No values for any of the other variables during those two weeks to help with the prediciton?

Discussion 8 answers
User avatar
TheGoodGuy

Some feature engineering will come in handy. The newly created features will really help in training and predicting for the two weeks.

2 Aug 2024, 09:35
Upvotes 0

Great point, on top of that a simple "starter notebook" would work well to clarify all the doubts that participants have.

2 Aug 2024, 09:37
Upvotes 0
User avatar
skaak
Ferra Solutions

Yip ...

Here is a snippet that I found useful - if you run this, you will get the same labels as the files provided.

n_base = 30
n_cell =  3
n_beam = 32

mod_cols = []
for base in range ( n_base ) :
    for cell in range ( n_cell ) :
        for beam in range ( n_beam ) :
            mod_col = f"{ base }_{ cell }_{ beam }"
            mod_cols.append ( mod_col )
User avatar
skaak
Ferra Solutions

Also, to pivot the sample submission (here I call it ss) I use this ... but of course, it ends up with just zeroes ...

ss [ "var_name" ], ss [ "coords" ] = zip ( * ss [ "ID" ].str.split ( "_test_" ) )
# EDIT: this!
ss [ "week"     ], \
ss [ "hour"     ], \
ss [ "base"     ], \
ss [ "cell"     ], \
ss [ "beam"     ] = zip ( * ss [ "coords" ].str.split ( "_" ) )

ss_pivot = ss.pivot ( columns = [ "base", "cell", "beam" ], index = [ "var_name", "week", "hour" ], values = "Target" )
ss_pivot.columns = mod_cols
ss_pivot = ss_pivot.reset_index ()

EDIT: I'm fighting the editing system over here ... also, important omission here that I added in.

Anyhow, this is sort of the point of my post, I was hoping to do this and end up with a bunch of explanatory variables extracted from the sample submission to use to model the target variables with, but it seems we just predict into the unseen wihtout any additional variables. So I guess, it is what it is ...

User avatar
skaak
Ferra Solutions

And then ... to create files similar to the input files but from the sample submission

for i in [ "traffic_DLThpVol", "traffic_DLPRB", "traffic_DLThpTime", "traffic_MR_number" ] :

    for j in [ "5w-6w", "10w-11w" ] :

        x = ss_pivot.loc [ ( ss_pivot [ "var_name" ] == i ) & ( ss_pivot [ "week" ] == j ) ]
        x [ "hour" ] = x [ "hour" ].astype ( int )
        x = x.sort_values ( "hour" ).reset_index ()
        x [ mod_cols ].to_csv ( f"../output/{ i }={ j }.csv" )

speaking as someone who does time-series predictions in the the real world .... this is often what you actually would get. sure, sometimes you can back out some day of week or add some future holidays if you have data. but usually we dont know any of the X's or Y's for two week from now.

2 Aug 2024, 12:20
Upvotes 1
User avatar
Jaw22
Zindi africa

@Skaak

I tend to concur, the ds is very ruff and rudimentary. Getting it together and in a train / test format is a mean task, unless you the Optimus Prime of DS. Also no starter webinar or starter notebook!

3 Aug 2024, 18:30
Upvotes 0
User avatar
skaak
Ferra Solutions

Jaw!

So nice to see you - how are you doing my good friend? Did you play any of the fossil comps? This reminds me of those. Also a bit like Sasol. Oh well ... as CB says, it is what it is.

I think, if this was a real world one, you'd struggle to beat ARIMA ... just work the stuff through that and you have decent forecasts and a relatively simple model that scales well to this sort of a thing. But ... to win a comp, you have to dig deeper I guess.

Best wishes for this one ... you are the shark of DS of course and, afaik, time series are your thing. Hope you do really well here.