Hey everyone, just joined , and this is my first dealing with time :D, i want to know if it is possible to use built-in functions in Pandas to easily extract time-based features? Like how many hours since last transaction by a user or how many transactions did a user do without frauding? Or do i have to make groupby's and do some for loops? Thanks
Hey Blenz!
Just joined too, I have not analyzed the data yet, but this is how it can be done using pandas:
df["hour_column"] = pd.DatetimeIndex( df['timestamp_column'] ).hour
Hey, thanks for the answer but i'm not talking about extracting the hour/minute/second from a timestamp , already did that via pd.to_datetime() then df.column.dt.hour. I'm referring to extracting features after grouping transactions by CustomerId. For example for a number of transactions N made by a user U, i want to know the difference in hours between transaction number N and transaction number N-1 etc.. and other features that are time-based
@Blenz: the pandas groupby built-in function is best at this kind of task. You can also implement your own custom functions on a dataframe using the .apply() built-in function.
Thanks a lot for the input . But i won't be submitting to this competition as it is too late.