The constraint means you cannot use standard end-to-end Deep Learning models where the raw video or image pixels are fed directly into a neural network, and the entire network is then tuned using back-propagation to predict the congestion rating.
We can use computer vision/deep learning models for feature extraction from the raw videos, then train classical ML models on them. So the flow would be computer vision - time series feature engineering - classical ML prediction
ohhh alright. But they made mention that its not also allowed during inferencing. Although most cv models run inference with eval mode, does this still exclude then from backpropagation methods
Just to be clear, I can use deep learning based models to extract features from the footage and then use non deep learning based methods learn patterns in this features right?
The constraint means you cannot use standard end-to-end Deep Learning models where the raw video or image pixels are fed directly into a neural network, and the entire network is then tuned using back-propagation to predict the congestion rating.
We can use computer vision/deep learning models for feature extraction from the raw videos, then train classical ML models on them. So the flow would be computer vision - time series feature engineering - classical ML prediction
ohhh alright. But they made mention that its not also allowed during inferencing. Although most cv models run inference with eval mode, does this still exclude then from backpropagation methods
Just to be clear, I can use deep learning based models to extract features from the footage and then use non deep learning based methods learn patterns in this features right?
During inference, we don't perform any backpropagation (no gradient calculations)