I know using a recurrent neural network may be an overkill, but I figured out how to show the importance of the input features to my RNN to the forecast values. It is by basically computing the derivatives. That being said, does gradient importance equal feature importance?