i am getting decent accuracy(83%) within the training set, but the moment the model is exposed to the test data, things go south(as evidenced by my rank!)..from gridseachcv the XGBClassifier gives the highest score. any pointers to the right direction?..code snipets maybe?
I recommend you try the cross-validation techniques to avoid overfitting.
how many folds would lead to realistic prediction accuracy?
You cant know the exact number of folds, you have to try different numbers, I recommend you try to use 5 to 10 folds.