User avatar
National Engineering School of Carthage (ENICarthage)
Data Augmentation and Data Oversampling
Help · 8 Jun 2020, 19:26 · 2

In the competition info, there is a rule that said : 'You may use only the datasets provided for this competition.' So in case of data oversampling, it is allowed because you are using the same dataset. But augmenting data in NLP, is quite different than CV, and you need an external data source considering that your train set is very small.

So do you think that using some external sources to augment your data is against the rules here ?

Discussion 2 answers
User avatar
niksss
University of California, Santa Cruz

You can oversample your data till the point you're not involving any external data points. Someone correct me if I'm wrong.

8 Jun 2020, 19:31
Upvotes 0
User avatar
National Engineering School of Carthage (ENICarthage)

By defenition, Oversampling does not involve external data, but in order to get good results with Augmentation, you need an external data source, and here lies the problem.