🎥 Data Talk: Ideas for Quick Processing for... - 289 Views

Compete Jobs Learn Chat Leaderboard

More

For Business Partners Meet the team Press Case studies AI4EAC

Primary competition visual

Hulkshare Recommendation Algorithm Challenge

$7 500 USD

Challenge completed over 3 years ago

Skills you will learn

Prediction

Collaborative Filtering

510 joined

70 active

Info Data Chat Leaderboard

Start

Feb 03, 22

Close

May 01, 22

Reveal

May 01, 22

Ideas for Quick Processing for Feature Engineering

Data · 17 Mar 2022, 09:54 · edited 2 minutes later · 2

Hello,

There is a major hassle with utilizing proper feature engineering generation in this challenge, with the large dataset there's a time complexity. Can we share how we are going about optimizing the process on this discussion, I will update this discussion with any ideas as I explore ways to optimize the process as well. Let's learn from each other about parallelism and efficiency.

Thank you.

Discussion 2 answers

Update-

Here are some helpful resources for accelerated workflows and data processing:

Memory Usage Reduction- https://www.kaggle.com/code/gemartin/load-data-reduce-memory-usage/notebook

RAPIDS AI CUDF - Enables accelerated workflows for tabular data on CUDA https://docs.rapids.ai/api/cudf/stable/

RAPIDS AI CUML- Accelerated model training, GPU integration with traditional machine learning algorithms: https://www.analyticsvidhya.com/blog/2022/01/cuml-blazing-fast-machine-learning-model-training-with-nvidias-rapids/, https://medium.com/rapids-ai/10-minutes-to-rapids-cudf-and-dask-cudf-3d16fcb84139

PANDAS + DASK - https://pandas.pydata.org/docs/user_guide/scale.html, https://www.vantage-ai.com/en/blog/4-strategies-how-to-deal-with-large-datasets-in-pandas, https://docs.dask.org/en/stable/10-minutes-to-dask.html

P.S- I'm yet to implement them and see the benefits.

8 Apr 2022, 02:48 (edited ~3 hours later)

Upvotes 0

your welcome to contunue the conversation on our discourd https://discord.gg/TwsnzK8k

12 Apr 2022, 08:19

Upvotes 0

Join the largest network for
data scientists and AI builders

Privacy Policy Terms of Use Rules

© Zindi 2025