Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Best Practices in Feature Engineering for Tabular Data with GPU Acceleration
, NVIDIA
, NVIDIA
, NVIDIA
Many resources, such as university courses, (online) tutorials or academic literature, focus mainly on different models and model types and rarely discuss the steps for preprocessing or feature engineering. Feature engineering is an important component in (tabular) machine learning problems, which can be easily integrated into an existing model. In this tutorial, we teach best practices for feature engineering techniques specific to tabular data building off our teams’ collective experience competing in data science competitions, such as Kaggle and RecSys. The tabular data structure limits the models capabilities to learn the relationships between features and adding hand crafted features can significantly boost their performance. For example, our team won the RecSys2020 challenge by developing hand crafted features, which outperformed complex models without them. We were able to reduce the calculation time from multiple days to less than an hour by adding GPU acceleration. The speed up was crucial for our high score as we could run more experiments.
In this hands-on tutorial, we will teach you how to create features for tabular data and accelerate your code with GPUs: Feature Engineering: Best practices on how to create features from categorical, numerical and time series data GPU acceleration: Accelerate your data frame operations with RAPIDS.AI cuDF using its pandas-like API Examples: Get a lot of example functions for your machine learning problem
Upon completion, you'll be able to utilize RAPIDS.AI cuDF to apply the feature engineering techniques to your pipeline and accelerate it with GPUs.