Name: Best Practices in Feature Engineering for Tabular Data with GPU Acceleration T2505 | GTC Digital April 2021 | NVIDIA On-Demand
Uploaded: 2021-04-12T00:00:01Z
Duration: 8374 s
Description: Many resources, such as university courses, (online) tutorials or academic literature, focus mainly on different models and model types and rarely discuss

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Many resources, such as university courses, (online) tutorials or academic literature, focus mainly on different models and model types and rarely discuss the steps for preprocessing or feature engineering. Feature engineering is an important component in (tabular) machine learning problems, which can be easily integrated into an existing model. In this tutorial, we teach best practices for feature engineering techniques specific to tabular data building off our teams’ collective experience competing in data science competitions, such as Kaggle and RecSys. The tabular data structure limits the models capabilities to learn the relationships between features and adding hand crafted features can significantly boost their performance. For example, our team won the RecSys2020 challenge by developing hand crafted features, which outperformed complex models without them. We were able to reduce the calculation time from multiple days to less than an hour by adding GPU acceleration. The speed up was crucial for our high score as we could run more experiments.

In this hands-on tutorial, we will teach you how to create features for tabular data and accelerate your code with GPUs:
Feature Engineering: Best practices on how to create features from categorical, numerical and time series data
GPU acceleration: Accelerate your data frame operations with RAPIDS.AI cuDF using its pandas-like API
Examples: Get a lot of example functions for your machine learning problem

Upon completion, you'll be able to utilize RAPIDS.AI cuDF to apply the feature engineering techniques to your pipeline and accelerate it with GPUs.

活动: GTC Digital April

日期: April 2021

行业: 所有行业

级别: 初级技术

话题: ETL Processing

语言: 英语

所在地: