Scaling Datasets in Pipelines

A machine learning algorithm may benefit from data that is "standardized". If one column in a dataframe has a completely different scale than the rest, it may cause an overfit. To mitigate this you may consider scaling your dataset. This video will give a full introduction on what this does to your pipeline.
00:00 Introduction
01:01 Making the problem harder
02:54 Pipelines
04:57 Scaling Data
11:33 Prediction Surface
To learn more about scikit-learn scaling algorithms, you may appreciate this guide:
scikit-learn.org/stable/auto_...
The code for all of our videos can be found on this Github repository:
github.com/probabl-ai/youtube...
The code for this specific episode can be found here:github.com/probabl-ai/youtube...

Жүктеу

Пікірлер: 4

@EdgarPauloVerchez
Ай бұрын
Hi your videos are great especially for beginers like me..can you do a video on model stacking.
@Mayur7Garg
Ай бұрын
Good summary. The video title made me feel it was about scaling the size of the data set (number of points) and not the features. 😆
@TheDeviszont
Ай бұрын
same
@Sadjina
16 күн бұрын
Nitpicking alert: The standard deviation is NOT the average distance of the samples to the mean. You would naively assume that's what a "standard deviation" would be about, but it's not. (I am sure you know this, but still wanted to point it out because it can be confusing).

The StandardScaler is not Standard

Normalization Vs. Standardization (Feature Scaling in Machine Learning)

OMG 😨 Era o tênis dela 🤬

SHE WANTED CHIPS, BUT SHE GOT CARROTS 🤣🥕

Glow Stick Secret 😱 #shorts

Путин пригласил президентов на парад Победы в Москве 2024 | Полная запись FULL

Rethinking the Pipelines API

Optimization for Deep Learning (Momentum, RMSprop, AdaGrad, Adam)

Tessellation Automata | Snub Square Tiling | Glider! (B3 S23)

Is data management the secret to generative AI?

What REALLY is Data Science? Told by a Data Scientist

Building Elaborate Pipelines: Part 1

ggplot for plots and graphs. An introduction to data visualization using R programming

Professional Preprocessing with Pipelines in Python

Scikit-learn Crash Course - Machine Learning Library for Python

Why tree gradients give you a boost

How much charging is in your phone right now? 📱➡️ 🔋VS 🪫

😱НОУТБУК СОСЕДКИ😱

На iPhone можно фоткать даже ночью😳

ep.11 / ИГРОВОЙ ПК И ПОЛНЫЙ СЕТАП ЗА 45К - Артикулы в ТГ / #сборка #пк #wildberries

ДЕРЬМОВЫЕ ИНСТРУМЕНТЫ: Лазерный Гравер Xiaomi! Вы угараете?

Expert Tips for Professional-Quality Photo Background Removal in 2024

Распаковка айфона в воде😱 #shorts

Scaling Datasets in Pipelines

Пікірлер: 4