MulaiMulai sekarang secara gratis

Train/Test split

To avoid overfitting, it's common practice in Machine Learning to split data into train and test datasets. This is done to ensure that the model is able to correctly predict new, unseen data.

Since we're working with time-series data, we cannot use random split methods, as that would allow the model to know the future.

A function to print the start and end of a DataFrame is available as show_start_end(), which takes a DataFrame as the only argument, and returns a string.

The data is available as environment.

Latihan ini adalah bagian dari kursus

Analyzing IoT Data in Python

Lihat Kursus

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Define the split day
limit_day = ____

# Split the data
train_env = ____[____]
test_env = ____[____]
Edit dan Jalankan Kode