MulaiMulai sekarang secara gratis

Setting up your data for analysis

You will look at a version of the nycflights13 dataset, loaded as flights. It contains information on flights departing from New York City. You are interested in predicting whether or not they will arrive late to their destination, but first, you need to set up the data for analysis.

After discussing our model goals with a team of experts, you selected the following variables for your model: flight, sched_dep_time, dep_delay, sched_arr_time, carrier, origin, dest, distance, date, arrival.

You will also mutate() the date using as.Date() and convert character type variables to factors.

Lastly, you will split the data into train and test datasets.

Latihan ini adalah bagian dari kursus

Feature Engineering in R

Lihat Kursus

Petunjuk latihan

  • Transform all character-type variables to factors.
  • Split the flights data into test and train sets.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

flights <- flights %>%
  select(flight, sched_dep_time, dep_delay, sched_arr_time, carrier, origin, dest, distance, date, arrival) %>%

# Tranform all character-type variables to factors
  mutate(date = as.Date(date), ___(where(is.character), as.factor))

# Split the flights data into test and train sets
set.seed(246)
split <- flights %>% initial_split(prop = 3/4, strata = arrival)
test <- ___(split)
train <- ___(split)

test %>% select(arrival) %>% table() %>% prop.table()
train %>% select(arrival) %>% table() %>% prop.table()
Edit dan Jalankan Kode