BaşlayınÜcretsiz Başlayın

Treating missing data

In this exercise, you're working with another version of the accounts data that contains missing values for both the cust_id and acct_amount columns.

You want to figure out how many unique customers the bank has, as well as the average amount held by customers. You know that rows with missing cust_id don't really help you, and that on average, the acct_amount is usually 5 times the amount of inv_amount.

In this exercise, you will drop rows of accounts with missing cust_ids, and impute missing values of inv_amount with some domain knowledge. dplyr and assertive are loaded and accounts is available.

Bu egzersiz

Cleaning Data in R

kursunun bir parçasıdır
Kursu Görüntüle

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Create accounts_clean
accounts_clean <- accounts %>%
  # Filter to remove rows with missing cust_id
  ___

accounts_clean
Kodu Düzenle ve Çalıştır