Aan de slagGa gratis aan de slag

kNN tricks & tips I: weighting donors

A variation of kNN imputation that is frequently applied uses the so-called distance-weighted aggregation. What this means is that when we aggregate the values from the neighbors to obtain a replacement for a missing value, we do so using the weighted mean and the weights are inverted distances from each neighbor. As a result, closer neighbors have more impact on the imputed value.

In this exercise, you will apply the distance-weighted aggregation while imputing the tao data. This will only require passing two additional arguments to the kNN() function. Let's try it out!

Deze oefening maakt deel uit van de cursus

Handling Missing Data with Imputations in R

Cursus bekijken

Oefeninstructies

  • Load the VIM package.
  • Impute humidity with kNN using distance-weighted mean for aggregating neighbors; you will need to specify the numFun and weightDist arguments.
  • The margin plot to view the results has been already coded for you.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Load the VIM package
___(___)

# Impute humidity with kNN using distance-weighted mean
tao_imp <- ___(tao, 
               k = 5, 
               variable = "humidity", 
               ___ = ___,
               ___ = ___)

tao_imp %>% 
	select(sea_surface_temp, humidity, humidity_imp) %>% 
	marginplot(delimiter = "imp")
Code bewerken en uitvoeren