CommencerCommencer gratuitement

Specifying datatypes for columns

When you read data from a text or CSV file, you should specify the names and data types for each column. The read() function will try to determine if the first entry of the dataset contains the column names. R is clever at figuring out some datatypes, but if you are reading a categorical variable coded as 0, 1, and 2, it will read it as a numeric variable, and you will need to specify the data type for that column after reading the data.

Cet exercice fait partie du cours

Multivariate Probability Distributions in R

Afficher le cours

Instructions

  • Assign the new column names to the wine dataset, then check that they have been correctly assigned.
  • Change the Type column into a factor with three levels.
  • Use the str() function to check the data type/structure before and after changing the data type.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Assign new names
___ <- c('Type', 'Alcohol', 'Malic', 'Ash', 'Alcalinity', 'Magnesium', 'Phenols', 'Flavanoids', 'Nonflavanoids','Proanthocyanins', 'Color', 'Hue', 'Dilution', 'Proline')
                      
# Check the new column names
___

# Check data type/structure of each variable
str(___)

# Change the Type variable data type
___

# Check data type/structure again 
___
Modifier et exécuter le code