Get startedGet started for free

Specifying datatypes for columns

When you read data from a text or CSV file, you should specify the names and data types for each column. The read() function will try to determine if the first entry of the dataset contains the column names. R is clever at figuring out some datatypes, but if you are reading a categorical variable coded as 0, 1, and 2, it will read it as a numeric variable, and you will need to specify the data type for that column after reading the data.

This exercise is part of the course

Multivariate Probability Distributions in R

View Course

Exercise instructions

  • Assign the new column names to the wine dataset, then check that they have been correctly assigned.
  • Change the Type column into a factor with three levels.
  • Use the str() function to check the data type/structure before and after changing the data type.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Assign new names
___ <- c('Type', 'Alcohol', 'Malic', 'Ash', 'Alcalinity', 'Magnesium', 'Phenols', 'Flavanoids', 'Nonflavanoids','Proanthocyanins', 'Color', 'Hue', 'Dilution', 'Proline')
                      
# Check the new column names
___

# Check data type/structure of each variable
str(___)

# Change the Type variable data type
___

# Check data type/structure again 
___
Edit and Run Code