Iris redux - a more robust accuracy.
In this exercise, you will build linear SVMs for 100 distinct training/test partitions of the iris dataset. You will then evaluate the performance of your model by calculating the mean accuracy and standard deviation. This procedure, which is quite general, will give you a far more robust measure of model performance than the ones obtained from a single partition.
This exercise is part of the course
Support Vector Machines in R
Exercise instructions
- For each trial:
- Partition the dataset into training and test sets in a random 80/20 split.
- Build a default cost linear SVM on the training dataset.
- Evaluate the accuracy of your model (
accuracy
has been initialized in your environment).
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
for (i in 1:___){
#assign 80% of the data to the training set
sample_size <- ___(___ * nrow(iris))
train <- ___(seq_len(nrow(iris)), size = ___)
trainset <- iris[train, ]
testset <- iris[-train, ]
#build model using training data
svm_model <- svm(Species~ ., data = ___,
type = "C-classification", kernel = "linear")
#calculate accuracy on test data
pred_test <- predict(svm_model, ___)
accuracy[i] <- mean(pred_test == ___$Species)
}
mean(___)
sd(___)