Red wine tasting

In this exercise, you will have a look at the distributions of ratings for red wines from four different countries. The data are already pre-loaded in a data frame called red_wine_data. Check out the data in order to get a feel for it before you begin!

To obtain a histogram for each type of red wine, you will need to first rearrange the data into subsets. Use the subset() command to do this. Given a data frame, this function returns a new data frame containing only the elements that satisfy some condition. For example, red_wine_data$condition == "France" returns only the subset of data pertaining to French red wines.

This exercise is part of the course

Intro to Statistics with R: Introduction

Exercise instructions

Inspect the red_wine_data data frame by printing it to the console.
Provide some summary statistics for red_wine_data using the describe() function.
Split the data frame into one subset per country, as instructed above.
Make four new variables that contain the Ratings data from each of the newly created subsets. Use the $ operator.
Code is provided for you to organize your histograms into a 2x2 matrix using the par() function. Don't change this.
Plot a histogram of the ratings for each country using hist(). Display them in the same order as you defined them. Give your histograms sensible titles and label the x-axes with "score"

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

## The data frame `red_wine_data` is already pre-loaded

# Print red_wine_data


# Print basic statistical properties of red_wine_data


# Split the data frame into subsets for each country
red_usa <- ___
red_france <- ___
red_australia <- ___
red_argentina <- ___

# Select only the Ratings variable for each subset
red_ratings_usa <- ___
red_ratings_france <- ___
red_ratings_australia <- ___
red_ratings_argentina <- ___
  
# Create a 2 by 2 matrix of histograms
par(mfrow = c(2, 2))
    
# Plot four histograms, one for each subset

Edit and Run Code

This exercise is part of the course

Intro to Statistics with R: Introduction

BeginnerSkill Level

0.0+

Start Course for Free

In this chapter professor Conway will cover types of variables. It is very important to understand what type of variable you are dealing with when conducting a particular type of statistical analysis. You will cover variables such as nominal, ordinal, interval and ratio, and you will experiment with these via interactive exercises in R.

Exercise 1: Types of variables Exercise 2: Basketball standings Exercise 3: Longitude and latitude Exercise 4: Nominal variables in R Exercise 5: Ordinal variables in R Exercise 6: Interval and ratio variables in R Exercise 7: On the Theory of Scales of Measurement (Stevens, 1946)Exercise 8: Two nominal variables Exercise 9: Quick summary

You will look here at distributions in graphs called histograms. A histogram is one of the simplest graphs used in statistics, but they are very useful and very informative. Studying histograms will help you to overcome the tendency to put too much of a focus on summary statistics.

Exercise 1: Histograms and distributions Exercise 2: Creating histograms in R Exercise 3: Reading histograms Exercise 4: Looking at distributions by using histograms (1)Exercise 5: Positive and negative skew Exercise 6: Looking at distributions by using histograms (2)Exercise 7: Red wine tasting

Current Exercise

Exercise 8: White wine tasting Exercise 9: A uniform distribution Exercise 10: A negatively skewed distribution Exercise 11: Leptokurtic distribution Exercise 12: Quick summary

When working with data it is very important to keep in mind what type of scale you are dealing with, hence this chapter on scales of measurement. This chapter will introduce you to the different types of scales with a specific focus on the standard scale, the z-scale.

Exercise 1: Scales of measurement Exercise 2: Converting a value to its Z-score Exercise 3: Interpretation of a Z-score Exercise 4: Converting a distribution to Z-scale Exercise 5: Quick summary

In the previous chapters you looked at distributions and the importance of these. In this chapter the focus is more on summarizing all available information and drafting summary statistics. To make it a little bit more fun, the examples will be based on a wine tasting experiment :-).

Exercise 1: Measures of central tendency Exercise 2: The mean of a Fibonacci sequence Exercise 3: Three measures of central tendency (1)Exercise 4: Measures of central tendency: mode Exercise 5: Choosing a measure of central tendency Exercise 6: Three measures of central tendency (2)Exercise 7: Setting up histograms Exercise 8: Types of distribution Exercise 9: Robustness to outliers Exercise 10: Get intuitive!Exercise 11: Quick summary

Measures of central tendency try to capture the center point of a distribution. Measures of variability want to capture how much spread there is, or how wide the distribution is. The two measures you will look at in this final chapter will be standard deviation and variance.

Exercise 1: Measures of variability Exercise 2: Sample variance formula Exercise 3: Calculating variance in practice Exercise 4: Purpose of measures of variability Exercise 5: Michael Jordan's first NBA season Exercise 6: Calculate the variance manually Exercise 7: Get intuitive!Exercise 8: Quick summary