Get startedGet started for free

Introducing Survey Data Analysis

1. Introducing Survey Data Analysis

Welcome! My name is EbunOluwa Andrew and I am excited to be your instructor for this course!

2. What is survey data analysis?

Survey data analysis is the process of drawing insights from survey data. It's important that results are analyzed correctly to avoid incorrect insights. All results contribute to an increased understanding of the issue being studied. Survey data analysis also measures effects, and a high response rate is critical. For example, we can't determine if a change we made to a product is actually better if the majority of participants don't complete the follow up survey.

3. What is survey data?

Survey data is data collected from gathering research participants' responses. These responses come in multiple formats and can be numerical, or descriptive in nature. Ideally, it is a fair representation of the opinions and perceptions of your target audience.

4. Types of survey data

There are four main types of survey data. Ordinal data comes from survey questions where the responses only make sense as an order. A sample question would be "How much do you use our product?", and a sample answer would consist of the following: Never, Rarely, Sometimes, Often, or Always. In ordinal data, order matters.

5. Types of survey data

Nominal data consists of distinct groups, which are not arranged in any particular order or sequence. Examples of this type include, city of birth, gender, and ethnicity.

6. Types of survey data

Interval data is ordered too, and the distance between the values is meaningful and equal. A sample question could be, "Which is your most preferred home temperature in degree Celsius?" Sample answers could include, 22, 24, or 26. Ratio data questions ask for a precise measurement, with a true zero point. Here, zero means the absence of the variable in interest. A sample question could be, "What is your mortgage loan amount?" and a response of $5321 is ratio data. These survey types inform how to analyze the data.

7. Defining goals

When analyzing survey data, first, it's important to define our research goals. What are we trying to achieve? Second, we need to define the response, or survey participation, rate. If 5000 people attended a concert and only five people answered our survey, those five answers aren't representative of the whole group. The more users complete the survey, the more accurate our results will be. Third, we need to learn from the feedback and point to key areas that need attention. This helps improve whatever it is we're analyzing.

8. Sampling for surveys

When it comes to large populations, it is often difficult to collect data from every individual. Sampling can be used to make estimates or test hypotheses about the population. Different sampling techniques are used to create unbiased or neutral samples and draw valid conclusions.

9. Sampling techniques overview

There are four main sampling techniques. Simple random sampling randomly selects a population subgroup. Stratified random sampling divides the population into subgroups called stratum, based on specific characteristics like race and gender identity, and then each stratum is randomly selected from.

10. Sampling techniques overview

Weighted sampling selects a subgroup that matches the proportions of the population demographics. Cluster sampling divides the population into smaller groups or clusters based on naturally occurring groups such as schools or cities, and then randomly selects from these clusters to form a sample.

11. Crosstab - a common way to analyze survey data

One way to analyze survey samples is by examining the relationship between nominal variables with the pandas function, crosstab. Here's a survey taken by current and former University of Texas students.

12. Crosstab function

We can use the crosstab function to display the frequency distribution of one variable in rows and another in columns. To demonstrate, the first parameter specifies what we want on the rows, which is 'Age' and the second parameter specifies how to split the columns, which is 'Gender'. Here we have a distribution of male and female respondents by age group.

13. Let's practice!

Now, let's test what you know!