Effect size for means

Many venture capital-backed companies receive more than one round of funding. In general, the second round is bigger than the first. Just how much of an effect does the round number have on the average funding amount? You can use Cohen's d to quantify this.

Recall that, to calculate Cohen's d, you need to first calculate the pooled standard deviation. That is given by the equation

Cohen's d is then given by:

A DataFrame of venture capital investments (investments_df) has been loaded for you, as have the packages pandas as pd, NumPy as np and stats from SciPy. The column funding_total_usd shows the total funding received in that round.

This exercise is part of the course

Foundations of Inference in Python

Exercise instructions

Filter investments_df to select funding_rounds 1 and 2 separately.
Calculate the standard deviation and sample size of each round.
Calculate the pooled standard deviation between the two rounds.
Calculate Cohen's d using the terms you just calculated.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Select all investments from rounds 1 and 2 separately
round1_df = investments_df[____['funding_rounds'] == ____]
round2_df = investments_df[____['funding_rounds'] == ____]

# Calculate the standard deviation of each round and the number of companies in each round
round1_sd = ____.std()
round2_sd = ____.std()
round1_n = ____.shape[0]
round2_n = ____.shape[0]

# Calculate the pooled standard deviation between the two rounds
pooled_sd = np.sqrt(((____ - 1) * ____ ** 2 + (____ - 1) *____ ** 2) / (____ + ____ - 2))

# Calculate Cohen's d
d = (____.mean() - ____.mean()) / ____

Edit and Run Code

This exercise is part of the course

Foundations of Inference in Python

AdvancedSkill Level

4.9+

Start Course for Free

In this chapter, we'll explore the relationship between samples and statistically justifiable conclusions. Choosing a sample is the basis of making sound statistical decisions, and we’ll explore how the choice of a sample affects the outcome of your inference.

Exercise 1: Statistical inference and random sampling Exercise 2: Sampling and point estimates Exercise 3: Repeated sampling, point estimates and inference Exercise 4: Sampling and bias Exercise 5: Visualizing samples Exercise 6: Inference and bias Exercise 7: Confidence intervals and sampling Exercise 8: Normal sampling distributions Exercise 9: Calculating confidence intervals Exercise 10: Drawing conclusions from samples

Learn all about applying normality tests, correlation tests, and parametric and non-parametric tests for sound inference. Hypothesis tests are tools, and choosing the right tool for the job is critical for statistical decision-making. While you may be familiar with some of these tests in introductory courses, you'll go deeper to enhance your inferential toolkit in this chapter.

Exercise 1: Normality tests Exercise 2: Testing for normality Exercise 3: Distribution of errors Exercise 4: Fitting a normal distribution Exercise 5: Correlation tests Exercise 6: Testing for correlation Exercise 7: Autocorrelation Exercise 8: Explained variance Exercise 9: Parametric tests Exercise 10: Equal variance Exercise 11: Normality of groups Exercise 12: ANOVA Exercise 13: Non-parametric tests Exercise 14: Comparing rankings Exercise 15: Comparing medians

In this chapter, you'll measure and interpret effect size in various situations, encounter the multiple comparisons problem, and explore the power of a test in depth. While p-values tell you if a significant effect is present, they don't tell you how strong that effect is. Effect size measures how strong an effect a treatment has. Master the factors underpinning effect size in this chapter.

Exercise 1: Effect size Exercise 2: Effect size for means

Current Exercise

Exercise 3: Effect size for correlations Exercise 4: Effect size for categorical variables Exercise 5: Multiple comparisons and corrections Exercise 6: Multiple comparisons problem Exercise 7: Bonferonni-Holm correction Exercise 8: Power of a test Exercise 9: What is power anyway?Exercise 10: Power for experimental design Exercise 11: Computing power and sample sizes

You’ll expand your inferential statistics toolkit further with a look at bootstrapping, permutation tests, and methods of combining evidence from p-values. Bootstrapping will provide you with a first look at statistical simulation. In the lesson meta-analysis, you’ll learn all about combining results from multiple studies. You’ll end with a look at permutation tests, a powerful and flexible non-parametric statistical tool.

Exercise 1: Bootstrapping Exercise 2: Bootstrap confidence intervals Exercise 3: Bootstrapping vs. normality Exercise 4: Combining evidence from p-values Exercise 5: Fisher's method in SciPy Exercise 6: Inference using Fisher's method Exercise 7: Summarizing Fisher's method Exercise 8: Permutation tests Exercise 9: Permutation tests for correlations Exercise 10: Permutation tests and bootstrapping Exercise 11: Analyzing skewed data with a permutation test Exercise 12: Course wrap-up video