Performing calculations with pandas
Now, you've been provided with a CSV file called sales.csv
containing sales data with three columns: "user_id"
, "date"
, and "order_value"
.
Using pandas
, you'll read in the file and calculate statistics about sales values.
Just like how you can subset a dictionary by its key, e.g., dictionary["key_name"]
, you can use the same syntax in pandas
to subset a column! Not only this, the package also provides useful methods to perform calculations on DataFrames or subsets of DataFrames (such as columns)!
Examples of this syntax include df["column_name"].mean()
and df["column_name"].sum()
to calculate the average and total for a given column, respectively.
This exercise is part of the course
Intermediate Python for Developers
Exercise instructions
- Read in
"sales.csv"
, saving as a pandas DataFrame calledsales_df
. - Subset
sales_df
on the"order_value"
column, then call the.mean()
method to find the average order value. - Subset
sales_df
on the"order_value"
column, then call the.sum()
method to find the total value of all orders.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Read in sales.csv
sales_df = ____.____("____")
# Print the mean order_value
print(sales_df["____"].____())
# Print the total value of sales
print(sales_df["____"].____())