Session Ready
Exercise

Finding sources of weather delays

The final step in this case study is to use the persisted_weather_delays Dask DataFrame to determine the percentage of delayed flights per weather event.

Your job is to compute the number of delayed flights by weather events and divide by the total number of flights. You'll then use .nlargest(5) to retrieve the five highest contributions to the number of delayed flights. Finally, you'll compute the average length of the delay for the 5 leading contributions.

Instructions
100 XP
  • Count the by_event['WEATHER_DELAY'] column and divide by the total number of delayed flights from persisted_weather_delays, multiply by 100 and assign the result to pct_delayed.
  • Compute & print the five largest values of pct_delayed with .nlargest(5).
  • Calculate the mean of the by_event['WEATHER_DELAY'] column and return the 5 largest entries with .nlargest(5).
  • Compute & print avg_delay_time.