Session Ready
Exercise

Sweet pickle

"It was the best of times, it was the worst of times..", Charles Dickens said in a Tale of Two Cities. He could also be talking about your startup. Initially things were amazing and you and your co-workers laughed in delight as the CTO churned out machine learning models dozens by the day. Often this would be at 2AM and you would arrive in the morning and find the serialized sklearn models waiting for the Data Science team to deploy to production.

Unfortunately, this was in fact too good to be true. Many of the models had serious flaws and this ultimately led to the CTO stepping down. IT Auditors want to determine how flawed these ML models were and back test the predictions for accuracy.

Use the os.walk module to find serialized models and test them for accuracy.

Instructions
100 XP
  • Walk the the file system path my using os.walk.
  • Look for a file extension named .joblib and load the model into clf using joblib's load() function.
  • Use sklearn to predict from the unpickled model by loading it into clf.predict() and pass the input data X_digits to it (X_digits is already in memory).
  • Print your predictions.