Session Ready
Exercise

Double trouble

The CEO at your startup has been very happy with previous Data Engineering solution you created that eliminates duplicates in a tree full of Terabytes of data. You have been tasked with another similar task of finding all of the .csv files in your company's data lake. These files will need to later move to a specific directory for a machine learning task. Your code could save hours of time if it performs as expected.

In this exercise, you will search for files that match specific patterns in a directory test_dir. The os module has already been imported for you.

Instructions
100 XP
  • Walk the file system starting at the test_dir.
  • Create the full path to the file by using os.path.join().
  • Match the extension pattern .csv using os.path.splitext() method and append matches to a list.
  • Print the matches you find.