Get startedGet started for free

Parsing PDF files

You now need to work on another small project you have been delaying. Your company gave you some PDF files of signed contracts. The goal of the project is to create a database with the information you parse from them. Three of these columns should correspond to the day, month, and year when the contract was signed.
The dates appear as Signed on 05/24/2016 (05 indicating the month, 24 the day). You decide to use capturing groups to extract this information. Also, you would like to retrieve that information so you can store it separately in different variables.

You decide to do a proof of concept.

The variable contract containing the text of one contract and the re module are already loaded in your session. You can use print() to view the data in the IPython Shell.

This exercise is part of the course

Regular Expressions in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Write regex and scan contract to capture the dates described
regex_dates = r"____\s____\s(____)/(____)/(____)"
dates = re.____(____, ____)
Edit and Run Code