Document VQA with LayoutLM
From VQA with images to VQA with documents! In this exercise, you will use the layoutlm-document-qa
model to determine the total amount of training hours provided to employees in the years 2012-2013 from this document image:
The dataset (dataset
) has been loaded and the pipeline module (pipeline
) has been imported.
This exercise is part of the course
Multi-Modal Models with Hugging Face
Exercise instructions
- Load the pipeline using the
'document-question-answering'
task and'impira/layoutlm-document-qa'
checkpoint. - Process the document in data point
61
of thetest
set fromdataset
with an appropriate prompt to find how many days of formal training were provided to employees in 2012-2013.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load the document-question-answering pipeline with the pretrained model
pipe = ____
# Process datapoint 61 to find the amount of training days
result = ____(dataset["____"][61]["____"], "____")
print(result)