Document question and answering
Document question and answering is a multi-modal ML task which analyzes an image of a document, such as a contract, converts it to text, and allows a question to be asked about the text. This is useful when there are many scanned documents which need to be searched, for example financial records.
Build a pipeline for document question and answering, then ask the pre-loaded question Which meeting is this document about?
.
pipeline
from the transformers
library and the question
are already loaded for you. Note that we are using our own pipeline and dqa functions to enable you to learn how to use these functions without some of the extra setup. Please visit the Hugging Face documentation to dive deeper.
This is a part of the course
“Working with Hugging Face”
Exercise instructions
- Create a pipeline for
document-question-answering
and save asdqa
. - Save the path to the image,
document.png
, asimage
. - Get the answer for the
question
of theimage
using thedqa
pipeline and save asresults
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the pipeline
____ = ____(____="____", model="naver-clova-ix/donut-base-finetuned-docvqa")
# Set the image and question
____ = "____"
question = "Which meeting is this document about?"
# Get the answer
____ = ____(image=____, question=____)
print(results)