Question answering

1. Question answering

Let's explore question answering, where machines read and understand a passage, then answer questions about it.

2. Extractive versus abstractive QA

There are two main types of QA: extractive and abstractive. In extractive QA, the answer is a span of text directly copied from the passage, while in abstractive QA, the model generates a new, natural-sounding answer that may paraphrase or summarize the information. For instance, given a context about a library schedule and a question asking when the library closes on weekdays, an extractive answer would select the exact part; 6PM, in this case. An abstractive answer would generate a complete sentence that includes this information.

3. Extractive versus abstractive QA

Extractive QA is commonly used in search engines, document retrieval systems, and reading comprehension apps, where the goal is to locate precise information from large bodies of text. Since it extracts exact spans from the source, it's generally more accurate and less prone to errors. Abstractive QA is useful for conversational agents, virtual assistants, and customer support bots, where users expect clear and natural answers instead of raw text fragments. However, because it generates answers in its own words, it can sometimes introduce errors, especially when handling long documents.

4. Extractive QA

To do extractive QA in code, we import the pipeline function and create a pipeline with the task set to "question-answering" along with a suitable model. We define a context, in this case a text about Amazon rainforest where the model is expected to search for answers: and a question like: "Which countries does the Amazon rainforest cover?" We then pass both the question and context to the pipeline. The model returns a dictionary with a score showing the model's confidence in its answer, the start and end positions indicating where the answer appears in the original text, and the predicted answer; a span of text directly pulled from the context.

5. Abstractive QA

To perform abstractive QA, we use the pipeline function again, but this time with the task set to "text2text-generation" and a suitable model. We define the same context and question as before, but now we wrap them into a single input string using the format: "question:" followed by the actual question, and "context:" followed by the actual context. We build this string using an f-string, a Python feature that makes it easy to insert variables into text. We pass this input_text to the qa_pipeline and receive a list of one dictionary containing the generated text. Notice that this time, the answer is rephrased as a full sentence. This makes it easier for users to understand, especially in scenarios like voice assistants, chatbots, or educational tools where fluency matters.

6. Let's practice!

Let's practice both methods to see when each one fits best!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Natural Language Processing (NLP) in Python

IntermediateSkill Level

4.8+

268 reviews