HTML document loaders
It's possible to load documents from many different formats, including complex formats like HTML.
In this exercise, you'll load an HTML file containing a White House executive order.
This exercise is part of the course
Developing LLM Applications with LangChain
Exercise instructions
- Use the
UnstructuredHTMLLoader
class to load thewhite_house_executive_order_nov_2023.html
file in the current directory. - Load the documents into memory.
- Print the first document.
- Print the first document's metadata.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
from langchain_community.document_loaders import UnstructuredHTMLLoader
# Create a document loader for unstructured HTML
loader = ____
# Load the document
data = ____
# Print the first document
print(____)
# Print the first document's metadata
print(____)