This document explains the PDF Reader node, which extracts text content from PDF files.

Node Inputs

Required Fields

  • PDF File Name: Upload PDF or select existing file from storage

Optional Fields

  • Use Link: Enable to read from URL
  • Use Advanced PDF Reading: If enabled, reads the content from a PDF file in a structured way useful for LLMs.

Note: Advanced PDF Reading will add 5 credits to the cost of this node.

  • Specify Pages: Select specific pages to read
  • Split PDF Content by Page: Get page-by-page output in a list format. Each item in the list is a single page from the PDF.

Node Output

  • PDF Contents: Extracted content (single or per page)

Node Functionality

The PDF Reader node:

  • Extracts text from PDFs
  • Preserves formatting
  • Handles page selection
  • Works with URLs
  • Supports batch processing

Important Considerations

  1. URLs must be accessible if ‘Use Link’ option in enabled

In summary, the PDF Reader node helps extract and organize text content from PDF files with flexible options for page selection and output format.