PDF Read

Read and process a PDF file provided as base64 encoded string. Validates file size and outputs the base64 content for use in other workflow nodes.

Node Type

Action

Category

File Upload

Icon

FileText

Overview

The PDF Read node is an action node that accepts a base64 encoded PDF file, validates its size against configured limits, and outputs the base64 content for further processing in the workflow. This powerful integration enables you to programmatically process PDF documents, perfect for document analysis, content extraction, and workflow-based document management.

Key Features

  • File Validation: Validates PDF file size against 5MB limit
  • Base64 Processing: Handles base64 encoded PDF files efficiently
  • Blob Management: Integrates with blob service for file storage
  • Error Handling: Comprehensive error handling for file processing failures
  • Metadata Extraction: Extracts filename and blob ID for reference
  • Workflow Integration: Seamless integration with other workflow nodes

Prerequisites

File Upload Requirements

Must have a valid PDF file to process

PDF file must be uploaded through the file upload interface
File must be a valid PDF format (.pdf)
File size must not exceed 5MB limit
File must be accessible and not corrupted

PDF Document Requirements

Understanding of PDF processing capabilities

PDF must be readable and not password-protected
Document should contain extractable text content
Complex layouts may require additional processing
Scanned PDFs may need OCR processing for text extraction

Technical Requirements

System capabilities needed for PDF processing

Blob service access for file storage and retrieval
File size validation and processing capabilities
Base64 encoding/decoding support
Error handling for file processing failures

Node Configuration

Required Fields

PDF File

Type:file_upload_pdf
Required:Yes
Value Type:number (blob ID)

The PDF file to be processed. Upload a PDF file through the file upload interface. The file will be validated for size and format before processing.

Examples & Use Cases

Document Analysis Workflow

Extract and analyze PDF content with AI

Workflow Structure

📄 PDF Read → 🤖 LLM Analysis → 📊 Generate Summary → 📧 Email Report

Read PDF document, analyze content with AI for key insights, generate a summary, and email to stakeholders.

Document Processing Pipeline

Process PDFs for content extraction and storage

Workflow Structure

📄 PDF Read → 🔍 Extract Text → 📝 Store in Database → ✅ Confirm Processing

Read PDF content, extract text data, store in database or knowledge base, and confirm successful processing.

Automated Document Review

Automatically review and categorize PDF documents

Workflow Structure

📄 PDF Read → 🤖 AI Classification → 📁 Move to Folder → 📧 Notify Team

Read PDF, use AI to classify document type, automatically organize into appropriate folders, and notify relevant team members.

Best Practices

Do's

  • Validate file size before processing to avoid errors
  • Use the blob ID for referencing files in subsequent nodes
  • Handle file processing errors gracefully
  • Store original filename for reference and organization
  • Consider file size limits when designing workflows
  • Use appropriate error handling for file access issues

Don'ts

  • Don't process files larger than 5MB without splitting
  • Avoid processing password-protected PDFs
  • Don't forget to handle blob service connection errors
  • Avoid hardcoding file paths or blob IDs
  • Don't process sensitive files without proper security measures
  • Avoid exceeding system memory limits with large files
💡
Pro Tip: The PDF Read node is perfect for document processing workflows. Use the blob ID output to reference the file in other nodes like LLM analysis or document conversion. For large documents, consider implementing file size checks before processing.

Troubleshooting

Common Issues

File Size Exceeded

Symptoms: Node fails with file size exceeded error

Solution: Ensure the PDF file is under 5MB. For larger files, consider splitting the document or using a different processing approach.

Invalid PDF Format

Symptoms: Node fails with invalid file format error

Solution: Verify the uploaded file is a valid PDF format. Check that the file is not corrupted and can be opened in a PDF viewer.

Blob Service Connection Error

Symptoms: Node fails with blob service connection error

Solution: Check that the blob service is properly configured and accessible. Verify network connectivity and service credentials.

File Not Found

Symptoms: Node fails with file not found error

Solution: Ensure the PDF file was properly uploaded and the blob ID is valid. Check that the file hasn't been deleted or moved.

Memory Issues with Large Files

Symptoms: Node times out or fails with memory errors

Solution: Large PDF files may cause memory issues. Consider processing smaller files or implementing chunked processing for very large documents.

Related Resources