PDF Read
Read and process a PDF file provided as base64 encoded string. Validates file size and outputs the base64 content for use in other workflow nodes.
Node Type
Action
Category
File Upload
Icon
FileText
Overview
The PDF Read node is an action node that accepts a base64 encoded PDF file, validates its size against configured limits, and outputs the base64 content for further processing in the workflow. This powerful integration enables you to programmatically process PDF documents, perfect for document analysis, content extraction, and workflow-based document management.
Key Features
- • File Validation: Validates PDF file size against 5MB limit
- • Base64 Processing: Handles base64 encoded PDF files efficiently
- • Blob Management: Integrates with blob service for file storage
- • Error Handling: Comprehensive error handling for file processing failures
- • Metadata Extraction: Extracts filename and blob ID for reference
- • Workflow Integration: Seamless integration with other workflow nodes
Prerequisites
File Upload Requirements
Must have a valid PDF file to process
PDF Document Requirements
Understanding of PDF processing capabilities
Technical Requirements
System capabilities needed for PDF processing
Node Configuration
Required Fields
PDF File
The PDF file to be processed. Upload a PDF file through the file upload interface. The file will be validated for size and format before processing.
Examples & Use Cases
Document Analysis Workflow
Extract and analyze PDF content with AI
Workflow Structure
Read PDF document, analyze content with AI for key insights, generate a summary, and email to stakeholders.
Document Processing Pipeline
Process PDFs for content extraction and storage
Workflow Structure
Read PDF content, extract text data, store in database or knowledge base, and confirm successful processing.
Automated Document Review
Automatically review and categorize PDF documents
Workflow Structure
Read PDF, use AI to classify document type, automatically organize into appropriate folders, and notify relevant team members.
Best Practices
Do's
- • Validate file size before processing to avoid errors
- • Use the blob ID for referencing files in subsequent nodes
- • Handle file processing errors gracefully
- • Store original filename for reference and organization
- • Consider file size limits when designing workflows
- • Use appropriate error handling for file access issues
Don'ts
- • Don't process files larger than 5MB without splitting
- • Avoid processing password-protected PDFs
- • Don't forget to handle blob service connection errors
- • Avoid hardcoding file paths or blob IDs
- • Don't process sensitive files without proper security measures
- • Avoid exceeding system memory limits with large files
Troubleshooting
Common Issues
File Size Exceeded
Symptoms: Node fails with file size exceeded error
Solution: Ensure the PDF file is under 5MB. For larger files, consider splitting the document or using a different processing approach.
Invalid PDF Format
Symptoms: Node fails with invalid file format error
Solution: Verify the uploaded file is a valid PDF format. Check that the file is not corrupted and can be opened in a PDF viewer.
Blob Service Connection Error
Symptoms: Node fails with blob service connection error
Solution: Check that the blob service is properly configured and accessible. Verify network connectivity and service credentials.
File Not Found
Symptoms: Node fails with file not found error
Solution: Ensure the PDF file was properly uploaded and the blob ID is valid. Check that the file hasn't been deleted or moved.
Memory Issues with Large Files
Symptoms: Node times out or fails with memory errors
Solution: Large PDF files may cause memory issues. Consider processing smaller files or implementing chunked processing for very large documents.