Get Started with Document Processing

Learn how to quickly integrate our document processing API into your application.

Prerequisites

Supported Document Types

We support various document types with specialized processing:

  • Bank Statements (BANK_STATEMENT)

    • Extracts account details, transactions, and balances
    • Supports US bank statements
    • Includes validation for transaction totals
  • Invoices (INVOICE)

    • Processes vendor and customer information
    • Handles multi-currency
    • Supports multiple locales including US, EU, and Asia
  • Receipts (RECEIPT)

    • Specialized handling for different receipt types
    • Includes hotel receipts with extended fields
    • Supports multiple languages
  • Credit Cards (CREDIT_CARD)

    • Validates card numbers using Luhn algorithm
    • Supports major payment networks
    • Includes security features for sensitive data
  • ID Documents (ID_DOCUMENT)

    • Processes passports, driver licenses, and national IDs
    • Includes MRZ (Machine Readable Zone) parsing
    • Supports multiple regions
  • Tax Forms

    • W2 Forms (W2): Processes wage and tax statements
    • W4 Forms (W4): Handles employee withholding certificates

Basic Implementation

Code Examples

Process a Bank Statement

import requests

response = requests.post(
    'https://api.ledgerbox.io/jobs/upload',
    headers={'x-api-key': 'your_api_key'},
    files={'files': open('statement.pdf', 'rb')},
    data={'model': 'BANK_STATEMENT'}
)

# Example response structure
{
    "content": {
        "bankName": "Contoso Bank",
        "accountHolderName": "John Doe",
        "accounts": [{
            "accountNumber": "987-654-3210",
            "accountType": "Checking",
            "transactions": [/* ... */]
        }]
    }
}

Process Tax Forms

# Process W2 Form
response = requests.post(
    'https://api.ledgerbox.io/jobs/upload',
    headers={'x-api-key': 'your_api_key'},
    files={'files': open('w2.pdf', 'rb')},
    data={'model': 'W2'}
)

# Example W2 response
{
    "content": {
        "taxYear": "2024",
        "employee": {
            "name": "John Doe",
            "ssn": "XXX-XX-1234"
        },
        "wagesTipsAndOtherCompensation": 50000.00,
        "federalIncomeTaxWithheld": 7500.00
    }
}

Process Identity Documents

# Process Passport
response = requests.post(
    'https://api.ledgerbox.io/jobs/upload',
    headers={'x-api-key': 'your_api_key'},
    files={'files': open('passport.pdf', 'rb')},
    data={'model': 'ID_DOCUMENT'}
)

# Example passport response
{
    "content": {
        "documentType": "passport",
        "firstName": "John",
        "lastName": "Doe",
        "dateOfBirth": "1990-01-01",
        "nationality": "USA",
        "machineReadableZone": {/* ... */}
    }
}

Common Use Cases

Financial Documents

Process bank statements, invoices, and receipts for accounting automation.

Identity Verification

Automate KYC processes with ID document processing.

Tax Processing

Streamline tax form processing with W2 and W4 extraction.

Receipt Management

Automate expense management with receipt processing.

Best Practices

  1. Document Quality

    • Submit clear, high-resolution scans
    • Ensure documents are not skewed or rotated
    • Verify all important fields are visible
  2. Security

    • Always use HTTPS for API calls
    • Implement proper credential management
    • Follow data privacy guidelines for sensitive documents
  3. Error Handling

    • Implement retry logic with exponential backoff
    • Handle document-specific validation errors
    • Monitor confidence scores in responses

Next Steps