Skip to content

Integration Guide

This guide covers integrating Readur with external systems, services, and tools to extend its functionality and fit into your existing infrastructure.

Webhook Integration

Configuring Webhooks

Readur can send webhooks for various events to integrate with external systems.

Setup

# Environment configuration
WEBHOOK_ENABLED: true
WEBHOOK_URL: https://your-system.com/webhook
WEBHOOK_SECRET: your-webhook-secret
WEBHOOK_RETRY_ATTEMPTS: 3
WEBHOOK_TIMEOUT: 30

Webhook Events

Event Description Payload
document.created New document uploaded Document details
document.updated Document modified Changes made
document.deleted Document removed Document ID
ocr.completed OCR processing finished OCR results
ocr.failed OCR processing failed Error details
source.sync.started Source sync began Source info
source.sync.completed Source sync finished Sync statistics
user.created New user registered User details
user.login User logged in Login info

Webhook Payload Structure

{
  "event": "document.created",
  "timestamp": "2025-01-15T10:30:00Z",
  "webhook_id": "whk_123456",
  "data": {
    "document_id": "doc_abc123",
    "title": "Invoice.pdf",
    "user_id": "usr_xyz789",
    "file_size": 1048576,
    "mime_type": "application/pdf"
  },
  "signature": "sha256=abcdef..."
}

Webhook Security

Verify webhook signatures:

import hmac
import hashlib

def verify_webhook(payload, signature, secret):
    expected = hmac.new(
        secret.encode(),
        payload.encode(),
        hashlib.sha256
    ).hexdigest()

    return hmac.compare_digest(
        f"sha256={expected}",
        signature
    )

# Usage
@app.route('/webhook', methods=['POST'])
def handle_webhook():
    signature = request.headers.get('X-Readur-Signature')
    if not verify_webhook(request.data, signature, WEBHOOK_SECRET):
        return 'Unauthorized', 401

    # Process webhook
    data = request.json
    if data['event'] == 'document.created':
        process_new_document(data['data'])

    return 'OK', 200

Webhook Examples

Slack Integration

// Slack webhook handler
const express = require('express');
const axios = require('axios');

app.post('/readur-webhook', async (req, res) => {
  const event = req.body;

  if (event.event === 'ocr.completed') {
    await axios.post(process.env.SLACK_WEBHOOK_URL, {
      text: `Document processed: ${event.data.title}`,
      attachments: [{
        color: 'good',
        fields: [
          { title: 'Pages', value: event.data.pages, short: true },
          { title: 'Confidence', value: event.data.confidence, short: true }
        ]
      }]
    });
  }

  res.status(200).send('OK');
});

Zapier Integration

# Zapier webhook transformer
def transform_for_zapier(readur_event):
    return {
        'id': readur_event['data']['document_id'],
        'title': readur_event['data']['title'],
        'created_at': readur_event['timestamp'],
        'file_url': f"https://readur.app/api/documents/{readur_event['data']['document_id']}/download",
        'metadata': readur_event['data'].get('metadata', {})
    }

Storage Provider Integration

S3-Compatible Storage

MinIO

# MinIO configuration
S3_ENABLED: true
S3_ENDPOINT_URL: https://minio.internal:9000
S3_BUCKET_NAME: readur-documents
S3_ACCESS_KEY_ID: minioadmin
S3_SECRET_ACCESS_KEY: minioadmin
S3_USE_SSL: true
S3_VERIFY_SSL: false  # For self-signed certificates

Backblaze B2

# Backblaze B2 configuration
S3_ENABLED: true
S3_ENDPOINT_URL: https://s3.us-west-002.backblazeb2.com
S3_BUCKET_NAME: your-bucket-name
S3_ACCESS_KEY_ID: your-key-id
S3_SECRET_ACCESS_KEY: your-application-key

DigitalOcean Spaces

# DigitalOcean Spaces configuration
S3_ENABLED: true
S3_ENDPOINT_URL: https://nyc3.digitaloceanspaces.com
S3_BUCKET_NAME: your-space-name
S3_REGION: nyc3
S3_ACCESS_KEY_ID: your-access-key
S3_SECRET_ACCESS_KEY: your-secret-key

Azure Blob Storage

// Future implementation example
pub struct AzureStorage {
    container_client: ContainerClient,
}

impl StorageBackend for AzureStorage {
    async fn store(&self, key: &str, data: &[u8]) -> Result<()> {
        let blob_client = self.container_client.blob_client(key);
        blob_client.put_block_blob(data).await?;
        Ok(())
    }
}

Google Cloud Storage

// Future implementation example
pub struct GcsStorage {
    bucket: Bucket,
}

impl StorageBackend for GcsStorage {
    async fn store(&self, key: &str, data: &[u8]) -> Result<()> {
        self.bucket.create_object(key, data, "application/octet-stream").await?;
        Ok(())
    }
}

Authentication Provider Integration

OIDC/SSO Providers

Keycloak

# Keycloak configuration
OIDC_ENABLED: true
OIDC_ISSUER_URL: https://keycloak.example.com/auth/realms/readur
OIDC_CLIENT_ID: readur-client
OIDC_CLIENT_SECRET: your-client-secret
OIDC_REDIRECT_URI: https://readur.example.com/auth/oidc/callback
OIDC_SCOPES: openid profile email

Keycloak client configuration:

{
  "clientId": "readur-client",
  "standardFlowEnabled": true,
  "implicitFlowEnabled": false,
  "directAccessGrantsEnabled": false,
  "serviceAccountsEnabled": false,
  "publicClient": false,
  "frontchannelLogout": true,
  "protocol": "openid-connect",
  "redirectUris": [
    "https://readur.example.com/auth/oidc/callback"
  ],
  "webOrigins": [
    "https://readur.example.com"
  ]
}

Auth0

# Auth0 configuration
OIDC_ENABLED: true
OIDC_ISSUER_URL: https://your-tenant.auth0.com/
OIDC_CLIENT_ID: your-client-id
OIDC_CLIENT_SECRET: your-client-secret
OIDC_REDIRECT_URI: https://readur.example.com/auth/oidc/callback
OIDC_SCOPES: openid profile email

Okta

# Okta configuration
OIDC_ENABLED: true
OIDC_ISSUER_URL: https://your-org.okta.com/oauth2/default
OIDC_CLIENT_ID: your-client-id
OIDC_CLIENT_SECRET: your-client-secret
OIDC_REDIRECT_URI: https://readur.example.com/auth/oidc/callback

Azure AD

# Azure AD configuration
OIDC_ENABLED: true
OIDC_ISSUER_URL: https://login.microsoftonline.com/{tenant-id}/v2.0
OIDC_CLIENT_ID: your-application-id
OIDC_CLIENT_SECRET: your-client-secret
OIDC_REDIRECT_URI: https://readur.example.com/auth/oidc/callback
OIDC_SCOPES: openid profile email User.Read

Google Workspace

# Google Workspace configuration
OIDC_ENABLED: true
OIDC_ISSUER_URL: https://accounts.google.com
OIDC_CLIENT_ID: your-client-id.apps.googleusercontent.com
OIDC_CLIENT_SECRET: your-client-secret
OIDC_REDIRECT_URI: https://readur.example.com/auth/oidc/callback
OIDC_SCOPES: openid profile email

LDAP Integration

# Future LDAP support configuration
LDAP_ENABLED: true
LDAP_HOST: ldap.example.com
LDAP_PORT: 389
LDAP_USE_TLS: true
LDAP_BIND_DN: cn=readur,ou=services,dc=example,dc=com
LDAP_BIND_PASSWORD: password
LDAP_USER_BASE_DN: ou=users,dc=example,dc=com
LDAP_USER_FILTER: (uid={username})
LDAP_EMAIL_ATTRIBUTE: mail
LDAP_NAME_ATTRIBUTE: cn

Monitoring Integration

Prometheus

# Enable Prometheus metrics
PROMETHEUS_ENABLED: true
PROMETHEUS_PORT: 9090

Prometheus configuration:

# prometheus.yml
scrape_configs:
  - job_name: 'readur'
    static_configs:
      - targets: ['readur:9090']
    metrics_path: '/metrics'

Key metrics to monitor:

# Document processing rate
rate(documents_processed_total[5m])

# OCR queue depth
ocr_queue_pending_count

# API response time
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Error rate
rate(http_requests_total{status=~"5.."}[5m])

Grafana

Import the Readur dashboard:

{
  "dashboard": {
    "title": "Readur Monitoring",
    "panels": [
      {
        "title": "Documents Processed",
        "targets": [
          {
            "expr": "rate(documents_processed_total[5m])"
          }
        ]
      },
      {
        "title": "OCR Queue",
        "targets": [
          {
            "expr": "ocr_queue_pending_count"
          }
        ]
      },
      {
        "title": "API Latency",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))"
          }
        ]
      }
    ]
  }
}

ELK Stack

Logstash Configuration

# logstash.conf
input {
  tcp {
    port => 5000
    codec => json
  }
}

filter {
  if [app] == "readur" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}" }
    }

    mutate {
      add_field => { "service" => "readur" }
    }
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
    index => "readur-%{+YYYY.MM.dd}"
  }
}

Filebeat Configuration

# filebeat.yml
filebeat.inputs:
  - type: log
    enabled: true
    paths:
      - /var/log/readur/*.log
    json.keys_under_root: true
    json.add_error_key: true
    fields:
      service: readur

output.elasticsearch:
  hosts: ["elasticsearch:9200"]
  index: "readur-%{+yyyy.MM.dd}"

Datadog

# Datadog integration
DATADOG_ENABLED: true
DATADOG_API_KEY: your-api-key
DATADOG_APP_KEY: your-app-key
DATADOG_HOST: https://api.datadoghq.com

Reverse Proxy Configuration

Nginx

# nginx.conf
upstream readur {
    server readur1:8080;
    server readur2:8080;
    server readur3:8080;
}

server {
    listen 443 ssl http2;
    server_name readur.example.com;

    ssl_certificate /etc/ssl/certs/readur.crt;
    ssl_certificate_key /etc/ssl/private/readur.key;

    # Security headers
    add_header Strict-Transport-Security "max-age=31536000" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "SAMEORIGIN" always;

    # Main application
    location / {
        proxy_pass http://readur;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket support
    location /ws {
        proxy_pass http://readur;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_read_timeout 86400;
    }

    # File uploads
    location /api/documents/upload {
        proxy_pass http://readur;
        client_max_body_size 500M;
        proxy_request_buffering off;
    }
}

Apache

# httpd.conf
<VirtualHost *:443>
    ServerName readur.example.com

    SSLEngine on
    SSLCertificateFile /etc/ssl/certs/readur.crt
    SSLCertificateKeyFile /etc/ssl/private/readur.key

    # Proxy configuration
    ProxyPreserveHost On
    ProxyPass / http://localhost:8080/
    ProxyPassReverse / http://localhost:8080/

    # WebSocket support
    RewriteEngine On
    RewriteCond %{HTTP:Upgrade} websocket [NC]
    RewriteCond %{HTTP:Connection} upgrade [NC]
    RewriteRule ^/?(.*) "ws://localhost:8080/$1" [P,L]

    # Security headers
    Header always set Strict-Transport-Security "max-age=31536000"
    Header always set X-Content-Type-Options "nosniff"
    Header always set X-Frame-Options "SAMEORIGIN"
</VirtualHost>

Traefik

# docker-compose.yml with Traefik
services:
  traefik:
    image: traefik:v2.9
    command:
      - "--providers.docker=true"
      - "--entrypoints.websecure.address=:443"
    ports:
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./certs:/certs

  readur:
    image: readur:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.readur.rule=Host(`readur.example.com`)"
      - "traefik.http.routers.readur.entrypoints=websecure"
      - "traefik.http.routers.readur.tls=true"
      - "traefik.http.services.readur.loadbalancer.server.port=8080"

Caddy

# Caddyfile
readur.example.com {
    reverse_proxy localhost:8080 {
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-Proto {scheme}
    }

    # File upload size
    request_body {
        max_size 500MB
    }

    # WebSocket support (automatic in Caddy)
}

CI/CD Integration

GitHub Actions

# .github/workflows/readur-deploy.yml
name: Deploy Readur

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build Docker image
        run: docker build -t readur:${{ github.sha }} .

      - name: Push to registry
        run: |
          echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
          docker push readur:${{ github.sha }}

      - name: Deploy to Kubernetes
        uses: azure/k8s-deploy@v4
        with:
          manifests: |
            k8s/deployment.yaml
            k8s/service.yaml
          images: |
            readur:${{ github.sha }}

GitLab CI

# .gitlab-ci.yml
stages:
  - build
  - test
  - deploy

build:
  stage: build
  script:
    - docker build -t readur:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

test:
  stage: test
  script:
    - cargo test
    - cargo clippy

deploy:
  stage: deploy
  script:
    - kubectl set image deployment/readur readur=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  only:
    - main

Jenkins

// Jenkinsfile
pipeline {
    agent any

    stages {
        stage('Build') {
            steps {
                sh 'docker build -t readur:${BUILD_NUMBER} .'
            }
        }

        stage('Test') {
            steps {
                sh 'cargo test'
            }
        }

        stage('Deploy') {
            when {
                branch 'main'
            }
            steps {
                sh 'kubectl apply -f k8s/'
                sh 'kubectl set image deployment/readur readur=readur:${BUILD_NUMBER}'
            }
        }
    }
}

API Client Libraries

Python SDK

# readur-python-sdk
from readur import ReadurClient

client = ReadurClient(
    base_url="https://readur.example.com",
    api_key="your-api-key"
)

# Upload document
document = client.documents.upload(
    file_path="/path/to/document.pdf",
    metadata={"category": "invoice"}
)

# Search documents
results = client.search.query(
    q="invoice 2024",
    filters={"mime_type": "application/pdf"}
)

# WebSocket for real-time updates
@client.on('ocr.completed')
def handle_ocr_complete(event):
    print(f"OCR completed for {event['document_id']}")

client.connect_websocket()

JavaScript/TypeScript SDK

// @readur/sdk
import { ReadurClient } from '@readur/sdk';

const client = new ReadurClient({
  baseUrl: 'https://readur.example.com',
  apiKey: 'your-api-key'
});

// Upload document
const document = await client.documents.upload({
  file: fileInput.files[0],
  metadata: { category: 'invoice' }
});

// Search documents
const results = await client.search.query({
  q: 'invoice 2024',
  filters: { mimeType: 'application/pdf' }
});

// WebSocket subscription
client.subscribe('ocr.completed', (event) => {
  console.log(`OCR completed for ${event.documentId}`);
});

Go SDK

// github.com/readur/readur-go-sdk
package main

import (
    "github.com/readur/readur-go-sdk"
)

func main() {
    client := readur.NewClient(
        readur.WithBaseURL("https://readur.example.com"),
        readur.WithAPIKey("your-api-key"),
    )

    // Upload document
    doc, err := client.Documents.Upload(
        "document.pdf",
        readur.WithMetadata(map[string]interface{}{
            "category": "invoice",
        }),
    )

    // Search documents
    results, err := client.Search.Query(
        "invoice 2024",
        readur.WithFilter("mime_type", "application/pdf"),
    )
}

Database Integration

PostgreSQL Extensions

-- Enable useful extensions
CREATE EXTENSION IF NOT EXISTS pg_trgm;  -- Trigram similarity search
CREATE EXTENSION IF NOT EXISTS unaccent;  -- Remove accents
CREATE EXTENSION IF NOT EXISTS pgcrypto;  -- Encryption functions

-- Custom search function with fuzzy matching
CREATE OR REPLACE FUNCTION fuzzy_search(
    query_text TEXT,
    threshold FLOAT DEFAULT 0.3
) RETURNS TABLE (
    id UUID,
    title TEXT,
    similarity FLOAT
) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        d.id,
        d.title,
        similarity(d.title, query_text) AS similarity
    FROM documents d
    WHERE similarity(d.title, query_text) > threshold
    ORDER BY similarity DESC;
END;
$$ LANGUAGE plpgsql;

Redis Caching

# Redis configuration
REDIS_ENABLED: true
REDIS_URL: redis://localhost:6379
REDIS_PASSWORD: your-password
REDIS_DB: 0
REDIS_KEY_PREFIX: readur:

Usage example:

// Cache search results
let cache_key = format!("search:{}", query_hash);
if let Some(cached) = redis.get(&cache_key).await? {
    return Ok(cached);
}

let results = perform_search(query).await?;
redis.set_ex(&cache_key, &results, 300).await?; // Cache for 5 minutes

Notification Services

Email (SMTP)

# SMTP configuration
EMAIL_ENABLED: true
SMTP_HOST: smtp.gmail.com
SMTP_PORT: 587
SMTP_USERNAME: [email protected]
SMTP_PASSWORD: your-app-password
SMTP_FROM_ADDRESS: [email protected]
SMTP_USE_TLS: true

SendGrid

# SendGrid configuration
EMAIL_PROVIDER: sendgrid
SENDGRID_API_KEY: your-api-key
SENDGRID_FROM_EMAIL: [email protected]
SENDGRID_FROM_NAME: Readur

Amazon SES

# AWS SES configuration
EMAIL_PROVIDER: ses
AWS_REGION: us-east-1
S3_ACCESS_KEY_ID: your-access-key
S3_SECRET_ACCESS_KEY: your-secret-key
SES_FROM_EMAIL: [email protected]

Message Queue Integration

RabbitMQ

# RabbitMQ configuration
AMQP_ENABLED: true
AMQP_URL: amqp://user:pass@localhost:5672/
AMQP_EXCHANGE: readur-events
AMQP_QUEUE: readur-processing

Apache Kafka

# Kafka configuration
KAFKA_ENABLED: true
KAFKA_BROKERS: localhost:9092
KAFKA_TOPIC: readur-events
KAFKA_CONSUMER_GROUP: readur-consumers

Best Practices

Security

  1. Always use HTTPS for webhooks and API calls
  2. Verify webhook signatures to prevent spoofing
  3. Rotate API keys regularly
  4. Use least privilege for service accounts
  5. Enable audit logging for all integrations

Performance

  1. Implement retry logic with exponential backoff
  2. Use connection pooling for database connections
  3. Cache frequently accessed data
  4. Batch API requests when possible
  5. Monitor integration performance

Reliability

  1. Implement health checks for all integrations
  2. Use circuit breakers for external services
  3. Set appropriate timeouts
  4. Handle failures gracefully
  5. Maintain integration documentation

Troubleshooting Integrations

Common Issues

Webhook Delivery Failures

# Check webhook logs
grep "webhook" /var/log/readur/app.log | tail -50

# Test webhook endpoint
curl -X POST https://your-webhook-url \
  -H "Content-Type: application/json" \
  -d '{"test": true}'

Authentication Failures

# Test OIDC discovery
curl https://your-provider/.well-known/openid-configuration

# Verify JWT token
jwt decode your-token-here

Storage Connection Issues

# Test S3 connectivity
aws s3 ls s3://your-bucket/ --endpoint-url https://your-endpoint

# Check permissions
aws s3api get-bucket-acl --bucket your-bucket