Skip to content

API Reference

Readur provides a comprehensive REST API for integrating with external systems and building custom workflows.

Table of Contents

Base URL

http://localhost:8080/api

For production deployments, replace with your configured domain and ensure HTTPS is used.

Authentication

Readur supports multiple authentication methods:

JWT Authentication

Include the token in the Authorization header:

Authorization: Bearer <jwt_token>

Obtaining a Token

POST /api/auth/login
Content-Type: application/json

{
  "username": "admin",
  "password": "your_password"
}

Response:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "user": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "username": "admin",
    "email": "[email protected]",
    "role": "admin"
  },
  "expires_at": "2025-01-16T12:00:00Z"
}

Refresh Token

POST /api/auth/refresh
Authorization: Bearer <expired_token>

OIDC Authentication

For OIDC/SSO authentication:

GET /api/auth/oidc/login

This will redirect to your configured OIDC provider. After successful authentication, the callback URL will receive the token.

Error Handling

All API errors follow a consistent format:

{
  "error": {
    "code": "DOCUMENT_NOT_FOUND",
    "message": "Document with ID 'abc123' not found",
    "details": {
      "document_id": "abc123",
      "user_id": "user456"
    },
    "timestamp": "2025-01-15T10:30:45Z",
    "request_id": "req_xyz789"
  }
}

Error Codes

Code HTTP Status Description
UNAUTHORIZED 401 Invalid or missing authentication
FORBIDDEN 403 Insufficient permissions
NOT_FOUND 404 Resource not found
VALIDATION_ERROR 400 Invalid request parameters
DUPLICATE_RESOURCE 409 Resource already exists
RATE_LIMITED 429 Too many requests
INTERNAL_ERROR 500 Server error
SERVICE_UNAVAILABLE 503 Service temporarily unavailable

Rate Limiting

API requests are rate limited per user:

  • Default limit: 100 requests per minute
  • Burst limit: 20 requests
  • Upload endpoints: 10 requests per minute

Rate limit headers are included in responses:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642248360

Pagination

List endpoints support pagination using query parameters:

GET /api/documents?page=1&per_page=20&sort=created_at&order=desc

Pagination Parameters: - page: Page number (default: 1) - per_page: Items per page (default: 20, max: 100) - sort: Sort field - order: Sort order (asc or desc)

Response includes pagination metadata:

{
  "data": [...],
  "pagination": {
    "page": 1,
    "per_page": 20,
    "total": 150,
    "total_pages": 8,
    "has_next": true,
    "has_prev": false
  }
}

Endpoints

Authentication Endpoints

Login

POST /api/auth/login

Request Body:

{
  "username": "string",
  "password": "string"
}

Response: 200 OK

{
  "token": "string",
  "user": {
    "id": "uuid",
    "username": "string",
    "email": "string",
    "role": "admin|editor|viewer"
  }
}

Register

POST /api/auth/register

Request Body:

{
  "username": "string",
  "email": "string",
  "password": "string"
}

Response: 201 Created

Logout

POST /api/auth/logout
Authorization: Bearer <token>

Response: 200 OK

Document Endpoints

List Documents

GET /api/documents

Query Parameters: - page: Page number - per_page: Items per page - status: Filter by status (pending, processing, completed, failed) - source_id: Filter by source - label_ids: Comma-separated label IDs - date_from: Start date (ISO 8601) - date_to: End date (ISO 8601)

Response: 200 OK

{
  "data": [
    {
      "id": "uuid",
      "title": "Document Title",
      "filename": "document.pdf",
      "content": "Extracted text content...",
      "status": "completed",
      "mime_type": "application/pdf",
      "size": 1048576,
      "created_at": "2025-01-15T10:00:00Z",
      "updated_at": "2025-01-15T10:05:00Z",
      "ocr_confidence": 0.95,
      "labels": ["label1", "label2"],
      "metadata": {
        "author": "John Doe",
        "pages": 10
      }
    }
  ],
  "pagination": {...}
}

Get Document

GET /api/documents/{id}

Response: 200 OK

Upload Document

POST /api/documents/upload
Content-Type: multipart/form-data

Request Body: - file: File to upload (required) - title: Document title (optional) - labels: Comma-separated label IDs (optional) - ocr_enabled: Enable OCR (default: true) - language: OCR language code (default: "eng")

Response: 201 Created

{
  "id": "uuid",
  "message": "Document uploaded successfully",
  "ocr_queued": true
}

Bulk Upload

POST /api/documents/bulk-upload
Content-Type: multipart/form-data

Request Body: - files: Multiple files - default_labels: Labels to apply to all files

Response: 201 Created

{
  "uploaded": 5,
  "failed": 0,
  "documents": [...]
}

Update Document

PUT /api/documents/{id}

Request Body:

{
  "title": "Updated Title",
  "labels": ["label1", "label3"],
  "metadata": {
    "custom_field": "value"
  }
}

Response: 200 OK

Delete Document

DELETE /api/documents/{id}

Response: 204 No Content

Download Document

GET /api/documents/{id}/download

Response: 200 OK with file attachment

Get Document Thumbnail

GET /api/documents/{id}/thumbnail

Query Parameters: - width: Thumbnail width (default: 200) - height: Thumbnail height (default: 200)

Response: 200 OK with image

Retry OCR

POST /api/documents/{id}/retry-ocr

Request Body:

{
  "language": "eng+spa",
  "priority": "high"
}

Response: 200 OK

Search Endpoints

Search Documents

GET /api/search

Query Parameters: - q: Search query (required) - page: Page number - per_page: Items per page - filters: JSON-encoded filters - highlight: Enable highlighting (default: true) - fuzzy: Enable fuzzy search (default: false)

Response: 200 OK

{
  "results": [
    {
      "document_id": "uuid",
      "title": "Document Title",
      "score": 0.95,
      "highlights": [
        {
          "field": "content",
          "snippet": "...matched <mark>text</mark> here..."
        }
      ]
    }
  ],
  "total": 42,
  "facets": {
    "mime_types": {
      "application/pdf": 30,
      "text/plain": 12
    },
    "labels": {
      "important": 15,
      "archive": 27
    }
  }
}

POST /api/search/advanced

Request Body:

{
  "query": {
    "must": [
      {"field": "content", "value": "invoice", "type": "match"}
    ],
    "should": [
      {"field": "title", "value": "2024", "type": "contains"}
    ],
    "must_not": [
      {"field": "labels", "value": "draft", "type": "exact"}
    ]
  },
  "filters": {
    "date_range": {
      "from": "2024-01-01",
      "to": "2024-12-31"
    },
    "mime_types": ["application/pdf"],
    "min_confidence": 0.8
  },
  "sort": [
    {"field": "created_at", "order": "desc"}
  ],
  "page": 1,
  "per_page": 20
}

OCR Queue Endpoints

Get Queue Status

GET /api/ocr/queue/status

Response: 200 OK

{
  "pending": 15,
  "processing": 3,
  "completed_today": 142,
  "failed": 2,
  "average_processing_time": 5.2,
  "estimated_completion": "2025-01-15T11:30:00Z"
}

List Queue Items

GET /api/ocr/queue

Query Parameters: - status: Filter by status - priority: Filter by priority

Response: 200 OK

Update Queue Priority

PUT /api/ocr/queue/{id}/priority

Request Body:

{
  "priority": "high"
}

Cancel OCR Job

DELETE /api/ocr/queue/{id}

Settings Endpoints

Get User Settings

GET /api/settings

Response: 200 OK

{
  "theme": "dark",
  "language": "en",
  "notifications_enabled": true,
  "ocr_default_language": "eng",
  "items_per_page": 20
}

Update Settings

PUT /api/settings

Request Body:

{
  "theme": "light",
  "notifications_enabled": false
}

Sources Endpoints

List Sources

GET /api/sources

Response: 200 OK

{
  "sources": [
    {
      "id": "uuid",
      "name": "Shared Documents",
      "type": "webdav",
      "url": "https://nextcloud.example.com/remote.php/dav/files/user/",
      "status": "active",
      "last_sync": "2025-01-15T10:00:00Z",
      "next_sync": "2025-01-15T11:00:00Z",
      "document_count": 150
    }
  ]
}

Create Source

POST /api/sources

Request Body:

{
  "name": "Company Drive",
  "type": "webdav",
  "url": "https://drive.company.com/dav/",
  "username": "user",
  "password": "encrypted_password",
  "sync_interval": 3600,
  "recursive": true,
  "file_patterns": ["*.pdf", "*.docx"]
}

Update Source

PUT /api/sources/{id}

Delete Source

DELETE /api/sources/{id}

Trigger Source Sync

POST /api/sources/{id}/sync

Response: 202 Accepted

{
  "message": "Sync started",
  "job_id": "job_123"
}

Get Sync Status

GET /api/sources/{id}/sync-status

Labels Endpoints

List Labels

GET /api/labels

Response: 200 OK

{
  "labels": [
    {
      "id": "uuid",
      "name": "Important",
      "color": "#FF5733",
      "description": "High priority documents",
      "document_count": 42,
      "created_by": "admin",
      "created_at": "2025-01-01T00:00:00Z"
    }
  ]
}

Create Label

POST /api/labels

Request Body:

{
  "name": "Archive",
  "color": "#808080",
  "description": "Archived documents"
}

Update Label

PUT /api/labels/{id}

Delete Label

DELETE /api/labels/{id}

Assign Label to Documents

POST /api/labels/{id}/assign

Request Body:

{
  "document_ids": ["doc1", "doc2", "doc3"]
}

User Endpoints

List Users (Admin only)

GET /api/users

Get User Profile

GET /api/users/profile

Update Profile

PUT /api/users/profile

Request Body:

{
  "email": "[email protected]",
  "display_name": "John Doe"
}

Change Password

POST /api/users/change-password

Request Body:

{
  "current_password": "old_password",
  "new_password": "new_secure_password"
}

Notification Endpoints

List Notifications

GET /api/notifications

Query Parameters: - unread_only: Show only unread notifications

Response: 200 OK

{
  "notifications": [
    {
      "id": "uuid",
      "type": "ocr_completed",
      "title": "OCR Processing Complete",
      "message": "Document 'Invoice.pdf' has been processed",
      "read": false,
      "created_at": "2025-01-15T10:00:00Z",
      "data": {
        "document_id": "doc123"
      }
    }
  ]
}

Mark as Read

PUT /api/notifications/{id}/read

Mark All as Read

PUT /api/notifications/read-all

Metrics Endpoints

System Metrics

GET /api/metrics/system

Response: 200 OK

{
  "cpu_usage": 45.2,
  "memory_usage": 67.8,
  "disk_usage": 34.5,
  "active_connections": 23,
  "uptime_seconds": 864000
}

OCR Analytics

GET /api/metrics/ocr

Response: 200 OK

{
  "total_processed": 5432,
  "success_rate": 0.98,
  "average_processing_time": 4.5,
  "languages_used": {
    "eng": 4500,
    "spa": 700,
    "fra": 232
  },
  "daily_stats": [...]
}

WebSocket API

Connect to real-time updates:

const ws = new WebSocket('ws://localhost:8080/ws');

ws.onopen = () => {
  // Authenticate
  ws.send(JSON.stringify({
    type: 'auth',
    token: 'your_jwt_token'
  }));

  // Subscribe to events
  ws.send(JSON.stringify({
    type: 'subscribe',
    events: ['ocr_progress', 'sync_progress', 'notifications']
  }));
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);

  switch(data.type) {
    case 'ocr_progress':
      console.log(`OCR Progress: ${data.progress}% for document ${data.document_id}`);
      break;
    case 'sync_progress':
      console.log(`Sync Progress: ${data.processed}/${data.total} files`);
      break;
    case 'notification':
      console.log(`New notification: ${data.message}`);
      break;
  }
};

WebSocket Events

Event Type Description Data Structure
ocr_progress OCR processing updates {document_id, progress, status}
sync_progress Source sync updates {source_id, processed, total, current_file}
notification Real-time notifications {id, type, title, message}
document_created New document added {document_id, title, source}
document_updated Document modified {document_id, changes}
queue_status OCR queue updates {pending, processing, completed}

Examples

Python Client Example

import requests
import json

class ReadurClient:
    def __init__(self, base_url, username, password):
        self.base_url = base_url
        self.token = self._authenticate(username, password)
        self.headers = {'Authorization': f'Bearer {self.token}'}

    def _authenticate(self, username, password):
        response = requests.post(
            f'{self.base_url}/auth/login',
            json={'username': username, 'password': password}
        )
        return response.json()['token']

    def upload_document(self, file_path):
        with open(file_path, 'rb') as f:
            files = {'file': f}
            response = requests.post(
                f'{self.base_url}/documents/upload',
                headers=self.headers,
                files=files
            )
        return response.json()

    def search(self, query):
        response = requests.get(
            f'{self.base_url}/search',
            headers=self.headers,
            params={'q': query}
        )
        return response.json()

# Usage
client = ReadurClient('http://localhost:8080/api', 'admin', 'password')
result = client.upload_document('/path/to/document.pdf')
print(f"Document uploaded: {result['id']}")

search_results = client.search('invoice 2024')
for result in search_results['results']:
    print(f"Found: {result['title']} (score: {result['score']})")

JavaScript/TypeScript Example

class ReadurAPI {
  private token: string;
  private baseURL: string;

  constructor(baseURL: string) {
    this.baseURL = baseURL;
  }

  async login(username: string, password: string): Promise<void> {
    const response = await fetch(`${this.baseURL}/auth/login`, {
      method: 'POST',
      headers: {'Content-Type': 'application/json'},
      body: JSON.stringify({username, password})
    });
    const data = await response.json();
    this.token = data.token;
  }

  async uploadDocument(file: File): Promise<any> {
    const formData = new FormData();
    formData.append('file', file);

    const response = await fetch(`${this.baseURL}/documents/upload`, {
      method: 'POST',
      headers: {'Authorization': `Bearer ${this.token}`},
      body: formData
    });
    return response.json();
  }

  async search(query: string): Promise<any> {
    const response = await fetch(
      `${this.baseURL}/search?q=${encodeURIComponent(query)}`,
      {headers: {'Authorization': `Bearer ${this.token}`}}
    );
    return response.json();
  }
}

// Usage
const api = new ReadurAPI('http://localhost:8080/api');
await api.login('admin', 'password');

const fileInput = document.getElementById('file-input') as HTMLInputElement;
if (fileInput.files?.[0]) {
  const result = await api.uploadDocument(fileInput.files[0]);
  console.log('Uploaded:', result.id);
}

cURL Examples

# Login and save token
TOKEN=$(curl -s -X POST http://localhost:8080/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"password"}' \
  | jq -r '.token')

# Upload document
curl -X POST http://localhost:8080/api/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F "labels=important,invoice"

# Search documents
curl -X GET "http://localhost:8080/api/search?q=invoice%202024" \
  -H "Authorization: Bearer $TOKEN" | jq

# Get OCR queue status
curl -X GET http://localhost:8080/api/ocr/queue/status \
  -H "Authorization: Bearer $TOKEN" | jq

# Create a new source
curl -X POST http://localhost:8080/api/sources \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Company Drive",
    "type": "webdav",
    "url": "https://drive.company.com/dav/",
    "username": "user",
    "password": "pass",
    "sync_interval": 3600
  }'

API Versioning

The API uses URL versioning. The current version is v1. Future versions will be available at:

/api/v2/...

Deprecated endpoints will include a Deprecation header with the sunset date:

Deprecation: Sun, 01 Jul 2025 00:00:00 GMT

SDK Support

Official SDKs are available for:

  • Python: pip install readur-sdk
  • JavaScript/TypeScript: npm install @readur/sdk
  • Go: go get github.com/readur/readur-go-sdk

API Limits

  • Maximum request size: 100MB (configurable)
  • Maximum file upload: 500MB
  • Maximum bulk upload: 10 files
  • Maximum search results: 1000
  • WebSocket connections per user: 5
  • API calls per minute: 100 (configurable)