Skip to content

Per-User Watch Directories Documentation

Table of Contents

  1. Overview
  2. Architecture and Components
  3. Prerequisites and Requirements
  4. Administrator Setup Guide
  5. User Guide
  6. API Reference
  7. Configuration Reference
  8. Security Considerations
  9. Troubleshooting
  10. Examples and Best Practices

Overview

The Per-User Watch Directories feature in Readur allows each user to have their own dedicated folder for automatic document ingestion. When enabled, documents placed in a user's watch directory are automatically processed, OCR'd, and associated with that specific user's account.

Key Benefits

  • User Isolation: Each user's documents remain private and separate
  • Automatic Attribution: Documents are automatically assigned to the correct user
  • Simplified Workflow: Users can drop files into their folder without manual upload
  • Batch Processing: Process multiple documents simultaneously
  • Integration Support: Works with network shares, sync tools, and automated workflows

How It Works

  1. Administrator enables per-user watch directories in configuration
  2. System creates a dedicated folder for each user (e.g., /data/user_watch/username/)
  3. Users place documents in their watch folder
  4. Readur's file watcher detects new files
  5. Documents are automatically ingested and associated with the user
  6. OCR processing extracts text for searching
  7. Documents appear in the user's library

Architecture and Components

System Components

  1. UserWatchService (src/services/user_watch_service.rs)
  2. Manages user-specific watch directories
  3. Handles directory creation, validation, and cleanup
  4. Provides secure path operations

  5. UserWatchManager (src/scheduling/user_watch_manager.rs)

  6. Coordinates between file watcher and user management
  7. Maps file paths to users
  8. Manages user cache for performance

  9. File Watcher (src/scheduling/watcher.rs)

  10. Monitors both global and per-user directories
  11. Determines file ownership based on directory location
  12. Triggers document ingestion pipeline

  13. API Endpoints (src/routes/users.rs)

  14. REST API for managing user watch directories
  15. Provides status, creation, and deletion operations

Directory Structure

user_watch_base_dir/           # Base directory (configurable)
├── alice/                     # User alice's watch directory
│   ├── document1.pdf
│   └── report.docx
├── bob/                       # User bob's watch directory
│   └── invoice.pdf
└── charlie/                   # User charlie's watch directory
    ├── presentation.pptx
    └── notes.txt

Prerequisites and Requirements

System Requirements

  • Operating System: Linux, macOS, or Windows with proper file permissions
  • Storage: Sufficient disk space for user directories and documents
  • File System: Support for directory permissions (recommended: ext4, NTFS, APFS)
  • Readur Version: 2.5.4 or later

Software Requirements

  • PostgreSQL database
  • Readur server with file watching enabled
  • Proper file system permissions for the Readur process

Network Requirements (Optional)

  • Network file system support (NFS, SMB/CIFS) for remote directories
  • Stable network connection for remote file access

Administrator Setup Guide

Step 1: Enable Per-User Watch Directories

Edit your .env file or set environment variables:

# Enable the feature
ENABLE_PER_USER_WATCH=true

# Set the base directory for user watch folders
USER_WATCH_BASE_DIR=/data/user_watch

# Configure watch interval (optional, default: 60 seconds)
WATCH_INTERVAL_SECONDS=30

# Set file stability check (optional, default: 2000ms)
FILE_STABILITY_CHECK_MS=3000

# Set maximum file age to process (optional, default: 24 hours)
MAX_FILE_AGE_HOURS=48

Step 2: Create Base Directory

Ensure the base directory exists with proper permissions:

# Create the base directory
sudo mkdir -p /data/user_watch

# Set ownership to the user running Readur
sudo chown readur:readur /data/user_watch

# Set permissions (owner: read/write/execute, group: read/execute)
sudo chmod 755 /data/user_watch

Step 3: Configure Directory Permissions

For production environments, configure appropriate permissions:

# Option 1: Shared group access
sudo groupadd readur-users
sudo usermod -a -G readur-users readur
sudo chgrp -R readur-users /data/user_watch
sudo chmod -R 2775 /data/user_watch  # SGID bit ensures new files inherit group

# Option 2: ACL-based permissions (more granular)
sudo setfacl -R -m u:readur:rwx /data/user_watch
sudo setfacl -R -d -m u:readur:rwx /data/user_watch

Step 4: Network Share Setup (Optional)

To allow users to access their watch directories via network shares:

SMB/CIFS Share Configuration

# /etc/samba/smb.conf
[readur-watch]
   path = /data/user_watch
   valid users = @readur-users
   writable = yes
   browseable = yes
   create mask = 0660
   directory mask = 0770
   force group = readur-users

NFS Export Configuration

# /etc/exports
/data/user_watch *(rw,sync,no_subtree_check,no_root_squash)

Step 5: Restart Readur

After configuration, restart the Readur service:

# Systemd
sudo systemctl restart readur

# Docker
docker-compose restart readur

# Direct execution
# Stop the current process and start with new configuration

Step 6: Verify Configuration

Check the Readur logs to confirm per-user watch is enabled:

# Check logs for confirmation
grep "Per-user watch enabled" /var/log/readur/readur.log

# Expected output:
# ✅ Per-user watch enabled: true
# 📂 User watch base directory: /data/user_watch

User Guide

Accessing Your Watch Directory

Method 1: Direct File System Access

If you have direct access to the server:

# Navigate to your watch directory
cd /data/user_watch/your-username/

# Copy files
cp ~/Documents/*.pdf /data/user_watch/your-username/

# Move files
mv ~/Downloads/report.docx /data/user_watch/your-username/

Method 2: Network Share Access

Access via SMB/CIFS on Windows:

  1. Open File Explorer
  2. Type in address bar: \\server-name\readur-watch\your-username
  3. Drag and drop files into your folder

Access via SMB/CIFS on macOS:

  1. Open Finder
  2. Press Cmd+K
  3. Enter: smb://server-name/readur-watch/your-username
  4. Drag and drop files into your folder

Method 3: Sync Tools

Use synchronization tools for automatic uploads:

# Using rsync
rsync -avz ~/Documents/*.pdf server:/data/user_watch/your-username/

# Using rclone
rclone copy ~/Documents server:user_watch/your-username/

# Using Syncthing (configure folder sync)
# Add /data/user_watch/your-username as a sync folder

Managing Your Watch Directory via Web Interface

  1. Check Directory Status
  2. Navigate to Settings → Watch Folder
  3. View your watch directory path and status
  4. See if directory exists and is enabled

  5. Create Your Directory

  6. Click "Create Watch Directory" button
  7. System will create your personal folder
  8. Confirmation message will appear

  9. View Directory Path

  10. Your directory path is displayed
  11. Copy path for reference
  12. Share with IT for network access setup

Supported File Types

Place any of these file types in your watch directory:

  • Documents: PDF, TXT, DOC, DOCX, ODT, RTF
  • Images: PNG, JPG, JPEG, TIFF, BMP
  • Presentations: PPT, PPTX, ODP
  • Spreadsheets: XLS, XLSX, ODS

File Processing Workflow

  1. File Detection: System checks for new files every 30-60 seconds
  2. Stability Check: Waits for file to stop changing (2-3 seconds)
  3. Validation: Verifies file type and size
  4. Ingestion: Creates document record in database
  5. OCR Queue: Adds to processing queue
  6. Text Extraction: OCR processes the document
  7. Search Index: Document becomes searchable

Best Practices for Users

  1. File Naming: Use descriptive names for easier identification
  2. File Size: Keep files under 50MB for optimal processing
  3. Batch Upload: Can upload multiple files simultaneously
  4. Organization: Create subfolders within your watch directory
  5. Patience: Allow 1-5 minutes for processing depending on file size

API Reference

Get User Watch Directory Information

Retrieve information about a user's watch directory.

Endpoint: GET /api/users/{user_id}/watch-directory

Headers:

Authorization: Bearer {jwt_token}

Response (200 OK):

{
  "user_id": "550e8400-e29b-41d4-a716-446655440000",
  "username": "alice",
  "watch_directory_path": "/data/user_watch/alice",
  "exists": true,
  "enabled": true
}

Error Responses: - 401 Unauthorized: Missing or invalid authentication - 403 Forbidden: Insufficient permissions - 404 Not Found: User not found - 500 Internal Server Error: Per-user watch disabled

Create User Watch Directory

Create or ensure a user's watch directory exists.

Endpoint: POST /api/users/{user_id}/watch-directory

Headers:

Authorization: Bearer {jwt_token}
Content-Type: application/json

Request Body:

{
  "ensure_created": true
}

Response (200 OK):

{
  "success": true,
  "message": "Watch directory ready for user 'alice'",
  "watch_directory_path": "/data/user_watch/alice"
}

Error Responses: - 401 Unauthorized: Missing or invalid authentication - 403 Forbidden: Insufficient permissions - 404 Not Found: User not found - 500 Internal Server Error: Creation failed or feature disabled

Delete User Watch Directory

Remove a user's watch directory and its contents.

Endpoint: DELETE /api/users/{user_id}/watch-directory

Headers:

Authorization: Bearer {jwt_token}

Note: Only administrators can delete watch directories.

Response (200 OK):

{
  "success": true,
  "message": "Watch directory removed for user 'alice'",
  "watch_directory_path": null
}

Error Responses: - 401 Unauthorized: Missing or invalid authentication - 403 Forbidden: Admin access required - 404 Not Found: User not found - 500 Internal Server Error: Deletion failed

API Usage Examples

Python Example

import requests

# Configuration
base_url = "https://readur.example.com/api"
token = "your-jwt-token"
user_id = "550e8400-e29b-41d4-a716-446655440000"

headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Get watch directory info
response = requests.get(
    f"{base_url}/users/{user_id}/watch-directory",
    headers=headers
)
info = response.json()
print(f"Watch directory: {info['watch_directory_path']}")
print(f"Exists: {info['exists']}")

# Create watch directory
response = requests.post(
    f"{base_url}/users/{user_id}/watch-directory",
    headers=headers,
    json={"ensure_created": True}
)
result = response.json()
if result['success']:
    print(f"Created: {result['watch_directory_path']}")

JavaScript/TypeScript Example

// Using the provided API service
import { userWatchService } from './services/api';

// Get watch directory information
const getWatchInfo = async (userId: string) => {
  try {
    const response = await userWatchService.getUserWatchDirectory(userId);
    console.log('Watch directory:', response.data.watch_directory_path);
    console.log('Exists:', response.data.exists);
    return response.data;
  } catch (error) {
    console.error('Failed to get watch directory info:', error);
  }
};

// Create watch directory
const createWatchDirectory = async (userId: string) => {
  try {
    const response = await userWatchService.createUserWatchDirectory(userId);
    if (response.data.success) {
      console.log('Created:', response.data.watch_directory_path);
    }
    return response.data;
  } catch (error) {
    console.error('Failed to create watch directory:', error);
  }
};

cURL Examples

# Get watch directory information
curl -X GET "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
  -H "Authorization: Bearer ${TOKEN}"

# Create watch directory
curl -X POST "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"ensure_created": true}'

# Delete watch directory (admin only)
curl -X DELETE "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
  -H "Authorization: Bearer ${TOKEN}"

Configuration Reference

Environment Variables

Variable Type Default Description
ENABLE_PER_USER_WATCH Boolean false Enable/disable per-user watch directories
USER_WATCH_BASE_DIR String ./user_watch Base directory for all user watch folders
WATCH_INTERVAL_SECONDS Integer 60 How often to scan for new files (seconds)
FILE_STABILITY_CHECK_MS Integer 2000 Time to wait for file size stability (milliseconds)
MAX_FILE_AGE_HOURS Integer 24 Maximum age of files to process (hours)

Configuration Validation

The system performs several validation checks:

  1. Path Validation: Ensures paths are distinct and non-overlapping
  2. Directory Conflicts: Prevents USER_WATCH_BASE_DIR from being:
  3. The same as UPLOAD_PATH
  4. The same as WATCH_FOLDER
  5. Inside UPLOAD_PATH
  6. Containing UPLOAD_PATH

Docker Configuration

When using Docker, mount the user watch directory:

version: '3.8'

services:
  readur:
    image: readur:latest
    environment:
      - ENABLE_PER_USER_WATCH=true
      - USER_WATCH_BASE_DIR=/app/user_watch
      - WATCH_INTERVAL_SECONDS=30
    volumes:
      - ./user_watch:/app/user_watch
      - ./uploads:/app/uploads
      - ./watch:/app/watch
    ports:
      - "8000:8000"

Kubernetes Configuration

For Kubernetes deployments:

apiVersion: v1
kind: ConfigMap
metadata:
  name: readur-config
data:
  ENABLE_PER_USER_WATCH: "true"
  USER_WATCH_BASE_DIR: "/data/user_watch"
  WATCH_INTERVAL_SECONDS: "30"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: readur
spec:
  template:
    spec:
      containers:
      - name: readur
        image: readur:latest
        envFrom:
        - configMapRef:
            name: readur-config
        volumeMounts:
        - name: user-watch
          mountPath: /data/user_watch
      volumes:
      - name: user-watch
        persistentVolumeClaim:
          claimName: readur-user-watch-pvc

Security Considerations

Username Validation

The system enforces strict username validation to prevent security issues:

  • Length: 1-64 characters
  • Allowed Characters: Alphanumeric, underscore (_), dash (-)
  • Prohibited Patterns:
  • Path traversal attempts (.., /)
  • Hidden directories (starting with .)
  • Null bytes or special characters

Directory Permissions

  1. User Isolation: Each user's directory is separate
  2. Permission Model: 755 (owner: rwx, group: r-x, others: r-x)
  3. Ownership: Readur process owns all directories
  4. SGID Bit: Optional for group inheritance

Path Security

  • Canonicalization: All paths are canonicalized to prevent traversal
  • Boundary Checking: Files must be within designated directories
  • Validation: Extracted usernames are validated before use

Access Control

  • API Protection: JWT authentication required
  • Permission Levels:
  • Users: Can only access their own directory
  • Admins: Can manage all directories
  • Directory Creation: Users can create their own, admins can create any
  • Directory Deletion: Admin-only operation

Audit Considerations

  1. Logging: All directory operations are logged
  2. File Attribution: Documents tracked to source user
  3. Access Tracking: API access logged with user context

Troubleshooting

Common Issues and Solutions

Issue: Per-user watch directories not working

Symptoms: Files in user directories are not processed

Solutions: 1. Verify feature is enabled:

grep ENABLE_PER_USER_WATCH .env
# Should show: ENABLE_PER_USER_WATCH=true

Check base directory exists and has correct permissions: Verify that the base watch directory has been created with proper ownership.

ls -la /data/user_watch
# Should show readur as owner with 755 permissions

Review logs for errors: Search for watch directory related error messages in the application logs.

grep -i "user watch" /var/log/readur/readur.log

Issue: "User watch service not initialized" error

Symptoms: API returns 500 error when accessing watch directories

Solutions: 1. Ensure ENABLE_PER_USER_WATCH=true in configuration 2. Restart Readur service 3. Check initialization logs for errors

Issue: Files not being detected

Symptoms: Files placed in watch directory are not processed

Solutions: 1. Check file permissions:

ls -la /data/user_watch/username/
# Files should be readable by readur user

  1. Verify file type is supported:

    echo $ALLOWED_FILE_TYPES
    # Ensure your file extension is included
    

  2. Check file age restriction:

    # Files older than MAX_FILE_AGE_HOURS are ignored
    find /data/user_watch -type f -mtime +1
    

Issue: Permission denied errors

Symptoms: Users cannot write to their watch directories

Solutions: 1. Fix directory ownership:

sudo chown -R readur:readur /data/user_watch

  1. Set correct permissions:

    sudo chmod -R 755 /data/user_watch
    

  2. For shared access, use group permissions:

    sudo chmod -R 775 /data/user_watch
    sudo chgrp -R readur-users /data/user_watch
    

Issue: Duplicate documents created

Symptoms: Same file creates multiple documents

Solutions: 1. Ensure file stability check is adequate:

# Increase if files are still being written
FILE_STABILITY_CHECK_MS=5000

  1. Check for file system issues (timestamps, inode changes)
  2. Review deduplication settings in configuration

Diagnostic Commands

# Check if user watch is enabled
curl -H "Authorization: Bearer $TOKEN" \
  https://readur.example.com/api/users/$USER_ID/watch-directory

# List all user directories
ls -la /data/user_watch/

# Check file watcher logs
journalctl -u readur | grep -i "watch"

# Monitor file processing in real-time
tail -f /var/log/readur/readur.log | grep -E "(Processing new file|watch)"

# Check directory permissions
namei -l /data/user_watch/username/

# Find recently modified files
find /data/user_watch -type f -mmin -60

# Check disk space
df -h /data/user_watch

Examples and Best Practices

Example 1: Small Team Setup

For a team of 5-10 users with local file access:

# .env configuration
ENABLE_PER_USER_WATCH=true
USER_WATCH_BASE_DIR=/srv/readur/user_watches
WATCH_INTERVAL_SECONDS=60
FILE_STABILITY_CHECK_MS=2000
MAX_FILE_AGE_HOURS=72

# Directory structure
/srv/readur/user_watches/
├── alice/
├── bob/
├── charlie/
├── diana/
└── edward/

Example 2: Enterprise Network Share Integration

For larger organizations with network shares:

# Mount network share
sudo mount -t cifs //fileserver/readur /mnt/readur \
  -o username=readur,domain=COMPANY

# .env configuration
ENABLE_PER_USER_WATCH=true
USER_WATCH_BASE_DIR=/mnt/readur/user_watches
WATCH_INTERVAL_SECONDS=120  # Slower for network
FILE_STABILITY_CHECK_MS=5000  # Higher for network delays

Example 3: Automated Document Workflow

Script for automatic document routing:

#!/usr/bin/env python3
"""
Auto-route documents to user watch directories based on metadata
"""
import os
import shutil
from pathlib import Path

def route_document(file_path, user_mapping):
    """Route document to appropriate user watch directory"""

    # Extract metadata (example: from filename)
    filename = os.path.basename(file_path)

    # Determine target user (implement your logic)
    if "invoice" in filename.lower():
        target_user = "accounting"
    elif "report" in filename.lower():
        target_user = "management"
    else:
        target_user = "general"

    # Move to user's watch directory
    user_watch_dir = Path(f"/data/user_watch/{target_user}")
    if user_watch_dir.exists():
        dest = user_watch_dir / filename
        shutil.move(file_path, dest)
        print(f"Moved {filename} to {target_user}'s watch directory")
    else:
        print(f"User {target_user} watch directory does not exist")

# Monitor incoming directory
incoming_dir = Path("/srv/incoming")
for file_path in incoming_dir.glob("*.pdf"):
    route_document(file_path, user_mapping={})

Example 4: Bulk User Setup

PowerShell script for creating multiple user directories:

# bulk-create-watch-dirs.ps1
$baseUrl = "https://readur.example.com/api"
$adminToken = "your-admin-token"

$users = @("alice", "bob", "charlie", "diana", "edward")

foreach ($username in $users) {
    # Get user ID
    $userResponse = Invoke-RestMethod `
        -Uri "$baseUrl/users" `
        -Headers @{Authorization="Bearer $adminToken"}

    $user = $userResponse | Where-Object {$_.username -eq $username}

    if ($user) {
        # Create watch directory
        $body = @{ensure_created=$true} | ConvertTo-Json

        $result = Invoke-RestMethod `
            -Method Post `
            -Uri "$baseUrl/users/$($user.id)/watch-directory" `
            -Headers @{
                Authorization="Bearer $adminToken"
                "Content-Type"="application/json"
            } `
            -Body $body

        Write-Host "Created watch directory for $username at $($result.watch_directory_path)"
    }
}

Best Practices Summary

For Administrators

  1. Capacity Planning: Allocate 1-5GB per user for watch directories
  2. Backup Strategy: Include user watch directories in backup plans
  3. Monitoring: Set up alerts for disk space and processing failures
  4. Documentation: Maintain user guide with network paths
  5. Testing: Test with various file types and sizes before deployment

For Users

  1. File Organization: Use meaningful filenames and folder structure
  2. File Formats: Prefer PDF for best OCR results
  3. Batch Processing: Group related documents for upload
  4. Size Limits: Split large documents if over 50MB
  5. Patience: Allow processing time before expecting search results

For Developers

  1. API Integration: Use provided client libraries when available
  2. Error Handling: Implement retry logic for transient failures
  3. Validation: Validate file types before placing in watch directories
  4. Monitoring: Track processing status via WebSocket updates
  5. Caching: Cache user directory paths to reduce API calls

Performance Optimization

  1. File System: Use SSD storage for watch directories
  2. Network: Minimize latency for network-mounted directories
  3. Scheduling: Adjust watch interval based on usage patterns
  4. Concurrency: Configure OCR workers based on CPU cores
  5. Cleanup: Implement retention policies for processed files

Migration from Global Watch Directory

To migrate from a single global watch directory to per-user directories:

  1. Preparation:

    # Backup existing watch directory
    tar -czf watch_backup.tar.gz /data/watch/
    

  2. Enable Feature:

    # Update configuration
    ENABLE_PER_USER_WATCH=true
    USER_WATCH_BASE_DIR=/data/user_watch
    

  3. Create User Directories:

    # Script to create directories for existing users
    for user in $(psql -d readur -c "SELECT username FROM users" -t); do
      mkdir -p "/data/user_watch/$user"
      chown readur:readur "/data/user_watch/$user"
    done
    

  4. Migrate Documents (optional):

  5. Keep existing documents in place
  6. Or reassign to appropriate users through the UI

  7. Update Documentation:

  8. Notify users of new directory locations
  9. Update any automation scripts
  10. Revise backup procedures

This completes the comprehensive documentation for the Per-User Watch Directories feature in Readur.