Per-User Watch Directories Documentation¶
Table of Contents¶
- Overview
- Architecture and Components
- Prerequisites and Requirements
- Administrator Setup Guide
- User Guide
- API Reference
- Configuration Reference
- Security Considerations
- Troubleshooting
- Examples and Best Practices
Overview¶
The Per-User Watch Directories feature in Readur allows each user to have their own dedicated folder for automatic document ingestion. When enabled, documents placed in a user's watch directory are automatically processed, OCR'd, and associated with that specific user's account.
Key Benefits¶
- User Isolation: Each user's documents remain private and separate
- Automatic Attribution: Documents are automatically assigned to the correct user
- Simplified Workflow: Users can drop files into their folder without manual upload
- Batch Processing: Process multiple documents simultaneously
- Integration Support: Works with network shares, sync tools, and automated workflows
How It Works¶
- Administrator enables per-user watch directories in configuration
- System creates a dedicated folder for each user (e.g.,
/data/user_watch/username/
) - Users place documents in their watch folder
- Readur's file watcher detects new files
- Documents are automatically ingested and associated with the user
- OCR processing extracts text for searching
- Documents appear in the user's library
Architecture and Components¶
System Components¶
- UserWatchService (
src/services/user_watch_service.rs
) - Manages user-specific watch directories
- Handles directory creation, validation, and cleanup
-
Provides secure path operations
-
UserWatchManager (
src/scheduling/user_watch_manager.rs
) - Coordinates between file watcher and user management
- Maps file paths to users
-
Manages user cache for performance
-
File Watcher (
src/scheduling/watcher.rs
) - Monitors both global and per-user directories
- Determines file ownership based on directory location
-
Triggers document ingestion pipeline
-
API Endpoints (
src/routes/users.rs
) - REST API for managing user watch directories
- Provides status, creation, and deletion operations
Directory Structure¶
user_watch_base_dir/ # Base directory (configurable)
├── alice/ # User alice's watch directory
│ ├── document1.pdf
│ └── report.docx
├── bob/ # User bob's watch directory
│ └── invoice.pdf
└── charlie/ # User charlie's watch directory
├── presentation.pptx
└── notes.txt
Prerequisites and Requirements¶
System Requirements¶
- Operating System: Linux, macOS, or Windows with proper file permissions
- Storage: Sufficient disk space for user directories and documents
- File System: Support for directory permissions (recommended: ext4, NTFS, APFS)
- Readur Version: 2.5.4 or later
Software Requirements¶
- PostgreSQL database
- Readur server with file watching enabled
- Proper file system permissions for the Readur process
Network Requirements (Optional)¶
- Network file system support (NFS, SMB/CIFS) for remote directories
- Stable network connection for remote file access
Administrator Setup Guide¶
Step 1: Enable Per-User Watch Directories¶
Edit your .env
file or set environment variables:
# Enable the feature
ENABLE_PER_USER_WATCH=true
# Set the base directory for user watch folders
USER_WATCH_BASE_DIR=/data/user_watch
# Configure watch interval (optional, default: 60 seconds)
WATCH_INTERVAL_SECONDS=30
# Set file stability check (optional, default: 2000ms)
FILE_STABILITY_CHECK_MS=3000
# Set maximum file age to process (optional, default: 24 hours)
MAX_FILE_AGE_HOURS=48
Step 2: Create Base Directory¶
Ensure the base directory exists with proper permissions:
# Create the base directory
sudo mkdir -p /data/user_watch
# Set ownership to the user running Readur
sudo chown readur:readur /data/user_watch
# Set permissions (owner: read/write/execute, group: read/execute)
sudo chmod 755 /data/user_watch
Step 3: Configure Directory Permissions¶
For production environments, configure appropriate permissions:
# Option 1: Shared group access
sudo groupadd readur-users
sudo usermod -a -G readur-users readur
sudo chgrp -R readur-users /data/user_watch
sudo chmod -R 2775 /data/user_watch # SGID bit ensures new files inherit group
# Option 2: ACL-based permissions (more granular)
sudo setfacl -R -m u:readur:rwx /data/user_watch
sudo setfacl -R -d -m u:readur:rwx /data/user_watch
Step 4: Network Share Setup (Optional)¶
To allow users to access their watch directories via network shares:
SMB/CIFS Share Configuration¶
# /etc/samba/smb.conf
[readur-watch]
path = /data/user_watch
valid users = @readur-users
writable = yes
browseable = yes
create mask = 0660
directory mask = 0770
force group = readur-users
NFS Export Configuration¶
Step 5: Restart Readur¶
After configuration, restart the Readur service:
# Systemd
sudo systemctl restart readur
# Docker
docker-compose restart readur
# Direct execution
# Stop the current process and start with new configuration
Step 6: Verify Configuration¶
Check the Readur logs to confirm per-user watch is enabled:
# Check logs for confirmation
grep "Per-user watch enabled" /var/log/readur/readur.log
# Expected output:
# ✅ Per-user watch enabled: true
# 📂 User watch base directory: /data/user_watch
User Guide¶
Accessing Your Watch Directory¶
Method 1: Direct File System Access¶
If you have direct access to the server:
# Navigate to your watch directory
cd /data/user_watch/your-username/
# Copy files
cp ~/Documents/*.pdf /data/user_watch/your-username/
# Move files
mv ~/Downloads/report.docx /data/user_watch/your-username/
Method 2: Network Share Access¶
Access via SMB/CIFS on Windows:
- Open File Explorer
- Type in address bar:
\\server-name\readur-watch\your-username
- Drag and drop files into your folder
Access via SMB/CIFS on macOS:
- Open Finder
- Press Cmd+K
- Enter:
smb://server-name/readur-watch/your-username
- Drag and drop files into your folder
Method 3: Sync Tools¶
Use synchronization tools for automatic uploads:
# Using rsync
rsync -avz ~/Documents/*.pdf server:/data/user_watch/your-username/
# Using rclone
rclone copy ~/Documents server:user_watch/your-username/
# Using Syncthing (configure folder sync)
# Add /data/user_watch/your-username as a sync folder
Managing Your Watch Directory via Web Interface¶
- Check Directory Status
- Navigate to Settings → Watch Folder
- View your watch directory path and status
-
See if directory exists and is enabled
-
Create Your Directory
- Click "Create Watch Directory" button
- System will create your personal folder
-
Confirmation message will appear
-
View Directory Path
- Your directory path is displayed
- Copy path for reference
- Share with IT for network access setup
Supported File Types¶
Place any of these file types in your watch directory:
- Documents: PDF, TXT, DOC, DOCX, ODT, RTF
- Images: PNG, JPG, JPEG, TIFF, BMP
- Presentations: PPT, PPTX, ODP
- Spreadsheets: XLS, XLSX, ODS
File Processing Workflow¶
- File Detection: System checks for new files every 30-60 seconds
- Stability Check: Waits for file to stop changing (2-3 seconds)
- Validation: Verifies file type and size
- Ingestion: Creates document record in database
- OCR Queue: Adds to processing queue
- Text Extraction: OCR processes the document
- Search Index: Document becomes searchable
Best Practices for Users¶
- File Naming: Use descriptive names for easier identification
- File Size: Keep files under 50MB for optimal processing
- Batch Upload: Can upload multiple files simultaneously
- Organization: Create subfolders within your watch directory
- Patience: Allow 1-5 minutes for processing depending on file size
API Reference¶
Get User Watch Directory Information¶
Retrieve information about a user's watch directory.
Endpoint: GET /api/users/{user_id}/watch-directory
Headers:
Response (200 OK):
{
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"username": "alice",
"watch_directory_path": "/data/user_watch/alice",
"exists": true,
"enabled": true
}
Error Responses: - 401 Unauthorized
: Missing or invalid authentication - 403 Forbidden
: Insufficient permissions - 404 Not Found
: User not found - 500 Internal Server Error
: Per-user watch disabled
Create User Watch Directory¶
Create or ensure a user's watch directory exists.
Endpoint: POST /api/users/{user_id}/watch-directory
Headers:
Request Body:
Response (200 OK):
{
"success": true,
"message": "Watch directory ready for user 'alice'",
"watch_directory_path": "/data/user_watch/alice"
}
Error Responses: - 401 Unauthorized
: Missing or invalid authentication - 403 Forbidden
: Insufficient permissions - 404 Not Found
: User not found - 500 Internal Server Error
: Creation failed or feature disabled
Delete User Watch Directory¶
Remove a user's watch directory and its contents.
Endpoint: DELETE /api/users/{user_id}/watch-directory
Headers:
Note: Only administrators can delete watch directories.
Response (200 OK):
{
"success": true,
"message": "Watch directory removed for user 'alice'",
"watch_directory_path": null
}
Error Responses: - 401 Unauthorized
: Missing or invalid authentication - 403 Forbidden
: Admin access required - 404 Not Found
: User not found - 500 Internal Server Error
: Deletion failed
API Usage Examples¶
Python Example¶
import requests
# Configuration
base_url = "https://readur.example.com/api"
token = "your-jwt-token"
user_id = "550e8400-e29b-41d4-a716-446655440000"
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
# Get watch directory info
response = requests.get(
f"{base_url}/users/{user_id}/watch-directory",
headers=headers
)
info = response.json()
print(f"Watch directory: {info['watch_directory_path']}")
print(f"Exists: {info['exists']}")
# Create watch directory
response = requests.post(
f"{base_url}/users/{user_id}/watch-directory",
headers=headers,
json={"ensure_created": True}
)
result = response.json()
if result['success']:
print(f"Created: {result['watch_directory_path']}")
JavaScript/TypeScript Example¶
// Using the provided API service
import { userWatchService } from './services/api';
// Get watch directory information
const getWatchInfo = async (userId: string) => {
try {
const response = await userWatchService.getUserWatchDirectory(userId);
console.log('Watch directory:', response.data.watch_directory_path);
console.log('Exists:', response.data.exists);
return response.data;
} catch (error) {
console.error('Failed to get watch directory info:', error);
}
};
// Create watch directory
const createWatchDirectory = async (userId: string) => {
try {
const response = await userWatchService.createUserWatchDirectory(userId);
if (response.data.success) {
console.log('Created:', response.data.watch_directory_path);
}
return response.data;
} catch (error) {
console.error('Failed to create watch directory:', error);
}
};
cURL Examples¶
# Get watch directory information
curl -X GET "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
-H "Authorization: Bearer ${TOKEN}"
# Create watch directory
curl -X POST "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{"ensure_created": true}'
# Delete watch directory (admin only)
curl -X DELETE "https://readur.example.com/api/users/${USER_ID}/watch-directory" \
-H "Authorization: Bearer ${TOKEN}"
Configuration Reference¶
Environment Variables¶
Variable | Type | Default | Description |
---|---|---|---|
ENABLE_PER_USER_WATCH | Boolean | false | Enable/disable per-user watch directories |
USER_WATCH_BASE_DIR | String | ./user_watch | Base directory for all user watch folders |
WATCH_INTERVAL_SECONDS | Integer | 60 | How often to scan for new files (seconds) |
FILE_STABILITY_CHECK_MS | Integer | 2000 | Time to wait for file size stability (milliseconds) |
MAX_FILE_AGE_HOURS | Integer | 24 | Maximum age of files to process (hours) |
Configuration Validation¶
The system performs several validation checks:
- Path Validation: Ensures paths are distinct and non-overlapping
- Directory Conflicts: Prevents USER_WATCH_BASE_DIR from being:
- The same as UPLOAD_PATH
- The same as WATCH_FOLDER
- Inside UPLOAD_PATH
- Containing UPLOAD_PATH
Docker Configuration¶
When using Docker, mount the user watch directory:
version: '3.8'
services:
readur:
image: readur:latest
environment:
- ENABLE_PER_USER_WATCH=true
- USER_WATCH_BASE_DIR=/app/user_watch
- WATCH_INTERVAL_SECONDS=30
volumes:
- ./user_watch:/app/user_watch
- ./uploads:/app/uploads
- ./watch:/app/watch
ports:
- "8000:8000"
Kubernetes Configuration¶
For Kubernetes deployments:
apiVersion: v1
kind: ConfigMap
metadata:
name: readur-config
data:
ENABLE_PER_USER_WATCH: "true"
USER_WATCH_BASE_DIR: "/data/user_watch"
WATCH_INTERVAL_SECONDS: "30"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: readur
spec:
template:
spec:
containers:
- name: readur
image: readur:latest
envFrom:
- configMapRef:
name: readur-config
volumeMounts:
- name: user-watch
mountPath: /data/user_watch
volumes:
- name: user-watch
persistentVolumeClaim:
claimName: readur-user-watch-pvc
Security Considerations¶
Username Validation¶
The system enforces strict username validation to prevent security issues:
- Length: 1-64 characters
- Allowed Characters: Alphanumeric, underscore (_), dash (-)
- Prohibited Patterns:
- Path traversal attempts (.., /)
- Hidden directories (starting with .)
- Null bytes or special characters
Directory Permissions¶
- User Isolation: Each user's directory is separate
- Permission Model: 755 (owner: rwx, group: r-x, others: r-x)
- Ownership: Readur process owns all directories
- SGID Bit: Optional for group inheritance
Path Security¶
- Canonicalization: All paths are canonicalized to prevent traversal
- Boundary Checking: Files must be within designated directories
- Validation: Extracted usernames are validated before use
Access Control¶
- API Protection: JWT authentication required
- Permission Levels:
- Users: Can only access their own directory
- Admins: Can manage all directories
- Directory Creation: Users can create their own, admins can create any
- Directory Deletion: Admin-only operation
Audit Considerations¶
- Logging: All directory operations are logged
- File Attribution: Documents tracked to source user
- Access Tracking: API access logged with user context
Troubleshooting¶
Common Issues and Solutions¶
Issue: Per-user watch directories not working¶
Symptoms: Files in user directories are not processed
Solutions: 1. Verify feature is enabled:
Check base directory exists and has correct permissions: Verify that the base watch directory has been created with proper ownership.
Review logs for errors: Search for watch directory related error messages in the application logs.
Issue: "User watch service not initialized" error¶
Symptoms: API returns 500 error when accessing watch directories
Solutions: 1. Ensure ENABLE_PER_USER_WATCH=true in configuration 2. Restart Readur service 3. Check initialization logs for errors
Issue: Files not being detected¶
Symptoms: Files placed in watch directory are not processed
Solutions: 1. Check file permissions:
-
Verify file type is supported:
-
Check file age restriction:
Issue: Permission denied errors¶
Symptoms: Users cannot write to their watch directories
Solutions: 1. Fix directory ownership:
-
Set correct permissions:
-
For shared access, use group permissions:
Issue: Duplicate documents created¶
Symptoms: Same file creates multiple documents
Solutions: 1. Ensure file stability check is adequate:
- Check for file system issues (timestamps, inode changes)
- Review deduplication settings in configuration
Diagnostic Commands¶
# Check if user watch is enabled
curl -H "Authorization: Bearer $TOKEN" \
https://readur.example.com/api/users/$USER_ID/watch-directory
# List all user directories
ls -la /data/user_watch/
# Check file watcher logs
journalctl -u readur | grep -i "watch"
# Monitor file processing in real-time
tail -f /var/log/readur/readur.log | grep -E "(Processing new file|watch)"
# Check directory permissions
namei -l /data/user_watch/username/
# Find recently modified files
find /data/user_watch -type f -mmin -60
# Check disk space
df -h /data/user_watch
Examples and Best Practices¶
Example 1: Small Team Setup¶
For a team of 5-10 users with local file access:
# .env configuration
ENABLE_PER_USER_WATCH=true
USER_WATCH_BASE_DIR=/srv/readur/user_watches
WATCH_INTERVAL_SECONDS=60
FILE_STABILITY_CHECK_MS=2000
MAX_FILE_AGE_HOURS=72
# Directory structure
/srv/readur/user_watches/
├── alice/
├── bob/
├── charlie/
├── diana/
└── edward/
Example 2: Enterprise Network Share Integration¶
For larger organizations with network shares:
# Mount network share
sudo mount -t cifs //fileserver/readur /mnt/readur \
-o username=readur,domain=COMPANY
# .env configuration
ENABLE_PER_USER_WATCH=true
USER_WATCH_BASE_DIR=/mnt/readur/user_watches
WATCH_INTERVAL_SECONDS=120 # Slower for network
FILE_STABILITY_CHECK_MS=5000 # Higher for network delays
Example 3: Automated Document Workflow¶
Script for automatic document routing:
#!/usr/bin/env python3
"""
Auto-route documents to user watch directories based on metadata
"""
import os
import shutil
from pathlib import Path
def route_document(file_path, user_mapping):
"""Route document to appropriate user watch directory"""
# Extract metadata (example: from filename)
filename = os.path.basename(file_path)
# Determine target user (implement your logic)
if "invoice" in filename.lower():
target_user = "accounting"
elif "report" in filename.lower():
target_user = "management"
else:
target_user = "general"
# Move to user's watch directory
user_watch_dir = Path(f"/data/user_watch/{target_user}")
if user_watch_dir.exists():
dest = user_watch_dir / filename
shutil.move(file_path, dest)
print(f"Moved {filename} to {target_user}'s watch directory")
else:
print(f"User {target_user} watch directory does not exist")
# Monitor incoming directory
incoming_dir = Path("/srv/incoming")
for file_path in incoming_dir.glob("*.pdf"):
route_document(file_path, user_mapping={})
Example 4: Bulk User Setup¶
PowerShell script for creating multiple user directories:
# bulk-create-watch-dirs.ps1
$baseUrl = "https://readur.example.com/api"
$adminToken = "your-admin-token"
$users = @("alice", "bob", "charlie", "diana", "edward")
foreach ($username in $users) {
# Get user ID
$userResponse = Invoke-RestMethod `
-Uri "$baseUrl/users" `
-Headers @{Authorization="Bearer $adminToken"}
$user = $userResponse | Where-Object {$_.username -eq $username}
if ($user) {
# Create watch directory
$body = @{ensure_created=$true} | ConvertTo-Json
$result = Invoke-RestMethod `
-Method Post `
-Uri "$baseUrl/users/$($user.id)/watch-directory" `
-Headers @{
Authorization="Bearer $adminToken"
"Content-Type"="application/json"
} `
-Body $body
Write-Host "Created watch directory for $username at $($result.watch_directory_path)"
}
}
Best Practices Summary¶
For Administrators¶
- Capacity Planning: Allocate 1-5GB per user for watch directories
- Backup Strategy: Include user watch directories in backup plans
- Monitoring: Set up alerts for disk space and processing failures
- Documentation: Maintain user guide with network paths
- Testing: Test with various file types and sizes before deployment
For Users¶
- File Organization: Use meaningful filenames and folder structure
- File Formats: Prefer PDF for best OCR results
- Batch Processing: Group related documents for upload
- Size Limits: Split large documents if over 50MB
- Patience: Allow processing time before expecting search results
For Developers¶
- API Integration: Use provided client libraries when available
- Error Handling: Implement retry logic for transient failures
- Validation: Validate file types before placing in watch directories
- Monitoring: Track processing status via WebSocket updates
- Caching: Cache user directory paths to reduce API calls
Performance Optimization¶
- File System: Use SSD storage for watch directories
- Network: Minimize latency for network-mounted directories
- Scheduling: Adjust watch interval based on usage patterns
- Concurrency: Configure OCR workers based on CPU cores
- Cleanup: Implement retention policies for processed files
Migration from Global Watch Directory¶
To migrate from a single global watch directory to per-user directories:
-
Preparation:
-
Enable Feature:
-
Create User Directories:
-
Migrate Documents (optional):
- Keep existing documents in place
-
Or reassign to appropriate users through the UI
-
Update Documentation:
- Notify users of new directory locations
- Update any automation scripts
- Revise backup procedures
This completes the comprehensive documentation for the Per-User Watch Directories feature in Readur.