Test Infrastructure Documentation¶
This document provides a comprehensive guide to the test infrastructure in Readur, including test patterns, utilities, common issues, and best practices.
π Table of Contents¶
- Test Architecture Overview
- TestContext Pattern
- Test Utilities
- Test Isolation and Environment Variables
- Common Patterns
- Troubleshooting
- Best Practices
Test Architecture Overview¶
Readur uses a three-tier testing approach:
- Unit Tests (
src/tests/
) - Fast, isolated component tests - Integration Tests (
tests/
) - Full system tests with database - Frontend Tests (
frontend/src/__tests__/
) - React component and API tests
Test Execution Flow¶
βββββββββββββββββββ
β Unit Tests β β No external dependencies
β (cargo test) β β Milliseconds execution
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
βIntegration Testsβ β Real database (PostgreSQL)
β (TestContext) β β In-memory app instance
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Frontend Tests β β Mocked API responses
β (Vitest) β β Component isolation
βββββββββββββββββββ
TestContext Pattern¶
The TestContext
is the cornerstone of integration testing in Readur. It provides an isolated test environment with a real database.
Basic Usage¶
use readur::test_utils::{TestContext, TestAuthHelper};
#[tokio::test]
async fn test_document_workflow() {
// Create a new test context with default configuration
let ctx = TestContext::new().await;
// Access the app router for making requests
let app = ctx.app();
// Access the application state
let state = ctx.state();
// Test runs with isolated database
}
How TestContext Works¶
- Database Setup: Spins up a PostgreSQL container using testcontainers
- Migrations: Runs all SQLx migrations automatically
- App Instance: Creates an in-memory Axum router with full API routes
- Isolation: Each test gets its own database container
Custom Configuration¶
use readur::test_utils::{TestContext, TestConfigBuilder};
#[tokio::test]
async fn test_with_custom_config() {
let config = TestConfigBuilder::default()
.with_concurrent_ocr_jobs(4)
.with_upload_path("./test-uploads")
.with_oidc_enabled(false);
let ctx = TestContext::with_config(config).await;
}
Making Requests¶
use axum::http::{Request, StatusCode};
use axum::body::Body;
use tower::ServiceExt;
// Direct request to the test app
let request = Request::builder()
.method("GET")
.uri("/api/health")
.body(Body::empty())
.unwrap();
let response = ctx.app().clone().oneshot(request).await.unwrap();
assert_eq!(response.status(), StatusCode::OK);
Test Utilities¶
TestAuthHelper¶
Handles user creation and authentication in tests:
let auth_helper = TestAuthHelper::new(ctx.app().clone());
// Create a regular user
let mut test_user = auth_helper.create_test_user().await;
// Generates unique username: testuser_<pid>_<thread>_<nanos>
// Create an admin user
let admin_user = auth_helper.create_admin_user().await;
// Login and get token
let token = test_user.login(&auth_helper).await.unwrap();
// Make authenticated request
let response = auth_helper.make_authenticated_request(
"GET",
"/api/documents",
None,
&token
).await;
Document Helpers¶
Test data builders for consistent document creation:
use readur::test_utils::document_helpers::*;
// Basic test document
let doc = create_test_document(user_id);
// Document with specific hash
let doc = create_test_document_with_hash(
user_id,
"test.pdf",
"abc123".to_string()
);
// Low confidence OCR document
let doc = create_low_confidence_document(user_id, 45.0);
// Document with OCR error
let doc = create_document_with_ocr_error(user_id);
Test User Pattern¶
Each test creates unique users to avoid conflicts:
// Unique username pattern: testuser_<process_id>_<thread_id>_<timestamp_nanos>
// Example: testuser_12345_2_1752870966778668050
// This prevents "Username already exists" errors in parallel tests
Test Isolation and Environment Variables¶
The TESSDATA_PREFIX Problem¶
One of the most challenging issues in the test suite was related to OCR language validation and environment variables.
The Issue¶
- Tests set
TESSDATA_PREFIX
environment variable to point to temporary directories - Environment variables are global and shared across all threads
- When tests run in parallel, they overwrite each other's
TESSDATA_PREFIX
- This caused 400 errors when validating OCR languages
The Solution¶
Modified the OCR retry endpoint to use custom tessdata paths:
// In src/routes/documents/ocr.rs
let health_checker = if let Ok(tessdata_path) = std::env::var("TESSDATA_PREFIX") {
crate::ocr::health::OcrHealthChecker::new_with_path(tessdata_path)
} else {
crate::ocr::health::OcrHealthChecker::new()
};
Test Setup Example¶
#[tokio::test]
async fn test_retry_ocr_with_language() {
// Create temporary directory for tessdata
let temp_dir = TempDir::new().unwrap();
let tessdata_path = temp_dir.path();
// Create mock language files
fs::write(tessdata_path.join("eng.traineddata"), "mock").unwrap();
fs::write(tessdata_path.join("spa.traineddata"), "mock").unwrap();
// Set environment variable (careful with parallel tests!)
let tessdata_str = tessdata_path.to_string_lossy().to_string();
std::env::set_var("TESSDATA_PREFIX", &tessdata_str);
let ctx = TestContext::new().await;
// ... rest of test
}
Best Practices for Environment Variables¶
- Avoid Global State: Prefer passing configuration through constructors
- Use TestContext: It provides isolation for most test scenarios
- Serial Execution: For tests that must modify environment variables:
Common Patterns¶
Authentication Test Pattern¶
#[tokio::test]
async fn test_authenticated_endpoint() {
let ctx = TestContext::new().await;
let auth_helper = TestAuthHelper::new(ctx.app().clone());
// Create and login user
let mut user = auth_helper.create_test_user().await;
let token = user.login(&auth_helper).await.unwrap();
// Make authenticated request
let request = Request::builder()
.method("GET")
.uri("/api/protected")
.header("Authorization", format!("Bearer {}", token))
.body(Body::empty())
.unwrap();
let response = ctx.app().clone().oneshot(request).await.unwrap();
assert_eq!(response.status(), StatusCode::OK);
}
Document Upload Pattern¶
#[tokio::test]
async fn test_document_upload() {
let ctx = TestContext::new().await;
let auth_helper = TestAuthHelper::new(ctx.app().clone());
let mut user = auth_helper.create_test_user().await;
let token = user.login(&auth_helper).await.unwrap();
// Create multipart form
let form = multipart::Form::new()
.text("tags", "test,document")
.part("file", multipart::Part::bytes(b"test content")
.file_name("test.txt")
.mime_str("text/plain").unwrap());
// Upload document
let response = reqwest::Client::new()
.post("http://localhost:8000/api/documents")
.header("Authorization", format!("Bearer {}", token))
.multipart(form)
.send()
.await
.unwrap();
assert_eq!(response.status(), 201);
}
Database Direct Access Pattern¶
#[tokio::test]
async fn test_database_operations() {
let ctx = TestContext::new().await;
let user_id = Uuid::new_v4();
// Direct database access
sqlx::query!(
"INSERT INTO users (id, username, email, password_hash, role)
VALUES ($1, $2, $3, $4, $5)",
user_id,
"testuser",
"[email protected]",
"hash",
"user"
)
.execute(&ctx.state().db.pool)
.await
.unwrap();
// Verify through API
// ...
}
Troubleshooting¶
Common Test Failures¶
1. "Username already exists" Error¶
Cause: Parallel tests creating users with same username
Solution: TestAuthHelper now generates unique usernames with timestamps
// Automatic unique username generation
let username = format!("testuser_{}_{}_{}",
std::process::id(),
thread_id,
timestamp_nanos
);
2. "Server is not running" (Integration Tests)¶
Cause: Tests expecting external server on localhost:8000
Solution: Use TestContext instead of external HTTP requests
// β Wrong - expects external server
let response = reqwest::get("http://localhost:8000/api/health").await;
// β
Correct - uses TestContext
let response = ctx.app().clone()
.oneshot(Request::builder()
.uri("/api/health")
.body(Body::empty())
.unwrap())
.await
.unwrap();
3. OCR Language Validation Failures (400 errors)¶
Cause: TESSDATA_PREFIX environment variable conflicts
Solution: Use new_with_path() for custom tessdata directories
4. Database Connection Errors¶
Cause: PostgreSQL container not ready or migrations failed
Debug Steps:
# Check if tests can connect to database
RUST_LOG=debug cargo test
# Run single test with output
cargo test test_name -- --nocapture
# Check Docker containers
docker ps
Debugging Techniques¶
Enable Detailed Logging¶
# Full debug output
RUST_LOG=debug cargo test -- --nocapture
# Specific module logging
RUST_LOG=readur::routes=debug cargo test
# With backtrace
RUST_BACKTRACE=1 cargo test
Run Tests Serially¶
Inspect Test Database¶
// Add debug queries in test
let count: i64 = sqlx::query_scalar("SELECT COUNT(*) FROM users")
.fetch_one(&ctx.state().db.pool)
.await
.unwrap();
println!("User count: {}", count);
Best Practices¶
1. Use Unique Identifiers¶
Always use timestamps or UUIDs for test data:
2. Clean Test State¶
TestContext automatically provides isolated databases, but clean up external resources:
// TempDir automatically cleans up
let temp_dir = TempDir::new().unwrap();
// Directory deleted when temp_dir drops
3. Test Both Success and Failure Cases¶
#[tokio::test]
async fn test_endpoint_success() {
// Happy path test
}
#[tokio::test]
async fn test_endpoint_unauthorized() {
// No auth token - expect 401
}
#[tokio::test]
async fn test_endpoint_not_found() {
// Invalid ID - expect 404
}
4. Use Type-Safe Assertions¶
// Parse response to proper types
let body_bytes = axum::body::to_bytes(response.into_body(), usize::MAX)
.await
.unwrap();
let document: DocumentResponse = serde_json::from_slice(&body_bytes).unwrap();
// Now assertions are type-safe
assert_eq!(document.filename, "test.pdf");
5. Document Test Purpose¶
#[tokio::test]
async fn test_ocr_retry_with_multiple_languages() {
// Tests that OCR retry endpoint accepts multiple language codes
// and validates them against available tessdata files.
// This ensures multi-language OCR support works correctly.
}
6. Avoid External Dependencies¶
- Use TestContext instead of external servers
- Mock external services when possible
- Use in-memory databases for unit tests
- Create test fixtures instead of relying on external files
7. Handle Async Properly¶
// Use tokio::test for async tests
#[tokio::test]
async fn test_async_operation() {
// Can use .await here
}
// For timeout handling
use tokio::time::{timeout, Duration};
let result = timeout(
Duration::from_secs(30),
long_running_operation()
).await;
Test Organization¶
Directory Structure¶
readur/
βββ src/
β βββ tests/ # Unit tests
β βββ mod.rs
β βββ auth_tests.rs
β βββ db_tests.rs
β βββ ...
βββ tests/ # Integration tests
β βββ integration_ocr_language_endpoints.rs
β βββ integration_settings_tests.rs
β βββ ...
βββ frontend/
βββ src/
βββ __tests__/ # Frontend tests
βββ components/
βββ pages/
Naming Conventions¶
- Unit tests:
test_<component>_<behavior>
- Integration tests:
test_<workflow>_<scenario>
- Test files:
integration_<feature>_tests.rs
Summary¶
The test infrastructure in Readur provides:
- Isolation: Each test runs in its own environment
- Realism: Integration tests use real databases and full app instances
- Speed: Parallel execution with proper isolation
- Reliability: Unique identifiers prevent conflicts
- Maintainability: Clear patterns and utilities
Key takeaways: - Always use TestContext for integration tests - Generate unique test data to avoid conflicts - Be careful with environment variables in parallel tests - Use the provided test utilities for common operations - Test both success and failure scenarios
For more examples, see the existing test files in tests/
directory.