Date: October 3, 2025
Status: ✅ Accepted
Context: Implementing Proposal #8 - Improve Error Context
Decision Makers: Development Team
When implementing Proposal #8 to improve error messages in the file lock module, we faced a design challenge: how to make errors actionable and helpful without overwhelming users with verbose messages or requiring external infrastructure.
- Actionability: Errors must guide users toward solutions (development principle)
- Brevity: Error messages shouldn't be overwhelming or cluttered
- Accessibility: Help must be available when users need it
- Maintainability: Solution should be easy to maintain and update
- No External Dependencies: Avoid requiring web hosting or internet access
- Runtime Availability: Help must be accessible at runtime, not just in documentation
The initial Proposal #8 suggested embedding extensive troubleshooting steps directly in error messages:
#[error("Failed to acquire lock for '{path}' within {timeout:?}
The lock is currently held by process {holder_pid}.
To resolve this issue:
1. Check if process {holder_pid} is still running:
ps -p {holder_pid}
2. If the process is running and should release the lock:
- Wait for the process to complete its operation
- Or increase the timeout duration
3. If the process is stuck or hung:
- Terminate it: kill {holder_pid}
- Or force terminate: kill -9 {holder_pid}
4. If the process doesn't exist (stale lock):
- This should be handled automatically
- If you see this error repeatedly, report a bug
Current timeout: {timeout:?}
Lock file: {path}.lock")]Problems identified:
- ❌ Too verbose - error messages become documentation
- ❌ Maintenance burden - commands may become outdated
- ❌ Not DRY - duplicates publicly available information
- ❌ Clutters logs and terminal output
Approach: Assign error codes (like Rust's E0001) and link to web documentation.
#[error("Failed to acquire lock for '{path}' (error code: FL001)
See: https://docs.torrust.com/errors/FL001 for troubleshooting")]Pros:
- ✅ Clean separation of concerns
- ✅ Detailed help without cluttering errors
- ✅ Easy to update documentation independently
- ✅ Similar to Rust's own error codes
Cons:
- ❌ Stability problem: Internal errors change frequently, making code numbering unstable
- ❌ Requires infrastructure (hosting docs, numbering scheme)
- ❌ Users need internet access to see help
- ❌ Extra cognitive load (see error → look up code → read docs)
- ❌ Version synchronization issues between binary and docs
Decision: ❌ Rejected due to stability concerns and infrastructure requirements.
Approach: Document errors in Rustdoc, accessible via cargo doc.
/// Failed to acquire lock within timeout period
///
/// ## Troubleshooting
///
/// 1. Check if process is running: ps -p <pid>
/// 2. Wait or increase timeout
/// 3. Terminate stuck processes
#[error("Failed to acquire lock for '{path}'")]
AcquisitionTimeout { /* ... */ }Pros:
- ✅ Documentation lives with the code
- ✅ Easy to maintain
- ✅ Accessible via
cargo doc
Cons:
- ❌ Target audience mismatch: Rustdoc is for developers, not end-users
- ❌ Not accessible at runtime when users need it
- ❌ Users running binaries won't have access to docs
- ❌ Doesn't help with production issues
Decision: ❌ Rejected because help isn't available when users need it most (at runtime).
Approach: Include only one-line hints in error messages.
#[error("Failed to acquire lock for '{path}' (held by process {holder_pid})
Tip: Check if process is running or increase timeout")]Pros:
- ✅ Balances brevity with actionability
- ✅ No external infrastructure
- ✅ Aligns with development principles
Cons:
- ❌ Limited guidance for complex scenarios
- ❌ Users may still struggle without detailed steps
- ❌ Platform-specific commands not easily conveyed
Decision:
Approach: Point to existing tools and generic documentation.
#[error("Failed to acquire lock for '{path}' (held by process {holder_pid})
Check process status with: ps -p {holder_pid}
See: https://docs.torrust.com/troubleshooting#file-locks")]Pros:
- ✅ Leverages existing documentation
- ✅ Can provide deep troubleshooting
- ✅ Documentation evolves independently
Cons:
- ❌ Requires maintaining external docs
- ❌ Requires internet access
- ❌ Link rot over time
- ❌ Version synchronization issues
Decision: ❌ Rejected due to external dependency and maintenance burden.
Approach: Provide concise errors with brief tips, plus an optional .help() method for detailed guidance.
#[derive(Debug, Error)]
pub enum FileLockError {
/// Failed to acquire lock within timeout period
///
/// Use `.help()` for detailed troubleshooting steps.
#[error("Failed to acquire lock for '{path}' within {timeout:?} (held by process {holder_pid})
Tip: Use 'ps -p {holder_pid}' to check if process is running")]
AcquisitionTimeout {
path: PathBuf,
holder_pid: ProcessId,
timeout: Duration,
},
}
impl FileLockError {
/// Get detailed troubleshooting guidance
pub fn help(&self) -> &'static str {
match self {
Self::AcquisitionTimeout { .. } => {
"Lock Acquisition Timeout - Detailed Troubleshooting:
1. Check if holder process is running:
Unix: ps -p <pid>
Windows: tasklist /FI \"PID eq <pid>\"
2. If running: wait or increase timeout
3. If stuck: terminate process
4. If stale: report bug
See docs for more info."
}
}
}
}Usage:
// Basic: just show error
eprintln!("Error: {e}");
// Advanced: show help when needed
if verbose {
eprintln!("\n{}", e.help());
}Pros:
- ✅ Clean separation: error vs help
- ✅ No external infrastructure needed
- ✅ Help always available at runtime
- ✅ Users get help when they need it
- ✅ No version stability concerns
- ✅ Easy to maintain (help with error definition)
- ✅ Balances brevity with actionability
- ✅ Platform-specific guidance possible
- ✅ Aligns with all development principles
Cons:
⚠️ Help strings embedded in binary (small overhead)⚠️ Still some duplication of publicly available info
Decision: ✅ ACCEPTED - Best balance of all considerations.
We adopt the Tiered Help System approach (Option 5) combining:
- Base error message: Concise with essential context
- Brief tip: One-liner actionable hint (from Option 3)
.help()method: Detailed troubleshooting on-demand- Rustdoc: Developer documentation (from Option 2)
use thiserror::Error;
#[derive(Debug, Error)]
pub enum FileLockError {
/// Brief description for developers
///
/// Longer context and when this error occurs.
/// Use `.help()` for user-facing troubleshooting.
#[error("Concise error with context
Tip: Brief actionable hint")]
VariantName {
// Error fields with #[source] where appropriate
},
}
impl FileLockError {
/// Get detailed troubleshooting guidance
///
/// Returns platform-specific steps to resolve the error.
pub fn help(&self) -> &'static str {
match self {
Self::VariantName { .. } => {
"Detailed troubleshooting with:
1. Step-by-step instructions
2. Platform-specific commands
3. Multiple resolution approaches
4. Links to report bugs"
}
}
}
}| Criterion | Error Codes | Rustdoc Only | Brief Tips | External Refs | Tiered Help |
|---|---|---|---|---|---|
| Brevity | ✅ Excellent | ✅ Excellent | ✅ Good | ✅ Good | ✅ Excellent |
| Actionability | ✅ High | ❌ Low | ✅ High | ✅ Very High | |
| Runtime Access | ❌ No | ❌ No | ✅ Yes | ❌ Partial | ✅ Yes |
| No Infrastructure | ❌ Needs web | ✅ Yes | ✅ Yes | ❌ Needs web | ✅ Yes |
| Maintainability | ✅ Easy | ✅ Easy | ✅ Easy | ||
| Stability | ❌ Numbering issues | ✅ Stable | ✅ Stable | ✅ Stable | |
| User Control | ❌ No | ❌ No | ❌ No | ❌ No | ✅ Yes (verbose) |
- Add Rustdoc comment explaining when/why error occurs
- Use
#[error]with concise message + brief tip - Include
#[source]for underlying errors - Implement
.help()method with detailed guidance
// Let users control verbosity
match operation() {
Ok(result) => { /* success */ }
Err(e) => {
eprintln!("Error: {e}");
if args.verbose {
eprintln!("\n{}", e.help());
} else {
eprintln!("\nRun with --verbose for detailed troubleshooting");
}
}
}- Test error messages contain tips
- Test
.help()returns non-empty strings for all variants - Test help content includes key troubleshooting terms
- Error Handling Guide - Full implementation guidance
- Development Principles - Actionability principle
- Proposal #8 implementation details - See git history for
docs/refactors/file-lock-improvements.md(completed October 3, 2025)
If we find the tiered help system insufficient, we could:
- Generate HTML docs: Build-time generation from
.help()content - Error catalog: Maintain a searchable error database
- Telemetry integration: Track which errors need better help
For now, the tiered help system provides the best balance of simplicity and effectiveness.
Last Updated: October 3, 2025
Review Date: After Proposal #8 implementation and user feedback