baktainer/README.md

329 lines
12 KiB
Markdown
Raw Normal View History

2025-04-05 13:58:02 +00:00
# baktainer
2025-04-14 09:39:37 -04:00
Easily backup databases running in docker containers.
2025-04-14 09:39:37 -04:00
## Features
- **Database Support**: MySQL, PostgreSQL, MongoDB, and SQLite databases
- **Scheduled Backups**: Run on a schedule using cron expressions
- **Container Discovery**: Define which databases to backup using docker labels
- **Health Monitoring**: Web dashboard and REST API for monitoring backup status
- **Notifications**: Multi-channel notifications (Slack, Discord, Teams, Email, Webhooks)
- **Backup Rotation**: Automatic cleanup based on age, count, or disk space
- **Encryption**: AES-256-GCM encryption for backup files
- **Compression**: Gzip compression to reduce backup file sizes
- **Performance Monitoring**: Real-time metrics and performance alerts
- **Label Validation**: Schema-based validation with helpful error messages
- **SSL/TLS Support**: Secure Docker API connections
- **High Performance**: Multi-threaded backups with dynamic scaling
2025-04-14 09:39:37 -04:00
## Installation
⚠️ **Security Notice**: Baktainer requires Docker socket access which grants significant privileges. Please review [SECURITY.md](SECURITY.md) for important security considerations and recommended mitigations.
2025-04-14 09:39:37 -04:00
```yaml
services:
baktainer:
2025-05-08 22:54:50 -04:00
image: jamez001/baktainer:latest
2025-04-14 09:39:37 -04:00
container_name: baktainer
restart: unless-stopped
ports:
- "8080:8080" # Health check dashboard
2025-04-14 09:39:37 -04:00
volumes:
- ./backups:/backups
- /var/run/docker.sock:/var/run/docker.sock
environment:
- BT_CRON="0 0 * * *" # Backup every day at midnight
- BT_DOCKER_URL=unix:///var/run/docker.sock
2025-04-14 09:39:37 -04:00
- BT_THREADS=4
- BT_LOG_LEVEL=info
- BT_HEALTH_SERVER_ENABLED=true # Enable health check server
- BT_COMPRESS=true # Enable compression
- BT_ROTATION_ENABLED=true # Enable backup rotation
- BT_RETENTION_DAYS=30 # Keep backups for 30 days
# SSL Configuration (if using remote Docker API)
#- BT_SSL=true
#- BT_CA=/path/to/ca.pem
#- BT_CERT=/path/to/cert.pem
#- BT_KEY=/path/to/key.pem
# Notification Configuration
#- BT_NOTIFICATION_CHANNELS=slack,log
#- BT_SLACK_WEBHOOK_URL=https://hooks.slack.com/...
#- BT_NOTIFY_FAILURES=true
# Encryption Configuration
#- BT_ENCRYPTION_ENABLED=true
#- BT_ENCRYPTION_KEY=your-hex-key
2025-04-14 09:39:37 -04:00
```
2025-04-05 13:58:02 +00:00
For enhanced security, consider using a Docker socket proxy. See [SECURITY.md](SECURITY.md) for detailed security recommendations.
2025-04-14 09:39:37 -04:00
## Environment Variables
### Core Configuration
2025-04-14 09:39:37 -04:00
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_CRON | Cron expression for scheduling backups | `0 0 * * *` |
| BT_THREADS | Number of threads to use for backups | `4` |
| BT_LOG_LEVEL | Log level (debug, info, warn, error) | `info` |
| BT_DOCKER_URL | Docker API URL | `unix:///var/run/docker.sock` |
| BT_BACKUP_DIR | Directory to store backups | `/backups` |
### Backup Options
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_COMPRESS | Enable gzip compression for backups | `true` |
| BT_ROTATION_ENABLED | Enable automatic backup rotation | `true` |
| BT_RETENTION_DAYS | Days to keep backups (0 = unlimited) | `30` |
| BT_RETENTION_COUNT | Max backups per container (0 = unlimited) | `0` |
| BT_MIN_FREE_SPACE_GB | Minimum free space in GB | `10` |
### Health Monitoring
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_HEALTH_SERVER_ENABLED | Enable health check server | `false` |
| BT_HEALTH_PORT | Port for health check server | `8080` |
| BT_HEALTH_BIND | Bind address for health server | `0.0.0.0` |
### SSL/TLS Configuration
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_SSL | Enable SSL for Docker connection | `false` |
| BT_CA | Path to CA certificate or certificate data | none |
| BT_CERT | Path to client certificate or certificate data | none |
| BT_KEY | Path to client key or key data | none |
### Encryption
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_ENCRYPTION_ENABLED | Enable AES-256-GCM encryption | `false` |
| BT_ENCRYPTION_KEY | Encryption key (hex or base64) | none |
| BT_ENCRYPTION_KEY_FILE | Path to encryption key file | none |
| BT_ENCRYPTION_PASSPHRASE | Passphrase for key derivation | none |
### Notifications
| Variable | Description | Default |
| -------- | ----------- | ------- |
| BT_NOTIFICATION_CHANNELS | Comma-separated list of channels | `log` |
| BT_NOTIFY_SUCCESS | Notify on successful backups | `false` |
| BT_NOTIFY_FAILURES | Notify on backup failures | `true` |
| BT_NOTIFY_WARNINGS | Notify on warnings | `true` |
| BT_NOTIFY_HEALTH | Notify on health issues | `true` |
| BT_SLACK_WEBHOOK_URL | Slack webhook URL | none |
| BT_DISCORD_WEBHOOK_URL | Discord webhook URL | none |
| BT_TEAMS_WEBHOOK_URL | Microsoft Teams webhook URL | none |
| BT_WEBHOOK_URL | Generic webhook URL | none |
2025-04-14 09:39:37 -04:00
## Usage
Add labels to your docker containers to specify which databases to backup.
```yaml
services:
db:
image: postgres:17
container_name: my-db
restart: unless-stopped
volumes:
- db:/var/lib/postgresql/data
environment:
POSTGRES_DB: "${DB_BASE:-database}"
POSTGRES_USER: "${DB_USER:-user}"
POSTGRES_PASSWORD: "${DB_PASSWORD:-StrongPassword}"
labels:
2025-04-14 09:49:36 -04:00
- baktainer.backup=true
- baktainer.db.engine=postgres
- baktainer.db.name=my-db
- baktainer.db.user=user
- baktainer.db.password=StrongPassword
- baktainer.name="MyApp"
2025-04-14 09:39:37 -04:00
```
2025-04-14 09:49:36 -04:00
## Docker Labels for Container Configuration
### Required Labels
| Label | Description | Required For |
| ----- | ----------- | ------------ |
| baktainer.backup | Set to `true` to enable backup for this container | All |
| baktainer.db.engine | Database engine: `mysql`, `postgres`, `mongodb`, `sqlite` | All |
| baktainer.db.name | Name of the database to backup | All |
| baktainer.db.user | Username for the database | MySQL, PostgreSQL, MongoDB |
| baktainer.db.password | Password for the database | MySQL, PostgreSQL, MongoDB |
### Optional Labels
| Label | Description | Default |
| ----- | ----------- | ------- |
| baktainer.name | Application name for backup files | Database name |
| baktainer.compress | Enable compression for this container (`true`/`false`) | `BT_COMPRESS` value |
| baktainer.encrypt | Enable encryption for this container (`true`/`false`) | `BT_ENCRYPTION_ENABLED` value |
| baktainer.backup.priority | Backup priority: `low`, `normal`, `high`, `critical` | `normal` |
| baktainer.backup.retention.days | Days to keep backups for this container | `BT_RETENTION_DAYS` value |
| baktainer.backup.retention.count | Max backups to keep for this container | `BT_RETENTION_COUNT` value |
2025-04-14 09:49:36 -04:00
## Backup Files
The backup files will be stored in the directory specified by the `BT_BACKUP_DIR` environment variable. The files will be named according to the following format:
```
Major architectural overhaul: dependency injection, monitoring, and operational improvements This commit represents a comprehensive refactoring and enhancement of Baktainer: ## Core Architecture Improvements - Implemented comprehensive dependency injection system with DependencyContainer - Fixed critical singleton instantiation bug that was returning Procs instead of service instances - Replaced problematic Concurrent::FixedThreadPool with custom SimpleThreadPool implementation - Achieved 100% test pass rate (121 examples, 0 failures) after fixing 30+ failing tests ## New Features Implemented ### 1. Backup Rotation & Cleanup (BackupRotation) - Configurable retention policies by age, count, and disk space - Automatic cleanup with comprehensive statistics tracking - Empty directory cleanup and space monitoring ### 2. Backup Encryption (BackupEncryption) - AES-256-CBC and AES-256-GCM encryption support - Key derivation from passphrases or direct key input - Encrypted backup metadata storage ### 3. Operational Monitoring Suite - **Health Check Server**: HTTP endpoints for monitoring (/health, /status, /metrics) - **Web Dashboard**: Real-time monitoring dashboard with auto-refresh - **Prometheus Metrics**: Integration with monitoring systems - **Backup Monitor**: Comprehensive metrics tracking and performance alerts ### 4. Advanced Label Validation (LabelValidator) - Schema-based validation for all 12+ Docker labels - Engine-specific validation rules - Helpful error messages and warnings - Example generation for each database engine ### 5. Multi-Channel Notifications (NotificationSystem) - Support for Slack, Discord, Teams, webhooks, and log notifications - Event-based notifications for backups, failures, warnings, and health issues - Configurable notification thresholds ## Code Organization Improvements - Extracted responsibilities into focused classes: - ContainerValidator: Container validation logic - BackupOrchestrator: Backup workflow orchestration - FileSystemOperations: File I/O with comprehensive error handling - Configuration: Centralized environment variable management - BackupStrategy/Factory: Strategy pattern for database engines ## Testing Infrastructure - Added comprehensive unit and integration tests - Fixed timing-dependent test failures - Added RSpec coverage reporting (94.94% coverage) - Created test factories and fixtures ## Breaking Changes - Container class constructor now requires dependency injection - BackupCommand methods now use keyword arguments - Thread pool implementation changed from Concurrent to SimpleThreadPool ## Configuration New environment variables: - BT_HEALTH_SERVER_ENABLED: Enable health check server - BT_HEALTH_PORT/BT_HEALTH_BIND: Health server configuration - BT_NOTIFICATION_CHANNELS: Comma-separated notification channels - BT_ENCRYPTION_ENABLED/BT_ENCRYPTION_KEY: Backup encryption - BT_RETENTION_DAYS/COUNT: Backup retention policies This refactoring improves maintainability, testability, and adds enterprise-grade monitoring and operational features while maintaining backward compatibility for basic usage. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-14 22:58:26 -04:00
/backups/<date>/<db_name>-<timestamp>.sql.gz
2025-04-14 09:49:36 -04:00
```
Where `<date>` is the date of the backup ('YY-MM-DD' format) `<db_name>` is the name provided by baktainer.name, or the name of the database, `<timestamp>` is the unix timestamp of the backup.
2025-04-14 09:49:36 -04:00
Major architectural overhaul: dependency injection, monitoring, and operational improvements This commit represents a comprehensive refactoring and enhancement of Baktainer: ## Core Architecture Improvements - Implemented comprehensive dependency injection system with DependencyContainer - Fixed critical singleton instantiation bug that was returning Procs instead of service instances - Replaced problematic Concurrent::FixedThreadPool with custom SimpleThreadPool implementation - Achieved 100% test pass rate (121 examples, 0 failures) after fixing 30+ failing tests ## New Features Implemented ### 1. Backup Rotation & Cleanup (BackupRotation) - Configurable retention policies by age, count, and disk space - Automatic cleanup with comprehensive statistics tracking - Empty directory cleanup and space monitoring ### 2. Backup Encryption (BackupEncryption) - AES-256-CBC and AES-256-GCM encryption support - Key derivation from passphrases or direct key input - Encrypted backup metadata storage ### 3. Operational Monitoring Suite - **Health Check Server**: HTTP endpoints for monitoring (/health, /status, /metrics) - **Web Dashboard**: Real-time monitoring dashboard with auto-refresh - **Prometheus Metrics**: Integration with monitoring systems - **Backup Monitor**: Comprehensive metrics tracking and performance alerts ### 4. Advanced Label Validation (LabelValidator) - Schema-based validation for all 12+ Docker labels - Engine-specific validation rules - Helpful error messages and warnings - Example generation for each database engine ### 5. Multi-Channel Notifications (NotificationSystem) - Support for Slack, Discord, Teams, webhooks, and log notifications - Event-based notifications for backups, failures, warnings, and health issues - Configurable notification thresholds ## Code Organization Improvements - Extracted responsibilities into focused classes: - ContainerValidator: Container validation logic - BackupOrchestrator: Backup workflow orchestration - FileSystemOperations: File I/O with comprehensive error handling - Configuration: Centralized environment variable management - BackupStrategy/Factory: Strategy pattern for database engines ## Testing Infrastructure - Added comprehensive unit and integration tests - Fixed timing-dependent test failures - Added RSpec coverage reporting (94.94% coverage) - Created test factories and fixtures ## Breaking Changes - Container class constructor now requires dependency injection - BackupCommand methods now use keyword arguments - Thread pool implementation changed from Concurrent to SimpleThreadPool ## Configuration New environment variables: - BT_HEALTH_SERVER_ENABLED: Enable health check server - BT_HEALTH_PORT/BT_HEALTH_BIND: Health server configuration - BT_NOTIFICATION_CHANNELS: Comma-separated notification channels - BT_ENCRYPTION_ENABLED/BT_ENCRYPTION_KEY: Backup encryption - BT_RETENTION_DAYS/COUNT: Backup retention policies This refactoring improves maintainability, testability, and adds enterprise-grade monitoring and operational features while maintaining backward compatibility for basic usage. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-14 22:58:26 -04:00
By default, backups are compressed with gzip. To disable compression, set `BT_COMPRESS=false` or add `baktainer.compress=false` label to specific containers.
## Health Monitoring & Dashboard
Baktainer includes a comprehensive health monitoring system with a web dashboard and REST API.
### Accessing the Dashboard
When `BT_HEALTH_SERVER_ENABLED=true`, visit **http://localhost:8080** for:
- Real-time backup status and metrics
- Container discovery and configuration
- Performance monitoring with auto-refresh
- System health checks and alerts
### Health Check Endpoints
| Endpoint | Description |
| -------- | ----------- |
| `GET /` | Interactive monitoring dashboard |
| `GET /health` | Health check (200 = healthy, 503 = unhealthy) |
| `GET /status` | Detailed system status and metrics |
| `GET /backups` | Backup history and statistics |
| `GET /containers` | Discovered containers with backup labels |
| `GET /config` | Configuration (credentials sanitized) |
| `GET /metrics` | Prometheus-format metrics |
### Prometheus Integration
Use the `/metrics` endpoint to integrate with monitoring systems:
```yaml
# Prometheus scrape config
scrape_configs:
- job_name: 'baktainer'
static_configs:
- targets: ['baktainer:8080']
metrics_path: '/metrics'
scrape_interval: 30s
```
## Notifications
Configure multi-channel notifications for backup events:
### Supported Channels
- **Slack**: Set `BT_SLACK_WEBHOOK_URL`
- **Discord**: Set `BT_DISCORD_WEBHOOK_URL`
- **Microsoft Teams**: Set `BT_TEAMS_WEBHOOK_URL`
- **Generic Webhook**: Set `BT_WEBHOOK_URL`
- **Logs**: Always available
### Example Configuration
```yaml
environment:
- BT_NOTIFICATION_CHANNELS=slack,log
- BT_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
- BT_NOTIFY_FAILURES=true
- BT_NOTIFY_SUCCESS=false
- BT_NOTIFY_WARNINGS=true
```
## Backup Encryption
Secure your backups with AES-256-GCM encryption:
```yaml
environment:
- BT_ENCRYPTION_ENABLED=true
- BT_ENCRYPTION_KEY=your-256-bit-hex-key
# Or use a key file:
# - BT_ENCRYPTION_KEY_FILE=/path/to/keyfile
# Or derive from passphrase:
# - BT_ENCRYPTION_PASSPHRASE=your-secure-passphrase
```
Encrypted files use `.enc` extension and include authentication data for integrity verification.
## Backup Rotation & Cleanup
Automatic backup cleanup keeps your storage manageable:
### Configuration Options
```yaml
environment:
- BT_ROTATION_ENABLED=true # Enable automatic cleanup
- BT_RETENTION_DAYS=30 # Keep backups for 30 days
- BT_RETENTION_COUNT=0 # Max backups per container (0 = unlimited)
- BT_MIN_FREE_SPACE_GB=10 # Trigger cleanup when free space < 10GB
```
### Per-Container Overrides
```yaml
labels:
- baktainer.backup.retention.days=7 # Keep this container's backups for 7 days
- baktainer.backup.retention.count=5 # Keep max 5 backups for this container
```
### Cleanup Behavior
- **Time-based**: Remove backups older than specified days
- **Count-based**: Keep only N most recent backups per container
- **Space-based**: Automatic cleanup when disk space is low
- **Smart cleanup**: Removes empty date directories after cleanup
## Testing
The project includes comprehensive test coverage with both unit and integration tests.
### Running Tests
```bash
# Run all tests
cd app && bundle exec rspec
# Run tests with coverage report
cd app && COVERAGE=true bundle exec rspec
# Run only unit tests
cd app && bundle exec rspec spec/unit/
# Run only integration tests (requires Docker)
cd app && bundle exec rspec spec/integration/
```
### Test Coverage
- **Line Coverage**: 94.94% (150/158 lines)
- **Branch Coverage**: 71.11% (32/45 branches)
- Tests cover all database engines, container discovery, error handling, and backup workflows
- Integration tests validate full backup operations with real Docker containers
### Test Commands
```bash
# Quick unit tests
bin/test
# All tests with coverage
bin/test --all --coverage
# Integration tests with setup/cleanup
bin/test --integration --setup --cleanup
```
## Development Roadmap
### ✅ Completed Features
- [x] **Database Support**: MySQL, PostgreSQL, MongoDB, and SQLite backups
- [x] **Scheduling**: Cron-based backup scheduling
- [x] **Container Discovery**: Docker label-based configuration
- [x] **Docker Integration**: Support for socket, TCP, SSL/TLS connections
- [x] **Compression**: Gzip compression for backup files
- [x] **Health Monitoring**: Web dashboard and REST API monitoring
- [x] **Notifications**: Multi-channel notifications (Slack, Discord, Teams, Webhooks)
- [x] **Backup Rotation**: Automatic cleanup based on age, count, and disk space
- [x] **Encryption**: AES-256-GCM encryption for backup files
- [x] **Performance Monitoring**: Real-time metrics and alerts
- [x] **Label Validation**: Schema-based validation with helpful error messages
- [x] **High Performance**: Multi-threaded backups with dynamic scaling
- [x] **Comprehensive Testing**: 94.94% line coverage with unit and integration tests
### 🔄 In Progress
- [ ] Backup streaming for large databases
- [ ] Advanced retry strategies with exponential backoff
### 📋 Future Enhancements
- [ ] Individual hooks for completed backups
- [ ] Hooks for fully completed backup cycles
- [ ] Configurable timeout limits for each backup
- [ ] Database-specific optimization settings
- [ ] Backup verification and integrity checking
- [ ] Multi-region backup replication
- [ ] Advanced alerting rules and thresholds