docs: complete Phase 0 architecture — spec updates, review fixes, and link portability
Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy, OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034) for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05 in open-questions.md, deprecated hub terminology, undefined AuthService and noq terms. Replace inline OQ text with cross-references per format rules. Add ConfigServiceImpl definition to configuration.md. Port absolute workspace paths to project-relative links by copying referenced docs (feasibility, certbot, fail2ban, event_source_types) into docs/research/.
This commit is contained in:
56
docs/research/ops/certbot.md
Normal file
56
docs/research/ops/certbot.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# Certbot — dev1
|
||||
|
||||
## Overview
|
||||
|
||||
Let's Encrypt SSL certificates managed by certbot. Used by nginx for HTTPS.
|
||||
|
||||
## Installed
|
||||
|
||||
certbot (snap package on Ubuntu 24.04)
|
||||
|
||||
## Certificates
|
||||
|
||||
| Domain | Expiry | Path |
|
||||
|--------|--------|------|
|
||||
| git.alk.dev | 2026-06-18 | /etc/letsencrypt/live/git.alk.dev/ |
|
||||
|
||||
## File Locations
|
||||
|
||||
```
|
||||
/etc/letsencrypt/live/git.alk.dev/
|
||||
├── fullchain.pem # Server cert + chain
|
||||
├── privkey.pem # Private key
|
||||
├── cert.pem # Server cert only
|
||||
├── chain.pem # Chain only
|
||||
└── README
|
||||
```
|
||||
|
||||
Renewal config: `/etc/letsencrypt/renewal/git.alk.dev.conf`
|
||||
|
||||
## Renewal
|
||||
|
||||
Certbot auto-renews via systemd timer. Certificates renew when <30 days remaining.
|
||||
|
||||
```bash
|
||||
# Check certificates and expiry
|
||||
sudo certbot certificates
|
||||
|
||||
# Dry run renewal
|
||||
sudo certbot renew --dry-run
|
||||
|
||||
# Force renewal (if needed)
|
||||
sudo certbot renew --force-renewal
|
||||
|
||||
# Reload nginx after renewal
|
||||
sudo systemctl reload nginx
|
||||
```
|
||||
|
||||
## Initial Certificate
|
||||
|
||||
If adding a new domain, obtain the cert with the standalone plugin (nginx doesn't need to be running):
|
||||
|
||||
```bash
|
||||
sudo certbot certonly --standalone -d <domain> --agree-tos -m <email>
|
||||
```
|
||||
|
||||
Port 80 must be open for the ACME challenge. The api.alk.dev UFW rule allows HTTP for this purpose.
|
||||
106
docs/research/ops/fail2ban.md
Normal file
106
docs/research/ops/fail2ban.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Fail2ban — dev1
|
||||
|
||||
## Status
|
||||
|
||||
Active. 7 jails. Uses `nftables` backend with `systemd` journal.
|
||||
|
||||
## Active Jails
|
||||
|
||||
| Jail | Port | Filter | Max Retry | Find Time | Ban Time | Log Source |
|
||||
|------|------|--------|-----------|-----------|----------|------------|
|
||||
| sshd | ssh | sshd | default (5) | default (10m) | default (10m) | systemd journal |
|
||||
| gitea | ssh | gitea | 5 | 10m | 1h | journald (CONTAINER_NAME=gitea) |
|
||||
| nginx-badbots | http,https | nginx-badbots | 5 | 10m | 1h | /var/log/nginx/access.log |
|
||||
| nginx-botsearch | http,https | nginx-botsearch | default | default | default | /var/log/nginx/access.log |
|
||||
| nginx-limit-req | http,https | nginx-limit-req | default | default | default | /var/log/nginx/error.log |
|
||||
| nginx-401 | http,https | nginx-401 | 5 | 10m | 1h | /var/log/nginx/access.log |
|
||||
| nginx-403 | http,https | nginx-403 | 10 | 10m | 30m | /var/log/nginx/access.log |
|
||||
|
||||
## Configuration
|
||||
|
||||
Default settings in `/etc/fail2ban/jail.d/defaults-debian.conf`:
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
banaction = nftables
|
||||
banaction_allports = nftables[type=allports]
|
||||
backend = systemd
|
||||
```
|
||||
|
||||
Jail configs in `/etc/fail2ban/jail.d/`:
|
||||
- `gitea.conf` — Gitea jail with Docker journald log driver
|
||||
- `nginx.conf` — nginx-related jails
|
||||
|
||||
## Gitea Jail Details
|
||||
|
||||
Gitea runs in Docker with the `journald` log driver. The fail2ban filter uses `journalmatch` to read only Gitea container logs:
|
||||
|
||||
```ini
|
||||
[gitea]
|
||||
enabled = true
|
||||
port = ssh
|
||||
filter = gitea
|
||||
backend = systemd
|
||||
journalmatch = CONTAINER_NAME=gitea
|
||||
maxretry = 5
|
||||
findtime = 10m
|
||||
bantime = 1h
|
||||
action = iptables-allports[chain="DOCKER-USER"]
|
||||
```
|
||||
|
||||
The `DOCKER-USER` chain ensures bans affect Docker traffic.
|
||||
|
||||
## Custom Filters
|
||||
|
||||
Default install includes `gitea.conf`, `nginx-401.conf`, `nginx-403.conf` in `/etc/fail2ban/filter.d/`. Custom filter:
|
||||
|
||||
### nginx-badbots (`/etc/fail2ban/filter.d/nginx-badbots.conf`)
|
||||
|
||||
Catches malicious requests that the other nginx jails miss: `.env`/`.git` probes, PROPFIND/CONNECT abuse, common exploit paths (`/actuator`, `/cgi-bin`, `/ecp`, `/SDK`), and binary/garbage requests. Matches 400/404/405/413 status codes for known-bad path patterns only — legitimate 404s (e.g. wrong Gitea repo name) are not matched.
|
||||
|
||||
## Lesson Learned: Default Filters Miss Most Scanner Traffic
|
||||
|
||||
The default fail2ban nginx filters (`nginx-botsearch`, `nginx-401`, `nginx-403`, `nginx-limit-req`) only catch a narrow subset of malicious requests:
|
||||
|
||||
- **nginx-botsearch** only matches `<webmail|phpmyadmin|wordpress|cgi-bin|mysqladmin>` paths returning **404**. Misses `.env`, `.git/config`, `/actuator`, `/SDK`, `/ecp`, crypto mining RPC, PROPFIND/CONNECT abuse, and binary garbage — all of which return 400/405 instead of 404.
|
||||
- **nginx-401/403** only trigger on those specific status codes. Most scanners get 400 or 405.
|
||||
- **nginx-limit-req** only triggers when the rate limiter in nginx actually rejects a request.
|
||||
|
||||
**Result**: A site with heavy scanner traffic can show zero bans from all four default jails. The `nginx-badbots` custom filter closes this gap by matching known-bad path patterns regardless of status code.
|
||||
|
||||
### Verifying Jail Coverage
|
||||
|
||||
When setting up fail2ban on a new host:
|
||||
|
||||
1. Install jails and filters first
|
||||
2. Let traffic flow for a few hours
|
||||
3. Run `sudo fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/<filter>.conf` to verify each filter matches expected lines
|
||||
4. Check `sudo fail2ban-client status` to confirm jails show `Total failed > 0` — if any jail stays at 0 for hours on a public-facing host, the filter likely has a gap
|
||||
5. Inspect logs manually: `awk '$9>=400' /var/log/nginx/access.log | awk '{print $9}' | sort | uniq -c | sort -rn` shows which status codes scanners are hitting
|
||||
|
||||
### Adding the nginx-badbots Filter to a New Host
|
||||
|
||||
1. Copy `/etc/fail2ban/filter.d/nginx-badbots.conf` to the new host
|
||||
2. Append the jail config to `/etc/fail2ban/jail.d/nginx.conf`:
|
||||
|
||||
```ini
|
||||
[nginx-badbots]
|
||||
enabled = true
|
||||
port = http,https
|
||||
filter = nginx-badbots
|
||||
logpath = /var/log/nginx/access.log
|
||||
maxretry = 5
|
||||
findtime = 10m
|
||||
bantime = 1h
|
||||
```
|
||||
|
||||
3. `sudo fail2ban-client reload`
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
sudo fail2ban-client status
|
||||
sudo fail2ban-client status gitea
|
||||
sudo fail2ban-client set gitea unbanip <IP>
|
||||
sudo journalctl -u fail2ban -f
|
||||
```
|
||||
Reference in New Issue
Block a user