Compare commits

...

11 Commits

Author SHA1 Message Date
da0c5008df docs: gitea mis-route fix + robust WAF route propagation (#609)
Some checks are pending
License Headers / check (push) Waiting to run
2026-06-15 18:27:27 +02:00
CyberMind
994b48f39d
Merge pull request #610 from CyberMind-FR/feature/609-waf-robust-route-propagation-dir-bind-mo
fix(waf): robust route propagation — dir bind-mount + live-reload
2026-06-15 18:26:18 +02:00
e13cf925f1 fix(waf): robust route propagation — dir bind-mount + addon live-reload (closes #609)
The #603 FILE bind-mount of haproxy-routes.json binds an inode; route tools
edit via jq>tmp&&mv (new inode) so changes went stale until a container
restart (surfaced fixing git.maegia.tv mis-route). Now: wafctl uses a
DIRECTORY bind-mount (host /srv/mitmproxy -> /var/lib/secubox-waf-routes ro)
+ symlink, and the addon (both synced copies) live-reloads haproxy-routes.json
on mtime change (throttled 10s) in requestheaders -> route edits apply with
NO restart. Verified live: jq+mv add -> live-reloaded 256 routes, 0 restart.
2026-06-15 18:25:56 +02:00
6dba5a08d6 docs: WAF open-proxy fix + behind-WAF media cache (#605, #607) 2026-06-15 18:10:49 +02:00
CyberMind
211cff09b5
Merge pull request #608 from CyberMind-FR/feature/607-waf-behind-waf-media-cache-image-video-s
feat(waf): behind-WAF media cache (image/video/static) for hosted vhosts
2026-06-15 18:09:44 +02:00
3290f3b7c0 feat(waf): behind-WAF media cache for hosted vhosts (closes #607)
New media_cache.py addon (both synced copies) caches cacheable GET
media/static (image/video/audio/font/css/js) from our vhosts on disk
(URL key, 16MB/obj, 2GB LRU, TTL from max-age) and serves repeats from
cache. NOT a bypass: requests still pass secubox_waf inspection; only the
response body is served from a WAF-populated cache. Loaded in the LXC
mitmproxy.service; wafctl creates the cache dir. Toggle via
/data/mitmproxy/media-cache.json (default on). Verified live: HIT.
2026-06-15 18:09:25 +02:00
CyberMind
72b7eca12e
Merge pull request #606 from CyberMind-FR/feature/605-waf-refuse-unmapped-hosts-close-open-for
fix(waf): refuse unmapped hosts — close open forward-proxy (loops + 72% errors)
2026-06-15 17:19:25 +02:00
0d1c49307e fix(waf): refuse unmapped hosts — close open forward-proxy (closes #605)
In --mode regular the addon relayed any Host; HAProxy default_backend made
the WAF an open forward proxy abused by scanners (~72% error churn + 11
loop-508s/hr). requestheaders now serves ONLY our vhosts (routes / our
domains via routes-derived local_suffixes -> nginx 9080 / SELF_HOSTS) and
returns 421 otherwise with no upstream connect. Applied to both synced
secubox_waf.py copies. Verified live: 0 external server-connects, 0 loops,
apt/admin/kbin 200, scanners 421.
2026-06-15 17:19:06 +02:00
CyberMind
8bb546c689
Merge pull request #604 from CyberMind-FR/feature/603-waf-port-live-mitmproxy-fixes-to-source
fix(waf): port live mitmproxy fixes to source — mitmproxy-11 routing + routes bind-mount
2026-06-15 16:47:18 +02:00
05d6f97b44 fix(waf): port live mitmproxy fixes to source — mitmproxy-11 routing + routes bind-mount (closes #603)
- requestheaders hook in both synced secubox_waf.py copies: mitmproxy 11
  opens the upstream connection before the request hook, so the in-request
  redirect was too late and routed vhosts hit their public IP. Set
  flow.server_conn.address in requestheaders instead.
- wafctl: bind-mount host /srv/mitmproxy/haproxy-routes.json into the LXC at
  the addon's read path /data/mitmproxy/haproxy-routes.json (they had drifted
  → routes_count 0 → no routing); ensure the host file exists at provision.
- mitmproxy.service: warn any ExecStart drop-in must keep --set confdir
  (a dropped confdir crash-looped the WAF via ProtectHome).

Ported from the live gk2 fixes (HISTORY 2026-06-15).
2026-06-15 16:47:00 +02:00
28a73c8477 docs: WAF mitmproxy restored (confdir + mitmproxy-11 routing + routes bind-mount) 2026-06-15 16:39:01 +02:00
9 changed files with 802 additions and 8 deletions

View File

@ -3,6 +3,49 @@
---
## 2026-06-15 — gitea mis-route fix + robust WAF route propagation
- **gitea (`git.maegia.tv`) 404 → 200.** Pure routing-table error: its WAF
route pointed at `192.168.1.200:8000` (unrelated nginx) instead of the gitea
LXC `10.100.0.40:3000`. Corrected the route; gitea container was healthy
throughout. (`gitea.gk2`→nginx:9080 and `git.gk2`→gitea:3000 were already OK.)
- **Robust route propagation (#609/PR #610, mitmproxy 1.0.8 + waf 1.2.6).**
Fixing gitea surfaced that the #603 *file* bind-mount binds an inode, so route
tools (`jq > tmp && mv` = new inode) didn't reach the addon until a container
restart. Now: **directory** bind-mount (host `/srv/mitmproxy`
`/var/lib/secubox-waf-routes`, ro) + symlink, and the addon **live-reloads**
`haproxy-routes.json` on mtime change (10 s throttle, in `requestheaders`).
Verified live: `jq+mv` add → `[routes] live-reloaded 256 routes`, **0
restart**. Ported to source (both synced `secubox_waf.py` copies + wafctl) +
rebuilt into apt.secubox.in.
## 2026-06-15 — WAF hardening + perf: close open-proxy, behind-WAF media cache
Follow-up to the WAF restoration. Three findings investigated; two fixed.
- **Open forward-proxy / loops (#605/PR #606, mitmproxy 1.0.6 + waf 1.2.4).**
`--mode regular` + HAProxy `default_backend mitmproxy_inspector` made the WAF
an open proxy: internet scanners (114.66.25.146, 211.154.17.165,
hashtagbrock.nl) drove a **72% backend-error rate** + 11 self-loop 508s/hr.
The `requestheaders` hook now serves ONLY our vhosts (routes / our domains
via routes-derived `local_suffixes` → nginx :9080 / `SELF_HOSTS`) and returns
**421 with no upstream connect** otherwise. Live: 0 external server-connects,
0 loop-508s, apt/admin/kbin 200, scanners 421.
- **Behind-WAF media cache (#607/PR #608, mitmproxy 1.0.7 + waf 1.2.5).** New
`media_cache.py` addon caches cacheable GET media/static (image/video/audio/
font/css/js) from our vhosts on disk (URL key, 16 MB/obj, 2 GB LRU, TTL from
`max-age`) and serves repeats from cache — backend-load + latency win for
hosted media. **Not a bypass**: requests still pass `secubox_waf` inspection;
only the response body is served from a WAF-populated cache. Toggle
`/data/mitmproxy/media-cache.json` (default on). Live: `X-SecuBox-Cache: HIT`.
Gate fix vs the toolbox copy: cache on body length (our nginx is chunked).
- **WG R3 tunnel** (`wg-toolbox`, 4 peers, 4 `mitm-wg-worker@{1..4}`) is
healthy — not the bottleneck; the WAF open-proxy churn was. All fixes ported
to source (both synced `secubox_waf.py` copies) + rebuilt into apt.secubox.in.
**Still optional:** relax the forced `Connection: close` (FD-leak fix #496) to
bounded keep-alive now that scanner churn is gone — lower per-request latency.
## 2026-06-15 — APT repo: all packages published + signed (apt.secubox.in)
Made the apt repo at `https://admin.gk2.secubox.in/repo/` (served from
@ -25,12 +68,32 @@ Made the apt repo at `https://admin.gk2.secubox.in/repo/` (served from
secubox-core and others from every build). 1 pkg failed (sentinelle-gsm,
buildinfo artifact race — deb still produced).
**Blocker for public HTTPS (separate, pre-existing):** `apt.secubox.in` via
HAProxy returns 503 because the **WAF mitmproxy LXC is crash-looping**
(restart #45552, `PermissionError: /home/mitmproxy/.mitmproxy/config.yaml`),
which downs the `mitmproxy_inspector` backend → ALL WAF-inspected vhosts 503
(analyse.gk2 etc., not just apt). Repo is reachable internally (nginx :9080)
and via the `/repo/` WebUI; public apt URL needs the WAF restored.
**Public HTTPS now works — WAF mitmproxy restored (3 stacked bugs).** The WAF
LXC (`mitmproxy`, served via HAProxy `mitmproxy_inspector` → 10.100.0.60:8080)
was down board-wide (every inspected vhost 503/400), blocking public
`apt.secubox.in`. Three compounding faults, all fixed live on gk2:
1. **Crash-loop** (restart #45552): the `cookie-audit.conf` systemd drop-in
(added #156) overrode `ExecStart` but dropped `--set confdir=/data/mitmproxy`
→ mitmdump fell back to `~/.mitmproxy`, which `ProtectHome=true` blocks →
`PermissionError: config.yaml`. Restored the flag in the drop-in (+ copied
the existing CA into `/data/mitmproxy` to preserve identity).
2. **mitmproxy-11 routing**: the LXC addon (`secubox_waf.py`, pre-#499) only
redirected upstream in the `request` hook, but mitmproxy 11 opens the
upstream connection *before* `request` → traffic went to the public IP
(82.67.100.75). Added a `requestheaders` hook that sets
`flow.server_conn.address` (+ request host/port) before the connect.
3. **Route-file drift** (the real killer, `routes_count: 0`): the addon reads
`/data/mitmproxy/haproxy-routes.json`, but the system maintains
`/srv/mitmproxy/haproxy-routes.json` (255 routes). The addon's file was
missing. Fixed by **bind-mounting** the host file into the container at the
addon's path (`/var/lib/lxc/mitmproxy/config`) so they stay in sync.
Verified: `apt-get update` against `https://apt.secubox.in` fetches a
**GPG-signed** InRelease + Packages (no signature errors), apt sees 130
secubox packages, `.deb` downloads (200). Other inspected vhosts recovered.
Live fixes are durable (container rootfs + LXC config survive restarts);
porting them into the provisioning package is a follow-up.
## 2026-06-15 — threat-analyst: global security overview (1.4.3, live on gk2)

View File

@ -0,0 +1,231 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
#
# #607 — behind-WAF media proxy-cache for the mitmproxy inspection LXC.
# Cacheable GET media/static (image / video / audio / font / css / js) served
# by our own vhosts is stored on disk keyed by URL and served from cache on
# repeat requests — cutting backend load + latency for hosted media
# (peertube / photoprism / nextcloud …). NOT a WAF bypass: the request still
# passes secubox_waf inspection (request hook runs first); only the response
# BODY is served from a cache the WAF itself populated from inspected
# responses. Fail-open everywhere — a cache error never breaks the flow.
from __future__ import annotations
import hashlib
import json
import os
import re
import time
from mitmproxy import http
CACHE_DIR = "/data/mitmproxy/cache/media"
STATS = "/data/mitmproxy/logs/media_cache.json"
CONFIG = "/data/mitmproxy/media-cache.json" # {"enabled": true} — default on
MAX_OBJ = 16 * 1024 * 1024 # 16 MB / object
MAX_TOTAL = 2 * 1024 * 1024 * 1024 # 2 GB on disk
DEFAULT_TTL = 3600 # 1 h when upstream gives no max-age
_CACHEABLE = ("image/", "video/", "audio/", "font/", "text/css",
"javascript", "ecmascript", "application/font",
"application/vnd.ms-fontobject")
_MAXAGE = re.compile(r"max-age\s*=\s*(\d+)", re.IGNORECASE)
_index: dict = {}
_total = 0
_stats = {"hits": 0, "misses": 0, "stored": 0, "evicted": 0,
"bytes_served": 0, "since": int(time.time())}
_last_flush = 0.0
_cfg = {"enabled": True}
_cfg_mtime = 0.0
def _key(url: str) -> str:
return hashlib.sha256(url.encode("utf-8", "ignore")).hexdigest()
def _paths(key: str):
d = os.path.join(CACHE_DIR, key[:2])
return os.path.join(d, key), os.path.join(d, key + ".m")
def _enabled() -> bool:
global _cfg, _cfg_mtime
try:
st = os.stat(CONFIG)
if st.st_mtime != _cfg_mtime:
_cfg_mtime = st.st_mtime
with open(CONFIG, encoding="utf-8") as f:
_cfg = json.load(f)
except FileNotFoundError:
pass
except Exception:
return True
return bool(_cfg.get("enabled", True))
def _cacheable_ct(ct: str) -> bool:
ct = (ct or "").split(";", 1)[0].strip().lower()
return bool(ct) and any(f in ct for f in _CACHEABLE)
def _flush_stats(force: bool = False) -> None:
global _last_flush
now = time.time()
if not force and (now - _last_flush) < 5:
return
_last_flush = now
try:
os.makedirs(os.path.dirname(STATS), exist_ok=True)
with open(STATS, "w", encoding="utf-8") as f:
json.dump({**_stats, "objects": len(_index),
"bytes_cached": _total, "updated": int(now)}, f)
except Exception:
pass
def _load_index() -> None:
global _total
try:
for sub in os.listdir(CACHE_DIR):
d = os.path.join(CACHE_DIR, sub)
if not os.path.isdir(d):
continue
for name in os.listdir(d):
if name.endswith(".m"):
continue
fp = os.path.join(d, name)
try:
st = os.stat(fp)
meta = {}
mp = fp + ".m"
if os.path.exists(mp):
with open(mp, encoding="utf-8") as mf:
meta = json.load(mf)
_index[name] = {"size": st.st_size, "exp": meta.get("exp", 0),
"atime": st.st_atime, "ct": meta.get("ct", "")}
_total += st.st_size
except Exception:
pass
except FileNotFoundError:
pass
def _evict_if_needed() -> None:
global _total
if _total <= MAX_TOTAL:
return
for key, e in sorted(_index.items(), key=lambda kv: kv[1]["atime"]):
if _total <= MAX_TOTAL:
break
body, meta = _paths(key)
for p in (body, meta):
try:
os.remove(p)
except OSError:
pass
_total -= e["size"]
_index.pop(key, None)
_stats["evicted"] += 1
class MediaCache:
def __init__(self):
try:
os.makedirs(CACHE_DIR, exist_ok=True)
_load_index()
except Exception:
pass
def request(self, flow: http.HTTPFlow) -> None:
if not _enabled():
return
r = flow.request
if r.method != "GET" or "range" in r.headers or "authorization" in r.headers:
return
key = _key(r.pretty_url or "")
e = _index.get(key)
if not e:
_stats["misses"] += 1
return
if e["exp"] and e["exp"] < time.time():
return
body_path, _m = _paths(key)
try:
with open(body_path, "rb") as f:
body = f.read()
except OSError:
_index.pop(key, None)
return
e["atime"] = time.time()
try:
os.utime(body_path, None)
except OSError:
pass
_stats["hits"] += 1
_stats["bytes_served"] += len(body)
_flush_stats()
flow.response = http.Response.make(
200, body,
{"Content-Type": e.get("ct") or "application/octet-stream",
"X-SecuBox-Cache": "HIT",
"Cache-Control": "public, max-age=300"},
)
def response(self, flow: http.HTTPFlow) -> None:
global _total
if not _enabled() or not flow.response:
return
r = flow.request
resp = flow.response
if r.method != "GET" or resp.status_code != 200:
return
if "range" in r.headers or "authorization" in r.headers:
return
if resp.headers.get("x-secubox-cache") == "HIT":
return
cc = (resp.headers.get("cache-control", "") or "").lower()
if "no-store" in cc or "private" in cc or "set-cookie" in resp.headers:
return
if not _cacheable_ct(resp.headers.get("content-type", "")):
return
try:
clen = int(resp.headers.get("content-length", "0") or "0")
except (TypeError, ValueError):
clen = 0
if clen > MAX_OBJ: # header short-circuit; body-size gate below covers chunked
return
try:
body = resp.content or b""
except Exception:
return
if not body or len(body) > MAX_OBJ:
return
m = _MAXAGE.search(cc)
ttl = int(m.group(1)) if m else DEFAULT_TTL
if ttl <= 0:
return
key = _key(r.pretty_url or "")
body_path, meta_path = _paths(key)
ct = (resp.headers.get("content-type", "") or "").split(";")[0]
try:
os.makedirs(os.path.dirname(body_path), exist_ok=True)
tmp = body_path + ".tmp"
with open(tmp, "wb") as f:
f.write(body)
os.replace(tmp, body_path)
with open(meta_path, "w", encoding="utf-8") as f:
json.dump({"ct": ct, "exp": time.time() + ttl,
"url": (r.pretty_url or "")[:300]}, f)
except Exception:
return
old = _index.get(key, {}).get("size", 0)
_total += len(body) - old
_index[key] = {"size": len(body), "exp": time.time() + ttl,
"atime": time.time(), "ct": ct}
_stats["stored"] += 1
_evict_if_needed()
_flush_stats()
addons = [MediaCache()]

View File

@ -688,6 +688,8 @@ ERROR_503_PAGE = b"""<!DOCTYPE html>
class SecuBoxWAF:
def __init__(self):
self.routes = {}
self._routes_mtime = 0.0
self._last_route_check = 0.0
self.compiled_patterns = {}
self.stats = {"requests": 0, "warnings": 0, "blocked": 0, "errors": 0}
self.threat_counts = defaultdict(list) # IP -> list of timestamps
@ -712,6 +714,12 @@ class SecuBoxWAF:
if ROUTES_FILE.exists():
try:
self.routes = json.loads(ROUTES_FILE.read_text())
sfx = set()
for _h in self.routes:
_p = _h.split('.')
if len(_p) >= 2 and not _p[-1].isdigit():
sfx.add('.'.join(_p[-2:]))
self.local_suffixes = sfx
ctx.log.info(f"Loaded {len(self.routes)} routes")
except Exception as e:
ctx.log.error(f"Failed to load routes: {e}")
@ -919,6 +927,73 @@ class SecuBoxWAF:
except Exception:
ctx.log.warn(f"BAN FAILED for {ip} ({reason}) : LAPI off + cscli unavailable")
def _maybe_reload_routes(self):
# #609 — live-reload haproxy-routes.json when it changes (throttled
# 10 s) so haproxyctl route edits take effect with NO restart. Pairs
# with the directory bind-mount that makes mv-replaced files visible.
import os as _o, time as _t
now = _t.time()
if now - getattr(self, "_last_route_check", 0) < 10:
return
self._last_route_check = now
try:
m = _o.path.getmtime(str(ROUTES_FILE))
except OSError:
return
if m != getattr(self, "_routes_mtime", 0):
self._routes_mtime = m
self.load_routes()
try:
ctx.log.info(f"[routes] live-reloaded {len(self.routes)} routes")
except Exception:
pass
def requestheaders(self, flow: http.HTTPFlow):
self._maybe_reload_routes()
# #605 — mitmproxy 11 opens the upstream connection before request(),
# so routing must happen here. ALSO: in --mode regular mitmproxy is a
# forward proxy that would relay ANY Host, so internet scanners abused
# it as an open proxy (~70% error churn + self-loops). Serve ONLY our
# own vhosts: mapped (routes), our domains (-> nginx catch-all), or our
# own IPs; refuse everything else with 421 and never open an upstream.
try:
host = flow.request.pretty_host
if host in self.routes:
bip, bport = self.routes[host]
orig = flow.request.headers.get('Host', host)
flow.request.host = bip
flow.request.port = bport
try:
flow.server_conn.address = (bip, bport)
except Exception:
pass
flow.request.headers['Host'] = orig
return
if host in SELF_HOSTS or self._is_local_host(host):
flow.request.host = '192.168.1.200'
flow.request.port = 9080
try:
flow.server_conn.address = ('192.168.1.200', 9080)
except Exception:
pass
return
self.stats['blocked'] = self.stats.get('blocked', 0) + 1
flow.response = http.Response.make(
421,
b'<h1>421 Misdirected Request</h1><p>SecuBox WAF does not proxy this host.</p>',
{'Content-Type': 'text/html', 'X-SecuBox-WAF': 'unmapped-host'},
)
except Exception as e:
ctx.log.warn(f'[requestheaders-route] {e}')
def _is_local_host(self, host: str) -> bool:
# #605 — is `host` one of our own (registrable) domains? Derived from
# the routed hosts in load_routes (self.local_suffixes).
sfx = getattr(self, 'local_suffixes', None)
if not sfx:
return False
return any(host == s or host.endswith('.' + s) for s in sfx)
def request(self, flow: http.HTTPFlow):
# Connection close (Phase 6.J leak fix, ref #496) — prevents mitmproxy
# from accumulating idle keep-alive sockets to upstream backends.

View File

@ -1,3 +1,52 @@
secubox-mitmproxy (1.0.8-1~bookworm1) bookworm; urgency=medium
* fix(waf): live-reload haproxy-routes.json on change (#609). The addon now
re-reads the routes file when its mtime changes (throttled 10 s, in the
requestheaders hook), so haproxyctl route edits take effect with NO
restart. Pairs with the directory bind-mount in wafctl that replaced the
fragile file bind-mount (a file mount binds one inode → went stale when
route tools edit via `jq > tmp && mv`). Verified live: jq+mv add →
addon live-reloaded, 0 restart.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 18:00:00 +0200
secubox-mitmproxy (1.0.7-1~bookworm1) bookworm; urgency=medium
* feat(waf): behind-WAF media cache (#607). New media_cache.py addon caches
cacheable GET media/static (image/video/audio/font/css/js) from our vhosts
on disk (URL key, 16 MB/obj, 2 GB LRU, TTL from max-age) and serves repeat
requests from cache — cutting backend load + latency for hosted media. Not
a bypass: requests still pass secubox_waf inspection; only the response
body is served from a WAF-populated cache. Toggle via
/data/mitmproxy/media-cache.json {"enabled": true} (default on). Verified
live: X-SecuBox-Cache: HIT.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 17:00:00 +0200
secubox-mitmproxy (1.0.6-1~bookworm1) bookworm; urgency=medium
* fix(waf): refuse unmapped hosts — close the open forward-proxy (#605). In
--mode regular the addon relayed any Host, so HAProxy's default_backend
made the WAF an open proxy; internet scanners drove ~72% backend-error
churn + self-loops. The `requestheaders` hook now serves ONLY our vhosts
(mapped, our domains via routes-derived `local_suffixes` → nginx catch-all,
or our own IPs) and returns 421 for everything else with no upstream
connect. Verified live: 0 external server-connects, 0 loop-508s.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 16:30:00 +0200
secubox-mitmproxy (1.0.5-1~bookworm1) bookworm; urgency=medium
* fix(waf): mitmproxy-11 upstream routing (#603). The addon only redirected
the upstream in the `request` hook, but mitmproxy 11 opens the upstream
connection between `requestheaders` and `request` — so routed vhosts
connected to their public DNS IP instead of the internal backend (the WAF
was effectively pass-through). Added a `requestheaders` hook that sets
`flow.server_conn.address` before the connect. Ported from the live gk2
fix that restored apt.secubox.in + all inspected vhosts.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 16:00:00 +0200
secubox-mitmproxy (1.0.4-1~bookworm1) bookworm; urgency=medium
* Pre-route SELF_HOSTS guard. When a client targets the box by

View File

@ -1,3 +1,48 @@
secubox-waf (1.2.6-1~bookworm1) bookworm; urgency=medium
* fix(waf): robust route propagation (#609). wafctl now uses a DIRECTORY
bind-mount (host /srv/mitmproxy → /var/lib/secubox-waf-routes, ro) + a
symlink /data/mitmproxy/haproxy-routes.json → it, replacing the #603 file
bind-mount (which bound one inode and went stale on `mv`). With the addon
live-reload (synced secubox_waf.py copies), haproxyctl route edits apply
with no restart. Fixes the class of bug that left git.maegia.tv mis-routed.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 18:00:00 +0200
secubox-waf (1.2.5-1~bookworm1) bookworm; urgency=medium
* feat(waf): behind-WAF media cache (#607) — ship media_cache.py addon copy,
load it in the LXC mitmproxy.service ExecStart, and create
/data/mitmproxy/cache/media + logs in wafctl provisioning. Caches hosted
media (image/video/static) for repeat requests; not a bypass (requests
still inspected). Synced with secubox-mitmproxy.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 17:00:00 +0200
secubox-waf (1.2.4-1~bookworm1) bookworm; urgency=medium
* fix(waf): refuse unmapped hosts in the addon copy — close the open
forward-proxy (#605, synced with secubox-mitmproxy). 421 for any host not
in routes / our domains / our IPs; no upstream connect. Kills the ~72%
scanner-driven error churn and the self-loop 508s.
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 16:30:00 +0200
secubox-waf (1.2.3-1~bookworm1) bookworm; urgency=medium
* fix(waf): mitmproxy-11 upstream routing — `requestheaders` hook in the
addon copy (#603, kept in sync with secubox-mitmproxy).
* fix(wafctl): bind-mount the host-maintained
/srv/mitmproxy/haproxy-routes.json into the LXC at the addon's read path
(/data/mitmproxy/haproxy-routes.json). The two had drifted (in-LXC copy
went stale → routes_count: 0 → no routing). Also ensures the host file
exists at provision time.
* doc(service): warn that any ExecStart drop-in MUST keep
`--set confdir=/data/mitmproxy` (a cookie_audit drop-in once dropped it
and crash-looped the WAF on ProtectHome).
-- Gerald KERMA <devel@cybermind.fr> Mon, 15 Jun 2026 16:00:00 +0200
secubox-waf (1.2.2-1~bookworm1) bookworm; urgency=medium
* Phase 11+ (#509) — double-buffered cache for WAF stats consumed by

View File

@ -0,0 +1,231 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
#
# #607 — behind-WAF media proxy-cache for the mitmproxy inspection LXC.
# Cacheable GET media/static (image / video / audio / font / css / js) served
# by our own vhosts is stored on disk keyed by URL and served from cache on
# repeat requests — cutting backend load + latency for hosted media
# (peertube / photoprism / nextcloud …). NOT a WAF bypass: the request still
# passes secubox_waf inspection (request hook runs first); only the response
# BODY is served from a cache the WAF itself populated from inspected
# responses. Fail-open everywhere — a cache error never breaks the flow.
from __future__ import annotations
import hashlib
import json
import os
import re
import time
from mitmproxy import http
CACHE_DIR = "/data/mitmproxy/cache/media"
STATS = "/data/mitmproxy/logs/media_cache.json"
CONFIG = "/data/mitmproxy/media-cache.json" # {"enabled": true} — default on
MAX_OBJ = 16 * 1024 * 1024 # 16 MB / object
MAX_TOTAL = 2 * 1024 * 1024 * 1024 # 2 GB on disk
DEFAULT_TTL = 3600 # 1 h when upstream gives no max-age
_CACHEABLE = ("image/", "video/", "audio/", "font/", "text/css",
"javascript", "ecmascript", "application/font",
"application/vnd.ms-fontobject")
_MAXAGE = re.compile(r"max-age\s*=\s*(\d+)", re.IGNORECASE)
_index: dict = {}
_total = 0
_stats = {"hits": 0, "misses": 0, "stored": 0, "evicted": 0,
"bytes_served": 0, "since": int(time.time())}
_last_flush = 0.0
_cfg = {"enabled": True}
_cfg_mtime = 0.0
def _key(url: str) -> str:
return hashlib.sha256(url.encode("utf-8", "ignore")).hexdigest()
def _paths(key: str):
d = os.path.join(CACHE_DIR, key[:2])
return os.path.join(d, key), os.path.join(d, key + ".m")
def _enabled() -> bool:
global _cfg, _cfg_mtime
try:
st = os.stat(CONFIG)
if st.st_mtime != _cfg_mtime:
_cfg_mtime = st.st_mtime
with open(CONFIG, encoding="utf-8") as f:
_cfg = json.load(f)
except FileNotFoundError:
pass
except Exception:
return True
return bool(_cfg.get("enabled", True))
def _cacheable_ct(ct: str) -> bool:
ct = (ct or "").split(";", 1)[0].strip().lower()
return bool(ct) and any(f in ct for f in _CACHEABLE)
def _flush_stats(force: bool = False) -> None:
global _last_flush
now = time.time()
if not force and (now - _last_flush) < 5:
return
_last_flush = now
try:
os.makedirs(os.path.dirname(STATS), exist_ok=True)
with open(STATS, "w", encoding="utf-8") as f:
json.dump({**_stats, "objects": len(_index),
"bytes_cached": _total, "updated": int(now)}, f)
except Exception:
pass
def _load_index() -> None:
global _total
try:
for sub in os.listdir(CACHE_DIR):
d = os.path.join(CACHE_DIR, sub)
if not os.path.isdir(d):
continue
for name in os.listdir(d):
if name.endswith(".m"):
continue
fp = os.path.join(d, name)
try:
st = os.stat(fp)
meta = {}
mp = fp + ".m"
if os.path.exists(mp):
with open(mp, encoding="utf-8") as mf:
meta = json.load(mf)
_index[name] = {"size": st.st_size, "exp": meta.get("exp", 0),
"atime": st.st_atime, "ct": meta.get("ct", "")}
_total += st.st_size
except Exception:
pass
except FileNotFoundError:
pass
def _evict_if_needed() -> None:
global _total
if _total <= MAX_TOTAL:
return
for key, e in sorted(_index.items(), key=lambda kv: kv[1]["atime"]):
if _total <= MAX_TOTAL:
break
body, meta = _paths(key)
for p in (body, meta):
try:
os.remove(p)
except OSError:
pass
_total -= e["size"]
_index.pop(key, None)
_stats["evicted"] += 1
class MediaCache:
def __init__(self):
try:
os.makedirs(CACHE_DIR, exist_ok=True)
_load_index()
except Exception:
pass
def request(self, flow: http.HTTPFlow) -> None:
if not _enabled():
return
r = flow.request
if r.method != "GET" or "range" in r.headers or "authorization" in r.headers:
return
key = _key(r.pretty_url or "")
e = _index.get(key)
if not e:
_stats["misses"] += 1
return
if e["exp"] and e["exp"] < time.time():
return
body_path, _m = _paths(key)
try:
with open(body_path, "rb") as f:
body = f.read()
except OSError:
_index.pop(key, None)
return
e["atime"] = time.time()
try:
os.utime(body_path, None)
except OSError:
pass
_stats["hits"] += 1
_stats["bytes_served"] += len(body)
_flush_stats()
flow.response = http.Response.make(
200, body,
{"Content-Type": e.get("ct") or "application/octet-stream",
"X-SecuBox-Cache": "HIT",
"Cache-Control": "public, max-age=300"},
)
def response(self, flow: http.HTTPFlow) -> None:
global _total
if not _enabled() or not flow.response:
return
r = flow.request
resp = flow.response
if r.method != "GET" or resp.status_code != 200:
return
if "range" in r.headers or "authorization" in r.headers:
return
if resp.headers.get("x-secubox-cache") == "HIT":
return
cc = (resp.headers.get("cache-control", "") or "").lower()
if "no-store" in cc or "private" in cc or "set-cookie" in resp.headers:
return
if not _cacheable_ct(resp.headers.get("content-type", "")):
return
try:
clen = int(resp.headers.get("content-length", "0") or "0")
except (TypeError, ValueError):
clen = 0
if clen > MAX_OBJ: # header short-circuit; body-size gate below covers chunked
return
try:
body = resp.content or b""
except Exception:
return
if not body or len(body) > MAX_OBJ:
return
m = _MAXAGE.search(cc)
ttl = int(m.group(1)) if m else DEFAULT_TTL
if ttl <= 0:
return
key = _key(r.pretty_url or "")
body_path, meta_path = _paths(key)
ct = (resp.headers.get("content-type", "") or "").split(";")[0]
try:
os.makedirs(os.path.dirname(body_path), exist_ok=True)
tmp = body_path + ".tmp"
with open(tmp, "wb") as f:
f.write(body)
os.replace(tmp, body_path)
with open(meta_path, "w", encoding="utf-8") as f:
json.dump({"ct": ct, "exp": time.time() + ttl,
"url": (r.pretty_url or "")[:300]}, f)
except Exception:
return
old = _index.get(key, {}).get("size", 0)
_total += len(body) - old
_index[key] = {"size": len(body), "exp": time.time() + ttl,
"atime": time.time(), "ct": ct}
_stats["stored"] += 1
_evict_if_needed()
_flush_stats()
addons = [MediaCache()]

View File

@ -570,6 +570,8 @@ ERROR_503_PAGE = b"""<!DOCTYPE html>
class SecuBoxWAF:
def __init__(self):
self.routes = {}
self._routes_mtime = 0.0
self._last_route_check = 0.0
self.compiled_patterns = {}
self.stats = {"requests": 0, "warnings": 0, "blocked": 0, "errors": 0}
self.threat_counts = defaultdict(list) # IP -> list of timestamps
@ -594,6 +596,12 @@ class SecuBoxWAF:
if ROUTES_FILE.exists():
try:
self.routes = json.loads(ROUTES_FILE.read_text())
sfx = set()
for _h in self.routes:
_p = _h.split('.')
if len(_p) >= 2 and not _p[-1].isdigit():
sfx.add('.'.join(_p[-2:]))
self.local_suffixes = sfx
ctx.log.info(f"Loaded {len(self.routes)} routes")
except Exception as e:
ctx.log.error(f"Failed to load routes: {e}")
@ -775,6 +783,73 @@ class SecuBoxWAF:
except Exception:
ctx.log.warn(f"BAN FAILED for {ip} ({reason})")
def _maybe_reload_routes(self):
# #609 — live-reload haproxy-routes.json when it changes (throttled
# 10 s) so haproxyctl route edits take effect with NO restart. Pairs
# with the directory bind-mount that makes mv-replaced files visible.
import os as _o, time as _t
now = _t.time()
if now - getattr(self, "_last_route_check", 0) < 10:
return
self._last_route_check = now
try:
m = _o.path.getmtime(str(ROUTES_FILE))
except OSError:
return
if m != getattr(self, "_routes_mtime", 0):
self._routes_mtime = m
self.load_routes()
try:
ctx.log.info(f"[routes] live-reloaded {len(self.routes)} routes")
except Exception:
pass
def requestheaders(self, flow: http.HTTPFlow):
self._maybe_reload_routes()
# #605 — mitmproxy 11 opens the upstream connection before request(),
# so routing must happen here. ALSO: in --mode regular mitmproxy is a
# forward proxy that would relay ANY Host, so internet scanners abused
# it as an open proxy (~70% error churn + self-loops). Serve ONLY our
# own vhosts: mapped (routes), our domains (-> nginx catch-all), or our
# own IPs; refuse everything else with 421 and never open an upstream.
try:
host = flow.request.pretty_host
if host in self.routes:
bip, bport = self.routes[host]
orig = flow.request.headers.get('Host', host)
flow.request.host = bip
flow.request.port = bport
try:
flow.server_conn.address = (bip, bport)
except Exception:
pass
flow.request.headers['Host'] = orig
return
if host in SELF_HOSTS or self._is_local_host(host):
flow.request.host = '192.168.1.200'
flow.request.port = 9080
try:
flow.server_conn.address = ('192.168.1.200', 9080)
except Exception:
pass
return
self.stats['blocked'] = self.stats.get('blocked', 0) + 1
flow.response = http.Response.make(
421,
b'<h1>421 Misdirected Request</h1><p>SecuBox WAF does not proxy this host.</p>',
{'Content-Type': 'text/html', 'X-SecuBox-WAF': 'unmapped-host'},
)
except Exception as e:
ctx.log.warn(f'[requestheaders-route] {e}')
def _is_local_host(self, host: str) -> bool:
# #605 — is `host` one of our own (registrable) domains? Derived from
# the routed hosts in load_routes (self.local_suffixes).
sfx = getattr(self, 'local_suffixes', None)
if not sfx:
return False
return any(host == s or host.endswith('.' + s) for s in sfx)
def request(self, flow: http.HTTPFlow):
# Connection close (Phase 6.J leak fix, ref #496) — prevents mitmproxy
# from accumulating idle keep-alive sockets to upstream backends.

View File

@ -82,6 +82,13 @@ cmd_install() {
# Create symlink for lxc-* commands
ln -sf "$LXC_PATH/$LXC_NAME" "/var/lib/lxc/$LXC_NAME"
# Ensure the host-maintained routes file exists so the bind-mount below has
# a source (#603). haproxyctl writes this file on the host; the WAF addon
# inside the LXC reads /data/mitmproxy/haproxy-routes.json — the bind-mount
# keeps them the same file instead of two copies that drift apart.
mkdir -p /srv/mitmproxy
[ -f /srv/mitmproxy/haproxy-routes.json ] || echo '{}' > /srv/mitmproxy/haproxy-routes.json
# Configure container
cat >> "$LXC_PATH/$LXC_NAME/config" << CONF
@ -92,6 +99,14 @@ lxc.net.0.flags = up
lxc.net.0.ipv4.address = $LXC_IP/24
lxc.net.0.ipv4.gateway = 10.100.0.1
# Routes: bind-mount the host-maintained haproxy routes DIRECTORY into the
# container; the addon reads /data/mitmproxy/haproxy-routes.json via a symlink
# into it (created below). A *directory* mount (not a file mount) is required
# so route tools that edit via `jq > tmp && mv` (new inode) stay visible — a
# file mount binds one inode and goes stale on mv (#609, was #603). Combined
# with the addon's mtime live-reload, route edits apply with no restart.
lxc.mount.entry = /srv/mitmproxy var/lib/secubox-waf-routes none bind,ro,create=dir 0 0
# Autostart
lxc.start.auto = 1
lxc.start.delay = 5
@ -106,7 +121,12 @@ CONF
lxc-attach -n "$LXC_NAME" -- apt-get update
lxc-attach -n "$LXC_NAME" -- apt-get install -y python3-pip python3-venv curl jq
lxc-attach -n "$LXC_NAME" -- mkdir -p /opt/mitmproxy /data/mitmproxy /var/log/mitmproxy
lxc-attach -n "$LXC_NAME" -- mkdir -p /opt/mitmproxy /data/mitmproxy /var/log/mitmproxy \
/data/mitmproxy/cache/media /data/mitmproxy/logs # #607 media cache + stats
# #609 — addon reads /data/mitmproxy/haproxy-routes.json; point it at the
# routes dir bind-mount so mv-replaced files stay visible + live-reload.
lxc-attach -n "$LXC_NAME" -- ln -sfn /var/lib/secubox-waf-routes/haproxy-routes.json \
/data/mitmproxy/haproxy-routes.json
lxc-attach -n "$LXC_NAME" -- python3 -m venv /opt/mitmproxy
lxc-attach -n "$LXC_NAME" -- /opt/mitmproxy/bin/pip install mitmproxy

View File

@ -12,12 +12,17 @@ Type=simple
User=mitmproxy
Group=mitmproxy
WorkingDirectory=/data/mitmproxy
# WARNING (#603): any ExecStart override (drop-in) MUST keep
# `--set confdir=/data/mitmproxy`. Without it mitmdump falls back to
# ~/.mitmproxy, which ProtectHome=true makes inaccessible → PermissionError
# crash-loop. A cookie_audit drop-in once dropped this flag and downed the WAF.
ExecStart=/opt/mitmproxy/bin/mitmdump \
--mode regular \
--listen-host 0.0.0.0 \
--listen-port 8080 \
--set confdir=/data/mitmproxy \
--scripts /data/mitmproxy/secubox_waf.py
-s /data/mitmproxy/secubox_waf.py \
-s /data/mitmproxy/media_cache.py
Restart=on-failure
RestartSec=5