Compare commits

...

4 Commits

Author SHA1 Message Date
89380a121a Merge feature/501-mitm-wg-multi-worker-fanout : Phase 9 mitm-wg multi-worker fanout (ref #501)
Some checks are pending
License Headers / check (push) Waiting to run
2026-06-09 06:27:40 +02:00
c17810e1f0 feat(toolbox): Phase 9 multi-worker fanout for mitm-wg (ref #501)
The single-process mitm-wg saturated one ARM core at ~90 % under just
2-3 active wg peers because the Python GIL caps real parallelism
inside one mitmproxy process.  No single-process tweak moves that
needle further than Phase 8.1 already did (CPU 65 % → 12 % at idle ;
at multi-peer load we're back to 90 %+).

Phase 9 ships 4 mitm worker instances and lets nft round-robin
distribute new TCP flows across them via `numgen inc mod 4`.
Conntrack pins each flow's DNAT translation for its lifetime, so a
given TCP connection sees exactly one worker from SYN to FIN —
sticky-per-flow without needing nftables 1.0.7+ jhash support
(Debian bookworm ships 1.0.6).

What's shipped :

  systemd/secubox-toolbox-mitm-wg-worker@.service
    Template unit ; per-instance Environment=MITM_WG_LISTEN_PORT=
    808%i (8081..8084).  Per-worker RuntimeMaxSec=3h, MemoryMax=128M,
    TasksMax=128, User=secubox-toolbox.

  sbin/secubox-toolbox-mitm-wg-launch
    Now reads MITM_WG_LISTEN_PORT (default 8081 for the legacy
    single-worker service).

  nftables.d/secubox-toolbox-wg-fanout.nft
    Replaces the single-port DNAT rules with a numgen-inc round-robin
    map to 4 ports.  DNS + captive-portal DNAT rules stay untouched
    (small queries, no benefit from fanout).

  debian/rules
    Installs both the worker template and the fanout nft drop-in
    next to the existing single-worker artifacts.

Activation (operator-initiated) :

  systemctl disable --now secubox-toolbox-mitm-wg.service
  systemctl enable --now secubox-toolbox-mitm-wg-worker@{1,2,3,4}.service
  ln -sf /usr/share/secubox/toolbox/nftables.d/secubox-toolbox-wg-fanout.nft \
         /etc/nftables.d/secubox-toolbox-wg.nft
  nft -f /etc/nftables.d/secubox-toolbox-wg.nft

Rollback : reverse the steps above ; the legacy single-worker
service and its single-port nft drop-in remain shipped + functional.

Live numbers on gk2 with the workload active during the cut-over
(2 Linux peers + 1 iPhone, dozens of concurrent flows) :

  before  one process @ 90-95 % CPU on a single core, saturated
  after   ~55 % avg per worker × 4, 0-70 % range, all cores below saturation

The Phase 8.B addon-write fire-and-forget pattern shipped in 2.4.3
becomes load-bearing here : SQLite WAL on toolbox.db handles 4
concurrent writers cleanly because each worker's _executor serialises
its own writes, and SQLite's writer mutex handles the inter-worker
contention with no event-loop stalls.

Known limitation : the cert-pin auto-learning dynamic bypass file is
the remaining race surface (4 writers can dupe a line under burst,
the launcher's sort -u de-dupes at next reload).  A real filelock
lands in Phase 9.1.
2026-06-09 06:27:27 +02:00
06e73d39bd Merge perf/500-captive-flags-and-addon-async : captive flags + addon async writes (ref #500) 2026-06-09 06:19:30 +02:00
d3fbf174c0 perf(toolbox): captive mitm flag symmetry + addon writes off-thread (ref #500)
A — Captive mitm-toolbox (secubox-toolbox-mitm.service) gets the same
three flags as mitm-wg picked up in 2.4.1 :

  --set http2=true
  --set connection_strategy=eager
  --set keep_host_header=true

The captive idles at ~0 % CPU right now (wlan AP is down), so no
visible change today.  When the AP is reactivated the captive will
inherit the same ×4 CPU win the WG path saw — and it stays in
symmetry with mitm-wg so future tweaks land on both.

B — Addon SQLite writes are now fire-and-forget via singleton
ThreadPoolExecutor (max_workers=1, thread_name_prefix=…) :

  local_store._insert  → submitted to sbx_store_write
  utiq.record_event    → submitted to sbx_utiq_write

The hook returns instantly ; the _conn() open + INSERT + fsync chain
runs on the bg thread.  No more event-loop stalls during peer flow
processing.

Live diagnostic on gk2 surfaced the actual bottleneck while shipping
this : one Linux PC peer (10.99.1.60) was generating ~3 client
connects/sec sustained, the second Linux PC (10.99.1.47) running in
parallel, and mitm-wg processing ~100 concurrent TLS sessions.  At
that level the CPU is consumed by mitmproxy itself (TLS termination
+ per-flow H/2 stream parsing under the Python GIL), NOT by the
addon writes — so A and B don't move the needle today.

They remain shipped as defensive hygiene before Phase 9 multi-worker
fanout : when 4 mitm workers contend on the same SQLite file, the
fire-and-forget pattern matters.
2026-06-09 06:19:19 +02:00
8 changed files with 248 additions and 13 deletions

View File

@ -1,3 +1,55 @@
secubox-toolbox (2.5.0-1~bookworm1) bookworm; urgency=medium
* Phase 9 (#501) — multi-worker fanout for mitm-wg.
Single-process mitm-wg saturated one ARM core at ~90 % even with
just 2-3 active wg peers, limited by the Python GIL. Phase 9
spreads new TCP flows across N=4 worker instances :
- systemd : new template secubox-toolbox-mitm-wg-worker@.service ;
per-instance Environment=MITM_WG_LISTEN_PORT=808%i (8081..8084).
Per-worker RuntimeMaxSec=3 h, MemoryMax=128M, TasksMax=128.
- launcher : reads MITM_WG_LISTEN_PORT (default 8081 for legacy
single-worker service).
- nft : new drop-in nftables.d/secubox-toolbox-wg-fanout.nft
replaces the prerouting chain with a numgen inc round-robin
across 4 ports. Conntrack pins each TCP flow to its initially
assigned worker for the lifetime of the connection
(sticky-per-flow ; rebalancing only at new connection).
- opt-in : single-worker secubox-toolbox-mitm-wg.service stays
shipped + functional. Activation recipe in the worker unit's
[Unit] description.
Live numbers on gk2 with 2 active Linux peers + 1 iPhone :
single 90-95 % CPU on 1 core (saturated)
fanout ~55 % avg per worker × 4, 0-70 % range (headroom)
SQLite WAL on toolbox.db handles 4 concurrent writers ; the
cert-pin auto-learning dynamic bypass file is the remaining race
surface (4 writers can dupe a line, the launcher's sort -u
de-dupes at next reload). A real filelock lands in Phase 9.1.
-- Gérald Kerma <devel@cybermind.fr> mar., 09 juin 2026 04:27:27 +0000
secubox-toolbox (2.4.3-1~bookworm1) bookworm; urgency=medium
* Phase 8.2 perf (#500) — defensive performance work :
- Captive mitm service flags now match the mitm-wg quick win of
2.4.1 (--set http2=true / connection_strategy=eager /
keep_host_header=true). No perceptible change today (the
captive AP is down so the service idles at ~0 % CPU) but the
moment the AP is reactivated the captive picks up the same
×4 CPU win the WG path got.
- Addon SQLite writes (local_store + utiq) are now fire-and-
forget through a singleton ThreadPoolExecutor. Each addon owns
its own bg writer thread (sbx_store_write / sbx_utiq_write).
The mitmproxy asyncio event loop never blocks on _conn() open
/ INSERT / fsync. Live diagnostic showed the actual mitm-wg
bottleneck is mitmproxy itself (TLS termination + per-flow
H/2 parsing) under multi-peer fan-in, not the addon writes ;
the change is still warranted as defensive hygiene before
shipping the Phase 9 multi-worker fanout that will benefit
from non-blocking writes when 4 workers contend on the same
SQLite file.
-- Gérald Kerma <devel@cybermind.fr> mar., 09 juin 2026 04:19:18 +0000
secubox-toolbox (2.4.2-1~bookworm1) bookworm; urgency=medium
* Landing page kbin.gk2.secubox.in : la section 'Démo install R3'

View File

@ -35,6 +35,11 @@ override_dh_installsystemd:
install -d debian/secubox-toolbox/lib/systemd/system/secubox-toolbox-mitm-wg.service.d
install -m 0644 systemd/secubox-toolbox-mitm-wg.service.d/10-runtime-max.conf \
debian/secubox-toolbox/lib/systemd/system/secubox-toolbox-mitm-wg.service.d/
# Phase 9 (#501) : multi-worker fanout template — opt-in via
# systemctl enable @1..4. See unit's [Unit] doc string for the
# activation + rollback recipe.
install -m 0644 systemd/secubox-toolbox-mitm-wg-worker@.service \
debian/secubox-toolbox/lib/systemd/system/
# Primary unit goes via dh_installsystemd which also handles the enable helpers.
cp systemd/secubox-toolbox.service debian/secubox-toolbox.service
dh_installsystemd --no-start --no-enable
@ -57,6 +62,10 @@ override_dh_strip:
install -d debian/secubox-toolbox/usr/share/secubox/toolbox/nftables.d
install -m 0644 nftables.d/secubox-toolbox-wg.nft \
debian/secubox-toolbox/usr/share/secubox/toolbox/nftables.d/
# Phase 9 (#501) : fanout DNAT drop-in (opt-in). Operator activates
# by symlinking /etc/nftables.d/secubox-toolbox-wg.nft → this file.
install -m 0644 nftables.d/secubox-toolbox-wg-fanout.nft \
debian/secubox-toolbox/usr/share/secubox/toolbox/nftables.d/
install -m 0755 sbin/secubox-toolbox-wg-restore \
debian/secubox-toolbox/usr/sbin/
install -m 0644 systemd/secubox-toolbox-wg-restore.service \

View File

@ -19,6 +19,9 @@ ExecStart=/usr/bin/mitmdump \
--set confdir=/etc/secubox/toolbox/mitm \
--set ssl_insecure=false \
--set web_open_browser=false \
--set http2=true \
--set connection_strategy=eager \
--set keep_host_header=true \
-s /usr/lib/secubox/toolbox/mitmproxy_addons/cookies.py \
-s /usr/lib/secubox/toolbox/mitmproxy_addons/dpi.py \
-s /usr/lib/secubox/toolbox/mitmproxy_addons/avatar.py \

View File

@ -145,7 +145,15 @@ def _peer_ip(flow) -> str | None:
return None
def _insert(mac_hash: str | None, source: str, payload: dict) -> None:
# Phase 8.B perf (#500) — fire-and-forget SQLite writes via a single
# background thread so the mitmproxy asyncio event loop never blocks
# on `fsync()`. Single worker keeps inserts ordered AND avoids SQLite
# write contention (the engine itself serialises writers in WAL mode).
import concurrent.futures as _futures
_executor = _futures.ThreadPoolExecutor(max_workers=1, thread_name_prefix="sbx_store_write")
def _insert_sync(mac_hash: str | None, source: str, payload: dict) -> None:
if not mac_hash:
return
try:
@ -163,6 +171,22 @@ def _insert(mac_hash: str | None, source: str, payload: dict) -> None:
log.debug("sqlite insert failed: %s", e)
def _insert(mac_hash: str | None, source: str, payload: dict) -> None:
"""Phase 8.B — submit the insert to the bg thread. Hook returns
instantly ; the mitmproxy event loop keeps churning flows while
the SQLite IO happens off-thread.
Submit may raise RuntimeError if the executor was shut down during
interpreter teardown ; we swallow that to keep the hook silent on
shutdown."""
if not mac_hash:
return
try:
_executor.submit(_insert_sync, mac_hash, source, payload)
except RuntimeError:
pass
# ──────────────── mitmproxy hooks ────────────────
class LocalStore:

View File

@ -0,0 +1,53 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
# Phase 9 (#501) — multi-worker fanout drop-in for the R3 wg tunnel mitm.
#
# REPLACES the prerouting rules from secubox-toolbox-wg.nft :
# iif wg-toolbox tcp dport 443 dnat ip to 10.99.1.1:8081 (single port)
# with a round-robin numgen mapping to ports 8081..8084.
#
# Why numgen inc and not jhash : nftables 1.0.6 (Debian bookworm) doesn't
# support `jhash` in numgen yet (lands in 1.0.7+). `inc` is round-robin
# per-rule-evaluation, but conntrack pins the chosen DNAT translation for
# the lifetime of the TCP flow — so each individual TCP connection sees
# exactly one worker from SYN to FIN. Re-balancing happens only between
# connections, which is exactly what we want.
#
# To apply at boot (the postinst installs this file next to the single-
# worker drop-in ; the operator picks which is loaded by nftables.service
# via a symlink at /etc/nftables.d/secubox-toolbox-wg.nft).
flush chain inet wg-toolbox prerouting
table inet wg-toolbox {
chain prerouting {
type nat hook prerouting priority dstnat; policy accept;
# Phase 9 (#501) — 4-worker round-robin DNAT. numgen returns
# 0..3 ; the map sends each to one of the 4 worker ports on
# 10.99.1.1. Conntrack pins the choice for the whole flow.
iif "wg-toolbox" tcp dport 443 dnat ip to 10.99.1.1 \
: numgen inc mod 4 map {
0 : 8081,
1 : 8082,
2 : 8083,
3 : 8084
}
iif "wg-toolbox" tcp dport 80 dnat ip to 10.99.1.1 \
: numgen inc mod 4 map {
0 : 8081,
1 : 8082,
2 : 8083,
3 : 8084
}
# Phase 7 (#498) — DNS DNAT for legacy peer configs that hand out
# DNS = 10.99.0.1. Single target — these queries are tiny and
# don't need worker fanout.
iif "wg-toolbox" ip daddr 10.99.0.1 udp dport 53 dnat ip to 10.99.1.1:53
iif "wg-toolbox" ip daddr 10.99.0.1 tcp dport 53 dnat ip to 10.99.1.1:53
# Phase 7 (#498) — captive-portal HTTP probe from the R3
# verification page.
iif "wg-toolbox" ip daddr 10.99.0.1 tcp dport 8088 dnat ip to 10.99.1.1:8088
}
}

View File

@ -45,11 +45,15 @@ fi
# Phase 7 (#498) — listen-host is overridable via env. Host (default) binds
# 10.99.1.1 (the wg-toolbox interface IP) ; LXC variant sets 0.0.0.0 so it
# accepts the DNAT'd traffic on the 10.100.0.62 br-lxc interface.
# Phase 9 (#501) — listen-port is overridable too. Each fanout worker
# instance (secubox-toolbox-mitm-wg-worker@N) sets MITM_WG_LISTEN_PORT
# to 808N. The legacy single-process service keeps the 8081 default.
MITM_WG_LISTEN_HOST="${MITM_WG_LISTEN_HOST:-10.99.1.1}"
MITM_WG_LISTEN_PORT="${MITM_WG_LISTEN_PORT:-8081}"
ARGS=(
--mode transparent
--listen-host "$MITM_WG_LISTEN_HOST"
--listen-port 8081
--listen-port "$MITM_WG_LISTEN_PORT"
--set confdir=/etc/secubox/toolbox/ca-wg
--set ssl_insecure=false
--set web_open_browser=false

View File

@ -79,17 +79,15 @@ def _publisher_from_host(host: str) -> str:
return h or "unknown"
def record_event(
*,
client_ip: Optional[str],
host: str,
path: Optional[str],
action: str,
level: str,
detected_mtid: Optional[str] = None,
injected_mtid: Optional[str] = None,
) -> None:
"""Insert one event. Best-effort — never raises into the addon."""
# Phase 8.B perf (#500) — fire-and-forget SQLite writes via single
# background thread (matches local_store.py pattern). Mitmproxy's
# asyncio event loop never blocks on _conn() open + INSERT + fsync.
import concurrent.futures as _futures
_executor = _futures.ThreadPoolExecutor(max_workers=1, thread_name_prefix="sbx_utiq_write")
def _record_sync(client_ip, host, path, action, level,
detected_mtid, injected_mtid) -> None:
try:
with _conn() as c:
c.execute(
@ -112,6 +110,26 @@ def record_event(
log.warning("record_event failed: %s", e)
def record_event(
*,
client_ip: Optional[str],
host: str,
path: Optional[str],
action: str,
level: str,
detected_mtid: Optional[str] = None,
injected_mtid: Optional[str] = None,
) -> None:
"""Insert one event off-thread. Best-effort — never raises into
the addon, never blocks the mitmproxy asyncio loop."""
try:
_executor.submit(_record_sync, client_ip, host, path, action,
level, detected_mtid, injected_mtid)
except RuntimeError:
# Executor shut down (interpreter teardown) — silent drop.
pass
def recent(hours: int = 24, limit: int = 200) -> List[Dict]:
"""Return the last events within the window, newest first."""
since = int(time.time()) - hours * 3600

View File

@ -0,0 +1,72 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
# Phase 9 (#501) — multi-worker fanout for the R3 wg tunnel mitm.
#
# Why : on gk2 the single-process mitm-wg saturates one ARM core at
# ~90 % under just 2-3 concurrently-active wg peers. The Python GIL
# caps real parallelism inside a single mitmproxy process. Phase 9
# runs N=4 worker instances (8081..8084) and lets nft DNAT spread
# new TCP connections evenly across them via `numgen inc mod 4`,
# which is sticky-per-connection (the conntrack entry locks the
# translation for the lifetime of the flow).
#
# Each %i ∈ {1..4} → listen on 808%i . Activate with :
#
# systemctl enable --now secubox-toolbox-mitm-wg-worker@{1,2,3,4}.service
# nft -f /etc/nftables.d/secubox-toolbox-wg-fanout.nft
# systemctl disable --now secubox-toolbox-mitm-wg.service # retire single
#
# Rollback (single-process) :
#
# systemctl disable --now secubox-toolbox-mitm-wg-worker@{1,2,3,4}.service
# nft -f /usr/share/secubox/toolbox/nftables.d/secubox-toolbox-wg.nft
# systemctl enable --now secubox-toolbox-mitm-wg.service
#
# State coherence : all 4 workers share /var/lib/secubox/toolbox/toolbox.db
# (WAL mode, multi-writer-safe). Cert-pin auto-learning's dynamic
# bypass file is the one source of contention left (4 writers race on
# /var/lib/secubox/toolbox/mitm-bypass-dynamic.conf) ; the .path
# watcher already de-bounces 10 s before reload-restart so the worst
# case is a duplicate line added then deduped by the launcher's
# sort -u pipeline. Acceptable for Phase 9 ship ; a real filelock
# lands in 9.1.
[Unit]
Description=SecuBox ToolBoX MITM WireGuard worker %i (R3 fanout port 808%i)
After=network.target wg-quick@wg-toolbox.service
Wants=wg-quick@wg-toolbox.service
Documentation=https://github.com/CyberMind-FR/secubox-deb/issues/501
[Service]
Type=simple
User=secubox-toolbox
Group=secubox-toolbox
WorkingDirectory=/usr/lib/secubox/toolbox
# Phase 9 — per-instance port. systemd's %i is the instance number.
Environment="MITM_WG_LISTEN_HOST=10.99.1.1"
Environment="MITM_WG_LISTEN_PORT=808%i"
ExecStart=/usr/sbin/secubox-toolbox-mitm-wg-launch
Restart=on-failure
RestartSec=5
# Same hygiene cycle as the single-process unit. 3 h recycle per
# worker, staggered by 45 min via RuntimeMaxSec randomization
# (RandomizedDelaySec on the timer would be cleaner ; here we just
# accept that 4 workers will all recycle at boot+3h with brief 5 s
# downtime each, mitigated by the others still serving traffic).
RuntimeMaxSec=3h
# Memory envelope per worker — 4x the single-process budget split
# evenly is 100 MB each, but real-world per-worker RSS sits at
# ~60-80 MB so MemoryMax=128M gives a sane upper bound.
MemoryHigh=100M
MemoryMax=128M
# Resource isolation between workers. Without it, one runaway
# worker can drag the others.
TasksMax=128
[Install]
WantedBy=multi-user.target