Compare commits

..

27 Commits

Author SHA1 Message Date
740cbd291f release: toolbox-ng 0.1.26 (#757 reval nudge) + toolbox 2.7.24 (Liveness card removed)
Some checks are pending
License Headers / check (push) Waiting to run
2026-06-27 09:46:14 +02:00
d203b9aa8f Merge branch 'feature/757-sw-revalidation-nudge' — SW revalidation nudge (#757) + remove Liveness card
#757: for sw-neuter allow-listed hosts, strip If-None-Match/If-Modified-Since on
HTML document fetches so a stale-while-revalidate SW re-fetches a full 200 (banner
injected) and caches a banner'd shell without being neutered. Reviewed APPROVED.
Also removes the redundant ♥ Liveness dashboard card (version stays in the top badge).
2026-06-27 09:45:19 +02:00
52d358c9d8 ui(toolbox): remove redundant Liveness card — version stays in the top badge (ref #757) 2026-06-27 09:44:57 +02:00
20fca011a1 feat(sbxmitm): SW revalidation nudge — strip If-None-Match/If-Modified-Since for allow-listed HTML fetches (ref #757)
Add requestWantsHTML (Sec-Fetch-Dest: document or Accept: text/html) and use
it in mitmPipeline to strip conditional headers before up.Do for sw-neuter
allow-listed hosts, forcing a full 200 so the injected banner is cached by a
stale-while-revalidate Service Worker.
2026-06-27 09:42:13 +02:00
fb90349670 docs(sbxmitm): plan — SW revalidation nudge (ref #757) 2026-06-27 09:40:32 +02:00
90a7df6f4b release: toolbox-ng 0.1.25 + toolbox 2.7.23 — #755 #ads breakdown + #756 cosmetic scroll 2026-06-27 09:19:37 +02:00
f997d1c9a9 Merge branch 'feature/755-ads-aggregate' — #ads MITM-stats breakdown (#755) + cosmetic scroll-restore (#756)
#755: the #ads card becomes an honest labeled breakdown — Pubs bloquées (204),
Trackers détectés (social_edges distinct cookie-trackers), Pages nettoyées (new
sbxmitm cosmetic counter, wg-gated), Drops réseau (blacklist nft). #756: the
cosmetic ad-hide style restores html,body overflow so paywall scroll-locks
(Bloomberg) no longer leave the page unscrollable. Final review: one wg-gate fix
applied; race-clean, contract airtight end-to-end.
2026-06-27 09:18:27 +02:00
24fa9da107 fix(sbxmitm): gate recordCosmetic() on wg to stop over-counting pages_cleaned (ref #755)
Non-WG LAN clients never receive the cosmetic ad-hide <style> (injectHTML
gates injectCosmetic behind `if wg`), yet recordCosmetic() fired on every
successful inject — inflating pages_cleaned for banner-only flows.
Gate the call with `if wg && px.ads != nil` to match injectHTML's own gate.
2026-06-27 09:17:49 +02:00
ff439d9395 feat(toolbox): cosmetic-pages counter → #ads 'Pages nettoyées' (ref #755) 2026-06-27 09:10:37 +02:00
6100a3e8ed feat(toolbox): #ads breakdown — trackers_seen + network_drops (ref #755)
Co-Authored-By: Gérald Kerma <devel@cybermind.fr>
2026-06-27 09:05:22 +02:00
3473320ad0 fix(sbxmitm): cosmetic restores scroll (overflow:auto) — paywall scroll-lock left Bloomberg dead (ref #756) 2026-06-27 09:02:43 +02:00
3e8b3e80fd docs(toolbox): implementation plan — #ads aggregate MITM stats (ref #755) 2026-06-27 08:54:57 +02:00
a76da4f783 release: toolbox-ng 0.1.24 + toolbox 2.7.22 — SW-neuter #753 2026-06-27 08:43:21 +02:00
ad4fc51d21 Merge branch 'feature/753-sw-neuter' — targeted Service-Worker neuter for the R3 banner (ref #753)
PWA news sites (leparisien, cnn, 20minutes, franceinfo) serve their main HTML
from a Service-Worker cache so the navigation never reaches the MITM and the
banner can't be injected. For an operator-curated allow-list, sbxmitm answers
the SW script fetch with a passive self-unregistering SW (next navigation reaches
the MITM → banner), with an auto-learn candidate feed to /__toolbox/sw-candidate.
Targeted-strict: empty list = no-op. Final review READY TO MERGE (race-clean).
2026-06-27 08:41:52 +02:00
bcea1ea4ac chore(toolbox): drop unused json import in sw-candidate test (ref #753) 2026-06-27 08:41:52 +02:00
f6d2e44565 feat(toolbox): SW-neuter auto-learn flush + /__toolbox/sw-candidate ingest (ref #753) 2026-06-27 08:33:17 +02:00
634a08c3ab feat(sbxmitm): wire SWNeuter into mitmPipeline + --sw-neuter-hosts (ref #753) 2026-06-27 08:27:59 +02:00
690da98510 fix(sbxmitm): SW-neuter reload throttle + test copyright header (ref #753)
- swneuter.go: use reload.DefaultReloadThrottle (15s) instead of 0 in
  newSWNeuter — avoids stat on every Maybe() call, consistent with policy.go
- swneuter_test.go: add missing copyright line to match implementation header
2026-06-27 08:25:37 +02:00
f94841e34f feat(sbxmitm): SWNeuter — allow-list + self-unregistering SW body (ref #753) 2026-06-27 08:22:43 +02:00
fc8248b854 docs(toolbox-ng): implementation plan — targeted SW-neuter (ref #753) 2026-06-27 08:19:59 +02:00
b3c1db9380 docs(toolbox-ng): design spec — targeted SW-neuter for the R3 banner (ref #753) 2026-06-27 08:13:13 +02:00
72e8cbd2db release(toolbox-ng): 0.1.23 — rebuild master (#751 nonce-CSP) + SBX_DEBUG_CSP 2026-06-27 08:02:45 +02:00
827165e6fd release(toolbox): 2.7.21 — bundle.py banner reconciliation (#754) 2026-06-27 07:57:09 +02:00
0d906b1471 Merge branch 'fix/754-bundle-reconcile' — bundle.py to #740 DOM-API banner + R4 + #752 guard (ref #754)
Brings master's bundle.py up to the working board version: the #740 mk() DOM-API
banner (Trusted-Types-proof — why x.com/news render), the R4 analyst tier (#736)
folded into its level switch, and the #752 top-frame guard (no banner in 3rd-party
iframes). Folds in fix/752. Reviewed APPROVED (6/6 named risks clean, 179 tests).
2026-06-27 07:54:25 +02:00
d1607328fd fix(toolbox): reconcile bundle.py to master — #740 DOM-API banner + R4 tier + #752 top-frame guard (ref #754)
The board ran the #740 DOM-API (mk(), Trusted-Types-proof) banner from the
unmerged feature/740 branch; master had diverged with the R4 tier (#736) on an
innerHTML banner. This brings master's bundle.py up to the working board version
(mk() rendering — TT/strict-CSP-proof, why x.com/news render), adds the R4 tier
into the #740 level switch, and folds in the #752 top-frame guard (no banner in
3rd-party iframes). Updated 2 stale inline tests: the #653 'no fetch at load'
assertion is now '/__toolbox/bundle' not fetched — #740's toggle handlers fetch
/set-* on user click, which is not the SW-hijackable load-time bundle fetch.
2026-06-27 07:52:01 +02:00
aae47c6e2e Merge branch 'fix/751-sbxmitm-csp-debug' — SBX_DEBUG_CSP banner/CSP diagnostic (ref #751)
Root cause of the x.com/news banner failures was a STALE board sbxmitm binary
lagging master's #728 nonce-borrow; redeploying a master build fixed them.
This adds a permanent opt-in CSP diagnostic (SBX_DEBUG_CSP) to pinpoint why a
banner does/doesn't render on a given site.
2026-06-27 07:21:29 +02:00
4329ab2d7b feat(sbxmitm): SBX_DEBUG_CSP diagnostic log for banner/CSP visibility (ref #751) 2026-06-27 07:15:30 +02:00
19 changed files with 1733 additions and 54 deletions

View File

@ -0,0 +1,307 @@
# Aggregate MITM protection stats in the #ads card — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax.
**Goal:** Turn the #ads card's single narrow "blocked" number into an honest labeled breakdown of the full MITM protection: ad-block 204s (existing) + trackers detected (social_edges) + pages cosmetically cleaned (new sbxmitm counter) + network drops (blacklist nft).
**Architecture:** Task 1 is Python-only (trackers + network_drops added to the ad-stats payload + a WebUI breakdown) — immediate value, no engine redeploy. Task 2 adds a Go `cosmeticPages` counter in sbxmitm, flushed via the existing ad-event channel, stored, and surfaced.
**Tech Stack:** Python/FastAPI + sqlite3 (toolbox), Go (sbxmitm), vanilla JS (toolbox WebUI), pytest + `go test`.
## Global Constraints
- New Python files carry `# SPDX-License-Identifier: LicenseRef-CMSD-1.0`. New Go logic keeps the file's existing header.
- **Honest breakdown, not a sum:** the four metrics are different units (blocks / trackers / pages / drops) — label them separately; never add them into one number.
- Reuse existing helpers: `store._conn()`, the `social_edges` table (same toolbox.db), the existing ad-event flush (`adstats.go` payload + `/__toolbox/ad-event` handler), the blacklist nft drops parse (api.py `admin_blacklist`).
- `trackers_seen` = `COUNT(DISTINCT cookie_id_hash)` over `social_edges` in the window (exclude empty cookie ids).
- Commits reference `(ref #755)`. No "Claude Code"/"Generated with" strings.
- Tests: `cd packages/secubox-toolbox && python -m pytest tests/<file> -v` ; Go: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -count=1`.
## File Structure
- Modify `packages/secubox-toolbox/secubox_toolbox/store.py``ad_stats` adds `trackers_seen` + `pages_cleaned`; new `record_cosmetic_pages`.
- Modify `packages/secubox-toolbox/secubox_toolbox/api.py``admin_ad_stats` adds `network_drops`; `toolbox_ad_event` ingests `cosmetic_pages`.
- Modify `packages/secubox-toolbox/www/toolbox/index.html` — the #ads card breakdown.
- Modify `packages/secubox-toolbox-ng/cmd/sbxmitm/adstats.go` + `main.go` — the `cosmeticPages` counter + flush.
- Tests: `packages/secubox-toolbox/tests/test_ads_aggregate.py`, `packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_count_test.go`.
---
### Task 1: Python — trackers_seen + network_drops + WebUI breakdown
**Files:**
- Modify: `packages/secubox-toolbox/secubox_toolbox/store.py` (`ad_stats`)
- Modify: `packages/secubox-toolbox/secubox_toolbox/api.py` (`admin_ad_stats`)
- Modify: `packages/secubox-toolbox/www/toolbox/index.html` (#ads card)
- Test: `packages/secubox-toolbox/tests/test_ads_aggregate.py`
**Interfaces:**
- Produces: `store.ad_stats(...)` dict gains `trackers_seen: int` (and `pages_cleaned: int`, defaulted 0 here — Task 2 fills it); `admin_ad_stats(...)` dict gains `network_drops: int`.
- [ ] **Step 1: Write the failing test**
Create `packages/secubox-toolbox/tests/test_ads_aggregate.py`:
```python
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for the #ads aggregate breakdown (ref #755)."""
import sqlite3
import time
from secubox_toolbox import store
def _seed_db(tmp_path, monkeypatch):
db = tmp_path / "toolbox.db"
c = sqlite3.connect(str(db))
c.executescript(
"CREATE TABLE ad_block_stats(ad_host TEXT, site TEXT, action TEXT, hits INTEGER, bytes INTEGER, last_seen REAL, PRIMARY KEY(ad_host,site,action));"
"CREATE TABLE ad_block_client_host(mac_hash TEXT, ad_host TEXT, hits INTEGER, last_seen REAL, PRIMARY KEY(mac_hash,ad_host));"
"CREATE TABLE social_edges(ts INTEGER, client_mac_hash TEXT, src_site TEXT, tracker_domain TEXT, cookie_id_hash TEXT, ja4_hash TEXT, consent_state TEXT);"
)
now = int(time.time())
# two distinct cookie-trackers in window, one duplicate, one stale (>24h)
for cid, ts in [("A", now-60), ("A", now-30), ("B", now-60), ("C", now-90000), ("", now-10)]:
c.execute("INSERT INTO social_edges(ts,client_mac_hash,src_site,tracker_domain,cookie_id_hash,ja4_hash,consent_state) VALUES(?,?,?,?,?,?,?)",
(ts, "m", "s", "t", cid, "j", "none_seen"))
c.commit(); c.close()
monkeypatch.setattr(store, "DB_PATH", db)
return db
def test_ad_stats_trackers_seen_distinct_in_window(tmp_path, monkeypatch):
_seed_db(tmp_path, monkeypatch)
out = store.ad_stats(hours=24)
# distinct non-empty cookie ids in the last 24h = {A, B}; C is stale, "" excluded
assert out["trackers_seen"] == 2
assert out["pages_cleaned"] == 0 # no cosmetic_events table yet → 0 (Task 2 fills it)
```
- [ ] **Step 2: Run it to verify it fails**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_ads_aggregate.py -v`
Expected: FAIL — `KeyError: 'trackers_seen'`.
- [ ] **Step 3: Implement store.ad_stats additions**
In `store.py`, in `ad_stats`, inside the `with _conn() as c:` block (after the existing `top_visitors` query, before the function returns `out`), add:
```python
# #755 — trackers detected/poisoned by the MITM in the window: distinct
# cross-site cookie-identifier hashes seen on social_edges. This is the
# "Trackers" half of the card (the 204 ad-block is the "pubs" half).
try:
r = c.execute(
"SELECT COUNT(DISTINCT cookie_id_hash) FROM social_edges "
"WHERE last_seen IS NULL AND 0", # placeholder replaced below
).fetchone()
except sqlite3.Error:
r = None
out["trackers_seen"] = 0
try:
out["trackers_seen"] = int(c.execute(
"SELECT COUNT(DISTINCT cookie_id_hash) FROM social_edges "
"WHERE ts >= ? AND cookie_id_hash IS NOT NULL AND cookie_id_hash <> ''",
(int(cutoff),),
).fetchone()[0] or 0)
except sqlite3.Error:
out["trackers_seen"] = 0
# #755 — pages where the cosmetic ad-hide style was injected (Task 2 writes
# cosmetic_events; absent table → 0).
out["pages_cleaned"] = 0
try:
out["pages_cleaned"] = int(c.execute(
"SELECT COALESCE(SUM(pages),0) FROM cosmetic_events WHERE ts >= ?",
(cutoff,),
).fetchone()[0] or 0)
except sqlite3.Error:
out["pages_cleaned"] = 0
```
Remove the dead placeholder block (the first `try/except` with `WHERE last_seen IS NULL AND 0`) — it was only to show the shape; keep ONLY the two real queries (`trackers_seen` and `pages_cleaned`). (`cutoff` is the existing local `cutoff = time.time() - hours*3600` already computed at the top of `ad_stats`.)
- [ ] **Step 4: Run the test to verify it passes**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_ads_aggregate.py -v`
Expected: PASS.
- [ ] **Step 5: Add network_drops to the endpoint**
In `api.py`, change `admin_ad_stats` (currently `return store.ad_stats(hours=h)`) to:
```python
async def admin_ad_stats(hours: int = 24) -> dict:
"""Contextual ad-block metrics for the #ads tab (read-only, kbin-safe)."""
h = max(1, min(int(hours if hours is not None else 24), 168))
out = store.ad_stats(hours=h)
# #755 — network-layer drops (blacklist nft sets). Best-effort; 0 when the
# blacklist is inert or unreadable. Reuses the admin_blacklist parse.
try:
bl = await admin_blacklist()
out["network_drops"] = int(bl.get("drops", 0) or 0)
except Exception:
out["network_drops"] = 0
return out
```
(Confirm `admin_blacklist` is defined ABOVE `admin_ad_stats` in api.py; it is referenced at module scope so definition order at call time is fine since both are coroutines resolved at runtime.)
- [ ] **Step 6: WebUI — render the breakdown**
In `packages/secubox-toolbox/www/toolbox/index.html`, find the line building the #ads KPI (around line 620, the `kpi.innerHTML = ...` that shows `Trackers &amp; pubs bloqués ${d.total_blocked}`). Replace that assignment with a labeled breakdown that keeps the existing "pubs bloquées" + bytes and ADDS the three new metrics:
```javascript
kpi.innerHTML = `<span class="k">Pubs bloquées (204)</span> <span class="v">${d.total_blocked||0}</span>`
+ ` <span class="k">Trackers détectés</span> <span class="v">${d.trackers_seen||0}</span>`
+ ` <span class="k">Pages nettoyées</span> <span class="v">${d.pages_cleaned||0}</span>`
+ ` <span class="k">Drops réseau</span> <span class="v">${d.network_drops||0}</span>`
+ ` <span class="k" title="estimation : un contenu bloqué n'est jamais téléchargé, on ne peut pas mesurer les octets réels — ~45 Ko/blocage">Ko évités <span style="opacity:.6">(est.)</span></span> <span class="v">~${Math.round((d.total_bytes||0)/1024)}</span>`;
```
(Keep the surrounding code that builds `hostRows`/`siteRows` tables unchanged.)
- [ ] **Step 7: Run the test once more + commit**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_ads_aggregate.py -v`
Expected: PASS.
```bash
git add packages/secubox-toolbox/secubox_toolbox/store.py packages/secubox-toolbox/secubox_toolbox/api.py packages/secubox-toolbox/www/toolbox/index.html packages/secubox-toolbox/tests/test_ads_aggregate.py
git commit -m "feat(toolbox): #ads breakdown — trackers_seen + network_drops (ref #755)"
```
---
### Task 2: Go cosmetic-pages counter + flush + store + surface
**Files:**
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/adstats.go` (counter + payload field + flush)
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go` (increment on injection)
- Modify: `packages/secubox-toolbox/secubox_toolbox/api.py` (`toolbox_ad_event` ingest)
- Modify: `packages/secubox-toolbox/secubox_toolbox/store.py` (`record_cosmetic_pages`)
- Test: `packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_count_test.go`, extend `packages/secubox-toolbox/tests/test_ads_aggregate.py`
**Interfaces:**
- Consumes: `store.ad_stats`'s `pages_cleaned` query (Task 1, reads `cosmetic_events`); the ad-event flush (`flushOnce`).
- Produces: `(*adStats).recordCosmetic()`, the `cosmetic_pages` JSON field on the ad-event payload; `store.record_cosmetic_pages(n: int)`.
- [ ] **Step 1: Write the failing Go test**
Create `packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_count_test.go`:
```go
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
package main
import "testing"
func TestCosmeticCounterSnapshotClears(t *testing.T) {
a := newAdStats()
a.recordCosmetic()
a.recordCosmetic()
if got := a.snapshotCosmetic(); got != 2 {
t.Fatalf("snapshotCosmetic = %d, want 2", got)
}
if got := a.snapshotCosmetic(); got != 0 {
t.Fatalf("snapshot must clear; second call = %d, want 0", got)
}
}
```
- [ ] **Step 2: Run it to verify it fails**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run TestCosmetic -v`
Expected: FAIL — `a.recordCosmetic undefined`.
- [ ] **Step 3: Implement the Go counter**
In `adstats.go`: add a field to the `adStats` struct (find `type adStats struct {`): a counter guarded by the existing struct mutex, or a dedicated `sync/atomic` int64. Use atomic to avoid touching the existing lock scope:
Add import `"sync/atomic"` if absent. Add to the `adStats` struct:
```go
cosmetic atomic.Int64 // #755 — pages where the cosmetic ad-hide style was injected
```
Add methods:
```go
// recordCosmetic tallies one R3 HTML page that received the cosmetic ad-hide style.
func (a *adStats) recordCosmetic() { a.cosmetic.Add(1) }
// snapshotCosmetic atomically reads-and-clears the cosmetic page counter.
func (a *adStats) snapshotCosmetic() int64 { return a.cosmetic.Swap(0) }
```
In the ad-event payload struct (`adEventPayload`), add the field:
```go
CosmeticPages int64 `json:"cosmetic_pages,omitempty"`
```
In `flushOnce` (where the payload `p` is assembled before marshal), set:
```go
p.CosmeticPages = a.snapshotCosmetic()
```
Note: if `flushOnce` early-returns when ad block/candidate maps are empty, ensure a non-zero cosmetic count still gets POSTed — adjust the "is the snapshot empty?" guard to also consider `p.CosmeticPages > 0` so cosmetic-only windows still flush.
- [ ] **Step 4: Increment on injection (main.go)**
In `main.go`, in `mitmPipeline`, in the block `if out, ok := injectIntoBody(body, resp.Header.Get("Content-Encoding"), scriptBody, cspNonce, wg); ok {` — inside the `ok` branch (after `body = out`), add:
```go
px.ads.recordCosmetic() // #755 — this R3 HTML page got the cosmetic ad-hide style
```
- [ ] **Step 5: Run the Go test + build**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run TestCosmetic -v && GOFLAGS=-mod=vendor go build ./... && GOFLAGS=-mod=vendor go vet ./cmd/sbxmitm/`
Expected: PASS, build OK, vet clean.
- [ ] **Step 6: Python — store.record_cosmetic_pages + ad-event ingest**
In `store.py`, add (near `record_ad_blocks`):
```python
def record_cosmetic_pages(pages: int) -> None:
"""#755 — append one cosmetic-hide tally (pages cleaned since the last flush).
ad_stats sums these over the window. Best-effort; never raises."""
try:
n = int(pages)
if n <= 0:
return
with _conn() as c:
c.execute("CREATE TABLE IF NOT EXISTS cosmetic_events(ts REAL, pages INTEGER)")
c.execute("INSERT INTO cosmetic_events(ts, pages) VALUES(?, ?)", (time.time(), n))
except Exception as e:
log.debug("record_cosmetic_pages failed: %s", e)
```
In `api.py`, in `toolbox_ad_event`, after the existing `store.record_ad_blocks(...)` / `record_ad_candidates(...)` calls, add:
```python
cp = payload.get("cosmetic_pages")
if cp:
store.record_cosmetic_pages(cp)
```
(Match the variable name the handler uses for the parsed JSON body — the brief's `payload` may be named `body`/`data` in the actual handler; use whatever it is. Guard so a missing/zero field is a no-op.)
- [ ] **Step 7: Extend the Python test (pages_cleaned now populated)**
Append to `tests/test_ads_aggregate.py`:
```python
def test_record_cosmetic_pages_summed_in_window(tmp_path, monkeypatch):
_seed_db(tmp_path, monkeypatch)
store.record_cosmetic_pages(3)
store.record_cosmetic_pages(2)
out = store.ad_stats(hours=24)
assert out["pages_cleaned"] == 5
```
- [ ] **Step 8: Run both suites**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_ads_aggregate.py -v` → PASS (3 tests).
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go build ./... && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -count=1` → build OK, all PASS.
- [ ] **Step 9: Commit**
```bash
git add packages/secubox-toolbox-ng/cmd/sbxmitm/adstats.go packages/secubox-toolbox-ng/cmd/sbxmitm/main.go packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_count_test.go packages/secubox-toolbox/secubox_toolbox/store.py packages/secubox-toolbox/secubox_toolbox/api.py packages/secubox-toolbox/tests/test_ads_aggregate.py
git commit -m "feat(toolbox): cosmetic-pages counter → #ads 'Pages nettoyées' (ref #755)"
```
---
## Self-Review notes
- **Spec coverage:** trackers_seen (Task 1) ✓; network_drops (Task 1) ✓; pages_cleaned cosmetic counter end-to-end Go→store→ad_stats (Task 2) ✓; WebUI labeled breakdown, not a sum (Task 1 Step 6) ✓; honest units (each labeled) ✓.
- **No placeholders:** Step 3 explicitly instructs deleting the illustrative dead block; the only "match the actual var name" notes (the ad-event handler's body var) are verify-in-context, not gaps.
- **Type consistency:** `trackers_seen`/`pages_cleaned`/`network_drops`/`cosmetic_pages` keys are identical across store → api → WebUI → Go payload → store ingest.
- **Out of scope:** real DNS-sinkhole per-window counter (no endpoint exists; `network_drops` uses the blacklist nft drops, 0 until that layer reports) — flagged in the issue.

View File

@ -0,0 +1,543 @@
# Targeted SW-neuter for the R3 banner — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make the R3 transparency banner appear on Service-Worker PWA sites (leparisien, cnn…) by serving a self-unregistering SW for an operator-curated allow-list of hosts, with an auto-learn proposal feed.
**Architecture:** A new `SWNeuter` in sbxmitm intercepts the `Service-Worker: script` fetch; for allow-listed hosts it answers with a passive self-unregistering SW (next navigation reaches the MITM → banner); off the list it records the host as an auto-learn candidate flushed to the portal for operator review.
**Tech Stack:** Go (sbxmitm, `golang.org/x/...` stdlib + internal/reload), Python/FastAPI (portal endpoint), pytest + `go test`.
## Global Constraints
- New Go files carry the SPDX header: `// SPDX-License-Identifier: LicenseRef-CMSD-1.0` + the CyberMind copyright line (copy from any sibling, e.g. `cmd/sbxmitm/csp.go`).
- **Targeted-strict:** ONLY hosts on the allow-list are neutered. An empty/missing list (`reload.LoadLines` → empty set) is a complete no-op. Nothing global.
- **Passive:** the neuter SW must NOT call `client.navigate()` / force a reload. It unregisters + clears caches only; the banner returns on the next navigation.
- Reuse existing package helpers: `hostMatches(host, patterns)` (policy.go), `reload.LoadLines`/`reload.Target`/`reload.NewWatcher`/`reload.StatMtime`, `writeRaw`, `portalTargetURL`, `adEventClient`.
- Allow-list path default: `/var/lib/secubox/toolbox/sw-neuter-hosts.txt`. Candidates file: `/var/lib/secubox/toolbox/sw-neuter-candidates.txt`.
- Detection signal: the spec-mandated `Service-Worker: script` request header — never trigger on normal traffic.
- Commits reference `(ref #753)`. No "Claude Code"/"Generated with" strings.
- Build: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go build ./...` ; test: `GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/`.
## File Structure
- Create `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go` — the SWNeuter unit (allow-list, match, detection, neuter body, candidate feed, flush).
- Create `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go` — unit tests.
- Modify `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go` — flag, Proxy field, construction, flusher launch, mitmPipeline insertion.
- Modify `packages/secubox-toolbox/secubox_toolbox/api.py` — the `/__toolbox/sw-candidate` portal endpoint.
- Test `packages/secubox-toolbox/tests/test_sw_candidate_api.py` — the portal endpoint.
---
### Task 1: SWNeuter core (allow-list, match, detection, neuter body, candidates)
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go`
- Test: `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go`
**Interfaces:**
- Consumes: `hostMatches(host string, patterns map[string]bool) bool` (policy.go); `reload.LoadLines/Target/NewWatcher/StatMtime` (internal/reload).
- Produces:
- `type SWNeuter struct{...}` with `newSWNeuter(path string) *SWNeuter`, `(*SWNeuter) Maybe()`, `(*SWNeuter) Match(host string) bool`, `(*SWNeuter) RecordCandidate(host string)`, `(*SWNeuter) snapshotCandidates() []string`.
- `isSWScriptRequest(req *http.Request) bool`.
- `const NeuterSW string`.
- [ ] **Step 1: Write the failing tests**
Create `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go`:
```go
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
package main
import (
"net/http"
"strings"
"testing"
)
func TestSWMatchSuffix(t *testing.T) {
s := &SWNeuter{hosts: map[string]bool{"leparisien.fr": true, "cnn.com": true}}
for _, h := range []string{"leparisien.fr", "www.leparisien.fr", "m.cnn.com", "CNN.COM"} {
if !s.Match(h) {
t.Fatalf("%q should match the allow-list", h)
}
}
for _, h := range []string{"notleparisien.fr", "evil.com", "leparisien.fr.evil.com", ""} {
if s.Match(h) {
t.Fatalf("%q must NOT match", h)
}
}
}
func TestSWEmptyListNoOp(t *testing.T) {
s := &SWNeuter{hosts: map[string]bool{}}
if s.Match("www.leparisien.fr") {
t.Fatal("empty allow-list must match nothing (targeted-strict no-op)")
}
}
func TestSWIsScriptRequest(t *testing.T) {
r1, _ := http.NewRequest("GET", "https://x/sw.js", nil)
r1.Header.Set("Service-Worker", "script")
if !isSWScriptRequest(r1) {
t.Fatal("Service-Worker: script must be detected")
}
r2, _ := http.NewRequest("GET", "https://x/sw.js", nil)
if isSWScriptRequest(r2) {
t.Fatal("no Service-Worker header → not a SW script request")
}
if isSWScriptRequest(nil) {
t.Fatal("nil request → false")
}
}
func TestNeuterSWPassiveAndCorrect(t *testing.T) {
if !strings.Contains(NeuterSW, "self.registration.unregister()") {
t.Fatal("neuter SW must unregister itself")
}
if !strings.Contains(NeuterSW, "caches.delete") {
t.Fatal("neuter SW must clear caches")
}
if strings.Contains(NeuterSW, "navigate(") {
t.Fatal("neuter SW must be PASSIVE — no client.navigate / force reload")
}
}
func TestSWCandidateRecordSnapshot(t *testing.T) {
s := &SWNeuter{cand: map[string]int64{}}
s.RecordCandidate("www.cnn.com")
s.RecordCandidate("www.cnn.com")
s.RecordCandidate("") // ignored
got := s.snapshotCandidates()
if len(got) != 1 || got[0] != "www.cnn.com" {
t.Fatalf("snapshot = %v, want [www.cnn.com]", got)
}
if s.snapshotCandidates() != nil {
t.Fatal("snapshot must read-and-CLEAR (second call → nil)")
}
}
```
- [ ] **Step 2: Run the tests to verify they fail**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run 'TestSW|TestNeuter' -v`
Expected: FAIL — `undefined: SWNeuter`, `undefined: isSWScriptRequest`, `undefined: NeuterSW`.
- [ ] **Step 3: Implement swneuter.go**
Create `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go`:
```go
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxmitm — targeted Service-Worker neuter (#753)
//
// PWA news sites (leparisien, cnn…) serve their main HTML document from a
// Service-Worker cache, so the navigation never reaches the MITM and the
// transparency banner can't be injected. For an operator-curated allow-list of
// hosts, we answer the SW SCRIPT fetch with a self-unregistering SW: the browser
// updates to it, it unregisters + drops caches, and the NEXT navigation is a
// fresh network fetch the MITM injects the banner into. PASSIVE (no forced
// reload). Targeted-strict: an empty list neuters nothing.
package main
import (
"net/http"
"strings"
"sync"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/reload"
)
// NeuterSW is the self-unregistering SW body served for allow-listed hosts.
// It unregisters itself and clears all caches on activate; it NEVER calls
// client.navigate(), so the current page is not force-reloaded — the banner
// returns on the next navigation.
const NeuterSW = `self.addEventListener('install', function(e){ self.skipWaiting(); });
self.addEventListener('activate', function(e){
e.waitUntil((async function(){
try { var ks = await caches.keys(); await Promise.all(ks.map(function(k){ return caches.delete(k); })); } catch (_) {}
try { await self.registration.unregister(); } catch (_) {}
})());
});
`
// swCandMapCap bounds the candidate buffer (mirrors adCandMapCap).
const swCandMapCap = 4096
// SWNeuter holds the hot-reloadable allow-list + the auto-learn candidate buffer.
type SWNeuter struct {
mu sync.RWMutex
hosts map[string]bool // allow-list (lowercased; suffix-matched via hostMatches)
watcher *reload.Watcher
cmu sync.Mutex
cand map[string]int64 // host -> hits (SW hosts NOT yet on the allow-list)
}
// newSWNeuter loads the allow-list file and registers a hot-reload watcher.
// A missing/unreadable file yields an empty (no-op) list.
func newSWNeuter(path string) *SWNeuter {
s := &SWNeuter{
hosts: reload.LoadLines(path, true),
cand: map[string]int64{},
}
target := reload.Target{
Path: path,
LastMtime: reload.StatMtime(path),
Load: func(p string) any { return reload.LoadLines(p, true) },
Apply: func(v any) {
m := v.(map[string]bool)
s.mu.Lock()
s.hosts = m
s.mu.Unlock()
},
}
s.watcher = reload.NewWatcher(0, target)
return s
}
// Maybe triggers a hot-reload check (cheap: one stat + mtime compare).
func (s *SWNeuter) Maybe() {
if s != nil && s.watcher != nil {
s.watcher.Maybe()
}
}
// Match reports whether host is on the allow-list (exact or dotted-suffix).
func (s *SWNeuter) Match(host string) bool {
s.mu.RLock()
defer s.mu.RUnlock()
return hostMatches(host, s.hosts)
}
// RecordCandidate tallies a SW host not on the allow-list (auto-learn proposal).
func (s *SWNeuter) RecordCandidate(host string) {
h := strings.Trim(strings.ToLower(host), ".")
if h == "" {
return
}
s.cmu.Lock()
defer s.cmu.Unlock()
if _, ok := s.cand[h]; ok {
s.cand[h]++
} else if len(s.cand) < swCandMapCap {
s.cand[h] = 1
}
}
// snapshotCandidates atomically reads-and-clears the candidate buffer.
func (s *SWNeuter) snapshotCandidates() []string {
s.cmu.Lock()
defer s.cmu.Unlock()
if len(s.cand) == 0 {
return nil
}
out := make([]string, 0, len(s.cand))
for h := range s.cand {
out = append(out, h)
}
s.cand = map[string]int64{}
return out
}
// isSWScriptRequest reports whether req is a Service-Worker SCRIPT fetch.
// Browsers send the spec-mandated `Service-Worker: script` header on the
// register() fetch and every update check — reliable and host-agnostic.
func isSWScriptRequest(req *http.Request) bool {
return req != nil && strings.EqualFold(req.Header.Get("Service-Worker"), "script")
}
```
- [ ] **Step 4: Run the tests to verify they pass**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run 'TestSW|TestNeuter' -v`
Expected: PASS (5 tests).
- [ ] **Step 5: Commit**
```bash
git add packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go
git commit -m "feat(sbxmitm): SWNeuter — allow-list + self-unregistering SW body (ref #753)"
```
---
### Task 2: Wire SWNeuter into the engine (flag, Proxy, mitmPipeline)
**Files:**
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go`
**Interfaces:**
- Consumes: `newSWNeuter`, `(*SWNeuter).Maybe/Match/RecordCandidate`, `isSWScriptRequest`, `NeuterSW` (Task 1); `writeRaw` (util.go).
- Produces: `Proxy.swNeuter *SWNeuter` field; `--sw-neuter-hosts` flag.
- [ ] **Step 1: Add the struct field**
In `main.go`, in `type Proxy struct`, after the `media *mediaCatcher` field, add:
```go
// swNeuter (#753) is the targeted Service-Worker neuter: for allow-listed
// hosts it answers the SW script fetch with a self-unregistering SW so PWA
// shells stop being SW-cached and the banner can be injected on the next nav.
swNeuter *SWNeuter
```
- [ ] **Step 2: Add the flag + construction**
In `main()`, next to the other `flag.String` defs (~line 530, after `mediaCatch`), add:
```go
swNeuterHosts := flag.String("sw-neuter-hosts", "/var/lib/secubox/toolbox/sw-neuter-hosts.txt",
"#753 allow-list of PWA hosts whose Service Worker is neutered (served a self-unregistering SW) so the banner can be injected; empty/missing file = no-op")
```
In the `px := &Proxy{ ... }` literal (~line 551), after `media: newMediaCatcher(*mediaCatch),`, add:
```go
swNeuter: newSWNeuter(*swNeuterHosts),
```
- [ ] **Step 3: Insert the neuter short-circuit in mitmPipeline**
In `mitmPipeline`, immediately AFTER the `isToolboxAssetPath` short-circuit block (the `if isToolboxAssetPath(req.URL.RequestURI()) { servePortalAsset(...); return }`) and BEFORE the `if dialHost != "" && host != ""` block, insert:
```go
// #753 — targeted SW-neuter. For an allow-listed host, answer the
// Service-Worker script fetch with a self-unregistering SW (the next
// navigation bypasses the now-gone SW → reaches the MITM → banner). Off the
// list, record the host as an auto-learn candidate. Only ever fires on the
// `Service-Worker: script` request — normal traffic is untouched.
if px.swNeuter != nil && isSWScriptRequest(req) {
px.swNeuter.Maybe()
if px.swNeuter.Match(host) {
writeRaw(tconn, 200, "OK", map[string]string{
"Content-Type": "application/javascript",
"Cache-Control": "no-store",
"X-SecuBox-Ng": "sw-neutered",
}, []byte(NeuterSW))
return
}
px.swNeuter.RecordCandidate(host)
}
```
- [ ] **Step 4: Build + vet + run the package tests**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go build ./... && GOFLAGS=-mod=vendor go vet ./cmd/sbxmitm/ && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -count=1`
Expected: build OK, vet clean, all tests PASS (Task 1 tests + the rest of the package).
- [ ] **Step 5: Commit**
```bash
git add packages/secubox-toolbox-ng/cmd/sbxmitm/main.go
git commit -m "feat(sbxmitm): wire SWNeuter into mitmPipeline + --sw-neuter-hosts (ref #753)"
```
---
### Task 3: Auto-learn flush + portal `/__toolbox/sw-candidate` endpoint
**Files:**
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go` (add the flusher)
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go` (launch the flusher)
- Modify: `packages/secubox-toolbox/secubox_toolbox/api.py` (the endpoint)
- Test: `packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go` (flush shape) + `packages/secubox-toolbox/tests/test_sw_candidate_api.py`
**Interfaces:**
- Consumes: `snapshotCandidates` (Task 1); `portalTargetURL`, `adEventClient` (adstats.go); FastAPI `router`, `Request`, `Response` (api.py).
- Produces: `(*SWNeuter) flushCandidatesOnce(portal string) []string`, `(*SWNeuter) runCandidateFlusher(portal string)`; `POST /__toolbox/sw-candidate`.
- [ ] **Step 1: Write the failing Go flush test**
Append to `swneuter_test.go`:
```go
func TestSWFlushCandidatesClears(t *testing.T) {
s := &SWNeuter{cand: map[string]int64{}}
s.RecordCandidate("www.cnn.com")
// portal "" → Post fails fast (best-effort); the snapshot must still drain.
got := s.flushCandidatesOnce("http://127.0.0.1:0")
if len(got) != 1 || got[0] != "www.cnn.com" {
t.Fatalf("flush returned %v, want [www.cnn.com]", got)
}
if s.snapshotCandidates() != nil {
t.Fatal("flush must have drained the buffer")
}
if s.flushCandidatesOnce("http://127.0.0.1:0") != nil {
t.Fatal("empty buffer → flush returns nil, no POST")
}
}
```
- [ ] **Step 2: Run it to verify it fails**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run TestSWFlush -v`
Expected: FAIL — `s.flushCandidatesOnce undefined`.
- [ ] **Step 3: Implement the flusher in swneuter.go**
Add these imports to `swneuter.go`'s import block: `"bytes"`, `"encoding/json"`, `"time"`. Then append:
```go
// swFlushInterval is how often pending candidates are POSTed to the portal.
const swFlushInterval = 30 * time.Second
// flushCandidatesOnce drains the candidate buffer and best-effort POSTs the host
// list to the portal's /__toolbox/sw-candidate ingest. Returns the drained hosts
// (so a test can assert the snapshot/clear); a dead/slow portal is swallowed.
func (s *SWNeuter) flushCandidatesOnce(portal string) []string {
hosts := s.snapshotCandidates()
if len(hosts) == 0 {
return nil
}
buf, err := json.Marshal(map[string][]string{"hosts": hosts})
if err != nil {
return hosts
}
url := portalTargetURL(portal, "/__toolbox/sw-candidate")
if resp, err := adEventClient.Post(url, "application/json", bytes.NewReader(buf)); err == nil && resp != nil {
resp.Body.Close()
}
return hosts
}
// runCandidateFlusher drains the candidate buffer to the portal every
// swFlushInterval. Launched as a background goroutine from main().
func (s *SWNeuter) runCandidateFlusher(portal string) {
for {
time.Sleep(swFlushInterval)
s.flushCandidatesOnce(portal)
}
}
```
- [ ] **Step 4: Run the Go flush test (pass)**
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -run TestSWFlush -v`
Expected: PASS.
- [ ] **Step 5: Launch the flusher in main()**
In `main.go`, next to `go px.ads.runAdStatsFlusher(*portal, px.cand)` (~line 579), add:
```go
go px.swNeuter.runCandidateFlusher(*portal)
```
- [ ] **Step 6: Write the failing portal endpoint test**
Create `packages/secubox-toolbox/tests/test_sw_candidate_api.py`:
```python
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for POST /__toolbox/sw-candidate (ref #753)."""
import asyncio
import json
from secubox_toolbox import api
class _Req:
def __init__(self, payload):
self._payload = payload
async def json(self):
return self._payload
def test_sw_candidate_appends_and_dedupes(tmp_path, monkeypatch):
f = tmp_path / "sw-neuter-candidates.txt"
monkeypatch.setattr(api, "SW_CANDIDATES_FILE", f)
r1 = asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": ["www.cnn.com", "leparisien.fr"]})))
assert r1.status_code == 204
asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": ["www.cnn.com", "20minutes.fr"]})))
lines = [l.strip() for l in f.read_text().splitlines() if l.strip()]
assert sorted(lines) == ["20minutes.fr", "leparisien.fr", "www.cnn.com"] # deduped
def test_sw_candidate_ignores_bad_payload(tmp_path, monkeypatch):
f = tmp_path / "sw-neuter-candidates.txt"
monkeypatch.setattr(api, "SW_CANDIDATES_FILE", f)
r = asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": [None, 123, ""]})))
assert r.status_code == 204
assert not f.exists() or f.read_text().strip() == ""
```
- [ ] **Step 7: Run it to verify it fails**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_sw_candidate_api.py -v`
Expected: FAIL — `module 'secubox_toolbox.api' has no attribute 'toolbox_sw_candidate'`.
- [ ] **Step 8: Implement the portal endpoint**
In `packages/secubox-toolbox/secubox_toolbox/api.py`, near the other `/__toolbox/*` routes (e.g. after `toolbox_inline`), add (and add `from pathlib import Path` to the imports if absent):
```python
SW_CANDIDATES_FILE = Path("/var/lib/secubox/toolbox/sw-neuter-candidates.txt")
def _append_sw_candidates(hosts: list[str]) -> None:
"""Append new hosts to the sw-neuter candidates file, deduped against what is
already there. Best-effort; never raises into the request path."""
try:
existing: set[str] = set()
if SW_CANDIDATES_FILE.exists():
existing = {l.strip() for l in SW_CANDIDATES_FILE.read_text().splitlines() if l.strip()}
fresh = [h for h in hosts if h not in existing]
if not fresh:
return
SW_CANDIDATES_FILE.parent.mkdir(parents=True, exist_ok=True)
with SW_CANDIDATES_FILE.open("a", encoding="utf-8") as fh:
for h in fresh:
fh.write(h + "\n")
except OSError as e:
log.debug("sw-candidate append failed: %s", e)
@router.post("/__toolbox/sw-candidate")
async def toolbox_sw_candidate(request: Request) -> Response:
"""#753 — record SW-PWA hosts proposed for the sw-neuter allow-list. sbxmitm
POSTs hosts it saw fetching a Service Worker that are NOT yet allow-listed.
Deduped-appends to the candidates file for operator review; the operator
promotes wanted hosts to sw-neuter-hosts.txt."""
try:
body = await request.json()
hosts = [h for h in (body.get("hosts") or []) if isinstance(h, str) and h]
except Exception:
hosts = []
if hosts:
_append_sw_candidates(hosts)
return Response(status_code=204)
```
Note: confirm `log` is the module logger name used elsewhere in api.py; if it differs, match the existing name. Confirm `Request`/`Response` are already imported (they are — other routes use them).
- [ ] **Step 9: Run both test suites**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_sw_candidate_api.py -v`
Expected: PASS (2 tests).
Run: `cd packages/secubox-toolbox-ng && GOFLAGS=-mod=vendor go build ./... && GOFLAGS=-mod=vendor go test ./cmd/sbxmitm/ -count=1`
Expected: build OK, all PASS.
- [ ] **Step 10: Commit**
```bash
git add packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter.go packages/secubox-toolbox-ng/cmd/sbxmitm/swneuter_test.go packages/secubox-toolbox-ng/cmd/sbxmitm/main.go packages/secubox-toolbox/secubox_toolbox/api.py packages/secubox-toolbox/tests/test_sw_candidate_api.py
git commit -m "feat(toolbox): SW-neuter auto-learn flush + /__toolbox/sw-candidate ingest (ref #753)"
```
---
## Manual validation (after deploy — not part of the TDD loop)
Rebuild + install the `secubox-toolbox-ng` .deb (bump changelog), then:
1. `echo leparisien.fr >> /var/lib/secubox/toolbox/sw-neuter-hosts.txt`
2. Through the tunnel, hard-reload leparisien.fr → DevTools → Application → Service Workers shows it unregistered; the banner appears on the next navigation.
3. A host NOT on the list keeps its SW; after a few minutes it appears in `/var/lib/secubox/toolbox/sw-neuter-candidates.txt`.
4. Confirm a non-PWA site (lemonde, x.com) is completely unaffected.
## Self-Review notes
- **Spec coverage:** SW-script detection + neuter serve (Task 1+2) ✓; targeted allow-list, hot-reload, suffix-match (Task 1) ✓; passive neuter SW (Task 1, asserted) ✓; auto-learn candidate record + flush + operator-review file (Task 1+3) ✓; targeted-strict/empty-list no-op + fail-open (Task 1, asserted) ✓; durability via .deb (manual section + #754 flow) ✓.
- **Type consistency:** `SWNeuter` methods (`Maybe`, `Match`, `RecordCandidate`, `snapshotCandidates`, `flushCandidatesOnce`, `runCandidateFlusher`) and `isSWScriptRequest`/`NeuterSW` are referenced identically across Tasks 1-3 and the main.go wiring. The portal `SW_CANDIDATES_FILE`/`toolbox_sw_candidate`/`_append_sw_candidates` names match between the test and the endpoint.
- **Out of scope (per spec):** WebUI to review candidates / manage the list (v1 = the two files); forced reload; injecting into SW revalidation fetches.

View File

@ -0,0 +1,21 @@
# #757 — SW revalidation nudge (strip conditional headers for allow-listed hosts)
**Goal:** stale-while-revalidate SWs on the sw-neuter allow-list update their cache
with a banner'd shell WITHOUT being neutered — by forcing a full 200 (not a 304)
on their HTML revalidation fetch so the MITM's existing injection lands.
**Design (approved):** in `mitmPipeline`, BEFORE the upstream proxy (`up.Do(req)`),
for an allow-listed host (`swNeuter.Match`) on an HTML document request, strip
`If-None-Match` + `If-Modified-Since` → upstream returns 200 with body → MITM
injects the banner → the SW caches the banner'd version on next background
revalidation. Gated to the allow-list (bounds the full-200 bandwidth cost).
Limitation: only helps SWs that revalidate; cache-first still needs #753's neuter.
## Task 1 (single)
**Files:** modify `cmd/sbxmitm/swneuter.go` (add `requestWantsHTML`), `cmd/sbxmitm/main.go` (the strip); test `cmd/sbxmitm/swneuter_test.go`.
- requestWantsHTML(req): true if `Sec-Fetch-Dest: document` (case-insensitive) OR `Accept` contains `text/html`; nil-safe false.
- Strip in mitmPipeline before `resp, err := up.Do(req)`:
`if px.swNeuter != nil && requestWantsHTML(req) && px.swNeuter.Match(host) { req.Header.Del("If-None-Match"); req.Header.Del("If-Modified-Since") }`
- Test: requestWantsHTML true/false cases (Sec-Fetch-Dest document, Accept text/html, neither, nil).
- Verify: go build + vet + go test ./cmd/sbxmitm/.

View File

@ -0,0 +1,141 @@
# Design — Targeted Service-Worker neuter for the R3 banner (#753)
- **Issue:** #753
- **Date:** 2026-06-27
- **Status:** Approved (brainstorm), pending implementation plan
- **Author:** Gérald Kerma / CyberMind
## Problem
The R3 transparency banner is absent on Service-Worker PWA sites (leparisien.fr,
cnn.com, 20minutes.fr, franceinfo). Their SW serves the **main HTML document from
its cache**, so the navigation request never reaches the MITM → nothing to inject
into. Confirmed via `SBX_DEBUG_CSP`: `www.leparisien.fr` produced **0**
`[csp-debug]` lines (vs lemonde/x.com which do). The inline #662 banner defeats
SW hijack of the *loader src*, but not a fully cached HTML shell.
## Decided scope (from brainstorm)
- **Targeted + auto-learn.** Neuter the SW only on an editable allow-list of
hosts; nothing global (a global SW-kill would break offline/push for every
tunnel site). Auto-detection proposes candidate hosts; the operator promotes.
- **Passive re-appearance.** The neuter SW unregisters silently and clears its
caches; it does NOT force-reload clients. The banner returns on the **next
navigation** (which bypasses the now-gone SW → fresh fetch → MITM injects).
- **Accepted tradeoff:** neutering a listed site's SW breaks its offline mode /
web-push / background-sync for tunnel clients. This is the cost of coverage,
scoped to the curated list.
## Approach (chosen)
Intercept the **Service-Worker script fetch** in sbxmitm and, for allow-listed
hosts, serve a self-unregistering SW instead of proxying the real one. The
browser updates to it → unregisters → caches cleared → next navigation is fresh.
Why this over alternatives:
- **vs. injecting a SW-unregister script into pages:** chicken-and-egg — the main
doc is SW-served, so our injected script never reaches it. Intercepting the SW
*script fetch* works because the browser re-fetches the SW script over the
network (the `Service-Worker: script` request DOES traverse the MITM), even
for cache-first PWAs.
- **vs. blocking sw.js with a 204:** a 204 stops SW *updates* but does not remove
an already-installed controlling SW. Serving an unregistering SW actively
removes it.
## Components
Each is small and follows an existing sbxmitm pattern.
### 1. `cmd/sbxmitm/swneuter.go` (new)
- **Allow-list loader:** wraps `reload.LoadLines("/var/lib/secubox/toolbox/sw-neuter-hosts.txt", true)` with a `reload.Watcher` for hot-reload — identical to the splice-whitelist / learned-trackers loaders. Exposes `Match(host) bool` doing the same suffix-match used by `policy`/splice (`host == p || strings.HasSuffix(host, "."+p)`), lowercased + port-stripped.
- **`isSWScriptRequest(req) bool`:** true when the request carries the spec-mandated `Service-Worker: script` header (browsers send it on every SW script fetch).
- **`NeuterSW` constant:** the self-unregistering SW body (see below).
- Construction wired in `main()` from a flag `--sw-neuter-hosts` (default `/var/lib/secubox/toolbox/sw-neuter-hosts.txt`); nil-safe (a nil neuter = feature off).
### 2. Insertion in `mitmPipeline` (main.go)
After the decrypted request is read and BEFORE the normal proxy, at the same
layer as the `verdict == "block"` → 204 short-circuit: if
`neuter != nil && isSWScriptRequest(req) && neuter.Match(host)`
`writeRaw(tconn, 200, "OK", {"Content-Type":"application/javascript","Cache-Control":"no-store","X-SecuBox-Ng":"sw-neutered"}, []byte(NeuterSW))` and return. The real SW script is never fetched.
### 3. Autolearn candidate feed
When sbxmitm sees `isSWScriptRequest(req)` for a host that is NOT on the allow-list,
record it as a sw-neuter candidate (lock-guarded, capped map, mirroring
`adstats.go`'s ad-candidate aggregator). Drained by the existing stats flusher
into a portal POST (a new `sw_candidates` field on the existing ad-event payload,
or a sibling `/__toolbox/sw-candidate` endpoint — decide at plan time to reuse the
existing channel where cleanest). The portal stores candidates; the existing
`secubox-toolbox-autolearn` proposes them; the operator promotes a host by adding
it to `sw-neuter-hosts.txt` (de-whitelist = remove the line — same UX as
splice-whitelist).
Precision note: candidate proposal is intentionally broad (any SW-script host not
already listed). It is SAFE because nothing is neutered until the operator
promotes a host to the allow-list — proposals never auto-neuter.
### The neuter SW body (`NeuterSW`)
```js
// SecuBox SW-neuter (#753): self-unregister + drop caches so the next
// navigation is a fresh network fetch the MITM can inject the banner into.
// Passive — no client.navigate(), so the current page is not force-reloaded.
self.addEventListener('install', function(e){ self.skipWaiting(); });
self.addEventListener('activate', function(e){
e.waitUntil((async function(){
try { var ks = await caches.keys(); await Promise.all(ks.map(function(k){ return caches.delete(k); })); } catch (_) {}
try { await self.registration.unregister(); } catch (_) {}
})());
});
```
## Data flow
```
SW script fetch (Service-Worker: script) → sbxmitm mitmPipeline
├─ host ∈ allow-list → writeRaw(200, NeuterSW) → browser unregisters SW → next nav fresh → banner
└─ host ∉ allow-list → record sw-neuter candidate → flush → portal store → autolearn proposes → operator promotes
```
## Error handling / safety
- **Targeted-strict:** only allow-listed hosts are neutered; an empty/missing
list is a complete no-op (fail-safe via `LoadLines` → empty set).
- **Off-switch:** a nil neuter (flag pointing at a non-existent file, or feature
disabled) means the SW-script path is untouched — normal proxy.
- **Scoped trigger:** the neuter is served ONLY on requests carrying the
`Service-Worker: script` header, never on normal navigation/subresource
traffic.
- **Idempotent / loop-safe:** re-serving the neuter SW is harmless (it just
unregisters again); passive mode means no reload loop.
- **Candidate cap:** the autolearn buffer is bounded (mirrors `adCandMapCap`) so
a flood of SW hosts cannot grow memory unbounded.
## Testing
- **Unit (Go, `cmd/sbxmitm/swneuter_test.go`):**
- `Match`: suffix-match positives (`leparisien.fr` matches `www.leparisien.fr`)
+ negatives (`notleparisien.fr` must NOT match); exact host; port-stripped.
- `isSWScriptRequest`: true with `Service-Worker: script`, false without.
- `NeuterSW` body: contains `self.registration.unregister()` and clears caches,
and does NOT contain `client.navigate`/`clients.matchAll(...).navigate`
(passive guarantee).
- empty/missing allow-list file → `Match` always false (no-op).
- **Manual:** add `leparisien.fr` to `sw-neuter-hosts.txt`; reload leparisien
through the tunnel; confirm the SW is unregistered (DevTools → Application →
Service Workers) and the banner appears on the next navigation. Confirm a host
NOT on the list keeps its SW.
## Out of scope (this iteration)
- A WebUI panel to manage the allow-list / review candidates (v2 — the text file
+ the autolearn proposal channel are the v1 surface, mirroring splice-whitelist).
- Forced/immediate reload (the brainstorm chose passive).
- Injecting into the SW's own revalidation fetches (approach 2 in the issue) —
the neuter approach supersedes it for cache-first PWAs; revisit only if a
network-first PWA proves the neuter too aggressive.
## Durability
The new flag + allow-list default ship in the `secubox-toolbox-ng` package; the
allow-list file is operator state under `/var/lib/secubox/toolbox/` (not shipped,
created empty by postinst/tmpfiles if needed). A `.deb` bump + reinstall makes
the engine change durable (same flow as #754).

View File

@ -28,6 +28,7 @@ import (
"net/url"
"regexp"
"sync"
"sync/atomic"
"time"
)
@ -132,10 +133,12 @@ type adCounter struct {
// adStats is the lock-guarded in-memory aggregator. blocks is keyed by
// (adHost,site); clients by (macHash,adHost). The keys are small structs so the
// maps stay allocation-light and comparable without string concatenation.
// cosmetic counts HTML pages where the cosmetic ad-hide style was injected (#755).
type adStats struct {
mu sync.Mutex
blocks map[adKey]*adCounter
clients map[cliKey]*adCounter
mu sync.Mutex
blocks map[adKey]*adCounter
clients map[cliKey]*adCounter
cosmetic atomic.Int64 // #755 — pages where the cosmetic ad-hide style was injected
}
type adKey struct{ adHost, site string }
@ -181,6 +184,12 @@ func (a *adStats) recordAdBlock(adHost, site, macHash string) {
}
}
// recordCosmetic tallies one R3 HTML page that received the cosmetic ad-hide style.
func (a *adStats) recordCosmetic() { a.cosmetic.Add(1) }
// snapshotCosmetic atomically reads-and-clears the cosmetic page counter.
func (a *adStats) snapshotCosmetic() int64 { return a.cosmetic.Swap(0) }
// ── wire payload (mirrors the portal /__toolbox/ad-event JSON contract) ──────
type adBlockRow struct {
@ -207,9 +216,10 @@ type adCandidateRow struct {
}
type adEventPayload struct {
Blocks []adBlockRow `json:"blocks"`
Clients []adClientRow `json:"clients"`
Candidates []adCandidateRow `json:"candidates,omitempty"`
Blocks []adBlockRow `json:"blocks"`
Clients []adClientRow `json:"clients"`
Candidates []adCandidateRow `json:"candidates,omitempty"`
CosmeticPages int64 `json:"cosmetic_pages,omitempty"` // #755 — pages cleaned since the last flush
}
// snapshot atomically reads-and-clears both maps, returning the accumulated rows.
@ -234,7 +244,7 @@ func (a *adStats) snapshot() adEventPayload {
// empty reports whether a payload carries no rows (nothing to POST).
func (p adEventPayload) empty() bool {
return len(p.Blocks) == 0 && len(p.Clients) == 0 && len(p.Candidates) == 0
return len(p.Blocks) == 0 && len(p.Clients) == 0 && len(p.Candidates) == 0 && p.CosmeticPages == 0
}
// adEventClient is a short-timeout fire-and-forget client for the ad-event POST.
@ -260,6 +270,7 @@ func (a *adStats) flushOnce(portal string, cand *adCandidates) adEventPayload {
if cand != nil {
p.Candidates = cand.snapshot()
}
p.CosmeticPages = a.snapshotCosmetic() // #755 — pages cleaned in this window
if p.empty() {
return p
}

View File

@ -39,6 +39,13 @@ const cosmeticGuard = "sbx-ghost-style"
// CONSERVATISM note above). The rule mirrors the Python _style_for:
// display:none + visibility:hidden, both !important, collapsing the slot.
const cosmeticStyle = `<style id="sbx-ghost-style">` +
// #756 — restore scroll. When we display:none a paywall/consent overlay, the
// site's JS has often already scroll-locked the page (document.body.style.
// overflow='hidden', no inline !important). A stylesheet !important overrides
// that, so scroll returns (Bloomberg etc.). Tradeoff: a legitimate modal that
// locks body scroll will let the page scroll behind it — acceptable for a
// page-cleaning MITM whose purpose includes defeating paywall/consent locks.
`html,body{overflow:auto!important}` +
// ── ads (ported from _COSMETIC["ads"]) ──────────────────────────────────
`[id^="google_ads"],` +
`[id^="div-gpt-ad"],` +

View File

@ -0,0 +1,16 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
package main
import "testing"
func TestCosmeticCounterSnapshotClears(t *testing.T) {
a := newAdStats()
a.recordCosmetic()
a.recordCosmetic()
if got := a.snapshotCosmetic(); got != 2 {
t.Fatalf("snapshotCosmetic = %d, want 2", got)
}
if got := a.snapshotCosmetic(); got != 0 {
t.Fatalf("snapshot must clear; second call = %d, want 0", got)
}
}

View File

@ -29,6 +29,7 @@ import (
"log"
"net"
"net/http"
"os"
"strconv"
"strings"
"time"
@ -110,6 +111,11 @@ type Proxy struct {
// (manifests / direct audio-video) seen on MITM'd flows to a JSONL log the
// mediaflow "Discovered Media" view reads. nil/disabled → no-op.
media *mediaCatcher
// swNeuter (#753) is the targeted Service-Worker neuter: for allow-listed
// hosts it answers the SW script fetch with a self-unregistering SW so PWA
// shells stop being SW-cached and the banner can be injected on the next nav.
swNeuter *SWNeuter
}
// recordAdBlock forwards a 204'd ad/tracker block to the engine's metrics
@ -290,6 +296,25 @@ func (px *Proxy) mitmPipeline(tconn *tls.Conn, rawClient net.Conn, host, verdict
servePortalAsset(tconn, px.portal, req.URL.RequestURI())
return
}
// #753 — targeted SW-neuter. For an allow-listed host, answer the
// Service-Worker script fetch with a self-unregistering SW (the next
// navigation bypasses the now-gone SW → reaches the MITM → banner). Off the
// list, record the host as an auto-learn candidate. Only ever fires on the
// `Service-Worker: script` request — normal traffic is untouched.
if px.swNeuter != nil && isSWScriptRequest(req) {
px.swNeuter.Maybe()
if px.swNeuter.Match(host) {
writeRaw(tconn, 200, "OK", map[string]string{
"Content-Type": "application/javascript",
"Cache-Control": "no-store",
"X-SecuBox-Ng": "sw-neutered",
}, []byte(NeuterSW))
return
}
px.swNeuter.RecordCandidate(host)
}
// Transparent: the upstream request must carry the SNI host (for Host header,
// SNI, and cert verification); the actual TCP dial is pinned to the captured
// original-dst by the uchromeTransport. We do NOT put the bare ip:port in
@ -375,6 +400,15 @@ func (px *Proxy) mitmPipeline(tconn *tls.Conn, rawClient net.Conn, host, verdict
Transport: newUchromeTransport(dialHost, host),
}
req.RequestURI = ""
// #757 — SW revalidation nudge: for an allow-listed (sw-neuter) host, strip the
// conditional headers off an HTML navigation / SW-revalidation request so the
// upstream returns a full 200 (not a 304) → the MITM injects the banner →
// a stale-while-revalidate SW caches a banner'd shell WITHOUT being neutered.
// Cache-first SWs that never revalidate still need the #753 neuter.
if px.swNeuter != nil && requestWantsHTML(req) && px.swNeuter.Match(host) {
req.Header.Del("If-None-Match")
req.Header.Del("If-Modified-Since")
}
resp, err := up.Do(req)
if err != nil {
writeRaw(tconn, 502, "Bad Gateway", nil, nil)
@ -471,6 +505,26 @@ func (px *Proxy) mitmPipeline(tconn *tls.Conn, rawClient net.Conn, host, verdict
if px.cspDemo {
cspNonce, cspBypassed = relaxCSPForLoader(resp.Header)
}
// CSP diagnostic (#751) — opt-in via SBX_DEBUG_CSP, off by default (zero cost
// when unset). For every injected HTML response it logs what relaxCSPForLoader
// actually saw — the proto, the count of CSP / CSP-Report-Only headers visible
// in resp.Header, the borrowed nonce and the bypass decision. Kept as a
// permanent operator tool: it pinpoints why a banner does/doesn't render on a
// given site (header present? nonce-source? hash-only? strict-dynamic?), the
// class of problem that took an x.com-shaped CSP to surface.
if os.Getenv("SBX_DEBUG_CSP") != "" {
csps := resp.Header.Values("Content-Security-Policy")
cspRO := resp.Header.Values("Content-Security-Policy-Report-Only")
head := ""
if len(csps) > 0 {
head = csps[0]
if len(head) > 220 {
head = head[:220]
}
}
log.Printf("[csp-debug] host=%s proto=%s status=%d cspHdrs=%d cspRO=%d nonce=%q bypassed=%v head=%q",
host, resp.Proto, resp.StatusCode, len(csps), len(cspRO), cspNonce, cspBypassed, head)
}
// #662 — INLINE the banner (supersedes the <script src="/__toolbox/loader.js">
// tag): sites with a SERVICE WORKER hijack the same-origin src before it
// reaches this engine. We fetch the COMPLETE script body from the portal
@ -481,6 +535,9 @@ func (px *Proxy) mitmPipeline(tconn *tls.Conn, rawClient net.Conn, host, verdict
scriptBody, _ := fetchInlineBanner(px.portal, clientHash, wg, cspBypassed)
if out, ok := injectIntoBody(body, resp.Header.Get("Content-Encoding"), scriptBody, cspNonce, wg); ok {
body = out
if wg && px.ads != nil {
px.ads.recordCosmetic() // #755 — cosmetic style is wg-only (injectHTML gates it)
}
// Keep framing consistent with the served bytes (only the length changed).
resp.Header.Set("Content-Length", strconv.Itoa(len(body)))
resp.ContentLength = int64(len(body))
@ -508,6 +565,8 @@ func main() {
"compute cross-site cookie-tracker edges and POST them to the portal /__toolbox/social-event ingest so the kbin /social graph refills (#662; replaces the decommissioned Python social_graph addon). Hash-only (never raw cookie values); WG-peer flows only; batched + fire-and-forget — a dead/slow portal never affects the proxy. Set false to emit nothing.")
mediaCatch := flag.Bool("media-catch", true,
"R4 media reverse-catcher (#736): record cloneable media URLs (HLS/DASH manifests + direct audio/video) seen on MITM'd flows to "+mediaCatchPath+" for the mediaflow \"Discovered Media\" clone view. URLs only, never bodies; deduped. Set false to disable.")
swNeuterHosts := flag.String("sw-neuter-hosts", "/var/lib/secubox/toolbox/sw-neuter-hosts.txt",
"#753 allow-list of PWA hosts whose Service Worker is neutered (served a self-unregistering SW) so the banner can be injected; empty/missing file = no-op")
flag.Parse()
ca, err := forge.LoadCA(*caCert, *caKey)
if err != nil {
@ -544,6 +603,7 @@ func main() {
social: newSocialRelay(),
consent: newConsentLog(),
media: newMediaCatcher(*mediaCatch),
swNeuter: newSWNeuter(*swNeuterHosts),
}
// #662 — start the social-edge flusher: the MITM path buffers cross-site
// tracker edges into px.social, drained every 10s to the portal's
@ -556,6 +616,7 @@ func main() {
// #662 — the candidate feed (px.cand) is drained in the SAME flush so the
// learning candidates ride the existing ad-event channel (one POST / 10s).
go px.ads.runAdStatsFlusher(*portal, px.cand)
go px.swNeuter.runCandidateFlusher(*portal)
if *transparent {
// Transparent R3 mode: raw accept loop, each conn carries its pre-DNAT
// destination via SO_ORIGINAL_DST (recovered in handleTransparent). The

View File

@ -0,0 +1,168 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxmitm — targeted Service-Worker neuter (#753)
//
// PWA news sites (leparisien, cnn…) serve their main HTML document from a
// Service-Worker cache, so the navigation never reaches the MITM and the
// transparency banner can't be injected. For an operator-curated allow-list of
// hosts, we answer the SW SCRIPT fetch with a self-unregistering SW: the browser
// updates to it, it unregisters + drops caches, and the NEXT navigation is a
// fresh network fetch the MITM injects the banner into. PASSIVE (no forced
// reload). Targeted-strict: an empty list neuters nothing.
package main
import (
"bytes"
"encoding/json"
"net/http"
"strings"
"sync"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/reload"
)
// NeuterSW is the self-unregistering SW body served for allow-listed hosts.
// It unregisters itself and clears all caches on activate; it NEVER calls
// client.navigate(), so the current page is not force-reloaded — the banner
// returns on the next navigation.
const NeuterSW = `self.addEventListener('install', function(e){ self.skipWaiting(); });
self.addEventListener('activate', function(e){
e.waitUntil((async function(){
try { var ks = await caches.keys(); await Promise.all(ks.map(function(k){ return caches.delete(k); })); } catch (_) {}
try { await self.registration.unregister(); } catch (_) {}
})());
});
`
// swCandMapCap bounds the candidate buffer (mirrors adCandMapCap).
const swCandMapCap = 4096
// SWNeuter holds the hot-reloadable allow-list + the auto-learn candidate buffer.
type SWNeuter struct {
mu sync.RWMutex
hosts map[string]bool // allow-list (lowercased; suffix-matched via hostMatches)
watcher *reload.Watcher
cmu sync.Mutex
cand map[string]int64 // host -> hits (SW hosts NOT yet on the allow-list)
}
// newSWNeuter loads the allow-list file and registers a hot-reload watcher.
// A missing/unreadable file yields an empty (no-op) list.
func newSWNeuter(path string) *SWNeuter {
s := &SWNeuter{
hosts: reload.LoadLines(path, true),
cand: map[string]int64{},
}
target := reload.Target{
Path: path,
LastMtime: reload.StatMtime(path),
Load: func(p string) any { return reload.LoadLines(p, true) },
Apply: func(v any) {
m := v.(map[string]bool)
s.mu.Lock()
s.hosts = m
s.mu.Unlock()
},
}
s.watcher = reload.NewWatcher(reload.DefaultReloadThrottle, target)
return s
}
// Maybe triggers a hot-reload check (cheap: one stat + mtime compare).
func (s *SWNeuter) Maybe() {
if s != nil && s.watcher != nil {
s.watcher.Maybe()
}
}
// Match reports whether host is on the allow-list (exact or dotted-suffix).
func (s *SWNeuter) Match(host string) bool {
s.mu.RLock()
defer s.mu.RUnlock()
return hostMatches(host, s.hosts)
}
// RecordCandidate tallies a SW host not on the allow-list (auto-learn proposal).
func (s *SWNeuter) RecordCandidate(host string) {
h := strings.Trim(strings.ToLower(host), ".")
if h == "" {
return
}
s.cmu.Lock()
defer s.cmu.Unlock()
if _, ok := s.cand[h]; ok {
s.cand[h]++
} else if len(s.cand) < swCandMapCap {
s.cand[h] = 1
}
}
// snapshotCandidates atomically reads-and-clears the candidate buffer.
func (s *SWNeuter) snapshotCandidates() []string {
s.cmu.Lock()
defer s.cmu.Unlock()
if len(s.cand) == 0 {
return nil
}
out := make([]string, 0, len(s.cand))
for h := range s.cand {
out = append(out, h)
}
s.cand = map[string]int64{}
return out
}
// requestWantsHTML reports whether req is for an HTML document (a navigation or a
// Service-Worker document fetch) — Sec-Fetch-Dest: document, or an Accept that
// advertises text/html. Used by the #757 revalidation nudge so we only force a
// full 200 on document fetches, never on subresources.
func requestWantsHTML(req *http.Request) bool {
if req == nil {
return false
}
if strings.EqualFold(req.Header.Get("Sec-Fetch-Dest"), "document") {
return true
}
return strings.Contains(req.Header.Get("Accept"), "text/html")
}
// isSWScriptRequest reports whether req is a Service-Worker SCRIPT fetch.
// Browsers send the spec-mandated `Service-Worker: script` header on the
// register() fetch and every update check — reliable and host-agnostic.
func isSWScriptRequest(req *http.Request) bool {
return req != nil && strings.EqualFold(req.Header.Get("Service-Worker"), "script")
}
// swFlushInterval is how often pending candidates are POSTed to the portal.
const swFlushInterval = 30 * time.Second
// flushCandidatesOnce drains the candidate buffer and best-effort POSTs the host
// list to the portal's /__toolbox/sw-candidate ingest. Returns the drained hosts
// (so a test can assert the snapshot/clear); a dead/slow portal is swallowed.
func (s *SWNeuter) flushCandidatesOnce(portal string) []string {
hosts := s.snapshotCandidates()
if len(hosts) == 0 {
return nil
}
buf, err := json.Marshal(map[string][]string{"hosts": hosts})
if err != nil {
return hosts
}
url := portalTargetURL(portal, "/__toolbox/sw-candidate")
if resp, err := adEventClient.Post(url, "application/json", bytes.NewReader(buf)); err == nil && resp != nil {
resp.Body.Close()
}
return hosts
}
// runCandidateFlusher drains the candidate buffer to the portal every
// swFlushInterval. Launched as a background goroutine from main().
func (s *SWNeuter) runCandidateFlusher(portal string) {
for {
time.Sleep(swFlushInterval)
s.flushCandidatesOnce(portal)
}
}

View File

@ -0,0 +1,109 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
package main
import (
"net/http"
"strings"
"testing"
)
func TestSWMatchSuffix(t *testing.T) {
s := &SWNeuter{hosts: map[string]bool{"leparisien.fr": true, "cnn.com": true}}
for _, h := range []string{"leparisien.fr", "www.leparisien.fr", "m.cnn.com", "CNN.COM"} {
if !s.Match(h) {
t.Fatalf("%q should match the allow-list", h)
}
}
for _, h := range []string{"notleparisien.fr", "evil.com", "leparisien.fr.evil.com", ""} {
if s.Match(h) {
t.Fatalf("%q must NOT match", h)
}
}
}
func TestSWEmptyListNoOp(t *testing.T) {
s := &SWNeuter{hosts: map[string]bool{}}
if s.Match("www.leparisien.fr") {
t.Fatal("empty allow-list must match nothing (targeted-strict no-op)")
}
}
func TestSWIsScriptRequest(t *testing.T) {
r1, _ := http.NewRequest("GET", "https://x/sw.js", nil)
r1.Header.Set("Service-Worker", "script")
if !isSWScriptRequest(r1) {
t.Fatal("Service-Worker: script must be detected")
}
r2, _ := http.NewRequest("GET", "https://x/sw.js", nil)
if isSWScriptRequest(r2) {
t.Fatal("no Service-Worker header → not a SW script request")
}
if isSWScriptRequest(nil) {
t.Fatal("nil request → false")
}
}
func TestNeuterSWPassiveAndCorrect(t *testing.T) {
if !strings.Contains(NeuterSW, "self.registration.unregister()") {
t.Fatal("neuter SW must unregister itself")
}
if !strings.Contains(NeuterSW, "caches.delete") {
t.Fatal("neuter SW must clear caches")
}
if strings.Contains(NeuterSW, "navigate(") {
t.Fatal("neuter SW must be PASSIVE — no client.navigate / force reload")
}
}
func TestSWCandidateRecordSnapshot(t *testing.T) {
s := &SWNeuter{cand: map[string]int64{}}
s.RecordCandidate("www.cnn.com")
s.RecordCandidate("www.cnn.com")
s.RecordCandidate("") // ignored
got := s.snapshotCandidates()
if len(got) != 1 || got[0] != "www.cnn.com" {
t.Fatalf("snapshot = %v, want [www.cnn.com]", got)
}
if s.snapshotCandidates() != nil {
t.Fatal("snapshot must read-and-CLEAR (second call → nil)")
}
}
func TestSWFlushCandidatesClears(t *testing.T) {
s := &SWNeuter{cand: map[string]int64{}}
s.RecordCandidate("www.cnn.com")
// portal "" → Post fails fast (best-effort); the snapshot must still drain.
got := s.flushCandidatesOnce("http://127.0.0.1:0")
if len(got) != 1 || got[0] != "www.cnn.com" {
t.Fatalf("flush returned %v, want [www.cnn.com]", got)
}
if s.snapshotCandidates() != nil {
t.Fatal("flush must have drained the buffer")
}
if s.flushCandidatesOnce("http://127.0.0.1:0") != nil {
t.Fatal("empty buffer → flush returns nil, no POST")
}
}
func TestRequestWantsHTML(t *testing.T) {
mk := func(setter func(h http.Header)) *http.Request {
r, _ := http.NewRequest("GET", "https://x/", nil)
if setter != nil {
setter(r.Header)
}
return r
}
if !requestWantsHTML(mk(func(h http.Header) { h.Set("Sec-Fetch-Dest", "document") })) {
t.Fatal("Sec-Fetch-Dest: document → true")
}
if !requestWantsHTML(mk(func(h http.Header) { h.Set("Accept", "text/html,application/xhtml+xml") })) {
t.Fatal("Accept text/html → true")
}
if requestWantsHTML(mk(func(h http.Header) { h.Set("Accept", "image/png") })) {
t.Fatal("non-html Accept → false")
}
if requestWantsHTML(nil) {
t.Fatal("nil → false")
}
}

View File

@ -1,3 +1,51 @@
secubox-toolbox-ng (0.1.26-1~bookworm1) bookworm; urgency=medium
* #757 SW revalidation nudge: for sw-neuter allow-listed hosts, strip
If-None-Match/If-Modified-Since on HTML document fetches so a
stale-while-revalidate Service Worker re-fetches a full 200 (banner injected)
and caches a banner'd shell WITHOUT being neutered. Cache-first SWs still use
the #753 neuter.
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 10:00:00 +0000
secubox-toolbox-ng (0.1.25-1~bookworm1) bookworm; urgency=medium
* #756 cosmetic restores scroll: the cosmetic ad-hide <style> now prepends
html,body{overflow:auto!important} so a paywall's JS scroll-lock
(body.style.overflow='hidden') is overridden and the page scrolls again
(Bloomberg etc.). No-op on normal pages; modal-scroll-behind is the accepted
tradeoff.
* #755 cosmetic-pages counter: sbxmitm tallies each WG/R3 HTML page that got
the cosmetic style (wg-gated, atomic) and flushes it via the ad-event channel
(cosmetic_pages) → portal → the #ads "Pages nettoyées" metric.
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 09:30:00 +0000
secubox-toolbox-ng (0.1.24-1~bookworm1) bookworm; urgency=medium
* #753 targeted Service-Worker neuter: PWA news sites (leparisien, cnn,
20minutes, franceinfo) serve their main HTML from a Service-Worker cache, so
the navigation never reaches the MITM and the transparency banner can't be
injected. For an operator-curated allow-list (--sw-neuter-hosts, default
/var/lib/secubox/toolbox/sw-neuter-hosts.txt) sbxmitm answers the SW script
fetch with a passive self-unregistering SW so the next navigation reaches the
MITM (→ banner). Auto-learn: non-listed SW hosts are flushed to the portal's
/__toolbox/sw-candidate for operator review. Targeted-strict: empty list =
no-op (ships inert until the operator adds a host).
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 08:45:00 +0000
secubox-toolbox-ng (0.1.23-1~bookworm1) bookworm; urgency=medium
* #751 rebuild from master: the deployed 0.1.22 binary was STALE and lacked
the working #728 nonce-CSP borrow, so the transparency banner was blocked on
nonce-CSP sites (x.com) — the inline <script> got no nonce. A fresh master
build restores the nonce-borrow (banner renders on x.com + news). Adds the
SBX_DEBUG_CSP opt-in diagnostic (logs per-injected-response CSP visibility +
borrow decision; off by default, zero cost when unset).
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 06:30:00 +0000
secubox-toolbox-ng (0.1.22-1~bookworm1) bookworm; urgency=medium
* media catcher: ship a tmpfiles.d entry so /run/secubox/media-catch.jsonl is

View File

@ -1,3 +1,40 @@
secubox-toolbox (2.7.24-1~bookworm1) bookworm; urgency=medium
* ui: remove the redundant ♥ Liveness dashboard card (generic status/version);
the version stays in the top version-badge (loadHealth still drives it).
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 10:00:00 +0000
secubox-toolbox (2.7.23-1~bookworm1) bookworm; urgency=medium
* #755 #ads card → honest labeled MITM-protection breakdown: Pubs bloquées
(204), Trackers détectés (distinct cross-site cookie-trackers from
social_edges), Pages nettoyées (cosmetic-hide pages, fed by sbxmitm via the
new cosmetic_events table), Drops réseau (blacklist nft). store.ad_stats gains
trackers_seen + pages_cleaned; admin_ad_stats gains network_drops;
record_cosmetic_pages ingests the sbxmitm flush. No metric is summed across
units.
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 09:30:00 +0000
secubox-toolbox (2.7.22-1~bookworm1) bookworm; urgency=medium
* #753 portal ingest for the SW-neuter auto-learn: POST /__toolbox/sw-candidate
dedup-appends SW-PWA hosts (proposed by sbxmitm) to
/var/lib/secubox/toolbox/sw-neuter-candidates.txt for operator review.
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 08:45:00 +0000
secubox-toolbox (2.7.21-1~bookworm1) bookworm; urgency=medium
* #754 reconcile bundle.py to the working #740 DOM-API banner (mk() builder,
Trusted-Types-proof — renders on x.com/news/strict-CSP) + the R4 tier folded
into its level switch + the #752 top-frame guard (banner never renders in a
3rd-party iframe, e.g. the Dailymotion player on leparisien). Closes the
deployed-but-unmerged feature/740 banner drift. Reviewed; 179 toolbox tests.
-- Gerald KERMA <devel@cybermind.fr> Sat, 27 Jun 2026 06:00:00 +0000
secubox-toolbox (2.7.20-1~bookworm1) bookworm; urgency=medium
* R4 analyst tier (#736): add R4 to the banner topbar level switch

View File

@ -137,6 +137,48 @@ async def toolbox_inline(
)
# #753 — SW-neuter auto-learn ingest: sbxmitm records every host it sees
# fetching a Service Worker that is NOT on the sw-neuter allow-list, and POSTs
# them here every 30 s. We dedup-append to a candidates file for operator review.
# The operator promotes wanted hosts to sw-neuter-hosts.txt to activate neuter.
# UNAUTHENTICATED — same trust perimeter as /__toolbox/ad-event (loopback / WG).
SW_CANDIDATES_FILE = Path("/var/lib/secubox/toolbox/sw-neuter-candidates.txt")
def _append_sw_candidates(hosts: list[str]) -> None:
"""Append new hosts to the sw-neuter candidates file, deduped against what is
already there. Best-effort; never raises into the request path."""
try:
existing: set[str] = set()
if SW_CANDIDATES_FILE.exists():
existing = {l.strip() for l in SW_CANDIDATES_FILE.read_text().splitlines() if l.strip()}
fresh = [h for h in hosts if h not in existing]
if not fresh:
return
SW_CANDIDATES_FILE.parent.mkdir(parents=True, exist_ok=True)
with SW_CANDIDATES_FILE.open("a", encoding="utf-8") as fh:
for h in fresh:
fh.write(h + "\n")
except OSError as e:
log.debug("sw-candidate append failed: %s", e)
@router.post("/__toolbox/sw-candidate")
async def toolbox_sw_candidate(request: Request) -> Response:
"""#753 — record SW-PWA hosts proposed for the sw-neuter allow-list. sbxmitm
POSTs hosts it saw fetching a Service Worker that are NOT yet allow-listed.
Deduped-appends to the candidates file for operator review; the operator
promotes wanted hosts to sw-neuter-hosts.txt."""
try:
body = await request.json()
hosts = [h for h in (body.get("hosts") or []) if isinstance(h, str) and h]
except Exception:
hosts = []
if hosts:
_append_sw_candidates(hosts)
return Response(status_code=204)
# #662 — ad-block metrics ingest from the Go MITM engine (sbxmitm). The #662
# cutover moved the BLOCK decision (204 on ad/tracker hosts) into the Go engine
# but left the METRICS unported, so the #ads dashboard froze. The engine now
@ -208,6 +250,11 @@ async def toolbox_ad_event(request: Request) -> Response:
store.record_ad_client_blocks(client_rows)
if cand_rows:
store.record_ad_candidates(cand_rows)
# #755 — cosmetic-pages counter: Go engine reports how many R3 HTML pages
# received the cosmetic ad-hide style in this flush window.
cp = body.get("cosmetic_pages")
if cp:
store.record_cosmetic_pages(cp)
except Exception as e: # never raise into the engine's fire-and-forget POST
log.debug("ad-event ingest failed: %s", e)
return Response(status_code=204)
@ -3048,7 +3095,15 @@ async def admin_protective() -> dict:
async def admin_ad_stats(hours: int = 24) -> dict:
"""Contextual ad-block metrics for the #ads tab (read-only, kbin-safe)."""
h = max(1, min(int(hours if hours is not None else 24), 168))
return store.ad_stats(hours=h)
out = store.ad_stats(hours=h)
# #755 — network-layer drops (blacklist nft sets). Best-effort; 0 when the
# blacklist is inert or unreadable. Reuses the admin_blacklist parse.
try:
bl = await admin_blacklist()
out["network_drops"] = int(bl.get("drops", 0) or 0)
except Exception:
out["network_drops"] = 0
return out
@router.get("/admin/ad-stats/client/{mac_hash}")

View File

@ -81,6 +81,15 @@ def _tor_mode() -> bool:
return False
def _ad_guard() -> bool:
"""Master ad-block switch on? (#740) Read from filters; default on."""
try:
from .filters import get_filters
return bool(get_filters().get("ad_guard", True))
except Exception:
return True
def build_bundle(client_id: str, is_wg: bool = False) -> dict:
"""Build the per-client cosmetic decision bundle (pure given inputs + pin file)."""
return {
@ -91,6 +100,7 @@ def build_bundle(client_id: str, is_wg: bool = False) -> dict:
"report_url": _report_url(client_id, is_wg),
"tracker_patterns": TRACKER_PATTERNS,
"tor_mode": _tor_mode(),
"ad_guard": _ad_guard(),
"ts": int(time.time()),
}
@ -103,6 +113,12 @@ def invalidate(client_id: str) -> None:
_cache.pop(k, None)
def invalidate_all() -> None:
"""#740 — drop ALL cached bundles after a GLOBAL filter change (ad_guard /
tor_mode toggled from the banner) so every client picks up the new state."""
_cache.clear()
def get_bundle(client_id: str, is_wg: bool = False) -> dict:
"""Return the cached bundle for a client, rebuilding past the TTL. Fail-open."""
try:
@ -146,6 +162,13 @@ def get_bundle(client_id: str, is_wg: bool = False) -> dict:
# + 2s poll; the prelude calls ensure() (inline) or sets `bundle` then ensure()s
# (src-loader).
_BANNER_CORE = r"""
// #752 — top-frame ONLY: never render inside an embedded sub-frame (3rd-party
// video players e.g. geo.dailymotion.com, ad/consent iframes). A same-origin
// iframe trips window.top !== window.self; a cross-origin one throws on the
// access both bail. The transparency banner belongs on the top document,
// once. Returns from the IIFE before any function runs (ensure() is appended
// after this block by both the inline and src-loader preludes).
try { if (window.top !== window.self) return; } catch (_) { return; }
function ready(fn){ if (document.body) { fn(); } else { setTimeout(function(){ready(fn);}, 30); } }
function esc(t){ return String(t).replace(/[&<>"]/g, function(c){
return {"&":"&amp;","<":"&lt;",">":"&gt;","\"":"&quot;"}[c]; }); }
@ -174,19 +197,34 @@ _BANNER_CORE = r"""
// #724 — inline R0..R3 level switch. Shows the real current level (highlighted)
// and lets the client change it: GET /__toolbox/set-level (same-origin, the Go
// engine reverse-proxies it to the portal), then reload so the new tier applies.
// #740 — mk(): build an element via DOM API only. textContent / setAttribute are
// NOT Trusted Types sinks (unlike innerHTML), so the banner renders on EVERY site
// incl. strict-CSP/Trusted-Types ones (franceinfo, leparisien, 20minutes, cnn,
// x/twitter) with ZERO CSP/TT bypass. This is the robust, permanent fix.
function mk(tag, opts){
var e = document.createElement(tag);
opts = opts || {};
if (opts.id) e.id = opts.id;
if (opts.cls) e.className = opts.cls;
if (opts.title) e.title = opts.title;
if (opts.text != null) e.textContent = opts.text;
if (opts.style) e.setAttribute("style", opts.style);
if (opts.attrs) for (var k in opts.attrs) if (Object.prototype.hasOwnProperty.call(opts.attrs, k)) e.setAttribute(k, opts.attrs[k]);
return e;
}
var BTN = "border-radius:3px;padding:0 5px;margin:0 2px;font:inherit;font-size:11px;cursor:pointer";
function lvlSwitch(b){
var cur = String(b.level || "r1").toLowerCase();
// #736 — R4 = analyst / reverse-catcher tier (deepest): everything is MITM'd
// and media URLs are caught for cloning. Selectable here; functionally the
// box already runs MITM-everything by default.
var lv = ["r0","r1","r2","r3","r4"], out = "<span id=\"sbx-lvl\" title=\"Niveau d'analyse — R4 = analyste/capteur média, clique pour changer\">";
// #736 — R4 = analyst / reverse-catcher tier (deepest): everything MITM'd +
// media URLs caught for cloning. Reconciled into the #740 DOM-API switch.
var lv = ["r0","r1","r2","r3","r4"];
var span = mk("span", {id:"sbx-lvl", title:"Niveau d'analyse — R4 = analyste/capteur média, clique pour changer"});
for (var i=0;i<lv.length;i++){ var on = lv[i]===cur;
out += "<button data-lvl=\"" + lv[i] + "\" class=\"sbx-lvl\" style=\"background:"
+ (on?"#148C66":"transparent") + ";color:" + (on?"#0A0E14":"#8A9AA8")
+ ";border:1px solid #148C66;border-radius:3px;padding:0 5px;margin:0 1px;"
+ "font:inherit;font-size:11px;cursor:pointer\">" + lv[i].toUpperCase() + "</button>";
span.appendChild(mk("button", {cls:"sbx-lvl", text:lv[i].toUpperCase(), attrs:{"data-lvl":lv[i]},
style:"background:"+(on?"#148C66":"transparent")+";color:"+(on?"#0A0E14":"#8A9AA8")
+";border:1px solid #148C66;"+BTN}));
}
return out + "</span>";
return span;
}
function wireLevels(bar, b){
var els = bar.querySelectorAll(".sbx-lvl");
@ -210,29 +248,46 @@ _BANNER_CORE = r"""
var ck = countCookies();
var bar = document.createElement("div");
bar.id = "sbx-banner";
bar.setAttribute("style", "position:fixed;left:0;right:0;top:0;z-index:2147483647;"
// #740 — !important on the visibility-critical props makes the banner IMMUNE
// to any stylesheet cosmetic hide (inline !important outranks author
// stylesheet !important): it stays ABOVE everything and visible, always.
bar.setAttribute("style", "position:fixed!important;left:0;right:0;top:0;z-index:2147483647!important;"
+ "display:flex!important;visibility:visible!important;opacity:1!important;"
+ "font:12px/1.4 system-ui,-apple-system,sans-serif;background:#0A0E14;color:#E8E6E0;"
+ "border-bottom:2px solid #148C66;padding:6px 12px;display:flex;gap:14px;align-items:center;"
+ "border-bottom:2px solid #148C66;padding:6px 12px;gap:14px;align-items:center;"
+ "box-shadow:0 2px 12px rgba(0,0,0,.4)");
var pin = b.pin ? "<span title=\"pinned\">📌 " + esc(b.pin) + "</span>" : "";
// #662 — 🔓 proof: the engine relaxed this page's CSP to inject this banner.
var cspProof = (csp === "1")
? "<span title=\"CSP contourné par SecuBox (démonstration)\">🔓</span>" : "";
// #683 — 🧅 kbin Tor mode: this session's exit is anonymised via Tor.
var tor = b.tor_mode
? "<span title=\"Sortie anonymisée via Tor\" style=\"color:#9E76FF;font-weight:bold\">🧅 Tor</span>" : "";
bar.innerHTML = "<b style=\"color:#148C66\">SecuBox</b>"
+ cspProof
+ tor
+ lvlSwitch(b)
+ "<span id=\"sbx-trk\">🛰️ " + trk + " trackers</span>"
+ "<span id=\"sbx-ck\">🍪 " + ck + " cookies</span>"
+ pin
+ "<a href=\"" + esc(b.report_url || "#") + "\" style=\"margin-left:auto;color:#2C70C0;text-decoration:none\">report ▸</a>"
+ "<button aria-label=\"dismiss\" style=\"background:none;border:0;color:#8A9AA8;cursor:pointer;font-size:14px\">✕</button>";
// #740 — built entirely with DOM API (no innerHTML → not a Trusted Types
// sink), so the banner renders identically on every site, strict-CSP/TT
// included, without touching the page's CSP.
bar.appendChild(mk("b", {text:"SecuBox", style:"color:#148C66"}));
if (csp === "1") bar.appendChild(mk("span", {text:"🔓", title:"CSP contourné par SecuBox (démonstration)"}));
bar.appendChild(mk("button", {id:"sbx-tor", text:"🧅 " + (b.tor_mode?"ON":"OFF"),
title:"Tor du tunnel toolbox — clique pour basculer",
style:"background:"+(b.tor_mode?"#3D2A6B":"transparent")+";color:"+(b.tor_mode?"#C9B8FF":"#8A9AA8")+";border:1px solid #6E40C9;"+BTN}));
bar.appendChild(lvlSwitch(b));
bar.appendChild(mk("button", {id:"sbx-adg", text:"🛡️ " + (b.ad_guard===false?"OFF":"ON"),
title:"Ad-Guard (blocage pub) — clique pour basculer",
style:"background:"+(b.ad_guard===false?"transparent":"#148C66")+";color:"+(b.ad_guard===false?"#8A9AA8":"#0A0E14")+";border:1px solid #148C66;"+BTN}));
bar.appendChild(mk("span", {id:"sbx-trk", text:"🛰️ " + trk + " trackers"}));
bar.appendChild(mk("span", {id:"sbx-ck", text:"🍪 " + ck + " cookies"}));
if (b.pin) bar.appendChild(mk("span", {text:"📌 " + b.pin, title:"pinned"}));
bar.appendChild(mk("a", {text:"report ▸", style:"margin-left:auto;color:#2C70C0;text-decoration:none", attrs:{href: b.report_url || "#"}}));
bar.appendChild(mk("button", {text:"", title:"dismiss",
style:"background:none;border:0;color:#8A9AA8;cursor:pointer;font-size:14px", attrs:{"aria-label":"dismiss"}}));
document.body.appendChild(bar);
try { document.body.style.paddingTop = (bar.offsetHeight || 34) + "px"; } catch (_) {}
wireLevels(bar, b);
// #740 — 🛡️ Ad-Guard + 🧅 Tor quick-toggles (mirror the level switch wiring).
var adg = bar.querySelector("#sbx-adg");
if (adg) adg.onclick = function(){ var on = b.ad_guard!==false; adg.textContent="";
fetch("/__toolbox/set-adguard?on=" + (on?"0":"1"), {credentials:"omit",cache:"no-store"})
.then(function(r){ if(r&&r.ok) location.reload(); else adg.textContent="🛡️ "+(on?"ON":"OFF"); })
.catch(function(){ adg.textContent="🛡️ "+(on?"ON":"OFF"); }); };
var tg = bar.querySelector("#sbx-tor");
if (tg) tg.onclick = function(){ var on = !!b.tor_mode; tg.textContent="";
fetch("/__toolbox/set-tor?on=" + (on?"0":"1"), {credentials:"omit",cache:"no-store"})
.then(function(r){ if(r&&r.ok) location.reload(); else tg.textContent="🧅 "+(on?"ON":"OFF"); })
.catch(function(){ tg.textContent="🧅 "+(on?"ON":"OFF"); }); };
var btn = bar.querySelector("button[aria-label=\"dismiss\"]");
if (btn) btn.onclick = function(){ dismissed = true; try { document.body.style.paddingTop = ""; } catch (_) {} bar.remove(); };
}

View File

@ -127,6 +127,20 @@ def ad_client_stats(mac_hash: str, hours: int = 24, top: int = 25) -> dict:
return out
def record_cosmetic_pages(pages: int) -> None:
"""#755 — append one cosmetic-hide tally (pages cleaned since the last flush).
ad_stats sums these over the window. Best-effort; never raises."""
try:
n = int(pages)
if n <= 0:
return
with _conn() as c:
c.execute("CREATE TABLE IF NOT EXISTS cosmetic_events(ts REAL, pages INTEGER)")
c.execute("INSERT INTO cosmetic_events(ts, pages) VALUES(?, ?)", (time.time(), n))
except Exception as e:
log.debug("record_cosmetic_pages failed: %s", e)
def record_ad_candidates(rows) -> None:
"""rows: iterable of (host, site, hits)."""
rows = [r for r in rows if r and r[0]]
@ -180,6 +194,28 @@ def ad_stats(hours: int = 24, top: int = 25) -> dict:
"SELECT mac_hash, SUM(hits) FROM ad_block_client_host "
"WHERE last_seen>=? AND mac_hash<>'' GROUP BY mac_hash "
"ORDER BY SUM(hits) DESC LIMIT ?", (cutoff, top))]
# #755 — trackers detected/poisoned by the MITM in the window: distinct
# cross-site cookie-identifier hashes seen on social_edges. This is the
# "Trackers" half of the card (the 204 ad-block is the "pubs" half).
out["trackers_seen"] = 0
try:
out["trackers_seen"] = int(c.execute(
"SELECT COUNT(DISTINCT cookie_id_hash) FROM social_edges "
"WHERE ts >= ? AND cookie_id_hash IS NOT NULL AND cookie_id_hash <> ''",
(int(cutoff),),
).fetchone()[0] or 0)
except sqlite3.Error:
out["trackers_seen"] = 0
# #755 — pages where the cosmetic ad-hide style was injected (Task 2 writes
# cosmetic_events; absent table → 0).
out["pages_cleaned"] = 0
try:
out["pages_cleaned"] = int(c.execute(
"SELECT COALESCE(SUM(pages),0) FROM cosmetic_events WHERE ts >= ?",
(cutoff,),
).fetchone()[0] or 0)
except sqlite3.Error:
out["pages_cleaned"] = 0
except Exception as e:
log.debug("ad_stats failed: %s", e)
return out

View File

@ -0,0 +1,39 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for the #ads aggregate breakdown (ref #755)."""
import sqlite3
import time
from secubox_toolbox import store
def _seed_db(tmp_path, monkeypatch):
db = tmp_path / "toolbox.db"
c = sqlite3.connect(str(db))
c.executescript(
"CREATE TABLE ad_block_stats(ad_host TEXT, site TEXT, action TEXT, hits INTEGER, bytes INTEGER, last_seen REAL, PRIMARY KEY(ad_host,site,action));"
"CREATE TABLE ad_block_client_host(mac_hash TEXT, ad_host TEXT, hits INTEGER, last_seen REAL, PRIMARY KEY(mac_hash,ad_host));"
"CREATE TABLE social_edges(ts INTEGER, client_mac_hash TEXT, src_site TEXT, tracker_domain TEXT, cookie_id_hash TEXT, ja4_hash TEXT, consent_state TEXT);"
)
now = int(time.time())
# two distinct cookie-trackers in window, one duplicate, one stale (>24h)
for cid, ts in [("A", now-60), ("A", now-30), ("B", now-60), ("C", now-90000), ("", now-10)]:
c.execute("INSERT INTO social_edges(ts,client_mac_hash,src_site,tracker_domain,cookie_id_hash,ja4_hash,consent_state) VALUES(?,?,?,?,?,?,?)",
(ts, "m", "s", "t", cid, "j", "none_seen"))
c.commit(); c.close()
monkeypatch.setattr(store, "DB_PATH", db)
return db
def test_ad_stats_trackers_seen_distinct_in_window(tmp_path, monkeypatch):
_seed_db(tmp_path, monkeypatch)
out = store.ad_stats(hours=24)
# distinct non-empty cookie ids in the last 24h = {A, B}; C is stale, "" excluded
assert out["trackers_seen"] == 2
assert out["pages_cleaned"] == 0 # no cosmetic_events table yet → 0 (Task 2 fills it)
def test_record_cosmetic_pages_summed_in_window(tmp_path, monkeypatch):
_seed_db(tmp_path, monkeypatch)
store.record_cosmetic_pages(3)
store.record_cosmetic_pages(2)
out = store.ad_stats(hours=24)
assert out["pages_cleaned"] == 5

View File

@ -66,10 +66,13 @@ def test_inline_csp_literal_and_proof_logic():
def test_inline_has_no_currentscript_no_fetch():
# #653 root cause: document.currentScript is null in an async context. The
# inline script MUST NOT read it, and MUST NOT fetch() (SW would hijack it).
# inline script MUST NOT read it, and MUST NOT fetch the BUNDLE at load time
# (a SW would hijack a same-origin /__toolbox/bundle fetch → no banner). The
# bundle is baked inline instead. (#740 toggle handlers DO fetch /set-level
# etc., but only on a user click — not a load-time SW-hijackable resource.)
s = bundle.inline_script("x", wg=True, csp=True)
assert "currentScript" not in s
assert "fetch(" not in s
assert "/__toolbox/bundle" not in s
def test_inline_keeps_guards_and_spa_hooks():
@ -133,6 +136,6 @@ def test_inline_route_returns_javascript_body():
body = resp.body.decode("utf-8")
assert "window.__SBX_LOADER__" in body
assert "currentScript" not in body
assert "fetch(" not in body
assert "/__toolbox/bundle" not in body # bundle baked inline, not fetched (#740 toggles fetch /set-* on click only)
assert 'var mh = "abc";' in body
assert 'var csp = "1";' in body

View File

@ -0,0 +1,30 @@
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for POST /__toolbox/sw-candidate (ref #753)."""
import asyncio
from secubox_toolbox import api
class _Req:
def __init__(self, payload):
self._payload = payload
async def json(self):
return self._payload
def test_sw_candidate_appends_and_dedupes(tmp_path, monkeypatch):
f = tmp_path / "sw-neuter-candidates.txt"
monkeypatch.setattr(api, "SW_CANDIDATES_FILE", f)
r1 = asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": ["www.cnn.com", "leparisien.fr"]})))
assert r1.status_code == 204
asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": ["www.cnn.com", "20minutes.fr"]})))
lines = [l.strip() for l in f.read_text().splitlines() if l.strip()]
assert sorted(lines) == ["20minutes.fr", "leparisien.fr", "www.cnn.com"] # deduped
def test_sw_candidate_ignores_bad_payload(tmp_path, monkeypatch):
f = tmp_path / "sw-neuter-candidates.txt"
monkeypatch.setattr(api, "SW_CANDIDATES_FILE", f)
r = asyncio.run(api.toolbox_sw_candidate(_Req({"hosts": [None, 123, ""]})))
assert r.status_code == 204
assert not f.exists() or f.read_text().strip() == ""

View File

@ -97,10 +97,6 @@
<h2>📊 Live metrics (24h)</h2>
<div class="kv" id="metrics"><span class="k">loading…</span><span class="v"></span></div>
</div>
<div class="card">
<h2>♥ Liveness</h2>
<div class="kv" id="health"><span class="k">loading…</span><span class="v"></span></div>
</div>
</div>
</section>
@ -435,15 +431,10 @@ async function loadClientDetail(macHash) {
}
async function loadHealth() {
// Liveness card removed (redundant generic status); /health still drives the
// top version-badge.
const h = await J('/health');
const el = document.getElementById('health');
if (h.__error) { el.innerHTML = `<span class="k">err</span><span class="v">${h.__error}</span>`; return; }
el.innerHTML = `
<span class="k">status</span> <span class="v">${h.status}</span>
<span class="k">module</span> <span class="v">${h.module}</span>
<span class="k">version</span> <span class="v">${h.version}</span>
`;
document.getElementById('version-badge').textContent = `v${h.version}`;
if (h && h.version) document.getElementById('version-badge').textContent = `v${h.version}`;
}
async function loadFilters() {
@ -617,10 +608,11 @@ async function loadAds() {
const kpi = document.getElementById('ads-kpi');
const esc = s => String(s).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
if (!d || d.__error) { kpi.innerHTML = `<span class="k">err</span><span class="v">${(d&&d.__error)||'no data'}</span>`; return; }
kpi.innerHTML = `<span class="k">Trackers &amp; pubs bloqués</span> <span class="v">${d.total_blocked||0}</span>`
+ ` <span class="k" title="estimation : un contenu bloqué n'est jamais téléchargé, on ne peut pas mesurer les octets réels — ~45 Ko/blocage">Ko évités <span style="opacity:.6">(est.)</span></span> <span class="v">~${Math.round((d.total_bytes||0)/1024)}</span>`
+ ` <span class="k">Silenced</span> <span class="v">${(d.by_action&&d.by_action.silent)||0}</span>`
+ ` <span class="k">Fenêtre</span> <span class="v">${d.window_hours||24}h</span>`;
kpi.innerHTML = `<span class="k">Pubs bloquées (204)</span> <span class="v">${d.total_blocked||0}</span>`
+ ` <span class="k">Trackers détectés</span> <span class="v">${d.trackers_seen||0}</span>`
+ ` <span class="k">Pages nettoyées</span> <span class="v">${d.pages_cleaned||0}</span>`
+ ` <span class="k">Drops réseau</span> <span class="v">${d.network_drops||0}</span>`
+ ` <span class="k" title="estimation : un contenu bloqué n'est jamais téléchargé, on ne peut pas mesurer les octets réels — ~45 Ko/blocage">Ko évités <span style="opacity:.6">(est.)</span></span> <span class="v">~${Math.round((d.total_bytes||0)/1024)}</span>`;
const hostRows = (d.top_hosts||[]).slice(0, 5).map(r=>`<tr><td><code>${esc(r.host)}</code></td><td>${r.hits}</td><td>~${Math.round((r.bytes||0)/1024)}</td></tr>`).join('');
const siteRows = (d.top_sites||[]).slice(0, 5).map(r=>`<tr><td><code>${esc(r.site)}</code></td><td>${r.hits}</td></tr>`).join('');
document.getElementById('ads-hosts').innerHTML = hostRows