fix(toolbox): content-aware binary streaming — restore banner on heavy sites

The #685 stream_large_bodies=1m streamed large HTML too (streamed bodies can't be banner-injected) → "plus de banner sur leparisien.fr". Replaced by the stream_binaries addon: streams only large NON-HTML (apk/xpi/video/octet-stream/ big downloads) verbatim so the R3 forging path doesn't corrupt them, while HTML is always buffered so inject_banner + ad_ghost work. toolbox 2.7.16. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fix(toolbox): report reads LIVE social graph, not frozen events (ref #686 )
2026-07-01 17:17:14 +00:00 · 2026-06-20 08:24:17 +02:00 · 2026-06-20 08:20:49 +02:00 · 2026-06-20 08:08:50 +02:00 · 2026-06-20 07:28:21 +02:00 · 2026-06-20 07:18:53 +02:00
8 changed files with 232 additions and 54 deletions
--- a/.claude/HISTORY.md
+++ b/.claude/HISTORY.md
@ -3,6 +3,25 @@

 ---

+## 2026-06-20 — kbin Tor shipped + client releases + ad-block/mitm hardening
+
+- **#683 MERGED (PR #684)** — kbin Tor egress quick-switch (switch + nft owner-match
+  tunnel, own-services exemption, reconciler+timer), dashboard/landing/banner metrics
+  fixes, 🧅 indicators (banner/webext/APK), APK persistent WG identity, landing+report
+  **redesign** (verdict gauge + donut/bars + collapsible details). Live on gk2; Tor armed.
+- **Client releases served from kbin**: `android-v0.4.0` (Latest) + `webext-v0.1.5`
+  published by CI; pinned webext tag bumped; board fetch-helpers pull them →
+  /wg/toolbox.apk (0.4.0) + /wg/toolbox.xpi (0.1.5). toolbox 2.7.12.
+- **#685 ad-learner hardened (2.7.13)** — NEVER_LEARN guard (Google/CDN/fonts/captcha/
+  auth/payment), AD_MIN_SITES 1→2, prune existing. Root cause of euronews breakage:
+  the learner had 204'd `www.google.com` → broke reCAPTCHA/consent. Also allowlisted
+  www.google.com/.fr live.
+- **mitm-wg stream_large_bodies=1m (2.7.14)** — large binary downloads (APK, CA) were
+  corrupted ONLY through the R3 tunnel (HTTP/2 buffer/reframe); now passed verbatim.
+- **OPEN [#686]** — android-toolbox non-root flow broken (CA auto-install needs root,
+  WG handoff → Play Store, tunnel not detected). Needs on-device dev/testing; rooted-vs-
+  non-rooted decision pending. #685 signing was a red herring (corrupt = mitm buffering).
+
 ## 2026-06-19 — kbin Tor egress quick-switch implemented DARK (#683, ToolBoX 2.7.1)

 - **Switch + tunnel** for routing kbin surfing through Tor, shipped **default-OFF /
--- a/packages/secubox-toolbox/conf/report-live.html.j2
+++ b/packages/secubox-toolbox/conf/report-live.html.j2
@ -75,17 +75,18 @@ ul{list-style:none;padding-left:.2rem}li{padding:.12rem 0;font-size:.82rem}li::b
 <body>

 {% set m = metrics or {} %}
-{% set sc = risk_score|default(0) %}
-{% set rl = risk_label|default('LOW') %}
+{# #686 — summary + graphs come from the LIVE social graph (graph_stats), not the
+   frozen events table. #}
+{% set gst = graph_stats or {} %}
+{% set sc = exposure_score|default(0) %}
 {% set ch = charts or {} %}
 {% set gcol = 'var(--phos-hot)' if sc < 30 else ('var(--amber)' if sc < 70 else 'var(--red)') %}
 {% set palette = ['#00dd44','#9e76ff','#ff8866','#66bbff','#ffb347','#ff4466'] %}
-{% set dpi_cls = dpi_classified or {} %}
-{% set cookies_p = cookies_providers or [] %}
-{% set geo_h = geo_top_hosts or [] %}
-{% set n_apps = (dpi_cls.top_apps|default([])|selectattr('app','ne','?')|list|length) %}
-{% set n_trackers = (cookies_p|map(attribute='count')|sum) %}
-{% set n_countries = (geo_h|map(attribute='country')|reject('equalto','')|list|unique|list|length) %}
+{% set n_trackers = gst.total_trackers|default(0) %}
+{% set n_sites = gst.total_sites|default(0) %}
+{% set n_countries = gst.total_countries|default(0) %}
+{% set n_antibot = gst.antibot_sites|default(0) %}
+{% set n_opgrade = gst.opgrade_sites|default(0) %}
 {% set _avatar = avatar_analysis or {} %}

 <h1>👁️ VILLAGE3B <span style="font-size:.8rem;color:var(--dim);font-weight:400">· mon rapport</span></h1>
@ -109,22 +110,21 @@ ul{list-style:none;padding-left:.2rem}li{padding:.12rem 0;font-size:.82rem}li::b
    </div>
  </div>
  <div class="verdict" style="color:{{ gcol }}">
-    {% if sc < 30 %}🟢 Tout va bien — {{ rl }}{% elif sc < 70 %}🟡 À surveiller — {{ rl }}{% else %}🔴 Attention — {{ rl }}{% endif %}
+    {% if sc < 30 %}🟢 Exposition faible{% elif sc < 70 %}🟡 Exposition modérée{% else %}🔴 Exposition élevée{% endif %}
  </div>
-  <p class="help">Score de risque de ton appareil. Plus il est <b>bas</b>, mieux tu es protégé.</p>
-  {% if risk_explanation %}<p style="font-size:.85rem;margin-top:.5rem">{{ risk_explanation }}</p>{% endif %}
+  <p class="help">Niveau d'exposition au pistage (traceurs croisés + acteurs opérateur/anti-bot). Plus c'est <b>bas</b>, mieux c'est.</p>
 </div>

-{# ── KPI row ── #}
+{# ── KPI row (LIVE social graph) ── #}
 <div class="kpis">
-  <div class="kpi"><div class="e">🌐</div><div class="n">{{ m.connections|default(0) }}</div><div class="l">connexions</div></div>
-  <div class="kpi"><div class="e">📡</div><div class="n">{{ m.unique_hosts|default(0) }}</div><div class="l">hôtes</div></div>
-  <div class="kpi"><div class="e">🍪</div><div class="n">{{ n_trackers }}</div><div class="l">trackers</div></div>
+  <div class="kpi"><div class="e">🍪</div><div class="n">{{ n_trackers }}</div><div class="l">traceurs</div></div>
+  <div class="kpi"><div class="e">🌐</div><div class="n">{{ n_sites }}</div><div class="l">sites</div></div>
  <div class="kpi"><div class="e">🌍</div><div class="n">{{ n_countries }}</div><div class="l">pays</div></div>
-  <div class="kpi"><div class="e">📺</div><div class="n">{{ n_apps }}</div><div class="l">apps</div></div>
-  <div class="kpi"><div class="e">🔒</div><div class="n">{{ m.tls_pinned|default(0) }}</div><div class="l">cert-pin</div></div>
+  <div class="kpi"><div class="e">🤖</div><div class="n">{{ n_antibot }}</div><div class="l">anti-bot</div></div>
+  <div class="kpi"><div class="e">📡</div><div class="n">{{ n_opgrade }}</div><div class="l">opérateur</div></div>
+  <div class="kpi"><div class="e">🔗</div><div class="n">{{ (graph.edges|default([]))|length }}</div><div class="l">liens</div></div>
 </div>
-<p class="help" style="text-align:center;margin-bottom:1rem">Ton appareil a contacté {{ m.unique_hosts|default(0) }} serveurs dans {{ n_countries }} pays, avec {{ n_trackers }} traceurs repérés.</p>
+<p class="help" style="text-align:center;margin-bottom:1rem">{{ n_trackers }} traceurs te suivent à travers {{ n_sites }} sites, depuis {{ n_countries }} pays.{% if n_opgrade %} Dont {{ n_opgrade }} de qualité opérateur.{% endif %}</p>

 {# ── GRAPHS ── #}
 <div class="card">
@ -158,18 +158,18 @@ ul{list-style:none;padding-left:.2rem}li{padding:.12rem 0;font-size:.82rem}li::b
      {% else %}<div class="empty">Pas encore de données géo</div>{% endif %}
    </div>

-    {# apps bars #}
+    {# top tracked sites bars #}
    <div style="grid-column:1/-1">
-      <div style="font-size:.82rem;color:var(--dim);margin-bottom:.4rem">📺 Quelles apps / services</div>
-      {% if ch.apps %}
-        {% for a in ch.apps %}
-        <div class="bar-row"><span class="bar-lbl">{{ a.emoji }} {{ a.label[:16] }}</span><span class="bar-track"><span class="bar-fill" style="width:{{ a.pct }}%;background:linear-gradient(90deg,var(--violet),#c9b6ff)"></span></span><span class="bar-val" style="color:var(--violet)">{{ a.count }}</span></div>
+      <div style="font-size:.82rem;color:var(--dim);margin-bottom:.4rem">🌐 Où tu es le plus pisté (traceurs par site)</div>
+      {% if ch.sites %}
+        {% for a in ch.sites %}
+        <div class="bar-row"><span class="bar-lbl">{{ a.label[:22] }}</span><span class="bar-track"><span class="bar-fill" style="width:{{ a.pct }}%;background:linear-gradient(90deg,var(--violet),#c9b6ff)"></span></span><span class="bar-val" style="color:var(--violet)">{{ a.count }}</span></div>
        {% endfor %}
-      {% else %}<div class="empty">Aucune app classifiée</div>{% endif %}
+      {% else %}<div class="empty">Pas encore de sites pistés</div>{% endif %}
    </div>

  </div>
-  <p class="help">Les traceurs suivent ta navigation entre sites. Les apps cert-pinning (🔒) refusent l'analyse — c'est bon signe.</p>
+  <p class="help">Les traceurs suivent ta navigation entre sites. « opérateur » = traceurs de niveau opérateur télécom (les plus intrusifs).</p>
 </div>

 {# ── LEVEL SWITCHER (action) ── #}
--- a/packages/secubox-toolbox/debian/changelog
+++ b/packages/secubox-toolbox/debian/changelog
@ -1,3 +1,53 @@
+secubox-toolbox (2.7.16-1~bookworm1) bookworm; urgency=medium
+
+  * fix: restore banner on heavy sites (leparisien.fr). The #685 stream_large_bodies=1m
+    streamed large HTML too (streamed bodies cannot be banner-injected). Replaced
+    by the content-aware stream_binaries addon: streams only large NON-HTML
+    (APK/XPI/video/octet-stream/big downloads) verbatim, HTML always buffered so
+    inject_banner + ad_ghost work.
+
+ -- Gerald KERMA <devel@cybermind.fr>  Sat, 20 Jun 2026 13:40:00 +0200
+
+secubox-toolbox (2.7.15-1~bookworm1) bookworm; urgency=medium
+
+  * fix(#686): /report/me/html reads the LIVE social graph (social.fetch_graph)
+    instead of the frozen events table (#662 cutover) — report was all-zeros even
+    when /social + webext showed data. Summary gauge = exposure score; KPIs
+    (traceurs/sites/pays/anti-bot/opérateur/liens) + graphs (trackers donut,
+    countries bars, top-pisté-sites bars) all from the live graph.
+
+ -- Gerald KERMA <devel@cybermind.fr>  Sat, 20 Jun 2026 13:00:00 +0200
+
+secubox-toolbox (2.7.14-1~bookworm1) bookworm; urgency=medium
+
+  * fix(#685): mitm-wg now streams large bodies (stream_large_bodies=1m) so big
+    binary downloads (APK, CA cert) pass through the R3 forging path verbatim
+    instead of being buffered/reframed over HTTP/2 — fixes "apk corrupt" /
+    "certificat vide" seen ONLY with the WG tunnel up. No addon touches non-HTML
+    bodies, so streaming is byte-transparent.
+
+ -- Gerald KERMA <devel@cybermind.fr>  Sat, 20 Jun 2026 12:00:00 +0200
+
+secubox-toolbox (2.7.13-1~bookworm1) bookworm; urgency=medium
+
+  * fix(#685): harden the ad-learner so it never 204s functional infra. The
+    aggressive promotion (AD_MIN_SITES=1) had hard-blocked www.google.com →
+    broke reCAPTCHA/consent on news sites (euronews). Now: a NEVER_LEARN guard
+    (Google/CDN/fonts/captcha/auth/payment registrables, matched on host +
+    registrable, env-extendable via SECUBOX_NEVER_LEARN), AD_MIN_SITES default
+    1 → 2, and existing never-learn/allowlisted entries are PRUNED from
+    learned-trackers.txt on each run.
+
+ -- Gerald KERMA <devel@cybermind.fr>  Sat, 20 Jun 2026 11:00:00 +0200
+
+secubox-toolbox (2.7.12-1~bookworm1) bookworm; urgency=medium
+
+  * chore: serve the new clients from kbin — bump pinned webext release tag
+    v0.1.4 → v0.1.5 (/wg/toolbox.xpi fallback + secubox-toolbox-fetch-xpi). The
+    APK serve path already pulls /releases/latest (now android-v0.4.0).
+
+ -- Gerald KERMA <devel@cybermind.fr>  Sat, 20 Jun 2026 09:00:00 +0200
+
 secubox-toolbox (2.7.11-1~bookworm1) bookworm; urgency=medium

  * feat: landing (kbin.gk2) restyled to match the new report — system font,
--- a/packages/secubox-toolbox/mitmproxy_addons/stream_binaries.py
+++ b/packages/secubox-toolbox/mitmproxy_addons/stream_binaries.py
@ -0,0 +1,53 @@
+# SPDX-License-Identifier: LicenseRef-CMSD-1.0
+# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
+# Source-Disclosed License — All rights reserved except as expressly granted.
+# See LICENCE-CMSD-1.0.md for terms.
+#
+# SecuBox-Deb :: toolbox :: stream large BINARY responses (#686)
+#
+# Replaces the content-agnostic `--set stream_large_bodies=1m`, which streamed
+# EVERY body >1MB — including large HTML (leparisien.fr) — and streamed bodies
+# can't be banner-injected → "plus de banner". Here we stream only large
+# NON-HTML responses (APK / XPI / video / octet-stream / big downloads) so they
+# pass through the HTTP/2 forging path VERBATIM (the buffer+reframe corrupted the
+# 14MB APK over the R3 tunnel), while HTML is always buffered so inject_banner /
+# ad_ghost still work.
+from mitmproxy import http
+
+_THRESHOLD = 1_000_000  # 1 MB
+# Always-stream binary content-types regardless of declared length (covers
+# chunked downloads with no Content-Length).
+_BIN_CT = (
+    "application/vnd.android.package-archive",  # .apk
+    "application/x-xpinstall",                  # .xpi
+    "application/octet-stream",
+    "application/zip",
+    "application/pdf",
+    "video/",
+    "audio/",
+)
+
+
+class StreamBinaries:
+    def responseheaders(self, flow: http.HTTPFlow) -> None:
+        try:
+            r = flow.response
+            if r is None:
+                return
+            ct = (r.headers.get("content-type", "") or "").lower()
+            if "text/html" in ct:
+                return  # NEVER stream HTML — banner + ad_ghost need the body
+            if any(b in ct for b in _BIN_CT):
+                r.stream = True
+                return
+            try:
+                cl = int(r.headers.get("content-length", "0") or "0")
+            except (TypeError, ValueError):
+                cl = 0
+            if cl >= _THRESHOLD:
+                r.stream = True
+        except Exception:
+            pass
+
+
+addons = [StreamBinaries()]
--- a/packages/secubox-toolbox/sbin/secubox-toolbox-autolearn
+++ b/packages/secubox-toolbox/sbin/secubox-toolbox-autolearn
@ -32,10 +32,35 @@ SPLICE_MIN_HITS = int(os.environ.get("SECUBOX_SPLICE_MIN_HITS", "20"))
 SPLICE_MAX = 2000
 MIN_SITES = 2          # cross-site threshold for operator-grade trackers
 MAX_ENTRIES = 8000
-# #656 — ad-candidate promotion (aggressive: 1 distinct site by default).
-AD_MIN_SITES = int(os.environ.get("SECUBOX_AD_MIN_SITES", "1"))
+# #656 — ad-candidate promotion. #685 hardening: require >= 2 distinct sites
+# (was 1 — a single-site host got hard-blocked, e.g. www.google.com → broke
+# reCAPTCHA/consent on euronews). Env-overridable.
+AD_MIN_SITES = int(os.environ.get("SECUBOX_AD_MIN_SITES", "2"))
 AD_ALLOWLIST = os.environ.get("SECUBOX_AD_ALLOWLIST",
                              "/var/lib/secubox/toolbox/ad-allowlist.txt")
+
+# #685 — NEVER-LEARN guard: registrables that host FUNCTIONAL content (CDNs,
+# fonts, captcha, auth, OS/payment services). The learner must NEVER 204 these —
+# blocking them breaks sites (www.google.com reCAPTCHA/consent broke euronews).
+# Checked against the host AND its registrable; existing entries are also pruned.
+_NEVER_LEARN_SEED = {
+    "google.com", "gstatic.com", "googleapis.com", "googleusercontent.com",
+    "googlevideo.com", "ytimg.com", "ggpht.com", "youtube.com", "recaptcha.net",
+    "apple.com", "icloud.com", "mzstatic.com", "cdn-apple.com", "cloudflare.com",
+    "jsdelivr.net", "jquery.com", "bootstrapcdn.com", "unpkg.com", "cdnjs.com",
+    "akamaihd.net", "akamai.net", "fastly.net", "edgekey.net", "edgesuite.net",
+    "microsoft.com", "office.com", "live.com", "windows.net", "azureedge.net",
+    "msftauth.net", "paypal.com", "paypalobjects.com", "stripe.com",
+}
+NEVER_LEARN = _NEVER_LEARN_SEED | {
+    d.strip().lower()
+    for d in os.environ.get("SECUBOX_NEVER_LEARN", "").split(",") if d.strip()
+}
+
+
+def _never_learn(host: str) -> bool:
+    h = (host or "").lower().strip(".")
+    return bool(h) and (h in NEVER_LEARN or (registrable(h) or h) in NEVER_LEARN)
 COOKIE_XSITE_TOP_N = int(os.environ.get("SECUBOX_COOKIE_XSITE_TOP_N", "5"))

 sys.path.insert(0, os.environ.get("SECUBOX_TOOLBOX_LIB", "/usr/lib/secubox/toolbox"))
@ -191,13 +216,16 @@ def _ad_feed() -> int:
            continue
        if reg in self_doms or any(h == d or h.endswith("." + d) for d in self_doms):
            continue
+        # #685 — never hard-block functional infra (CDN/fonts/captcha/auth).
+        if _never_learn(h):
+            continue
        # #658 — promote the EXACT host, NOT the registrable: blocking a tracker
        # subdomain (analytics.tiktok.com) must never block the parent site
        # (tiktok.com). Dedicated ad hosts are already registrable-level.
        promoted.add(h)
-    if not promoted:
-        return 0
-    # MERGE with existing learned-trackers.txt (union, dedup, cap).
+    # MERGE with existing learned-trackers.txt (union, dedup, cap). #685: also
+    # PRUNE any existing never-learn / allowlisted entries already on disk, so a
+    # previously mis-learned host (e.g. www.google.com) is cleaned on the next run.
    existing: set = set()
    try:
        if os.path.exists(OUT):
@ -208,7 +236,11 @@ def _ad_feed() -> int:
                        existing.add(ln)
    except Exception as e:
        sys.stderr.write(f"autolearn: ad merge read failed: {e}\n")
-    merged = sorted(existing | promoted)[:MAX_ENTRIES]
+    pruned = {e for e in existing
+              if _never_learn(e) or e in allow or (registrable(e) or e) in allow}
+    if not promoted and not pruned:
+        return 0
+    merged = sorted((existing - pruned) | promoted)[:MAX_ENTRIES]
    try:
        os.makedirs(os.path.dirname(OUT), exist_ok=True)
        tmp = OUT + ".tmp"
--- a/packages/secubox-toolbox/sbin/secubox-toolbox-fetch-xpi
+++ b/packages/secubox-toolbox/sbin/secubox-toolbox-fetch-xpi
@ -16,7 +16,7 @@ DEST_DIR="/var/lib/secubox/toolbox/webext"
 DEST="${DEST_DIR}/secubox-toolbox-webext.xpi"
 # Tag-pinned (not /latest/): the webext release is make_latest:false so it
 # doesn't steal "latest" from the Android APK release. Bump on new webext-v*.
-RELEASE_URL="https://github.com/CyberMind-FR/secubox-deb/releases/download/webext-v0.1.4/secubox-toolbox-webext.xpi"
+RELEASE_URL="https://github.com/CyberMind-FR/secubox-deb/releases/download/webext-v0.1.5/secubox-toolbox-webext.xpi"

 log() { logger -t "$MODULE" -- "$*" 2>/dev/null || echo "[$MODULE] $*" >&2; }

--- a/packages/secubox-toolbox/sbin/secubox-toolbox-mitm-wg-launch
+++ b/packages/secubox-toolbox/sbin/secubox-toolbox-mitm-wg-launch
@ -87,6 +87,10 @@ ARGS=(
    # upstream in mitmproxy 10.4 ; with mitmproxy 11+ we can safely
    # re-enable keep-alive.  Halves TCP handshakes towards busy CDNs.
    --set keep_host_header=true
+    # #686 — large-binary streaming is now content-aware via the stream_binaries
+    # addon (streams APK/XPI/video/large NON-HTML verbatim) instead of the blunt
+    # `stream_large_bodies=1m`, which also streamed large HTML and killed banner
+    # injection on heavy sites (leparisien.fr).
 )

 if [ -n "$IGNORE_REGEX" ]; then
@ -115,7 +119,7 @@ fi
 # ad_ghost (#566) runs right after protective_mode: for R3+/R4 it 204s known
 # ad/tracker hosts (bandwidth save) at request time and injects ad-hiding CSS
 # on HTML responses. Gated by the modular filter config (toolbox WebUI).
-for addon in tls_splice inject_xff utiq_defense protective_mode privacy_guard ad_ghost media_cache local_store social_graph inject_banner dpi cookies avatar ja4 soc_relay cert_pin_detect media_stats; do
+for addon in stream_binaries tls_splice inject_xff utiq_defense protective_mode privacy_guard ad_ghost media_cache local_store social_graph inject_banner dpi cookies avatar ja4 soc_relay cert_pin_detect media_stats; do
    ARGS+=(-s "$ADDON_DIR/${addon}.py")
 done

--- a/packages/secubox-toolbox/secubox_toolbox/api.py
+++ b/packages/secubox-toolbox/secubox_toolbox/api.py
@ -1632,7 +1632,7 @@ async def wg_toolbox_apk() -> Response:
 _WEBEXT_XPI = Path("/var/lib/secubox/toolbox/webext/secubox-toolbox-webext.xpi")
 _WEBEXT_XPI_RELEASE = (
    "https://github.com/CyberMind-FR/secubox-deb/releases/download/"
-    "webext-v0.1.4/secubox-toolbox-webext.xpi"
+    "webext-v0.1.5/secubox-toolbox-webext.xpi"
 )


@ -2342,11 +2342,12 @@ def _classify_apps(hosts: set[str]) -> list[str]:
    return apps


-def _build_report_charts(session: dict) -> dict:
-    """Graph-ready aggregates for the simplified report (trackers donut,
-    countries bars, apps bars). Defensive / fail-empty. Each list item has
-    {label, emoji/flag, count, pct}; trackers also carry cumulative start/end
-    for a CSS conic-gradient donut."""
+def _build_report_charts(graph: dict) -> dict:
+    """Graph-ready aggregates for the report, from the LIVE social graph
+    (social.fetch_graph). The events table froze at the #662 cutover, so the
+    report reads the SAME source as /social + the webext (was the bug: it read
+    the dead events → all zeros). Returns trackers donut + countries bars + sites
+    bars; trackers also carry cumulative start/end for the CSS conic-gradient."""
    def _top_pct(items: list, n: int = 6) -> list:
        items = [it for it in items if it.get("count")]
        items.sort(key=lambda x: x["count"], reverse=True)
@ -2356,30 +2357,33 @@ def _build_report_charts(session: dict) -> dict:
            it["pct"] = round(100 * it["count"] / total)
        return items

-    cp = session.get("cookies_providers") or []
+    g = graph or {}
+    nodes = g.get("nodes") or []
+
    trackers = _top_pct([
-        {"label": p.get("provider", "?"), "emoji": p.get("emoji", "🍪"),
-         "count": int(p.get("count", 0) or 0)} for p in cp])
+        {"label": (n.get("domain") or n.get("id") or "?"), "emoji": "🍪",
+         "count": int(n.get("hits", 0) or 0)} for n in nodes])
    cum = 0
    for it in trackers:
        it["start"] = cum
        cum += it["pct"]
        it["end"] = cum

-    by_country: dict = {}
-    for h in (session.get("geo_top_hosts") or []):
-        key = (h.get("flag") or "🏴", h.get("country") or "?")
-        by_country[key] = by_country.get(key, 0) + int(h.get("count", 0) or 0)
    countries = _top_pct([
-        {"flag": k[0], "label": k[1], "count": v} for k, v in by_country.items()])
+        {"flag": c.get("flag") or "🏴", "label": (c.get("country_iso") or "?"),
+         "count": int(c.get("hits", 0) or 0)} for c in (g.get("by_country") or [])])

-    dc = session.get("dpi_classified") or {}
-    apps = _top_pct([
-        {"label": a.get("app", "?"), "emoji": a.get("emoji", "📦"),
-         "count": int(a.get("count", 0) or 0)}
-        for a in (dc.get("top_apps") or []) if a.get("app") not in (None, "", "?")])
+    # top tracked sites = number of DISTINCT trackers reaching each first-party
+    # site (from each node's sites list) — "where you're tracked most".
+    site_trk: dict = {}
+    for n in nodes:
+        for s in (n.get("sites") or []):
+            if s:
+                site_trk[s] = site_trk.get(s, 0) + 1
+    sites = _top_pct([{"label": s, "emoji": "🌐", "count": c}
+                      for s, c in site_trk.items()])

-    return {"trackers": trackers, "countries": countries, "apps": apps}
+    return {"trackers": trackers, "countries": countries, "sites": sites}


 # NOTE: route order matters in FastAPI — specific routes (/report/me,
@ -2408,6 +2412,21 @@ async def report_me_html(request: Request) -> HTMLResponse:
        )
    ip = _client_ip(request) or (request.client.host if request.client else "?")
    session = _aggregate_session(mac_hash)
+    # #686 — the events table froze at the #662 cutover, so the report's numbers
+    # came out all-zero. The LIVE per-client data is the social graph (same source
+    # /social + the webext use). Pull it (7d) and drive the summary + graphs off it.
+    try:
+        from . import social as _social
+        graph = _social.fetch_graph(mac_hash, since_seconds=7 * 86400)
+    except Exception:
+        graph = {"stats": {}, "nodes": [], "by_country": []}
+    gs = graph.get("stats") or {}
+    # Honest exposure indicator (0-100) from the live graph: tracker breadth +
+    # operator-grade / anti-bot presence. Not a "compromise" score (events dead).
+    exposure_score = min(100, int(
+        (gs.get("total_trackers", 0) or 0) * 1.5
+        + (gs.get("opgrade_sites", 0) or 0) * 12
+        + (gs.get("antibot_sites", 0) or 0) * 8))
    # Phase 3 (#492) : pass query args + force no-cache so iPhone Safari
    # actually fetches the new template.
    # Phase 6 (#496) : also pass wg_enabled so dashboard R3 link renders
@ -2420,7 +2439,8 @@ async def report_me_html(request: Request) -> HTMLResponse:
        current_level=store.get_client_level(mac_hash) if mac_hash else "r1",
        wg_enabled=wg_enabled,
        cumulative=cumulative,
-        charts=_build_report_charts(session),
+        graph=graph, graph_stats=gs, exposure_score=exposure_score,
+        charts=_build_report_charts(graph),
        **session,
    )
    return HTMLResponse(html, headers={
Author	SHA1	Message	Date
CyberMind-FR	9eb2d68b92	fix(toolbox): content-aware binary streaming — restore banner on heavy sites Some checks failed License Headers / check (push) Has been cancelled Details The #685 stream_large_bodies=1m streamed large HTML too (streamed bodies can't be banner-injected) → "plus de banner sur leparisien.fr". Replaced by the stream_binaries addon: streams only large NON-HTML (apk/xpi/video/octet-stream/ big downloads) verbatim so the R3 forging path doesn't corrupt them, while HTML is always buffered so inject_banner + ad_ghost work. toolbox 2.7.16. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 08:24:17 +02:00
CyberMind-FR	cd3bbcadf3	fix(toolbox): report reads LIVE social graph, not frozen events (ref #686 ) /report/me/html showed all-zeros while /social + the webext showed data — it read the events table (frozen since the #662 cutover). Now it pulls social.fetch_graph (7d) and drives the gauge (exposure score), KPIs (traceurs/ sites/pays/anti-bot/opérateur/liens) and graphs (trackers donut, countries bars, top-pisté-sites bars) off the live graph. Verified live: 9433ceb9 → 113 traceurs /183 sites/13 pays (was 0). toolbox 2.7.15. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 08:20:49 +02:00
CyberMind-FR	00184bdbec	docs: session checkpoint 2026-06-20 — Tor shipped, client releases, ad-block/mitm hardening, #686 open Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 08:08:50 +02:00
CyberMind-FR	c7cc7bbe33	fix(toolbox): mitm-wg stream large bodies — stop corrupting APK/cert via R3 (ref #685 ) Large binary downloads (14MB APK, CA cert) were corrupted/truncated ONLY through the R3 WG tunnel (user: "disabling the wg and the apk is okay"). The HTTP/2 forging path buffered+reframed them. Set stream_large_bodies=1m so big responses pass through verbatim. No addon touches non-HTML bodies (all text/html-gated), so streaming is byte-transparent. toolbox 2.7.14. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 07:28:21 +02:00
CyberMind-FR	b936b2dbf7	fix(toolbox): harden ad-learner — never 204 functional infra (ref #685 ) The learner promoted ad candidates seen on a single site (AD_MIN_SITES=1) with no functional-infra guard, so it hard-blocked www.google.com → broke reCAPTCHA/ consent on news sites (euronews). Now: - NEVER_LEARN guard: Google/CDN/fonts/captcha/auth/payment registrables matched on host + registrable (env-extendable via SECUBOX_NEVER_LEARN). - AD_MIN_SITES default 1 → 2 (a one-site host no longer auto-blocks globally). - Existing never-learn / allowlisted entries PRUNED from learned-trackers.txt each run (cleans previously mis-learned hosts). Verified live on gk2: www.google.com/.fr pruned; real trackers (GA) stay blocked. toolbox 2.7.13. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 07:18:53 +02:00
CyberMind-FR	5bd107cb46	chore(toolbox): serve new clients from kbin — pin webext v0.1.5 (ref #683 ) Bump the pinned webext release tag v0.1.4 → v0.1.5 in the /wg/toolbox.xpi fallback (api.py) and secubox-toolbox-fetch-xpi. APK serve path already pulls /releases/latest (android-v0.4.0). toolbox 2.7.12. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 06:53:07 +02:00