GemmaForge — public-repo vulnerability scan
Paste any public GitHub repo. We shallow-clone it, sweep the source files with the spanmax-trained linear + per-CWE probes on Gemma 4 E2B layer-8 activations, and rank the regions that look most CVE-shaped.
Calibrated thresholding. Each lead has a confidence in [0, 1]; the F1-max threshold τ on the heldout calibration set is 0.929 (P=0.49, R=0.35, F1=0.41, AUC=0.70). Sub-τ leads are kept in the table for inspection but are not painted on the file viewer — the probe didn't really fire there.
Runs on ZeroGPU (H200-backed). Caps: 400 files / 800 chunks
default (max 2000) / 100 MB / 105s wall. Full sweep: clone
the repo and run python -m src.scan.
Ranked leads (gemmaforge.leads/v1) — sub-τ rows marked '·'
File-level viewer over the most recent scan. Click a file to see lead overlays. Below-τ leads are listed in the bottom table greyed out but not painted on the code.
Pre-computed scan of 136 SVEN heldout microrepos. Each repo has one vulnerable file + a fixed counterpart + a few safe decoys with line-level ground truth. Same probe, same τ.
Caveats: probes classify representation, not exploitability. Spanmax probe at F1-max τ: P=49%, R=35% on heldout — high-precision above τ, but expect FPs. Full eval suite: RESULTS.md.