GemmaForge — public-repo vulnerability scan

Paste any public GitHub repo. We shallow-clone it, sweep the source files with the spanmax-trained linear + per-CWE probes on Gemma 4 E2B layer-8 activations, and rank the regions that look most CVE-shaped.

Calibrated thresholding. Each lead has a confidence in [0, 1]; the F1-max threshold τ on the heldout calibration set is 0.929 (P=0.49, R=0.35, F1=0.41, AUC=0.70). Sub-τ leads are kept in the table for inspection but are not painted on the file viewer — the probe didn't really fire there.

Runs on ZeroGPU (H200-backed). Caps: 400 files / 800 chunks default (max 2000) / 100 MB / 105s wall. Full sweep: clone the repo and run python -m src.scan.

Public GitHub repo

Languages (comma-sep)

Subset of: py, js, ts, c, cpp

Top-K leads

1 50

Min probe confidence

0 0.95

Max chunks

Probe forward-pass budget. Higher → deeper coverage, longer scan.

50 2000

Try a small open-source repo

Ranked leads (gemmaforge.leads/v1) — sub-τ rows marked '·'

Ranked leads (gemmaforge.leads/v1) — sub-τ rows marked '·'

Download leads.jsonl

Caveats: probes classify representation, not exploitability. Spanmax probe at F1-max τ: P=49%, R=35% on heldout — high-precision above τ, but expect FPs. Full eval suite: RESULTS.md.