Jun 14, 2026·securitypentestidorplaywright

Flipping my test chat into attack mode: security and pentest in VS Code

Security testing always loses the calendar fight. You build the feature, you write the happy-path test, and the part where someone attacks the thing you just built gets deferred to a quarter that never comes — partly because it lives in a different tool, with a different setup, in a different headspace.

What finally got me doing it regularly was that it stopped being a separate tool. In the Hover VS Code extension, the chat I already use to author functional tests has two more modes, and switching is a toggle:

🟠 Security — business / authorization testing. Hover routes the debug Chrome through a local HTTPS MITM, captures the API calls your flow makes, and lets the agent re-issue them with mutations to probe for IDOR, auth bypass, and parameter tampering.
🔴 Pentest — offensive testing against your own dev app: SQLi, XSS, SSTI, SSRF, open-redirect, IDOR. The output is a findings report that says what it did and didn't test.

Both run inside the extension — @hover-dev/security and @hover-dev/pentest are packed into its staged engine — so flipping the mode is all it takes. And there's zero external setup: no mitmproxy, no Python, no system CA install. Hover generates a one-off CA the first time it starts and pins it to the debug Chrome, so your OS trust store is never touched.

Probing an IDOR, hands-on

Say my app has an orders page. In 🟠 Security mode I describe the flow the way I'd describe any test:

@account open my orders, then check whether I can read someone else's order

The agent operates the app to generate real traffic, sees the GET /orders/1001 call go by, and replays it with the resource id swapped:

GET /orders/999  →  200 OK   (expected 403)

That 200 is the finding: another user's order leaked. The mode is built around that loop — capture real flows, mutate the part that should be access-controlled, watch the response.

The output is still a Playwright spec

This is the part that fits Hover's whole reason for existing. A confirmed authorization finding doesn't just get logged — it crystallizes into __vibe_tests__/<slug>.security.spec.ts, a plain Playwright test using the request fixture:

test('GET /orders/:id enforces ownership', async ({ request }) => {
  const res = await request.get('/orders/999', {
    headers: { authorization: `Bearer ${process.env.HOVER_LOCAL_TOKEN}` },
  });
  expect(res.status()).toBe(403); // was 200 — IDOR
});

So the vulnerability becomes a regression gate that runs in CI with no proxy and no agent. You don't just find the bug once; you keep it found.

Pentest mode (🔴) is the offensive sibling — it operates the app to generate traffic, then attacks the captured flows in-band, and writes a Markdown findings report with severity, a proof of concept, and an explicit "not tested" section so you know the boundary of what it covered.

The guardrails

This is sharp enough to be dangerous, so the constraints are deliberate:

Your own app only. Pentest mode is origin-locked to your dev server. It's for testing things you're authorized to test.
You can see you're armed. The mode switch tints orange (red for pentest) so you never forget you're in an altered state.
The agent stays browser-only by default. An opt-in, read-only codeContext switch lets it read your server code to confirm a finding and cite the exact file:line — off unless you turn it on.

I'm not going to pretend a mode switch replaces a real pentest engagement. But the gap it closes is the one that actually bites: the routine authz checks that never got written because they lived in another tool. Now they live in the chat I already have open.

Try Hover on your own app.

Install the VS Code extension. Author tests with AI, ship plain Playwright.

Install on VS Code Marketplace →