Hover is an open-source VS Code extension that turns plain-English chat into end-to-end tests. AI drives your real Chrome once to explore a flow, then Hover crystallizes the verified run into a standard @playwright/test spec that runs in CI with no AI in the loop.

How is Hover different from other AI testing tools?

Other AI test tools keep a model in the loop at runtime and re-generate the test on every run, so CI keeps paying for tokens and results drift. Hover spends the model once, at authoring time, and the artifact it leaves behind is deterministic, human-readable @playwright/test code. Green builds never pay a recurring AI tax.

What does Hover cost to run?

Hover is free and open source. It bundles no model SDK and no API keys — it spawns the coding-agent CLI (Claude Code or OpenAI Codex) already on your PATH, running on your own subscription or API key. There is no per-token resale.

Can Hover do security testing?

Yes. The same chat flips into an API-testing mode (IDOR / authz probing that crystallizes confirmed findings into .api-test.spec.ts CI gates) and a pentest mode (offensive, white-box, own-app-only — SQLi / XSS / SSTI / SSRF — writing a findings report).

API & security testing

Hover covers API and application-security testing in the same chat as functional testing — flip a mode switch. There are two modes:

🟠 API testing — functional + business/authorization testing of your API surface. Hover routes the debug Chrome through a local HTTPS MITM and lets the agent re-issue captured API calls with mutations to verify auth / status codes / access control and probe for IDOR / authentication bypass / parameter tampering. Confirmed findings crystallize into .api-test.spec.ts regression specs that run in CI without the proxy.
🔴 Pentest — offensive, white-box testing against your own dev app: SQLi / XSS / SSTI / SSRF / open-redirect / IDOR. The output is a findings report that says what it did and didn't test. Authorized own-app testing only.

Both are taught by seeds — small probe recipes (8 access-control classes + 9 vulnerability classes). The full catalogue ships built-in.

Both modes run inside the VS Code extension: @hover-dev/api-test and @hover-dev/pentest are packed into the extension's staged engine, so flipping the mode switch is all it takes — no separate install.

Why a separate mode

Hover's normal mode is for building features. Security testing is for attacking what you just built — the agent's prompt is different ("look for authz bypass", not "test the happy path"), its captured traffic is routed through a local MITM proxy, and the mode switch tints orange (red for pentest) so you can never forget you're in altered state.

How the MITM works

Zero external dependencies — no mitmproxy, no Python, no system CA install. Hover uses mockttp (the engine behind HTTP Toolkit) for HTTPS MITM, generates a one-off CA the first time it starts, and pins it to the debug Chrome via --ignore-certificate-errors-spki-list — your OS trust store stays untouched.

The proxy is resident: it starts with the engine (transparent passthrough by default) and the single debug Chrome is launched pointed through it. There is no second browser — entering API-testing mode just flips the proxy from passthrough to recording.

Don't commit the CA

The CA private key persists under <your-project>/.hover/ca/ca.key. The shipped .gitignore includes .hover/ already; if you removed it, add it back.

Usage

In the Hover chat, flip the mode switch to 🟠 API testing.
The chat's running border turns orange to signal altered state. No new browser opens — the same debug Chrome (already routed through the resident proxy) simply starts having its traffic recorded.
Drive the page as a normal user would — log in (mention @account to have the agent sign in), navigate, submit forms. Every HTTPS request is captured.
Ask the agent to probe, for example:
```
list_flows, then look for IDOR vulnerabilities in the order endpoints
```
The agent uses mcp__hover_dev_api_test_flows__list_flows to enumerate the API surface, get_flow to inspect specific requests, and replay_flow to test mutations.
When findings show up in the Result + Findings cards, click Save as spec to crystallise the recorded replay_flow checks into a __vibe_tests__/<slug>.api-test.spec.ts regression test that runs in CI with vanilla @playwright/test. See Save as an API-test spec.

What the agent looks for

The system prompt restricts the agent to browser-reachable vulnerability classes, in this priority order:

1. Authorisation / authentication (highest signal)

IDOR — change a resource id in a captured URL and replay. A 200 OK is the vulnerability.
Authentication bypass — drop or swap the auth header in a replay.
Parameter tampering — mutate request body fields (user_id, role, price, isAdmin) and replay.
Mass assignment — add fields the form didn't expose (admin: true, email: "victim@…") and check if they take effect.

2. Frontend / browser-side issues

XSS — inject <script>, javascript:, or onerror= into URL params, form inputs, and postMessage handlers.
Open redirects — find URL params that control redirect targets.
DOM clobbering / prototype pollution — only flagged when the agent can demonstrate concrete impact, not theoretical surface.
Missing security headers — CSP, X-Frame-Options, HSTS, SameSite cookies.

PII in URL query strings (email, name, phone in GET params).
Cookies without Secure / HttpOnly / SameSite when carrying session data.
Third-party requests carrying user data before consent was granted.

Scope boundaries

In 🟠 API testing mode (the MITM-replay functional + business/authz mode described above), the agent will refuse to attempt:

SQL injection, SSRF, command injection, deserialisation attacks — out of scope for this authz-focused MITM-replay mode. (These offensive classes live in 🔴 Pentest mode instead — run against your own dev app.)
Automated fuzzing loops — API-testing mode stays surgical: one hypothesis, one targeted replay, one observation.
Modifying CSP / cookie settings before testing — the application is probed as deployed.
Real-user-data exfiltration — this is a dev environment; the agent uses placeholder ids when demonstrating an issue.

For offensive SQLi / XSS / SSTI / SSRF / open-redirect / IDOR testing, switch to 🔴 Pentest mode — it runs those checks against your own dev app and produces a findings report. For deep server-side fuzzing, an actual server-side scanner (sqlmap, ZAP active scan, etc.) remains the right tool.

Tools available to the agent in API-testing mode

Tool	Purpose
`list_flows()`	Enumerate captured HTTP flows (no bodies — just method / url / status / mutation marker).
`get_flow(id)`	Full request + response headers + body for one flow.
`replay_flow(id, mutation?)`	Re-issue a captured flow with optional method / url / headers / body overrides. The new flow is added to the store with its own id.
`clear_flows()`	Drop captured flows between probe rounds.
`mcp__playwright__*`	Standard browser-driving tools — navigate, click, fill, screenshot, evaluate.

Mutations to replay_flow use a small JSON shape:

{
  method?: string;                              // override HTTP method
  url?: string;                                 // override URL — typical IDOR test
  headers?: Record<string, string | null>;      // overrides; null deletes
  bodyText?: string;                            // replace UTF-8 body
}

The shape mirrors the agent-facing MCP schema, so what you see in the docs is what the agent receives in its tool catalogue.

Reporting style

When the agent finishes, findings render in a colour-coded Findings card next to the Result card. The agent uses these markers in its ## Findings block:

Bug — concrete vulnerability with reproducible impact. Red.
Minor — weak hardening, no immediate exploit (e.g. missing header). Amber.
(no marker) — informational observation. Neutral.

Crystallized output

Spec output looks like:

// __vibe_tests__/orders-idor-victim-can-view-their-own.spec.ts
import { test, expect } from '@playwright/test';

test('User A cannot read User B order', async ({ page, request }) => {
  // Log in as User A
  await page.goto('/login');
  await page.getByLabel('Email').fill('userA@example.com');
  await page.getByLabel('Password').fill('test-password');
  await page.getByRole('button', { name: 'Sign in' }).click();

  // Attempt to read User B's order
  const res = await request.get('/api/orders/userB-1', {
    headers: { cookie: (await page.context().cookies()).map(c => `${c.name}=${c.value}`).join('; ') },
  });

  expect(res.status()).toBe(403);
});

The MITM proxy is not part of this spec. The replay primitive lives in Playwright's own request fixture — CI runs this with vanilla @playwright/test, no Hover, no @hover-dev/api-test, no mockttp.

Implementation primer

For contributors who want to extend the plugin or write a similar one (@hover-dev/api-test is built on the same plugin API as any optional Hover mode):

packages/api-test/src/mitm/ — mockttp lifecycle (CA generation, FlowStore, proxy wrapper, replay primitives).
packages/api-test/src/control-plane.ts — loopback HTTP API the MCP server talks to (Bearer-token auth on a process-random secret).
packages/api-test/src/mcp/server.ts — the stdio MCP server using @modelcontextprotocol/sdk. Tool descriptions explicitly mention IDOR / authz-bypass / parameter-tampering use cases so the agent picks the right one.
packages/api-test/src/index.ts — the plugin manifest itself.

Limitations (honest)

Service workers — Playwright's page.route() historically can't see SW-mediated requests. The MITM proxy bypasses this (it's at the network layer, not the renderer layer), so capture works fine. But if your saved spec relies on observing a SW-routed request, you'll need to express it as page.request.fetch() (which goes around the SW) rather than page.route().
HTTP/3 / QUIC — Chrome will quietly downgrade through the proxy. Not visible as h3 in the captured flow list.
Cross-origin iframes — captured, but the flow list is currently flattened; correlating which iframe a flow came from is future work.
Session recording for API-test sessions — click→spec semantics don't apply to a network-probing session. Instead, the agent calls replay_flow({ intent, expectStatus }) to record API/authz checks, and Save as spec crystallises them into a Playwright regression spec. See Save as an API-test spec.