Architecture

A condensed extract of the design constraints. For implementation depth, see CLAUDE.md at the repo root.

The flow

Hover is MCP-first. The authoring engine is an MCP server you add to your own coding agent; that agent is the intelligence, and Hover guarantees fidelity at the output. The full path of one test_app run:

your coding agent (Claude Code / Cursor / …)
  ↕ MCP (stdio)
@hover-dev/mcp  ──  grounded actuation tools + crystallize_spec
  │   (a thin frontend over the engine)
@hover-dev/core  ──  the engine: launch/connect debug Chrome, grounded
  │                   locate + actuate, deterministic crystallize, .hover/ memory
  ↕ CDP
isolated debug Chrome
  ↕ DOM
your dev server / app
       │
       └─ crystallize_spec ──▶ __vibe_tests__/<flow>.spec.ts   (plain Playwright, no AI)

The roles split cleanly:

@hover-dev/mcp — the thin MCP frontend. Registers the grounded tools (click_control / fill_control / …), browser_navigate / browser_snapshot, crystallize_spec, the recall / record_fact memory tools, and the test_app prompt. Spawns no agent of its own.
@hover-dev/core — the engine. Owns the Playwright CDP preflight, the grounded locate-and-actuate over CDP, the deterministic writeSpec crystallizer, and the .hover/ memory loaders.
The VS Code extension — an engine-free review cockpit: a Business Map graph + a Dashboard. It reviews; it drives no agent.

Boundary constraints

These are load-bearing — several are non-obvious:

The agent never launches its own Chromium. The engine connects to a debug Chrome on the CDP port via connectOverCDP and picks the existing context / page whose URL matches the target origin.
The engine is allowed to spawn one specific Chrome: the isolated debug Chrome under <tmpdir>/hover-chrome, launched on demand. It is not the user's primary Chrome profile.
Grounded actuation is the fidelity mechanism. The agent acts through click_control / fill_control / select_control / check_control, which take a grounded target (role+name → testId → text) read off the snapshot and run it via page.getByRole(...) over CDP. The selector that drives the action is the one crystallized — so record == replay.
Crystallization is deterministic. crystallize_spec translates the recorded grounded steps to Playwright with no LLM at code-emit time. The agent never freehand-writes the spec.
No upload path. Hover bundles no model and no keys (BYO-CLI). Your agent talks to its own provider; the engine has no LLM SDK code and no telemetry.
Generated Playwright code prefers page.getByRole / getByText over CSS / XPath selectors.
Cookies / localStorage never transit the engine; auth state stays inside the browser and is handled by Playwright in-process.

Why isolated debug Chrome, not the user's normal browser?

Hover deliberately does not attach to the user's primary Chrome profile. Doing so would require relaunching the everyday browser with --remote-debugging-port and would expose every tab, cookie, and extension to whatever the agent does. The trade-off is honest: you log into the app once inside the debug Chrome, but the profile dir at <tmpdir>/hover-chrome persists across runs.

Why grounded tools instead of a free-form "click the button"?

A loose element description doesn't round-trip to a replayable selector — it crystallizes as a confabulated getByText that may not match next time. The grounded tools take a target read straight off the snapshot, so the saved selector is the one that actually drove the run. This is the whole reason the MCP exposes its own actuation tools rather than handing the agent a generic browser MCP.