From 9d63844f3a28fde70b19500422f17379e99e588a Mon Sep 17 00:00:00 2001 From: main Date: Fri, 20 Mar 2026 16:00:30 -0400 Subject: Refound Spinner as an austere frontier ledger --- docs/architecture.md | 578 ++++++++++----------------------------------------- 1 file changed, 106 insertions(+), 472 deletions(-) (limited to 'docs/architecture.md') diff --git a/docs/architecture.md b/docs/architecture.md index e274ad5..2882c72 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,9 +1,8 @@ # Fidget Spinner Architecture -## Current Shape +## Runtime Shape -The current MVP implementation is intentionally narrower than the eventual full -product: +The current runtime is intentionally simple and hardened: ```text agent host @@ -22,21 +21,19 @@ spinner MCP host +-- disposable MCP worker | | | +-- per-project SQLite store - | +-- per-project blob directory - | +-- git/worktree introspection - | +-- atomic experiment closure + | +-- frontier / hypothesis / experiment / artifact services + | +-- navigator projections | v /.fidget_spinner/ ``` -There is no long-lived daemon yet. The first usable slice runs MCP from the CLI -binary, but it already follows the hardened host/worker split required for -long-lived sessions and safe replay behavior. +There is no long-lived daemon yet. The CLI binary owns the stdio host and the +local navigator. ## Package Boundary -The package currently contains three coupled layers: +The package contains three coupled crates: - `fidget-spinner-core` - `fidget-spinner-store-sqlite` @@ -47,7 +44,7 @@ And two bundled agent assets: - `assets/codex-skills/fidget-spinner/SKILL.md` - `assets/codex-skills/frontier-loop/SKILL.md` -Those parts should be treated as one release unit. +These are one release unit. ## Storage Topology @@ -56,524 +53,161 @@ Every initialized project owns a private state root: ```text /.fidget_spinner/ project.json - schema.json state.sqlite - blobs/ ``` Why this shape: -- schema freedom stays per project - migrations stay local - backup and portability stay simple -- we avoid premature pressure toward a single global schema +- no global store is required +- git remains the code substrate instead of being mirrored into Spinner -Cross-project search can come later as an additive index. +## Canonical Types -## State Layers +### Frontier -### 1. Global engine spine +Frontier is a scope and grounding object, not a graph vertex. -The engine depends on a stable, typed spine stored in SQLite: +It owns: -- nodes -- node annotations -- node edges -- frontiers -- runs -- metrics -- experiments -- event log - -This layer powers traversal, indexing, archiving, and frontier projection. - -### 2. Project payload layer - -Each node stores a project payload as JSON, namespaced and versioned by the -project schema in `.fidget_spinner/schema.json`. - -This is where domain-specific richness lives. - -Project field specs may optionally declare a light-touch `value_type` of: - -- `string` -- `numeric` -- `boolean` -- `timestamp` - -These are intentionally soft hints for validation and rendering, not rigid -engine-schema commitments. - -### 3. Annotation sidecar - -Annotations are stored separately from payload and are default-hidden unless -explicitly surfaced. - -That separation is important. It prevents free-form scratch text from silently -mutating into a shadow schema. - -## Validation Model - -Validation has three tiers. - -### Storage validity - -Hard-fail conditions: - -- malformed engine envelope -- broken ids -- invalid enum values -- broken relational integrity - -### Semantic quality - -Project field expectations are warning-heavy: - -- missing recommended fields emit diagnostics -- missing projection-gated fields remain storable -- mistyped typed fields emit diagnostics -- ingest usually succeeds - -### Operational eligibility - -Specific actions may refuse incomplete records. - -Examples: - -- core-path experiment closure requires complete run/result/note/verdict state -- future promotion helpers may require a projection-ready hypothesis payload +- label +- objective +- status +- brief -## SQLite Schema +And it partitions hypotheses and experiments. -### `nodes` +### Hypothesis -Stores the global node envelope: +Hypothesis is a true graph vertex. It carries: -- id -- class -- track -- frontier id -- archived flag - title - summary -- schema namespace -- schema version -- payload JSON -- diagnostics JSON -- agent session id -- timestamps - -### `node_annotations` - -Stores sidecar free-form annotations: - -- annotation id -- owning node id -- visibility -- optional label -- body -- created timestamp - -### `node_edges` - -Stores typed DAG edges: - -- source node id -- target node id -- edge kind +- exactly one paragraph of body +- tags +- influence parents -The current edge kinds are enough for the MVP: +### Experiment -- `lineage` -- `evidence` -- `comparison` -- `supersedes` -- `annotation` +Experiment is also a true graph vertex. It carries: -### `frontiers` - -Stores derived operational frontier records: - -- frontier id -- label -- root contract node id +- one mandatory owning hypothesis +- optional influence parents +- title +- summary +- tags - status -- timestamps - -Important constraint: - -- the root contract node itself also carries the same frontier id - -That keeps frontier filtering honest. - -### `runs` +- outcome when closed -Stores run envelopes: +The outcome contains: -- run id -- run node id -- frontier id - backend -- status -- run dimensions - command envelope -- started and finished timestamps - -### `metrics` - -Stores primary and supporting run metrics: - -- run id -- metric key -- value -- unit -- optimization objective - -### `experiments` - -Stores the atomic closure object for core-path work: - -- experiment id -- frontier id -- hypothesis node id -- run node id and run id -- optional analysis node id -- decision node id -- title -- summary +- run dimensions +- primary metric +- supporting metrics - verdict -- note payload -- created timestamp - -This table is the enforcement layer for frontier discipline. - -### `events` - -Stores durable audit events: - -- event id -- entity kind -- entity id -- event kind -- payload -- created timestamp - -## Core Types - -### Node classes - -Core path: - -- `contract` -- `hypothesis` -- `run` -- `analysis` -- `decision` - -Off path: - -- `source` -- `source` -- `note` - -### Node tracks - -- `core_path` -- `off_path` - -Track is derived from class, not operator whim. - -### Frontier projection - -The frontier projection currently exposes: - -- frontier record -- open experiment count -- completed experiment count -- verdict counts - -This projection is derived from canonical state and intentionally rebuildable. - -## Write Surfaces - -### Low-ceremony off-path writes - -These are intentionally cheap: - -- `note.quick`, but only with explicit tags from the repo-local registry -- `source.record`, optionally tagged into the same repo-local taxonomy -- generic `node.create` for escape-hatch use -- `node.annotate` - -### Low-ceremony core-path entry - -`hypothesis.record` exists to capture intent before worktree state becomes muddy. - -### Atomic core-path closure - -`experiment.close` is the important write path. - -It persists, in one transaction: - -- run node -- run record -- decision node -- experiment record -- lineage and evidence edges -- frontier touch and verdict accounting inputs - -That atomic boundary is the answer to the ceremony/atomicity pre-mortem. +- rationale +- optional analysis -## MCP Surface +### Artifact -The MVP MCP server is stdio-only and follows newline-delimited JSON-RPC message -framing. The public server is a stable host. It owns initialization state, -replay policy, telemetry, and host rollout. Execution happens in a disposable -worker subprocess. +Artifact is metadata plus a locator for an external thing. It attaches to +frontiers, hypotheses, and experiments. Spinner never reads or stores the +artifact body. -Presentation is orthogonal to payload detail: +## Graph Semantics -- `render=porcelain|json` -- `detail=concise|full` +Two relations matter: -Porcelain is the terse model-facing surface, not a pretty-printed JSON dump. +### Ownership -### Host responsibilities +Every experiment has exactly one owning hypothesis. -- own the public JSON-RPC session -- enforce initialize-before-use -- classify tools and resources by replay contract -- retry only explicitly safe operations after retryable worker faults -- expose health and telemetry -- re-exec the host binary while preserving initialization seed and counters +This is the canonical tree spine. -### Worker responsibilities +### Influence -- open the per-project store -- execute tool logic and resource reads -- return typed success or typed fault records -- remain disposable without losing canonical state +Hypotheses and experiments may both cite later hypotheses or experiments as +influence parents. -## Minimal Navigator +This is the sparse DAG over the canonical tree. -The CLI also exposes a minimal localhost navigator through `ui serve`. +The product should make the ownership spine easy to read and the influence +network available without flooding the hot path. -Current shape: +## SQLite Shape -- left rail of repo-local tags -- single linear node feed in reverse chronological order -- full entry rendering in the main pane -- lightweight hyperlinking for text fields -- typed field badges for `string`, `numeric`, `boolean`, and `timestamp` +The store is normalized around the new ontology: -This is intentionally not a full DAG canvas. It is a text-first operator window -over the canonical store. +- `frontiers` +- `frontier_briefs` +- `hypotheses` +- `experiments` +- `vertex_influences` +- `artifacts` +- `artifact_attachments` +- `metric_definitions` +- `run_dimension_definitions` +- `experiment_metrics` +- `events` -## Binding Bootstrap +The important boundary is this: -`project.bind` may bootstrap a project store when the requested target root is -an existing empty directory. +- hypotheses and experiments are the scientific ledger +- artifacts are reference sidecars +- frontier projections are derived -That is intentionally narrow: +## Presentation Model -- empty root: initialize and bind -- non-empty uninitialized root: fail -- existing store anywhere above the requested path: bind to that discovered root +The system is designed to be hostile to accidental context burn. -### Fault model +`frontier.open` is the only sanctioned overview dump. It should be enough to +answer: -Faults are typed by: +- where the frontier stands +- which tags are active +- which metrics are live +- which hypotheses are active +- which experiments are open -- kind: `invalid_input`, `not_initialized`, `transient`, `internal` -- stage: `host`, `worker`, `store`, `transport`, `protocol`, `rollout` +Everything after that should require deliberate traversal: -Those faults are surfaced both as JSON-RPC errors and as structured tool -errors, depending on call type. +- `hypothesis.read` +- `experiment.read` +- `artifact.read` -### Replay contracts +Artifact reads stay metadata-only by design. -The tool catalog explicitly marks each operation as one of: +## Replay Model -- `safe_replay` -- `never_replay` +The MCP host owns: -Current policy: +- the public JSON-RPC session +- initialize-before-use semantics +- replay contracts +- health and telemetry +- host rollout -- reads such as `project.status`, `project.schema`, `tag.list`, `frontier.list`, - `frontier.status`, `node.list`, `node.read`, `skill.list`, `skill.show`, and - resource reads - are safe to replay once after a retryable worker fault -- mutating tools such as `tag.add`, `frontier.init`, `node.create`, `hypothesis.record`, - `node.annotate`, `node.archive`, `note.quick`, `source.record`, and - `experiment.close` are never auto-replayed +The worker owns: -This is the hardening answer to side-effect safety. +- project-store access +- tool execution +- typed success and fault results -Implemented server features: +Reads and safe operational surfaces may be replayed after retryable worker +faults. Mutating operations are never auto-replayed unless they are explicitly +designed to be safe. -- tools -- resources +## Navigator -### Tools +The local navigator mirrors the same philosophy: -Implemented tools: +- root page lists frontiers +- frontier page is the only overview page +- hypothesis and experiment pages are detail reads +- artifacts are discoverable but never expanded into body dumps -- `system.health` -- `system.telemetry` -- `project.bind` -- `project.status` -- `project.schema` -- `schema.field.upsert` -- `schema.field.remove` -- `tag.add` -- `tag.list` -- `frontier.list` -- `frontier.status` -- `frontier.init` -- `node.create` -- `hypothesis.record` -- `node.list` -- `node.read` -- `node.annotate` -- `node.archive` -- `note.quick` -- `source.record` -- `metric.define` -- `metric.keys` -- `metric.best` -- `metric.migrate` -- `run.dimension.define` -- `run.dimension.list` -- `experiment.close` -- `skill.list` -- `skill.show` - -### Resources - -Implemented resources: - -- `fidget-spinner://project/config` -- `fidget-spinner://project/schema` -- `fidget-spinner://skill/fidget-spinner` -- `fidget-spinner://skill/frontier-loop` - -### Operational tools - -`system.health` returns a typed operational snapshot. Concise/default output -stays on immediate session state; full detail widens to the entire health -object: - -- initialization state -- binding state -- worker generation and liveness -- current executable path -- launch-path stability -- rollout-pending state -- last recorded fault in full detail - -`system.telemetry` returns cumulative counters: - -- requests -- successes -- errors -- retries -- worker restarts -- host rollouts -- last recorded fault -- per-operation counts and last latencies - -### Rollout model - -The host fingerprints its executable at startup. If the binary changes on disk, -or if a rollout is explicitly requested, the host re-execs itself after sending -the current response. The re-exec carries forward: - -- initialization seed -- project binding -- telemetry counters -- request id sequence -- worker generation -- one-shot rollout and crash-test markers - -This keeps the public session stable while still allowing hot binary replacement. - -## CLI Surface - -The CLI remains thin and operational. - -Current commands: - -- `init` -- `schema show` -- `schema upsert-field` -- `schema remove-field` -- `frontier init` -- `frontier status` -- `node add` -- `node list` -- `node show` -- `node annotate` -- `node archive` -- `note quick` -- `tag add` -- `tag list` -- `source add` -- `metric define` -- `metric keys` -- `metric best` -- `metric migrate` -- `dimension define` -- `dimension list` -- `experiment close` -- `mcp serve` -- `ui serve` -- hidden internal `mcp worker` -- `skill list` -- `skill install` -- `skill show` - -The CLI is not the strategic write plane, but it is the easiest repair and -bootstrap surface. Its naming is intentionally parallel but not identical to -the MCP surface: - -- CLI subcommands use spaces such as `schema upsert-field` and `dimension define` -- MCP tools use dotted names such as `schema.field.upsert` and `run.dimension.define` - -## Bundled Skill - -The bundled `fidget-spinner` and `frontier-loop` skills should -be treated as part of the product, not stray prompts. - -Their job is to teach agents: - -- DAG first -- schema first -- cheap off-path pushes -- disciplined core-path closure -- archive rather than delete -- and, for the frontier-loop specialization, how to run an indefinite push - -The asset lives in-tree so it can drift only via an explicit code change. - -## Full-Product Trajectory - -The full product should add, not replace, the MVP implementation. - -Planned next layers: - -- `spinnerd` as a long-lived local daemon -- HTTP and SSE -- read-mostly local UI -- runner orchestration beyond direct process execution -- interruption recovery and resumable long loops -- archive and pruning passes -- optional cross-project indexing - -The invariant for that future work is strict: - -- keep the DAG canonical -- keep frontier state derived -- keep project payloads local and flexible -- keep off-path writes cheap -- keep core-path closure atomic -- keep host-owned replay contracts explicit and auditable +The UI should help a model or operator walk the graph conservatively, not tempt +it into giant all-history feeds. -- cgit v1.2.3