diff options
| author | main <main@swarm.moe> | 2026-03-20 01:11:39 -0400 |
|---|---|---|
| committer | main <main@swarm.moe> | 2026-03-20 01:11:39 -0400 |
| commit | 22fe3d2ce7478450a1d7443c4ecbd85fd4c46716 (patch) | |
| tree | d534d4585a804081b53fcf2f3bbb3a8fc5d29190 /docs | |
| parent | ce41a229dcd57f9a2c35359fe77d9f54f603e985 (diff) | |
| download | fidget_spinner-22fe3d2ce7478450a1d7443c4ecbd85fd4c46716.zip | |
Excise git provenance from experiment ledger
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/architecture.md | 34 | ||||
| -rw-r--r-- | docs/libgrid-dogfood.md | 43 | ||||
| -rw-r--r-- | docs/product-spec.md | 21 |
3 files changed, 31 insertions, 67 deletions
diff --git a/docs/architecture.md b/docs/architecture.md index 30d01fc..e274ad5 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -80,7 +80,6 @@ The engine depends on a stable, typed spine stored in SQLite: - node annotations - node edges - frontiers -- checkpoints - runs - metrics - experiments @@ -207,23 +206,6 @@ Important constraint: That keeps frontier filtering honest. -### `checkpoints` - -Stores committed candidate or champion checkpoints: - -- checkpoint id -- frontier id -- anchoring node id -- repo/worktree metadata -- commit hash -- disposition -- summary -- created timestamp - -In the current codebase, a frontier may temporarily exist without a champion if -it was initialized outside a git repo. Core-path experimentation is only fully -available once git-backed checkpoints exist. - ### `runs` Stores run envelopes: @@ -233,8 +215,7 @@ Stores run envelopes: - frontier id - backend - status -- code snapshot metadata -- benchmark suite +- run dimensions - command envelope - started and finished timestamps @@ -254,12 +235,12 @@ Stores the atomic closure object for core-path work: - experiment id - frontier id -- base checkpoint id -- candidate checkpoint id - hypothesis node id - run node id and run id - optional analysis node id - decision node id +- title +- summary - verdict - note payload - created timestamp @@ -307,9 +288,9 @@ Track is derived from class, not operator whim. The frontier projection currently exposes: - frontier record -- champion checkpoint id -- active candidate checkpoint ids -- experiment count +- open experiment count +- completed experiment count +- verdict counts This projection is derived from canonical state and intentionally rebuildable. @@ -336,11 +317,10 @@ It persists, in one transaction: - run node - run record -- candidate checkpoint - decision node - experiment record - lineage and evidence edges -- frontier touch and champion demotion when needed +- frontier touch and verdict accounting inputs That atomic boundary is the answer to the ceremony/atomicity pre-mortem. diff --git a/docs/libgrid-dogfood.md b/docs/libgrid-dogfood.md index 59e214e..206c4d7 100644 --- a/docs/libgrid-dogfood.md +++ b/docs/libgrid-dogfood.md @@ -19,8 +19,8 @@ The MVP does not need to solve all of `libgrid`. It needs to solve this specific problem: replace the giant freeform experiment log with a machine in which the active -frontier, the current champion, the candidate evidence, and the dead ends are -all explicit and queryable. +frontier, the accepted lines, the live evidence, and the dead ends are all +explicit and queryable. When using a global unbound MCP session from a `libgrid` worktree, the first project-local action should be `project.bind` against the `libgrid` worktree @@ -54,7 +54,6 @@ The root contract should state: Use `hypothesis.record` to capture: - what hypothesis is being tested -- what base checkpoint it starts from - what benchmark suite matters - any terse sketch of the intended delta @@ -65,19 +64,17 @@ The run node should capture: - exact command - cwd - backend kind -- benchmark suite -- code snapshot +- run dimensions - resulting metrics ### Decision node The decision should make the verdict explicit: -- promote to champion -- keep on frontier -- revert to champion -- archive dead end -- needs more evidence +- accepted +- kept +- parked +- rejected ### Off-path nodes @@ -100,7 +97,6 @@ The MVP does not need hard rejection. It does need meaningful warnings. Good first project fields: - `hypothesis` on `hypothesis` -- `base_checkpoint_id` on `hypothesis` - `benchmark_suite` on `hypothesis` and `run` - `body` on `hypothesis`, `source`, and `note` - `comparison_claim` on `analysis` @@ -121,7 +117,6 @@ Good first metric vocabulary: 1. Initialize the project store. 2. Create a frontier contract. -3. Capture the incumbent git checkpoint if available. ### 2. Start a line of attack @@ -132,13 +127,12 @@ Good first metric vocabulary: ### 3. Execute one experiment 1. Modify the worktree. -2. Commit the candidate checkpoint. -3. Run the benchmark protocol. -4. Close the experiment atomically. +2. Run the benchmark protocol. +3. Close the experiment atomically. ### 4. Judge and continue -1. Promote the checkpoint or keep it alive. +1. Mark the line accepted, kept, parked, or rejected. 2. Archive dead ends instead of leaving them noisy and active. 3. Repeat. @@ -148,12 +142,10 @@ For `libgrid`, the benchmark evidence needs to be structurally trustworthy. The MVP should always preserve at least: -- benchmark suite identity +- run dimensions - primary metric - supporting metrics - command envelope -- host/worktree metadata -- git commit identity This is the minimum needed to prevent "I think this was faster" folklore. @@ -176,27 +168,28 @@ The right sequence is: ## Repo-Local Dogfood Before Libgrid This repository itself is a valid off-path dogfood target even though it is not -currently a git repo. +a benchmark-heavy repo. That means we can already use it to test: - project initialization - schema visibility -- frontier creation without a champion +- frontier creation and status projection - off-path source recording - hidden annotations - MCP read and write flows -What it cannot honestly test is full git-backed core-path experiment closure. -That still belongs in a real repo such as the `libgrid` worktree. +What it cannot honestly test is heavy benchmark ingestion and the retrieval +pressure that comes with it. That still belongs in a real optimization corpus +such as the `libgrid` worktree. ## Acceptance Bar For Libgrid Fidget Spinner is ready for serious `libgrid` use when: - an agent can run for hours without generating a giant markdown graveyard -- the operator can identify the champion checkpoint mechanically -- each completed experiment has checkpoint, result, note, and verdict +- the operator can identify accepted, kept, parked, and rejected lines mechanically +- each completed experiment has result, note, and verdict - off-path side investigations stay preserved but do not pollute the core path - the system feels like a machine for evidence rather than a diary with better typography diff --git a/docs/product-spec.md b/docs/product-spec.md index efa57df..85561ad 100644 --- a/docs/product-spec.md +++ b/docs/product-spec.md @@ -50,7 +50,7 @@ These are the load-bearing decisions to hold fixed through the MVP push. The canonical record is the DAG plus its normalized supporting tables. Frontier state is not a rival authority. It is a derived, rebuildable -projection over the DAG and related run/checkpoint/experiment records. +projection over the DAG and related run/experiment records. ### 2. Storage is per-project @@ -103,8 +103,6 @@ benchmark/decision bureaucracy while still preserving it in the DAG. A completed experiment exists only when all of these exist together: -- base checkpoint -- candidate checkpoint - measured result - terse note - explicit verdict @@ -112,13 +110,6 @@ A completed experiment exists only when all of these exist together: The write surface should make that one atomic mutation, not a loose sequence of low-level calls. -### 7. Checkpoints are git-backed - -Dirty worktree snapshots are useful as descriptive context, but a completed -core-path experiment should anchor to a committed candidate checkpoint. - -Off-path notes and source captures can remain lightweight and non-committal. - ## Node Model ### Global envelope @@ -203,9 +194,9 @@ The frontier is a derived operational view over the canonical DAG. It answers: - what objective is active -- what the current champion checkpoint is -- which candidate checkpoints are still alive -- how many completed experiments exist +- how many experiments are open +- how many experiments are completed +- how the verdict mix currently breaks down The DAG answers: @@ -316,10 +307,10 @@ The MVP is successful when: - an agent can record off-path sources and notes without bureaucratic pain - the project schema can softly declare whether payload fields are strings, numbers, booleans, or timestamps - an operator can inspect recent nodes through a minimal localhost web navigator filtered by tag -- a git-backed project can close a real core-path experiment atomically +- a project can close a real core-path experiment atomically - retryable worker faults do not duplicate side effects - stale nodes can be archived instead of polluting normal enumeration -- a human can answer "what changed, what ran, what is the current champion, +- a human can answer "what was tried, what ran, what was accepted or parked, and why?" without doing markdown archaeology ## Full Product |