swarm repositories / source
aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/architecture.md34
-rw-r--r--docs/libgrid-dogfood.md43
-rw-r--r--docs/product-spec.md21
3 files changed, 31 insertions, 67 deletions
diff --git a/docs/architecture.md b/docs/architecture.md
index 30d01fc..e274ad5 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -80,7 +80,6 @@ The engine depends on a stable, typed spine stored in SQLite:
- node annotations
- node edges
- frontiers
-- checkpoints
- runs
- metrics
- experiments
@@ -207,23 +206,6 @@ Important constraint:
That keeps frontier filtering honest.
-### `checkpoints`
-
-Stores committed candidate or champion checkpoints:
-
-- checkpoint id
-- frontier id
-- anchoring node id
-- repo/worktree metadata
-- commit hash
-- disposition
-- summary
-- created timestamp
-
-In the current codebase, a frontier may temporarily exist without a champion if
-it was initialized outside a git repo. Core-path experimentation is only fully
-available once git-backed checkpoints exist.
-
### `runs`
Stores run envelopes:
@@ -233,8 +215,7 @@ Stores run envelopes:
- frontier id
- backend
- status
-- code snapshot metadata
-- benchmark suite
+- run dimensions
- command envelope
- started and finished timestamps
@@ -254,12 +235,12 @@ Stores the atomic closure object for core-path work:
- experiment id
- frontier id
-- base checkpoint id
-- candidate checkpoint id
- hypothesis node id
- run node id and run id
- optional analysis node id
- decision node id
+- title
+- summary
- verdict
- note payload
- created timestamp
@@ -307,9 +288,9 @@ Track is derived from class, not operator whim.
The frontier projection currently exposes:
- frontier record
-- champion checkpoint id
-- active candidate checkpoint ids
-- experiment count
+- open experiment count
+- completed experiment count
+- verdict counts
This projection is derived from canonical state and intentionally rebuildable.
@@ -336,11 +317,10 @@ It persists, in one transaction:
- run node
- run record
-- candidate checkpoint
- decision node
- experiment record
- lineage and evidence edges
-- frontier touch and champion demotion when needed
+- frontier touch and verdict accounting inputs
That atomic boundary is the answer to the ceremony/atomicity pre-mortem.
diff --git a/docs/libgrid-dogfood.md b/docs/libgrid-dogfood.md
index 59e214e..206c4d7 100644
--- a/docs/libgrid-dogfood.md
+++ b/docs/libgrid-dogfood.md
@@ -19,8 +19,8 @@ The MVP does not need to solve all of `libgrid`.
It needs to solve this specific problem:
replace the giant freeform experiment log with a machine in which the active
-frontier, the current champion, the candidate evidence, and the dead ends are
-all explicit and queryable.
+frontier, the accepted lines, the live evidence, and the dead ends are all
+explicit and queryable.
When using a global unbound MCP session from a `libgrid` worktree, the first
project-local action should be `project.bind` against the `libgrid` worktree
@@ -54,7 +54,6 @@ The root contract should state:
Use `hypothesis.record` to capture:
- what hypothesis is being tested
-- what base checkpoint it starts from
- what benchmark suite matters
- any terse sketch of the intended delta
@@ -65,19 +64,17 @@ The run node should capture:
- exact command
- cwd
- backend kind
-- benchmark suite
-- code snapshot
+- run dimensions
- resulting metrics
### Decision node
The decision should make the verdict explicit:
-- promote to champion
-- keep on frontier
-- revert to champion
-- archive dead end
-- needs more evidence
+- accepted
+- kept
+- parked
+- rejected
### Off-path nodes
@@ -100,7 +97,6 @@ The MVP does not need hard rejection. It does need meaningful warnings.
Good first project fields:
- `hypothesis` on `hypothesis`
-- `base_checkpoint_id` on `hypothesis`
- `benchmark_suite` on `hypothesis` and `run`
- `body` on `hypothesis`, `source`, and `note`
- `comparison_claim` on `analysis`
@@ -121,7 +117,6 @@ Good first metric vocabulary:
1. Initialize the project store.
2. Create a frontier contract.
-3. Capture the incumbent git checkpoint if available.
### 2. Start a line of attack
@@ -132,13 +127,12 @@ Good first metric vocabulary:
### 3. Execute one experiment
1. Modify the worktree.
-2. Commit the candidate checkpoint.
-3. Run the benchmark protocol.
-4. Close the experiment atomically.
+2. Run the benchmark protocol.
+3. Close the experiment atomically.
### 4. Judge and continue
-1. Promote the checkpoint or keep it alive.
+1. Mark the line accepted, kept, parked, or rejected.
2. Archive dead ends instead of leaving them noisy and active.
3. Repeat.
@@ -148,12 +142,10 @@ For `libgrid`, the benchmark evidence needs to be structurally trustworthy.
The MVP should always preserve at least:
-- benchmark suite identity
+- run dimensions
- primary metric
- supporting metrics
- command envelope
-- host/worktree metadata
-- git commit identity
This is the minimum needed to prevent "I think this was faster" folklore.
@@ -176,27 +168,28 @@ The right sequence is:
## Repo-Local Dogfood Before Libgrid
This repository itself is a valid off-path dogfood target even though it is not
-currently a git repo.
+a benchmark-heavy repo.
That means we can already use it to test:
- project initialization
- schema visibility
-- frontier creation without a champion
+- frontier creation and status projection
- off-path source recording
- hidden annotations
- MCP read and write flows
-What it cannot honestly test is full git-backed core-path experiment closure.
-That still belongs in a real repo such as the `libgrid` worktree.
+What it cannot honestly test is heavy benchmark ingestion and the retrieval
+pressure that comes with it. That still belongs in a real optimization corpus
+such as the `libgrid` worktree.
## Acceptance Bar For Libgrid
Fidget Spinner is ready for serious `libgrid` use when:
- an agent can run for hours without generating a giant markdown graveyard
-- the operator can identify the champion checkpoint mechanically
-- each completed experiment has checkpoint, result, note, and verdict
+- the operator can identify accepted, kept, parked, and rejected lines mechanically
+- each completed experiment has result, note, and verdict
- off-path side investigations stay preserved but do not pollute the core path
- the system feels like a machine for evidence rather than a diary with better
typography
diff --git a/docs/product-spec.md b/docs/product-spec.md
index efa57df..85561ad 100644
--- a/docs/product-spec.md
+++ b/docs/product-spec.md
@@ -50,7 +50,7 @@ These are the load-bearing decisions to hold fixed through the MVP push.
The canonical record is the DAG plus its normalized supporting tables.
Frontier state is not a rival authority. It is a derived, rebuildable
-projection over the DAG and related run/checkpoint/experiment records.
+projection over the DAG and related run/experiment records.
### 2. Storage is per-project
@@ -103,8 +103,6 @@ benchmark/decision bureaucracy while still preserving it in the DAG.
A completed experiment exists only when all of these exist together:
-- base checkpoint
-- candidate checkpoint
- measured result
- terse note
- explicit verdict
@@ -112,13 +110,6 @@ A completed experiment exists only when all of these exist together:
The write surface should make that one atomic mutation, not a loose sequence of
low-level calls.
-### 7. Checkpoints are git-backed
-
-Dirty worktree snapshots are useful as descriptive context, but a completed
-core-path experiment should anchor to a committed candidate checkpoint.
-
-Off-path notes and source captures can remain lightweight and non-committal.
-
## Node Model
### Global envelope
@@ -203,9 +194,9 @@ The frontier is a derived operational view over the canonical DAG.
It answers:
- what objective is active
-- what the current champion checkpoint is
-- which candidate checkpoints are still alive
-- how many completed experiments exist
+- how many experiments are open
+- how many experiments are completed
+- how the verdict mix currently breaks down
The DAG answers:
@@ -316,10 +307,10 @@ The MVP is successful when:
- an agent can record off-path sources and notes without bureaucratic pain
- the project schema can softly declare whether payload fields are strings, numbers, booleans, or timestamps
- an operator can inspect recent nodes through a minimal localhost web navigator filtered by tag
-- a git-backed project can close a real core-path experiment atomically
+- a project can close a real core-path experiment atomically
- retryable worker faults do not duplicate side effects
- stale nodes can be archived instead of polluting normal enumeration
-- a human can answer "what changed, what ran, what is the current champion,
+- a human can answer "what was tried, what ran, what was accepted or parked,
and why?" without doing markdown archaeology
## Full Product