docs/product-spec.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351

# Fidget Spinner Product Spec

## Thesis

Fidget Spinner is a local-first, agent-first frontier machine for autonomous
program optimization, source capture, and experiment adjudication.

The immediate target is brutally practical: replace gigantic freeform
experiment markdown with a machine that preserves evidence as structure.

The package is deliberately two things at once:

- a local MCP-backed DAG substrate
- bundled skills that teach agents how to drive that substrate

Those two halves should be versioned together and treated as one product.

## Product Position

This is not a hosted lab notebook.

This is not a cloud compute marketplace.

This is not a collaboration shell with experiments bolted on.

This is a local machine for indefinite frontier pushes, with agents as primary
writers and humans as auditors, reviewers, and occasional editors.

## Non-Goals

These are explicitly out of scope for the core product:

- OAuth
- hosted identity
- cloud tenancy
- billing, credits, and subscriptions
- managed provider brokerage
- chat as the system of record
- mandatory remote control planes
- replacing git

Git remains the code substrate. Fidget Spinner is the evidence substrate.

## Locked Design Decisions

These are the load-bearing decisions to hold fixed through the MVP push.

### 1. The DAG is canonical truth

The canonical record is the DAG plus its normalized supporting tables.

Frontier state is not a rival authority. It is a derived, rebuildable
projection over the DAG and related run/checkpoint/experiment records.

### 2. Storage is per-project

Each project owns its own local store under:

```text
<project root>/.fidget_spinner/
    state.sqlite
    project.json
    schema.json
    blobs/
```

There is no mandatory global database in the MVP.

### 3. Node structure is layered

Every node has three layers:

- a hard global envelope for indexing and traversal
- a project-local structured payload
- free-form sidecar annotations as an escape hatch

The engine only hard-depends on the envelope. Project payloads remain flexible.

### 4. Validation is warning-heavy

Engine integrity is hard-validated.

Project semantics are diagnostically validated.

Workflow eligibility is action-gated.

In other words:

- bad engine state is rejected
- incomplete project payloads are usually admitted with diagnostics
- projections and frontier actions may refuse incomplete nodes later

### 5. Core-path and off-path work must diverge

Core-path work is disciplined and atomic.

Off-path work is cheap and permissive.

The point is to avoid forcing every scrap of source digestion or note-taking through the full
benchmark/decision bureaucracy while still preserving it in the DAG.

### 6. Completed core-path experiments are atomic

A completed experiment exists only when all of these exist together:

- base checkpoint
- candidate checkpoint
- measured result
- terse note
- explicit verdict

The write surface should make that one atomic mutation, not a loose sequence of
low-level calls.

### 7. Checkpoints are git-backed

Dirty worktree snapshots are useful as descriptive context, but a completed
core-path experiment should anchor to a committed candidate checkpoint.

Off-path notes and source captures can remain lightweight and non-committal.

## Node Model

### Global envelope

The hard spine should be stable across projects. It includes at least:

- node id
- node class
- node track
- frontier id if any
- archived flag
- title
- summary
- schema namespace and version
- timestamps
- diagnostics
- hidden or visible annotations

This is the engine layer: the part that powers indexing, traversal, archiving,
default enumeration, and model-facing summaries.

### Project-local payload

Every project may define richer payload fields in:

`<project root>/.fidget_spinner/schema.json`

That file is a model-facing contract. It defines field names and soft
validation tiers without forcing global schema churn.

Per-field settings should express at least:

- presence: `required`, `recommended`, `optional`
- severity: `error`, `warning`, `info`
- role: `index`, `projection_gate`, `render_only`, `opaque`
- inference policy: whether the model may infer the field

These settings are advisory at ingest time and stricter at projection/action
time.

### Free-form annotations

Any node may carry free-form annotations.

These are explicitly sidecar, not primary payload. They are:

- allowed everywhere
- hidden from default enumeration
- useful as a scratchpad or escape hatch
- not allowed to become the only home of critical operational truth

If a fact matters to automation, comparison, or promotion, it must migrate into
the spine or project payload.

## Node Taxonomy

### Core-path node classes

These are the disciplined frontier-loop classes:

- `contract`
- `hypothesis`
- `run`
- `analysis`
- `decision`

### Off-path node classes

These are deliberately low-ceremony:

- `source`
- `source`
- `note`

They exist so the product can absorb real thinking instead of forcing users and
agents back into sprawling markdown.

## Frontier Model

The frontier is a derived operational view over the canonical DAG.

It answers:

- what objective is active
- what the current champion checkpoint is
- which candidate checkpoints are still alive
- how many completed experiments exist

The DAG answers:

- what changed
- what ran
- what evidence was collected
- what was concluded
- what dead ends and side investigations exist

That split is deliberate. It prevents "frontier state" from turning into a
second unofficial database.

## First Usable MVP

The first usable MVP is the first cut that can already replace a meaningful
slice of the markdown habit without pretending the whole full-product vision is
done.

### MVP deliverables

- per-project `.fidget_spinner/` state
- local SQLite backing store
- local blob directory
- typed Rust core model
- optional light-touch project field types: `string`, `numeric`, `boolean`, `timestamp`
- thin CLI for bootstrap and repair
- hardened stdio MCP host exposed from the CLI
- minimal read-only web navigator with tag filtering and linear node rendering
- disposable MCP worker execution runtime
- bundled `fidget-spinner` base skill
- bundled `frontier-loop` skill
- low-ceremony off-path note and source recording
- explicit experiment open/close lifecycle for the core path

### Explicitly deferred from the MVP

- long-lived `spinnerd`
- web UI
- remote runners
- multi-agent hardening
- aggressive pruning and vacuuming
- strong markdown migration tooling
- cross-project indexing

### MVP model-facing surface

The model-facing surface is a local MCP server oriented around frontier work.

The initial tools should be:

- `system.health`
- `system.telemetry`
- `project.bind`
- `project.status`
- `project.schema`
- `tag.add`
- `tag.list`
- `frontier.list`
- `frontier.status`
- `frontier.init`
- `node.create`
- `hypothesis.record`
- `node.list`
- `node.read`
- `node.annotate`
- `node.archive`
- `note.quick`
- `source.record`
- `experiment.open`
- `experiment.list`
- `experiment.read`
- `experiment.close`
- `skill.list`
- `skill.show`

The important point is not the exact names. The important point is the shape:

- cheap read access to project and frontier context
- cheap off-path writes
- low-ceremony hypothesis capture
- one explicit experiment-open step plus one experiment-close step
- explicit operational introspection for long-lived agent sessions
- explicit replay boundaries so side effects are never duplicated by accident

### MVP skill posture

The bundled skills should instruct agents to:

1. inspect `system.health` first
2. bind the MCP session to the target project before project-local reads or writes
3. read project schema, tag registry, and frontier state
4. pull context from the DAG instead of giant prose dumps
5. use `note.quick` and `source.record` freely off path, but always pass an explicit tag list for notes
6. use `hypothesis.record` before worktree thrash becomes ambiguous
7. use `experiment.open` before running a live hypothesis-owned line
8. use `experiment.close` to seal that line with measured evidence
9. archive detritus instead of deleting it
10. use the base `fidget-spinner` skill for ordinary DAG work and add
   `frontier-loop` only when the task becomes a true autonomous frontier push

### MVP acceptance bar

The MVP is successful when:

- a project can be initialized locally with no hosted dependencies
- an agent can inspect frontier state through MCP
- an agent can inspect MCP health and telemetry through MCP
- an agent can record off-path sources and notes without bureaucratic pain
- the project schema can softly declare whether payload fields are strings, numbers, booleans, or timestamps
- an operator can inspect recent nodes through a minimal localhost web navigator filtered by tag
- a git-backed project can close a real core-path experiment atomically
- retryable worker faults do not duplicate side effects
- stale nodes can be archived instead of polluting normal enumeration
- a human can answer "what changed, what ran, what is the current champion,
  and why?" without doing markdown archaeology

## Full Product

The full product grows outward from the MVP rather than replacing it.

### Planned additions

- `spinnerd` as a long-lived local daemon
- local HTTP and SSE
- read-mostly graph and run inspection UI
- richer artifact handling
- model-driven pruning and archive passes
- stronger interruption recovery
- local runner backends beyond direct process execution
- optional global indexing across projects
- import/export and subgraph packaging

### Invariant for all later stages

No future layer should invalidate the MVP spine:

- DAG canonical
- frontier derived
- project-local store
- layered node model
- warning-heavy schema validation
- cheap off-path writes
- atomic core-path closure