blob: f851294952ed17c36f49a7c7ebeed89948c2b481 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
|
# Fidget Spinner Product Spec
## Thesis
Fidget Spinner is a local-first, agent-first frontier ledger for autonomous
optimization work.
It is not a notebook. It is not a generic DAG memory. It is not an inner
platform for git. It is a hard experimental spine whose job is to preserve
scientific truth with enough structure that agents can resume work without
reconstructing everything from prose.
The package is deliberately two things at once:
- a local MCP-backed frontier ledger
- bundled skills that teach agents how to drive that ledger
Those two halves are one product and should be versioned together.
## Product Position
This is a machine for long-running frontier work in local repos.
Humans and agents should be able to answer:
- what frontier is active
- which hypotheses are live
- which experiments are still open
- what the latest accepted, kept, parked, and rejected outcomes are
- which metrics matter right now
without opening a markdown graveyard.
## Non-Goals
These are explicitly out of scope for the core product:
- hosted identity
- cloud tenancy
- billing or credits
- chat as the system of record
- mandatory remote control planes
- replacing git
- storing or rendering large artifact bodies
Git remains the code substrate. Fidget Spinner is the experimental ledger.
## Locked Design Decisions
### 1. The ledger is austere
The only freeform overview surface is the frontier brief, read through
`frontier.open`.
Everything else should require deliberate traversal one selector at a time.
Slow is better than burning tokens on giant feeds.
### 2. The ontology is small
The canonical object families are:
- `frontier`
- `hypothesis`
- `experiment`
- `artifact`
There are no canonical `note` or `source` ledger nodes.
### 3. Frontier is scope, not a graph vertex
A frontier is a named scope and grounding object. It owns:
- objective
- status
- brief
And it partitions hypotheses and experiments.
### 4. Hypothesis and experiment are the true graph vertices
A hypothesis is a terse intervention claim.
An experiment is a stateful scientific record. Every experiment has:
- one mandatory owning hypothesis
- optional influence parents drawn from hypotheses or experiments
This gives the product a canonical tree spine plus a sparse influence network.
### 5. Artifacts are references only
Artifacts are metadata plus locators for external material:
- files
- links
- logs
- tables
- plots
- dumps
- bibliographies
Spinner never reads artifact bodies. If a wall of text matters, attach it as an
artifact and summarize the operational truth elsewhere.
### 6. Experiment closure is atomic
A closed experiment exists only when all of these exist together:
- dimensions
- primary metric
- verdict
- rationale
- optional supporting metrics
- optional analysis
Closing an experiment is one atomic mutation, not a loose pile of lower-level
writes.
### 7. Live metrics are derived
The hot-path metric surface is not “all metrics that have ever existed.”
The hot-path metric surface is the derived live set for the active frontier.
That set should stay small, frontier-relevant, and queryable.
## Canonical Data Model
### Frontier
Frontier is a scope/partition object with one mutable brief.
The brief is the sanctioned grounding object. It should stay short and answer:
- situation
- roadmap
- unknowns
### Hypothesis
A hypothesis is a disciplined claim:
- title
- summary
- exactly one paragraph of body
- tags
- influence parents
It is not a design doc and not a catch-all prose bucket.
### Experiment
An experiment is a stateful object:
- open while the work is live
- closed when the result is in
A closed experiment stores:
- dimensions
- primary metric
- supporting metrics
- verdict: `accepted | kept | parked | rejected`
- rationale
- optional analysis
- attached artifacts
### Artifact
Artifacts preserve external material by reference. They are deliberately off the
token hot path. Artifact metadata should be enough to discover the thing; the
body lives elsewhere.
## Token Discipline
`frontier.open` is the only sanctioned overview dump. It should return:
- frontier brief
- active tags
- scoreboard metric keys
- live metric keys
- active hypotheses with deduped current state
- open experiments
After that, the model should walk explicitly:
- `hypothesis.read`
- `experiment.read`
- `artifact.read`
No broad list surface should dump large prose. Artifact bodies are never in the
MCP path.
## Storage
Every project owns a private state root:
```text
<project root>/.fidget_spinner/
project.json
state.sqlite
```
There is no required global database.
## MVP Surface
The current model-facing surface is:
- `system.health`
- `system.telemetry`
- `project.bind`
- `project.status`
- `tag.add`
- `tag.list`
- `frontier.create`
- `frontier.list`
- `frontier.read`
- `frontier.open`
- `frontier.update`
- `frontier.history`
- `hypothesis.record`
- `hypothesis.list`
- `hypothesis.read`
- `hypothesis.update`
- `hypothesis.history`
- `experiment.open`
- `experiment.list`
- `experiment.read`
- `experiment.update`
- `experiment.close`
- `experiment.nearest`
- `experiment.history`
- `artifact.record`
- `artifact.list`
- `artifact.read`
- `artifact.update`
- `artifact.history`
- `metric.define`
- `metric.keys`
- `metric.best`
- `run.dimension.define`
- `run.dimension.list`
## Explicitly Deferred
Still out of scope:
- remote runners
- hosted multi-user control planes
- broad artifact ingestion
- reading artifact bodies through Spinner
- giant auto-generated context dumps
- replacing git or reconstructing git inside the ledger
|