From ce41a229dcd57f9a2c35359fe77d9f54f603e985 Mon Sep 17 00:00:00 2001 From: main Date: Fri, 20 Mar 2026 00:33:08 -0400 Subject: Refound ontology around hypotheses and experiments --- assets/codex-skills/fidget-spinner/SKILL.md | 40 +++++++++++++++++++++++------ assets/codex-skills/frontier-loop/SKILL.md | 7 ++--- 2 files changed, 36 insertions(+), 11 deletions(-) (limited to 'assets/codex-skills') diff --git a/assets/codex-skills/fidget-spinner/SKILL.md b/assets/codex-skills/fidget-spinner/SKILL.md index 1e4c2a3..cfa3521 100644 --- a/assets/codex-skills/fidget-spinner/SKILL.md +++ b/assets/codex-skills/fidget-spinner/SKILL.md @@ -1,6 +1,6 @@ --- name: fidget-spinner -description: Use Fidget Spinner as the local system of record for structured research and optimization work. Read health, schema, and frontier state first; prefer cheap off-path DAG writes; reserve atomic experiment closure for benchmarked core-path work. +description: Use Fidget Spinner as the local system of record for source capture, hypothesis tracking, and experiment adjudication. Read health, schema, and frontier state first; keep off-path prose cheap; drive core-path work through hypothesis-owned experiments. --- # Fidget Spinner @@ -22,6 +22,7 @@ Then read: - `tag.list` - `frontier.list` - `frontier.status` for the active frontier +- `experiment.list` if you may be resuming in-flight core-path work Read `project.schema` only when payload authoring, validation rules, or local field vocabulary are actually relevant. When in doubt, start with @@ -31,6 +32,7 @@ If you need more context, pull it from: - `node.list` - `node.read` +- `experiment.read` ## Posture @@ -38,6 +40,8 @@ If you need more context, pull it from: - frontier state is a derived projection - project payload validation is warning-heavy at ingest - annotations are sidecar and hidden by default +- `source` and `note` are off-path memory +- `hypothesis` and `experiment` are the disciplined core path ## Choose The Cheapest Tool @@ -45,19 +49,35 @@ If you need more context, pull it from: - `tag.list` before inventing note tags by memory - `schema.field.upsert` when one project payload field needs to become canonical without hand-editing `schema.json` - `schema.field.remove` when one project payload field definition should be purged cleanly -- `research.record` for exploratory work, design notes, dead ends, and enabling ideas; always pass `title`, `summary`, and `body`, and pass `tags` when the research belongs in a campaign/subsystem index -- `note.quick` for terse state pushes, always with an explicit `tags` list plus `title`, `summary`, and `body`; use `[]` only when no registered tag applies +- `source.record` for imported source material, documentary context, or one substantial source digest; always pass `title`, `summary`, and `body`, and pass `tags` when the source belongs in a campaign/subsystem index +- `note.quick` for atomic reusable takeaways, always with an explicit `tags` list plus `title`, `summary`, and `body`; use `[]` only when no registered tag applies +- `hypothesis.record` before core-path work; every experiment must hang off exactly one hypothesis +- `experiment.open` once a hypothesis has a concrete base checkpoint and is ready to be tested +- `experiment.list` or `experiment.read` when resuming a session and you need to recover open experimental state - `metric.define` when a project-level metric key needs a canonical unit, objective, or human description - `run.dimension.define` when a new experiment slicer such as `scenario` or `duration_s` becomes query-worthy - `run.dimension.list` before guessing which run dimensions actually exist in the store - `metric.keys` before guessing which numeric signals are actually rankable; pass exact run-dimension filters when narrowing to one workload slice - `metric.best` when you need the best closed experiments by one numeric key; pass `order` for noncanonical payload fields and exact run-dimension filters when comparing one slice - `node.annotate` for scratch text that should stay off the main path -- `change.record` before core-path work -- `experiment.close` only when you have checkpoint, measured result, note, and verdict +- `experiment.close` only for an already-open experiment and only when you have checkpoint, measured result, note, and verdict; attach `analysis` when the result needs explicit interpretation - `node.archive` to hide stale detritus without deleting evidence - `node.create` only as a true escape hatch +## Workflow + +1. Preserve source texture with `source.record` only when keeping the source itself matters. +2. Extract reusable claims into `note.quick`. +3. State the intended intervention with `hypothesis.record`. +4. Open a live experiment with `experiment.open`. +5. Do the work. +6. Close the experiment with `experiment.close`, including metrics, verdict, and optional analysis. + +Do not dump a whole markdown tranche into one giant prose node and call that progress. +If a later agent should enumerate it by tag or node list, it should usually be a `note.quick`. +If the point is to preserve or digest a source document, it should be `source.record`. +If the point is to test a claim, it should become a hypothesis plus an experiment. + ## Discipline 1. Pull context from the DAG, not from sprawling prompt prose. @@ -70,7 +90,11 @@ If you need more context, pull it from: 6. Treat metric keys as project-level registry entries and run dimensions as the first-class slice surface for experiment comparison; do not encode scenario context into the metric key itself. -7. Porcelain is the terse triage surface. Use `detail=full` only when concise +7. A source node is not a dumping ground for every thought spawned by that source. + Preserve one source digest if needed, then extract reusable claims into notes. +8. A hypothesis is not an experiment. Open the experiment explicitly; do not + smuggle “planned work” into off-path prose. +9. Porcelain is the terse triage surface. Use `detail=full` only when concise output stops being decision-sufficient. -8. When the task becomes a true indefinite optimization push, pair this skill - with `frontier-loop`. +10. When the task becomes a true indefinite optimization push, pair this skill + with `frontier-loop`. diff --git a/assets/codex-skills/frontier-loop/SKILL.md b/assets/codex-skills/frontier-loop/SKILL.md index b3ea44d..e8f1b94 100644 --- a/assets/codex-skills/frontier-loop/SKILL.md +++ b/assets/codex-skills/frontier-loop/SKILL.md @@ -62,7 +62,7 @@ ASSUME YOU ARE RUNNING OVERNIGHT. 2. Study existing evidence from `fidget-spinner`. 3. Search outward if the local frontier looks exhausted or you are starting to take unambitious strides. 4. Form a strong, falsifiable hypothesis. -5. Make the change. +5. Record the hypothesis and open the experiment. 6. Measure it. 7. If the result is surprising, noisy, or broken, debug the implementation and rerun only enough to understand the outcome. @@ -99,11 +99,12 @@ Every real experiment must leave an auditable record in `fidget-spinner`. If something matters to the frontier, put it in the DAG. -Use off-path records liberally for enabling work, side investigations, and dead +Use off-path records liberally for source capture, side investigations, and dead ends. When a line becomes a real measured experiment, close it through the proper -`fidget-spinner` path instead of improvising a chain of half-recorded steps. +`fidget-spinner` path instead of improvising a chain of half-recorded steps: +`hypothesis.record` -> `experiment.open` -> `experiment.close`. ## Resume Discipline -- cgit v1.2.3