diff options
| -rw-r--r-- | README.md | 19 | ||||
| -rw-r--r-- | assets/codex-skills/fidget-spinner/SKILL.md | 9 | ||||
| -rw-r--r-- | assets/codex-skills/frontier-loop/SKILL.md | 4 | ||||
| -rw-r--r-- | crates/fidget-spinner-cli/src/main.rs | 80 | ||||
| -rw-r--r-- | crates/fidget-spinner-cli/src/mcp/catalog.rs | 24 | ||||
| -rw-r--r-- | crates/fidget-spinner-cli/src/mcp/service.rs | 124 | ||||
| -rw-r--r-- | crates/fidget-spinner-cli/tests/mcp_hardening.rs | 161 | ||||
| -rw-r--r-- | crates/fidget-spinner-core/src/id.rs | 1 | ||||
| -rw-r--r-- | crates/fidget-spinner-core/src/lib.rs | 21 | ||||
| -rw-r--r-- | crates/fidget-spinner-core/src/model.rs | 89 | ||||
| -rw-r--r-- | crates/fidget-spinner-store-sqlite/src/lib.rs | 545 | ||||
| -rw-r--r-- | docs/architecture.md | 34 | ||||
| -rw-r--r-- | docs/libgrid-dogfood.md | 43 | ||||
| -rw-r--r-- | docs/product-spec.md | 21 |
14 files changed, 248 insertions, 927 deletions
@@ -161,7 +161,6 @@ cargo run -p fidget-spinner-cli -- hypothesis add \ cargo run -p fidget-spinner-cli -- experiment open \ --project . \ --frontier <frontier-id> \ - --base-checkpoint <checkpoint-id> \ --hypothesis-node <hypothesis-node-id> \ --title "navigator metric card pass" \ --summary "Evaluate inline metrics on experiment-bearing cards." @@ -324,20 +323,14 @@ The intended flow is: 9. open the live experiment explicitly with `experiment.open` 10. seal core-path work with `experiment.close` -## Git-Backed Vs Plain Local Projects +## Git And The Ledger -Off-path work does not require git. You can initialize a local project and use: +Git remains useful for code history, bisect, and sensible commit messages, but +the Fidget Spinner ledger is about the science rather than about reproducing git +inside the experiment record. -- `source add` -- `tag add` -- `note quick` -- `metric keys` -- `node annotate` -- `mcp serve` - -Full core-path experiment closure needs a real git-backed project, such as the -target `libgrid` worktree, because checkpoints and champion capture are git -backed. +Core-path closure does not require a git-backed project. The canonical record is +the hypothesis, run slice, parsed metrics, verdict, and rationale. ## Workspace Layout diff --git a/assets/codex-skills/fidget-spinner/SKILL.md b/assets/codex-skills/fidget-spinner/SKILL.md index cfa3521..f187be3 100644 --- a/assets/codex-skills/fidget-spinner/SKILL.md +++ b/assets/codex-skills/fidget-spinner/SKILL.md @@ -52,7 +52,7 @@ If you need more context, pull it from: - `source.record` for imported source material, documentary context, or one substantial source digest; always pass `title`, `summary`, and `body`, and pass `tags` when the source belongs in a campaign/subsystem index - `note.quick` for atomic reusable takeaways, always with an explicit `tags` list plus `title`, `summary`, and `body`; use `[]` only when no registered tag applies - `hypothesis.record` before core-path work; every experiment must hang off exactly one hypothesis -- `experiment.open` once a hypothesis has a concrete base checkpoint and is ready to be tested +- `experiment.open` once a hypothesis has a concrete slice and is ready to be tested - `experiment.list` or `experiment.read` when resuming a session and you need to recover open experimental state - `metric.define` when a project-level metric key needs a canonical unit, objective, or human description - `run.dimension.define` when a new experiment slicer such as `scenario` or `duration_s` becomes query-worthy @@ -60,7 +60,7 @@ If you need more context, pull it from: - `metric.keys` before guessing which numeric signals are actually rankable; pass exact run-dimension filters when narrowing to one workload slice - `metric.best` when you need the best closed experiments by one numeric key; pass `order` for noncanonical payload fields and exact run-dimension filters when comparing one slice - `node.annotate` for scratch text that should stay off the main path -- `experiment.close` only for an already-open experiment and only when you have checkpoint, measured result, note, and verdict; attach `analysis` when the result needs explicit interpretation +- `experiment.close` only for an already-open experiment and only when you have measured result, note, and verdict; attach `analysis` when the result needs explicit interpretation - `node.archive` to hide stale detritus without deleting evidence - `node.create` only as a true escape hatch @@ -94,7 +94,8 @@ If the point is to test a claim, it should become a hypothesis plus an experimen Preserve one source digest if needed, then extract reusable claims into notes. 8. A hypothesis is not an experiment. Open the experiment explicitly; do not smuggle “planned work” into off-path prose. -9. Porcelain is the terse triage surface. Use `detail=full` only when concise +9. The ledger is scientific, not git-forensic. Do not treat commit hashes as experiment identity. +10. Porcelain is the terse triage surface. Use `detail=full` only when concise output stops being decision-sufficient. -10. When the task becomes a true indefinite optimization push, pair this skill +11. When the task becomes a true indefinite optimization push, pair this skill with `frontier-loop`. diff --git a/assets/codex-skills/frontier-loop/SKILL.md b/assets/codex-skills/frontier-loop/SKILL.md index e8f1b94..45235f4 100644 --- a/assets/codex-skills/frontier-loop/SKILL.md +++ b/assets/codex-skills/frontier-loop/SKILL.md @@ -58,7 +58,7 @@ DO NOT EMIT A FINAL TURN UNLESS YOU ARE ACTUALLY BLOCKED OR EXPLICITLY TOLD TO S ASSUME YOU ARE RUNNING OVERNIGHT. -1. Start from the current best checkpoint or most credible live branch. +1. Start from the current best accepted line or most credible live branch. 2. Study existing evidence from `fidget-spinner`. 3. Search outward if the local frontier looks exhausted or you are starting to take unambitious strides. 4. Form a strong, falsifiable hypothesis. @@ -68,7 +68,7 @@ ASSUME YOU ARE RUNNING OVERNIGHT. rerun only enough to understand the outcome. 8. Record the outcome through `fidget-spinner`. 9. Keep the line if it advances the objective or opens a genuinely strong new avenue. -10. If the line is dead, record that too, re-anchor to the best known checkpoint, +10. If the line is dead, record that too, re-anchor to the best known accepted line, and try a different attack. 11. Repeat. diff --git a/crates/fidget-spinner-cli/src/main.rs b/crates/fidget-spinner-cli/src/main.rs index 491e30d..f56e751 100644 --- a/crates/fidget-spinner-cli/src/main.rs +++ b/crates/fidget-spinner-cli/src/main.rs @@ -10,10 +10,10 @@ use std::path::{Path, PathBuf}; use camino::{Utf8Path, Utf8PathBuf}; use clap::{Args, Parser, Subcommand, ValueEnum}; use fidget_spinner_core::{ - AnnotationVisibility, CodeSnapshotRef, CommandRecipe, DiagnosticSeverity, ExecutionBackend, - FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, FrontierVerdict, - GitCommitHash, InferencePolicy, MetricSpec, MetricUnit, MetricValue, NodeAnnotation, NodeClass, - NodePayload, NonEmptyText, OptimizationObjective, ProjectFieldSpec, TagName, + AnnotationVisibility, CommandRecipe, DiagnosticSeverity, ExecutionBackend, FieldPresence, + FieldRole, FieldValueType, FrontierContract, FrontierNote, FrontierVerdict, InferencePolicy, + MetricSpec, MetricUnit, MetricValue, NodeAnnotation, NodeClass, NodePayload, NonEmptyText, + OptimizationObjective, ProjectFieldSpec, TagName, }; use fidget_spinner_store_sqlite::{ CloseExperimentRequest, CreateFrontierRequest, CreateNodeRequest, DefineMetricRequest, @@ -152,8 +152,6 @@ struct FrontierInitArgs { primary_metric_unit: CliMetricUnit, #[arg(long = "primary-metric-objective", value_enum)] primary_metric_objective: CliOptimizationObjective, - #[arg(long = "seed-summary", default_value = "initial champion checkpoint")] - seed_summary: String, } #[derive(Args)] @@ -490,11 +488,11 @@ struct MetricBestArgs { #[derive(Subcommand)] enum ExperimentCommand { - /// Open a stateful experiment against one hypothesis and base checkpoint. + /// Open a stateful experiment against one hypothesis. Open(ExperimentOpenArgs), /// List open experiments, optionally narrowed to one frontier. List(ExperimentListArgs), - /// Close a core-path experiment with checkpoint, run, note, and verdict. + /// Close a core-path experiment with run data, note, and verdict. Close(Box<ExperimentCloseArgs>), } @@ -518,8 +516,6 @@ struct ExperimentCloseArgs { project: ProjectArg, #[arg(long = "experiment")] experiment_id: String, - #[arg(long = "candidate-summary")] - candidate_summary: String, #[arg(long = "run-title")] run_title: String, #[arg(long = "run-summary")] @@ -567,8 +563,6 @@ struct ExperimentOpenArgs { project: ProjectArg, #[arg(long)] frontier: String, - #[arg(long = "base-checkpoint")] - base_checkpoint: String, #[arg(long = "hypothesis-node")] hypothesis_node: String, #[arg(long)] @@ -733,11 +727,10 @@ enum CliInferencePolicy { #[derive(Clone, Copy, Debug, Eq, PartialEq, ValueEnum)] enum CliFrontierVerdict { - PromoteToChampion, - KeepOnFrontier, - RevertToChampion, - ArchiveDeadEnd, - NeedsMoreEvidence, + Accepted, + Kept, + Parked, + Rejected, } fn main() { @@ -834,8 +827,6 @@ fn run_init(args: InitArgs) -> Result<(), StoreError> { fn run_frontier_init(args: FrontierInitArgs) -> Result<(), StoreError> { let mut store = open_store(&args.project.project)?; - let initial_checkpoint = - store.auto_capture_checkpoint(NonEmptyText::new(args.seed_summary)?)?; let projection = store.create_frontier(CreateFrontierRequest { label: NonEmptyText::new(args.label)?, contract_title: NonEmptyText::new(args.contract_title)?, @@ -853,7 +844,6 @@ fn run_frontier_init(args: FrontierInitArgs) -> Result<(), StoreError> { }, promotion_criteria: to_text_vec(args.promotion_criteria)?, }, - initial_checkpoint, })?; print_json(&projection) } @@ -1131,7 +1121,6 @@ fn run_experiment_open(args: ExperimentOpenArgs) -> Result<(), StoreError> { let summary = args.summary.map(NonEmptyText::new).transpose()?; let experiment = store.open_experiment(OpenExperimentRequest { frontier_id: parse_frontier_id(&args.frontier)?, - base_checkpoint_id: parse_checkpoint_id(&args.base_checkpoint)?, hypothesis_node_id: parse_node_id(&args.hypothesis_node)?, title: NonEmptyText::new(args.title)?, summary, @@ -1151,12 +1140,6 @@ fn run_experiment_list(args: ExperimentListArgs) -> Result<(), StoreError> { fn run_experiment_close(args: ExperimentCloseArgs) -> Result<(), StoreError> { let mut store = open_store(&args.project.project)?; - let snapshot = store - .auto_capture_checkpoint(NonEmptyText::new(args.candidate_summary.clone())?)? - .map(|seed| seed.snapshot) - .ok_or(StoreError::GitInspectionFailed( - store.project_root().to_path_buf(), - ))?; let command = CommandRecipe::new( args.working_directory .map(utf8_path) @@ -1186,14 +1169,11 @@ fn run_experiment_close(args: ExperimentCloseArgs) -> Result<(), StoreError> { }; let receipt = store.close_experiment(CloseExperimentRequest { experiment_id: parse_experiment_id(&args.experiment_id)?, - candidate_summary: NonEmptyText::new(args.candidate_summary)?, - candidate_snapshot: snapshot, run_title: NonEmptyText::new(args.run_title)?, run_summary: args.run_summary.map(NonEmptyText::new).transpose()?, backend: args.backend.into(), dimensions: coerce_cli_dimension_filters(&store, args.dimensions)?, command, - code_snapshot: Some(capture_code_snapshot(store.project_root())?), primary_metric: parse_metric_value(args.primary_metric)?, supporting_metrics: args .metrics @@ -1539,31 +1519,6 @@ fn parse_node_class_set(classes: Vec<CliNodeClass>) -> BTreeSet<NodeClass> { classes.into_iter().map(Into::into).collect() } -fn capture_code_snapshot(project_root: &Utf8Path) -> Result<CodeSnapshotRef, StoreError> { - let head_commit = run_git(project_root, &["rev-parse", "HEAD"])?; - let dirty_paths = run_git(project_root, &["status", "--porcelain"])? - .map(|status| { - status - .lines() - .filter_map(|line| line.get(3..).map(str::trim)) - .filter(|line| !line.is_empty()) - .map(Utf8PathBuf::from) - .collect::<BTreeSet<_>>() - }) - .unwrap_or_default(); - Ok(CodeSnapshotRef { - repo_root: run_git(project_root, &["rev-parse", "--show-toplevel"])? - .map(Utf8PathBuf::from) - .unwrap_or_else(|| project_root.to_path_buf()), - worktree_root: project_root.to_path_buf(), - worktree_name: run_git(project_root, &["rev-parse", "--abbrev-ref", "HEAD"])? - .map(NonEmptyText::new) - .transpose()?, - head_commit: head_commit.map(GitCommitHash::new).transpose()?, - dirty_paths, - }) -} - fn run_git(project_root: &Utf8Path, args: &[&str]) -> Result<Option<String>, StoreError> { let output = std::process::Command::new("git") .arg("-C") @@ -1702,12 +1657,6 @@ fn parse_frontier_id(raw: &str) -> Result<fidget_spinner_core::FrontierId, Store )?)) } -fn parse_checkpoint_id(raw: &str) -> Result<fidget_spinner_core::CheckpointId, StoreError> { - Ok(fidget_spinner_core::CheckpointId::from_uuid( - Uuid::parse_str(raw)?, - )) -} - fn parse_experiment_id(raw: &str) -> Result<fidget_spinner_core::ExperimentId, StoreError> { Ok(fidget_spinner_core::ExperimentId::from_uuid( Uuid::parse_str(raw)?, @@ -1851,11 +1800,10 @@ impl From<CliInferencePolicy> for InferencePolicy { impl From<CliFrontierVerdict> for FrontierVerdict { fn from(value: CliFrontierVerdict) -> Self { match value { - CliFrontierVerdict::PromoteToChampion => Self::PromoteToChampion, - CliFrontierVerdict::KeepOnFrontier => Self::KeepOnFrontier, - CliFrontierVerdict::RevertToChampion => Self::RevertToChampion, - CliFrontierVerdict::ArchiveDeadEnd => Self::ArchiveDeadEnd, - CliFrontierVerdict::NeedsMoreEvidence => Self::NeedsMoreEvidence, + CliFrontierVerdict::Accepted => Self::Accepted, + CliFrontierVerdict::Kept => Self::Kept, + CliFrontierVerdict::Parked => Self::Parked, + CliFrontierVerdict::Rejected => Self::Rejected, } } } diff --git a/crates/fidget-spinner-cli/src/mcp/catalog.rs b/crates/fidget-spinner-cli/src/mcp/catalog.rs index 3b8abcc..ae3ca78 100644 --- a/crates/fidget-spinner-cli/src/mcp/catalog.rs +++ b/crates/fidget-spinner-cli/src/mcp/catalog.rs @@ -99,13 +99,13 @@ pub(crate) fn tool_spec(name: &str) -> Option<ToolSpec> { }), "frontier.status" => Some(ToolSpec { name: "frontier.status", - description: "Read one frontier projection, including champion and active candidates.", + description: "Read one frontier projection, including open/completed experiment counts and verdict totals.", dispatch: DispatchTarget::Worker, replay: ReplayContract::Convergent, }), "frontier.init" => Some(ToolSpec { name: "frontier.init", - description: "Create a new frontier rooted in a contract node. If the project is a git repo, the current HEAD becomes the initial champion when possible.", + description: "Create a new frontier rooted in a contract node.", dispatch: DispatchTarget::Worker, replay: ReplayContract::NeverReplay, }), @@ -183,7 +183,7 @@ pub(crate) fn tool_spec(name: &str) -> Option<ToolSpec> { }), "metric.best" => Some(ToolSpec { name: "metric.best", - description: "Rank completed experiments by one numeric key, with optional run-dimension filters and candidate commit surfacing.", + description: "Rank completed experiments by one numeric key, with optional run-dimension filters.", dispatch: DispatchTarget::Worker, replay: ReplayContract::Convergent, }), @@ -195,7 +195,7 @@ pub(crate) fn tool_spec(name: &str) -> Option<ToolSpec> { }), "experiment.open" => Some(ToolSpec { name: "experiment.open", - description: "Open a stateful experiment against one hypothesis and one base checkpoint.", + description: "Open a stateful experiment against one hypothesis.", dispatch: DispatchTarget::Worker, replay: ReplayContract::NeverReplay, }), @@ -213,7 +213,7 @@ pub(crate) fn tool_spec(name: &str) -> Option<ToolSpec> { }), "experiment.close" => Some(ToolSpec { name: "experiment.close", - description: "Close one open experiment with typed run dimensions, preregistered metric observations, candidate checkpoint capture, optional analysis, note, and verdict.", + description: "Close one open experiment with typed run dimensions, preregistered metric observations, optional analysis, note, and verdict.", dispatch: DispatchTarget::Worker, replay: ReplayContract::NeverReplay, }), @@ -562,12 +562,11 @@ fn input_schema(name: &str) -> Value { "type": "object", "properties": { "frontier_id": { "type": "string" }, - "base_checkpoint_id": { "type": "string" }, "hypothesis_node_id": { "type": "string" }, "title": { "type": "string" }, "summary": { "type": "string" } }, - "required": ["frontier_id", "base_checkpoint_id", "hypothesis_node_id", "title"], + "required": ["frontier_id", "hypothesis_node_id", "title"], "additionalProperties": false }), "experiment.list" => json!({ @@ -589,7 +588,6 @@ fn input_schema(name: &str) -> Value { "type": "object", "properties": { "experiment_id": { "type": "string" }, - "candidate_summary": { "type": "string" }, "run": run_schema(), "primary_metric": metric_value_schema(), "supporting_metrics": { "type": "array", "items": metric_value_schema() }, @@ -601,7 +599,6 @@ fn input_schema(name: &str) -> Value { }, "required": [ "experiment_id", - "candidate_summary", "run", "primary_metric", "note", @@ -753,11 +750,10 @@ fn verdict_schema() -> Value { json!({ "type": "string", "enum": [ - "promote_to_champion", - "keep_on_frontier", - "revert_to_champion", - "archive_dead_end", - "needs_more_evidence" + "accepted", + "kept", + "parked", + "rejected" ] }) } diff --git a/crates/fidget-spinner-cli/src/mcp/service.rs b/crates/fidget-spinner-cli/src/mcp/service.rs index 05f2382..f0cca1e 100644 --- a/crates/fidget-spinner-cli/src/mcp/service.rs +++ b/crates/fidget-spinner-cli/src/mcp/service.rs @@ -3,11 +3,11 @@ use std::fs; use camino::{Utf8Path, Utf8PathBuf}; use fidget_spinner_core::{ - AdmissionState, AnnotationVisibility, CodeSnapshotRef, CommandRecipe, DiagnosticSeverity, - ExecutionBackend, FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, - FrontierProjection, FrontierRecord, FrontierVerdict, InferencePolicy, MetricSpec, MetricUnit, - MetricValue, NodeAnnotation, NodeClass, NodePayload, NonEmptyText, ProjectFieldSpec, - ProjectSchema, RunDimensionValue, TagName, TagRecord, + AdmissionState, AnnotationVisibility, CommandRecipe, DiagnosticSeverity, ExecutionBackend, + FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, FrontierProjection, + FrontierRecord, FrontierVerdict, InferencePolicy, MetricSpec, MetricUnit, MetricValue, + NodeAnnotation, NodeClass, NodePayload, NonEmptyText, ProjectFieldSpec, ProjectSchema, + RunDimensionValue, TagName, TagRecord, }; use fidget_spinner_store_sqlite::{ CloseExperimentRequest, CreateFrontierRequest, CreateNodeRequest, DefineMetricRequest, @@ -203,16 +203,6 @@ impl WorkerService { } "frontier.init" => { let args = deserialize::<FrontierInitToolArgs>(arguments)?; - let initial_checkpoint = self - .store - .auto_capture_checkpoint( - NonEmptyText::new( - args.seed_summary - .unwrap_or_else(|| "initial champion checkpoint".to_owned()), - ) - .map_err(store_fault("tools/call:frontier.init"))?, - ) - .map_err(store_fault("tools/call:frontier.init"))?; let projection = self .store .create_frontier(CreateFrontierRequest { @@ -251,7 +241,6 @@ impl WorkerService { promotion_criteria: crate::to_text_vec(args.promotion_criteria) .map_err(store_fault("tools/call:frontier.init"))?, }, - initial_checkpoint, }) .map_err(store_fault("tools/call:frontier.init"))?; tool_success( @@ -702,8 +691,6 @@ impl WorkerService { .open_experiment(OpenExperimentRequest { frontier_id: crate::parse_frontier_id(&args.frontier_id) .map_err(store_fault("tools/call:experiment.open"))?, - base_checkpoint_id: crate::parse_checkpoint_id(&args.base_checkpoint_id) - .map_err(store_fault("tools/call:experiment.open"))?, hypothesis_node_id: crate::parse_node_id(&args.hypothesis_node_id) .map_err(store_fault("tools/call:experiment.open"))?, title: NonEmptyText::new(args.title) @@ -763,33 +750,11 @@ impl WorkerService { } "experiment.close" => { let args = deserialize::<ExperimentCloseToolArgs>(arguments)?; - let snapshot = self - .store - .auto_capture_checkpoint( - NonEmptyText::new(args.candidate_summary.clone()) - .map_err(store_fault("tools/call:experiment.close"))?, - ) - .map_err(store_fault("tools/call:experiment.close"))? - .map(|seed| seed.snapshot) - .ok_or_else(|| { - FaultRecord::new( - FaultKind::Internal, - FaultStage::Store, - "tools/call:experiment.close", - format!( - "git repository inspection failed for {}", - self.store.project_root() - ), - ) - })?; let receipt = self .store .close_experiment(CloseExperimentRequest { experiment_id: crate::parse_experiment_id(&args.experiment_id) .map_err(store_fault("tools/call:experiment.close"))?, - candidate_summary: NonEmptyText::new(args.candidate_summary) - .map_err(store_fault("tools/call:experiment.close"))?, - candidate_snapshot: snapshot, run_title: NonEmptyText::new(args.run.title) .map_err(store_fault("tools/call:experiment.close"))?, run_summary: args @@ -810,10 +775,6 @@ impl WorkerService { self.store.project_root(), ) .map_err(store_fault("tools/call:experiment.close"))?, - code_snapshot: Some( - capture_code_snapshot(self.store.project_root()) - .map_err(store_fault("tools/call:experiment.close"))?, - ), primary_metric: metric_value_from_wire(args.primary_metric) .map_err(store_fault("tools/call:experiment.close"))?, supporting_metrics: args @@ -1346,8 +1307,8 @@ fn experiment_close_output( let concise = json!({ "experiment_id": receipt.experiment.id, "frontier_id": receipt.experiment.frontier_id, - "candidate_checkpoint_id": receipt.experiment.candidate_checkpoint_id, - "verdict": format!("{:?}", receipt.experiment.verdict).to_ascii_lowercase(), + "experiment_title": receipt.experiment.title, + "verdict": metric_verdict_name(receipt.experiment.verdict), "run_id": receipt.run.run_id, "hypothesis_node_id": receipt.experiment.hypothesis_node_id, "decision_node_id": receipt.decision_node.id, @@ -1362,11 +1323,11 @@ fn experiment_close_output( "closed experiment {} on frontier {}", receipt.experiment.id, receipt.experiment.frontier_id ), + format!("title: {}", receipt.experiment.title), format!("hypothesis: {}", receipt.experiment.hypothesis_node_id), - format!("candidate: {}", receipt.experiment.candidate_checkpoint_id), format!( "verdict: {}", - format!("{:?}", receipt.experiment.verdict).to_ascii_lowercase() + metric_verdict_name(receipt.experiment.verdict) ), format!( "primary metric: {}", @@ -1393,7 +1354,6 @@ fn experiment_open_output( let concise = json!({ "experiment_id": item.id, "frontier_id": item.frontier_id, - "base_checkpoint_id": item.base_checkpoint_id, "hypothesis_node_id": item.hypothesis_node_id, "title": item.title, "summary": item.summary, @@ -1405,7 +1365,6 @@ fn experiment_open_output( format!("{action} {}", item.id), format!("frontier: {}", item.frontier_id), format!("hypothesis: {}", item.hypothesis_node_id), - format!("base checkpoint: {}", item.base_checkpoint_id), format!("title: {}", item.title), item.summary .as_ref() @@ -1426,7 +1385,6 @@ fn experiment_list_output(items: &[OpenExperimentSummary]) -> Result<ToolOutput, json!({ "experiment_id": item.id, "frontier_id": item.frontier_id, - "base_checkpoint_id": item.base_checkpoint_id, "hypothesis_node_id": item.hypothesis_node_id, "title": item.title, "summary": item.summary, @@ -1436,8 +1394,8 @@ fn experiment_list_output(items: &[OpenExperimentSummary]) -> Result<ToolOutput, let mut lines = vec![format!("{} open experiment(s)", items.len())]; lines.extend(items.iter().map(|item| { format!( - "{} {} | hypothesis={} | checkpoint={}", - item.id, item.title, item.hypothesis_node_id, item.base_checkpoint_id, + "{} {} | hypothesis={}", + item.id, item.title, item.hypothesis_node_id, ) })); detailed_tool_output( @@ -1511,12 +1469,11 @@ fn metric_best_output( "value": item.value, "order": item.order.as_str(), "experiment_id": item.experiment_id, + "experiment_title": item.experiment_title, "frontier_id": item.frontier_id, "hypothesis_node_id": item.hypothesis_node_id, "hypothesis_title": item.hypothesis_title, "verdict": metric_verdict_name(item.verdict), - "candidate_checkpoint_id": item.candidate_checkpoint_id, - "candidate_commit_hash": item.candidate_commit_hash, "run_id": item.run_id, "unit": item.unit.map(metric_unit_name), "objective": item.objective.map(metric_objective_name), @@ -1527,15 +1484,14 @@ fn metric_best_output( let mut lines = vec![format!("{} ranked experiment(s)", items.len())]; lines.extend(items.iter().enumerate().map(|(index, item)| { format!( - "{}. {}={} [{}] {} | verdict={} | commit={} | checkpoint={}", + "{}. {}={} [{}] {} | verdict={} | hypothesis={}", index + 1, item.key, item.value, item.source.as_str(), - item.hypothesis_title, + item.experiment_title, metric_verdict_name(item.verdict), - item.candidate_commit_hash, - item.candidate_checkpoint_id, + item.hypothesis_title, ) })); lines.extend( @@ -1668,17 +1624,13 @@ fn frontier_projection_summary_value(projection: &FrontierProjection) -> Value { "frontier_id": projection.frontier.id, "label": projection.frontier.label, "status": format!("{:?}", projection.frontier.status).to_ascii_lowercase(), - "champion_checkpoint_id": projection.champion_checkpoint_id, - "candidate_checkpoint_ids": projection.candidate_checkpoint_ids, - "experiment_count": projection.experiment_count, + "open_experiment_count": projection.open_experiment_count, + "completed_experiment_count": projection.completed_experiment_count, + "verdict_counts": projection.verdict_counts, }) } fn frontier_projection_text(prefix: &str, projection: &FrontierProjection) -> String { - let champion = projection - .champion_checkpoint_id - .map(|value| value.to_string()) - .unwrap_or_else(|| "none".to_owned()); [ format!( "{prefix} {} {}", @@ -1688,9 +1640,18 @@ fn frontier_projection_text(prefix: &str, projection: &FrontierProjection) -> St "status: {}", format!("{:?}", projection.frontier.status).to_ascii_lowercase() ), - format!("champion: {champion}"), - format!("candidates: {}", projection.candidate_checkpoint_ids.len()), - format!("experiments: {}", projection.experiment_count), + format!("open experiments: {}", projection.open_experiment_count), + format!( + "completed experiments: {}", + projection.completed_experiment_count + ), + format!( + "verdicts: accepted={} kept={} parked={} rejected={}", + projection.verdict_counts.accepted, + projection.verdict_counts.kept, + projection.verdict_counts.parked, + projection.verdict_counts.rejected, + ), ] .join("\n") } @@ -1991,11 +1952,10 @@ fn metric_objective_name(objective: fidget_spinner_core::OptimizationObjective) fn metric_verdict_name(verdict: FrontierVerdict) -> &'static str { match verdict { - FrontierVerdict::PromoteToChampion => "promote_to_champion", - FrontierVerdict::KeepOnFrontier => "keep_on_frontier", - FrontierVerdict::RevertToChampion => "revert_to_champion", - FrontierVerdict::ArchiveDeadEnd => "archive_dead_end", - FrontierVerdict::NeedsMoreEvidence => "needs_more_evidence", + FrontierVerdict::Accepted => "accepted", + FrontierVerdict::Kept => "kept", + FrontierVerdict::Parked => "parked", + FrontierVerdict::Rejected => "rejected", } } @@ -2192,10 +2152,6 @@ fn command_recipe_from_wire( .map_err(StoreError::from) } -fn capture_code_snapshot(project_root: &Utf8Path) -> Result<CodeSnapshotRef, StoreError> { - crate::capture_code_snapshot(project_root) -} - fn parse_node_class_name(raw: &str) -> Result<NodeClass, StoreError> { match raw { "contract" => Ok(NodeClass::Contract), @@ -2311,11 +2267,10 @@ fn parse_backend_name(raw: &str) -> Result<ExecutionBackend, StoreError> { fn parse_verdict_name(raw: &str) -> Result<FrontierVerdict, StoreError> { match raw { - "promote_to_champion" => Ok(FrontierVerdict::PromoteToChampion), - "keep_on_frontier" => Ok(FrontierVerdict::KeepOnFrontier), - "revert_to_champion" => Ok(FrontierVerdict::RevertToChampion), - "archive_dead_end" => Ok(FrontierVerdict::ArchiveDeadEnd), - "needs_more_evidence" => Ok(FrontierVerdict::NeedsMoreEvidence), + "accepted" => Ok(FrontierVerdict::Accepted), + "kept" => Ok(FrontierVerdict::Kept), + "parked" => Ok(FrontierVerdict::Parked), + "rejected" => Ok(FrontierVerdict::Rejected), other => Err(crate::invalid_input(format!("unknown verdict `{other}`"))), } } @@ -2342,7 +2297,6 @@ struct FrontierInitToolArgs { primary_metric: WireMetricSpec, #[serde(default)] supporting_metrics: Vec<WireMetricSpec>, - seed_summary: Option<String>, } #[derive(Debug, Deserialize)] @@ -2480,7 +2434,6 @@ struct MetricBestToolArgs { #[derive(Debug, Deserialize)] struct ExperimentOpenToolArgs { frontier_id: String, - base_checkpoint_id: String, hypothesis_node_id: String, title: String, summary: Option<String>, @@ -2499,7 +2452,6 @@ struct ExperimentReadToolArgs { #[derive(Debug, Deserialize)] struct ExperimentCloseToolArgs { experiment_id: String, - candidate_summary: String, run: WireRun, primary_metric: WireMetricValue, #[serde(default)] diff --git a/crates/fidget-spinner-cli/tests/mcp_hardening.rs b/crates/fidget-spinner-cli/tests/mcp_hardening.rs index 0142b77..21a3d04 100644 --- a/crates/fidget-spinner-cli/tests/mcp_hardening.rs +++ b/crates/fidget-spinner-cli/tests/mcp_hardening.rs @@ -57,48 +57,6 @@ fn init_project(root: &Utf8PathBuf) -> TestResult { Ok(()) } -fn run_command(root: &Utf8PathBuf, program: &str, args: &[&str]) -> TestResult<String> { - let output = must( - Command::new(program) - .current_dir(root.as_std_path()) - .args(args) - .output(), - format!("{program} spawn"), - )?; - if !output.status.success() { - return Err(io::Error::other(format!( - "{program} {:?} failed: {}", - args, - String::from_utf8_lossy(&output.stderr) - )) - .into()); - } - Ok(String::from_utf8_lossy(&output.stdout).trim().to_owned()) -} - -fn run_git(root: &Utf8PathBuf, args: &[&str]) -> TestResult<String> { - run_command(root, "git", args) -} - -fn init_git_project(root: &Utf8PathBuf) -> TestResult<String> { - let _ = run_git(root, &["init", "-b", "main"])?; - let _ = run_git(root, &["config", "user.name", "main"])?; - let _ = run_git(root, &["config", "user.email", "main@swarm.moe"])?; - let _ = run_git(root, &["add", "-A"])?; - let _ = run_git(root, &["commit", "-m", "initial state"])?; - run_git(root, &["rev-parse", "HEAD"]) -} - -fn commit_project_state(root: &Utf8PathBuf, marker: &str, message: &str) -> TestResult<String> { - must( - fs::write(root.join(marker).as_std_path(), message), - format!("write marker {marker}"), - )?; - let _ = run_git(root, &["add", "-A"])?; - let _ = run_git(root, &["commit", "-m", message])?; - run_git(root, &["rev-parse", "HEAD"]) -} - fn binary_path() -> PathBuf { PathBuf::from(env!("CARGO_BIN_EXE_fidget-spinner-cli")) } @@ -688,7 +646,7 @@ fn tag_registry_drives_note_creation_and_lookup() -> TestResult { } #[test] -fn research_record_accepts_tags_and_filtering() -> TestResult { +fn source_record_accepts_tags_and_filtering() -> TestResult { let project_root = temp_project_root("research_tags")?; init_project(&project_root)?; @@ -721,10 +679,7 @@ fn research_record_accepts_tags_and_filtering() -> TestResult { assert_eq!(research["result"]["isError"].as_bool(), Some(false)); let filtered = harness.call_tool(454, "node.list", json!({"tags": ["campaign/libgrid"]}))?; - let nodes = must_some( - tool_content(&filtered).as_array(), - "filtered research nodes", - )?; + let nodes = must_some(tool_content(&filtered).as_array(), "filtered source nodes")?; assert_eq!(nodes.len(), 1); assert_eq!(nodes[0]["class"].as_str(), Some("source")); assert_eq!(nodes[0]["tags"][0].as_str(), Some("campaign/libgrid")); @@ -760,20 +715,20 @@ fn prose_tools_reject_invalid_shapes_over_mcp() -> TestResult { .is_some_and(|message| message.contains("summary") || message.contains("missing field")) ); - let missing_research_summary = harness.call_tool( + let missing_source_summary = harness.call_tool( 48, "source.record", json!({ - "title": "research only", + "title": "source only", "body": "body only", }), )?; assert_eq!( - missing_research_summary["result"]["isError"].as_bool(), + missing_source_summary["result"]["isError"].as_bool(), Some(true) ); assert!( - fault_message(&missing_research_summary) + fault_message(&missing_source_summary) .is_some_and(|message| message.contains("summary") || message.contains("missing field")) ); @@ -794,7 +749,7 @@ fn prose_tools_reject_invalid_shapes_over_mcp() -> TestResult { .is_some_and(|message| message.contains("payload field `body`")) ); - let research_without_summary = harness.call_tool( + let source_without_summary = harness.call_tool( 50, "node.create", json!({ @@ -804,11 +759,11 @@ fn prose_tools_reject_invalid_shapes_over_mcp() -> TestResult { }), )?; assert_eq!( - research_without_summary["result"]["isError"].as_bool(), + source_without_summary["result"]["isError"].as_bool(), Some(true) ); assert!( - fault_message(&research_without_summary) + fault_message(&source_without_summary) .is_some_and(|message| message.contains("non-empty summary")) ); Ok(()) @@ -885,11 +840,8 @@ fn concise_prose_reads_only_surface_payload_field_names() -> TestResult { }), )?; assert_eq!(research["result"]["isError"].as_bool(), Some(false)); - let node_id = must_some( - tool_content(&research)["id"].as_str(), - "created research id", - )? - .to_owned(); + let node_id = + must_some(tool_content(&research)["id"].as_str(), "created source id")?.to_owned(); let concise = harness.call_tool(533, "node.read", json!({ "node_id": node_id }))?; let concise_structured = tool_content(&concise); @@ -1043,7 +995,7 @@ fn bind_open_backfills_legacy_missing_summary() -> TestResult { store.add_node(fidget_spinner_store_sqlite::CreateNodeRequest { class: fidget_spinner_core::NodeClass::Source, frontier_id: None, - title: must(NonEmptyText::new("legacy research"), "legacy title")?, + title: must(NonEmptyText::new("legacy source"), "legacy title")?, summary: Some(must( NonEmptyText::new("temporary summary"), "temporary summary", @@ -1059,7 +1011,7 @@ fn bind_open_backfills_legacy_missing_summary() -> TestResult { annotations: Vec::new(), attachments: Vec::new(), }), - "create legacy research node", + "create legacy source node", )?; node.id.to_string() }; @@ -1097,7 +1049,7 @@ fn bind_open_backfills_legacy_missing_summary() -> TestResult { ); let listed = harness.call_tool(62, "node.list", json!({ "class": "source" }))?; - let items = must_some(tool_content(&listed).as_array(), "research node list")?; + let items = must_some(tool_content(&listed).as_array(), "source node list")?; assert_eq!(items.len(), 1); assert_eq!( items[0]["summary"].as_str(), @@ -1110,7 +1062,6 @@ fn bind_open_backfills_legacy_missing_summary() -> TestResult { fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResult { let project_root = temp_project_root("metric_rank_e2e")?; init_project(&project_root)?; - let _initial_head = init_git_project(&project_root)?; let mut harness = McpHarness::spawn(Some(&project_root), &[])?; let _ = harness.initialize()?; @@ -1138,11 +1089,6 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "frontier id", )? .to_owned(); - let base_checkpoint_id = must_some( - tool_content(&frontier)["champion_checkpoint_id"].as_str(), - "base checkpoint id", - )? - .to_owned(); let metric_define = harness.call_tool( 701, "metric.define", @@ -1222,7 +1168,6 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "experiment.open", json!({ "frontier_id": frontier_id, - "base_checkpoint_id": base_checkpoint_id, "hypothesis_node_id": first_change_id, "title": "first experiment", "summary": "first experiment summary" @@ -1233,14 +1178,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu tool_content(&first_experiment)["experiment_id"].as_str(), "first experiment id", )?; - let _first_commit = commit_project_state(&project_root, "candidate-one.txt", "candidate one")?; let first_close = harness.call_tool( 72, "experiment.close", json!({ "experiment_id": first_experiment_id, - "candidate_summary": "candidate one", "run": { "title": "first run", "summary": "first run summary", @@ -1262,19 +1205,13 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "note": { "summary": "first run note" }, - "verdict": "keep_on_frontier", + "verdict": "kept", "decision_title": "first decision", "decision_rationale": "keep first candidate around" }), )?; assert_eq!(first_close["result"]["isError"].as_bool(), Some(false)); - let first_candidate_checkpoint_id = must_some( - tool_content(&first_close)["candidate_checkpoint_id"].as_str(), - "first candidate checkpoint id", - )? - .to_owned(); - let second_change = harness.call_tool( 73, "node.create", @@ -1299,7 +1236,6 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "experiment.open", json!({ "frontier_id": frontier_id, - "base_checkpoint_id": base_checkpoint_id, "hypothesis_node_id": second_change_id, "title": "second experiment", "summary": "second experiment summary" @@ -1313,14 +1249,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu tool_content(&second_experiment)["experiment_id"].as_str(), "second experiment id", )?; - let second_commit = commit_project_state(&project_root, "candidate-two.txt", "candidate two")?; let second_close = harness.call_tool( 74, "experiment.close", json!({ "experiment_id": second_experiment_id, - "candidate_summary": "candidate two", "run": { "title": "second run", "summary": "second run summary", @@ -1342,17 +1276,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "note": { "summary": "second run note" }, - "verdict": "keep_on_frontier", + "verdict": "kept", "decision_title": "second decision", "decision_rationale": "second candidate looks stronger" }), )?; assert_eq!(second_close["result"]["isError"].as_bool(), Some(false)); - let second_candidate_checkpoint_id = must_some( - tool_content(&second_close)["candidate_checkpoint_id"].as_str(), - "second candidate checkpoint id", - )? - .to_owned(); let second_frontier = harness.call_tool( 80, @@ -1376,11 +1305,6 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "second frontier id", )? .to_owned(); - let second_base_checkpoint_id = must_some( - tool_content(&second_frontier)["champion_checkpoint_id"].as_str(), - "second frontier base checkpoint id", - )? - .to_owned(); let third_change = harness.call_tool( 81, @@ -1406,7 +1330,6 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "experiment.open", json!({ "frontier_id": second_frontier_id, - "base_checkpoint_id": second_base_checkpoint_id, "hypothesis_node_id": third_change_id, "title": "third experiment", "summary": "third experiment summary" @@ -1417,15 +1340,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu tool_content(&third_experiment)["experiment_id"].as_str(), "third experiment id", )?; - let third_commit = - commit_project_state(&project_root, "candidate-three.txt", "candidate three")?; let third_close = harness.call_tool( 82, "experiment.close", json!({ "experiment_id": third_experiment_id, - "candidate_summary": "candidate three", "run": { "title": "third run", "summary": "third run summary", @@ -1447,17 +1367,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "note": { "summary": "third run note" }, - "verdict": "keep_on_frontier", + "verdict": "kept", "decision_title": "third decision", "decision_rationale": "third candidate is best overall but not in the first frontier" }), )?; assert_eq!(third_close["result"]["isError"].as_bool(), Some(false)); - let third_candidate_checkpoint_id = must_some( - tool_content(&third_close)["candidate_checkpoint_id"].as_str(), - "third candidate checkpoint id", - )? - .to_owned(); let keys = harness.call_tool(75, "metric.keys", json!({}))?; assert_eq!(keys["result"]["isError"].as_bool(), Some(false)); @@ -1524,13 +1439,10 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu assert_eq!(run_best_rows[0]["value"].as_f64(), Some(5.0)); assert_eq!(run_best_rows.len(), 1); assert_eq!( - run_best_rows[0]["candidate_checkpoint_id"].as_str(), - Some(second_candidate_checkpoint_id.as_str()) - ); - assert_eq!( - run_best_rows[0]["candidate_commit_hash"].as_str(), - Some(second_commit.as_str()) + run_best_rows[0]["experiment_title"].as_str(), + Some("second experiment") ); + assert_eq!(run_best_rows[0]["verdict"].as_str(), Some("kept")); assert_eq!( run_best_rows[0]["dimensions"]["scenario"].as_str(), Some("belt_4x5") @@ -1539,7 +1451,9 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu run_best_rows[0]["dimensions"]["duration_s"].as_f64(), Some(60.0) ); - assert!(must_some(tool_text(&run_metric_best), "run metric best text")?.contains("commit=")); + assert!( + must_some(tool_text(&run_metric_best), "run metric best text")?.contains("hypothesis=") + ); assert!(must_some(tool_text(&run_metric_best), "run metric best text")?.contains("dims:")); let payload_requires_order = harness.call_tool( @@ -1580,12 +1494,8 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu assert_eq!(payload_best_rows[0]["value"].as_f64(), Some(7.0)); assert_eq!(payload_best_rows.len(), 1); assert_eq!( - payload_best_rows[0]["candidate_checkpoint_id"].as_str(), - Some(second_candidate_checkpoint_id.as_str()) - ); - assert_eq!( - payload_best_rows[0]["candidate_commit_hash"].as_str(), - Some(second_commit.as_str()) + payload_best_rows[0]["experiment_title"].as_str(), + Some("second experiment") ); let filtered_best = harness.call_tool( @@ -1608,8 +1518,8 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu )?; assert_eq!(filtered_rows.len(), 2); assert_eq!( - filtered_rows[0]["candidate_checkpoint_id"].as_str(), - Some(second_candidate_checkpoint_id.as_str()) + filtered_rows[0]["experiment_title"].as_str(), + Some("second experiment") ); assert!( filtered_rows @@ -1632,12 +1542,12 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu "global metric best array", )?; assert_eq!( - global_rows[0]["candidate_checkpoint_id"].as_str(), - Some(third_candidate_checkpoint_id.as_str()) + global_rows[0]["experiment_title"].as_str(), + Some("third experiment") ); assert_eq!( - global_rows[0]["candidate_commit_hash"].as_str(), - Some(third_commit.as_str()) + global_rows[0]["frontier_id"].as_str(), + Some(second_frontier_id.as_str()) ); let migrate = harness.call_tool(85, "metric.migrate", json!({}))?; @@ -1654,14 +1564,5 @@ fn metric_tools_rank_closed_experiments_and_enforce_disambiguation() -> TestResu tool_content(&migrate)["inserted_dimension_values"].as_u64(), Some(0) ); - - assert_ne!( - first_candidate_checkpoint_id, - second_candidate_checkpoint_id - ); - assert_ne!( - second_candidate_checkpoint_id, - third_candidate_checkpoint_id - ); Ok(()) } diff --git a/crates/fidget-spinner-core/src/id.rs b/crates/fidget-spinner-core/src/id.rs index ea2cd5a..7f696a3 100644 --- a/crates/fidget-spinner-core/src/id.rs +++ b/crates/fidget-spinner-core/src/id.rs @@ -39,7 +39,6 @@ macro_rules! define_id { define_id!(AgentSessionId); define_id!(AnnotationId); define_id!(ArtifactId); -define_id!(CheckpointId); define_id!(ExperimentId); define_id!(FrontierId); define_id!(NodeId); diff --git a/crates/fidget-spinner-core/src/lib.rs b/crates/fidget-spinner-core/src/lib.rs index 3c9aaac..1c4108a 100644 --- a/crates/fidget-spinner-core/src/lib.rs +++ b/crates/fidget-spinner-core/src/lib.rs @@ -12,17 +12,16 @@ mod model; pub use crate::error::CoreError; pub use crate::id::{ - AgentSessionId, AnnotationId, ArtifactId, CheckpointId, ExperimentId, FrontierId, NodeId, RunId, + AgentSessionId, AnnotationId, ArtifactId, ExperimentId, FrontierId, NodeId, RunId, }; pub use crate::model::{ - AdmissionState, AnnotationVisibility, ArtifactKind, ArtifactRef, CheckpointDisposition, - CheckpointRecord, CheckpointSnapshotRef, CodeSnapshotRef, CommandRecipe, CompletedExperiment, - DagEdge, DagNode, DiagnosticSeverity, EdgeKind, EvaluationProtocol, ExecutionBackend, - ExperimentResult, FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, - FrontierProjection, FrontierRecord, FrontierStatus, FrontierVerdict, GitCommitHash, - InferencePolicy, JsonObject, MetricDefinition, MetricObservation, MetricSpec, MetricUnit, - MetricValue, NodeAnnotation, NodeClass, NodeDiagnostics, NodePayload, NodeTrack, NonEmptyText, - OpenExperiment, OptimizationObjective, PayloadSchemaRef, ProjectFieldSpec, ProjectSchema, - RunDimensionDefinition, RunDimensionValue, RunRecord, RunStatus, TagName, TagRecord, - ValidationDiagnostic, + AdmissionState, AnnotationVisibility, ArtifactKind, ArtifactRef, CommandRecipe, + CompletedExperiment, DagEdge, DagNode, DiagnosticSeverity, EdgeKind, EvaluationProtocol, + ExecutionBackend, ExperimentResult, FieldPresence, FieldRole, FieldValueType, FrontierContract, + FrontierNote, FrontierProjection, FrontierRecord, FrontierStatus, FrontierVerdict, + FrontierVerdictCounts, InferencePolicy, JsonObject, MetricDefinition, MetricObservation, + MetricSpec, MetricUnit, MetricValue, NodeAnnotation, NodeClass, NodeDiagnostics, NodePayload, + NodeTrack, NonEmptyText, OpenExperiment, OptimizationObjective, PayloadSchemaRef, + ProjectFieldSpec, ProjectSchema, RunDimensionDefinition, RunDimensionValue, RunRecord, + RunStatus, TagName, TagRecord, ValidationDiagnostic, }; diff --git a/crates/fidget-spinner-core/src/model.rs b/crates/fidget-spinner-core/src/model.rs index 170f49c..88050a2 100644 --- a/crates/fidget-spinner-core/src/model.rs +++ b/crates/fidget-spinner-core/src/model.rs @@ -8,8 +8,7 @@ use time::OffsetDateTime; use time::format_description::well_known::Rfc3339; use crate::{ - AgentSessionId, AnnotationId, ArtifactId, CheckpointId, CoreError, ExperimentId, FrontierId, - NodeId, RunId, + AgentSessionId, AnnotationId, ArtifactId, CoreError, ExperimentId, FrontierId, NodeId, RunId, }; #[derive(Clone, Debug, Deserialize, Eq, Ord, PartialEq, PartialOrd, Serialize)] @@ -37,27 +36,6 @@ impl Display for NonEmptyText { } } -#[derive(Clone, Debug, Deserialize, Eq, Ord, PartialEq, PartialOrd, Serialize)] -#[serde(transparent)] -pub struct GitCommitHash(NonEmptyText); - -impl GitCommitHash { - pub fn new(value: impl Into<String>) -> Result<Self, CoreError> { - NonEmptyText::new(value).map(Self) - } - - #[must_use] - pub fn as_str(&self) -> &str { - self.0.as_str() - } -} - -impl Display for GitCommitHash { - fn fmt(&self, formatter: &mut Formatter<'_>) -> fmt::Result { - Display::fmt(&self.0, formatter) - } -} - #[derive(Clone, Debug, Eq, Ord, PartialEq, PartialOrd, Serialize, Deserialize)] #[serde(try_from = "String", into = "String")] pub struct TagName(String); @@ -286,15 +264,6 @@ pub enum FrontierStatus { Archived, } -#[derive(Clone, Copy, Debug, Deserialize, Eq, PartialEq, Serialize)] -pub enum CheckpointDisposition { - Champion, - FrontierCandidate, - Baseline, - DeadEnd, - Archived, -} - #[derive(Clone, Copy, Debug, Deserialize, Eq, Ord, PartialEq, PartialOrd, Serialize)] pub enum MetricUnit { Seconds, @@ -418,12 +387,12 @@ pub enum ExecutionBackend { } #[derive(Clone, Copy, Debug, Deserialize, Eq, PartialEq, Serialize)] +#[serde(rename_all = "snake_case")] pub enum FrontierVerdict { - PromoteToChampion, - KeepOnFrontier, - RevertToChampion, - ArchiveDeadEnd, - NeedsMoreEvidence, + Accepted, + Kept, + Parked, + Rejected, } #[derive(Clone, Copy, Debug, Deserialize, Eq, PartialEq, Serialize)] @@ -732,23 +701,6 @@ pub struct ArtifactRef { } #[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] -pub struct CodeSnapshotRef { - pub repo_root: Utf8PathBuf, - pub worktree_root: Utf8PathBuf, - pub worktree_name: Option<NonEmptyText>, - pub head_commit: Option<GitCommitHash>, - pub dirty_paths: BTreeSet<Utf8PathBuf>, -} - -#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] -pub struct CheckpointSnapshotRef { - pub repo_root: Utf8PathBuf, - pub worktree_root: Utf8PathBuf, - pub worktree_name: Option<NonEmptyText>, - pub commit_hash: GitCommitHash, -} - -#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] pub struct CommandRecipe { pub working_directory: Utf8PathBuf, pub argv: Vec<NonEmptyText>, @@ -831,17 +783,6 @@ impl FrontierRecord { } } -#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] -pub struct CheckpointRecord { - pub id: CheckpointId, - pub frontier_id: FrontierId, - pub node_id: NodeId, - pub snapshot: CheckpointSnapshotRef, - pub disposition: CheckpointDisposition, - pub summary: NonEmptyText, - pub created_at: OffsetDateTime, -} - #[derive(Clone, Debug, Deserialize, PartialEq, Serialize)] pub struct RunRecord { pub node_id: NodeId, @@ -849,7 +790,6 @@ pub struct RunRecord { pub frontier_id: Option<FrontierId>, pub status: RunStatus, pub backend: ExecutionBackend, - pub code_snapshot: Option<CodeSnapshotRef>, pub dimensions: BTreeMap<NonEmptyText, RunDimensionValue>, pub command: CommandRecipe, pub started_at: Option<OffsetDateTime>, @@ -868,7 +808,6 @@ pub struct ExperimentResult { pub struct OpenExperiment { pub id: ExperimentId, pub frontier_id: FrontierId, - pub base_checkpoint_id: CheckpointId, pub hypothesis_node_id: NodeId, pub title: NonEmptyText, pub summary: Option<NonEmptyText>, @@ -885,8 +824,6 @@ pub struct FrontierNote { pub struct CompletedExperiment { pub id: ExperimentId, pub frontier_id: FrontierId, - pub base_checkpoint_id: CheckpointId, - pub candidate_checkpoint_id: CheckpointId, pub hypothesis_node_id: NodeId, pub run_node_id: NodeId, pub run_id: RunId, @@ -900,12 +837,20 @@ pub struct CompletedExperiment { pub created_at: OffsetDateTime, } +#[derive(Clone, Debug, Default, Deserialize, Eq, PartialEq, Serialize)] +pub struct FrontierVerdictCounts { + pub accepted: u64, + pub kept: u64, + pub parked: u64, + pub rejected: u64, +} + #[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] pub struct FrontierProjection { pub frontier: FrontierRecord, - pub champion_checkpoint_id: Option<CheckpointId>, - pub candidate_checkpoint_ids: BTreeSet<CheckpointId>, - pub experiment_count: u64, + pub open_experiment_count: u64, + pub completed_experiment_count: u64, + pub verdict_counts: FrontierVerdictCounts, } #[cfg(test)] diff --git a/crates/fidget-spinner-store-sqlite/src/lib.rs b/crates/fidget-spinner-store-sqlite/src/lib.rs index 1862590..bbe7038 100644 --- a/crates/fidget-spinner-store-sqlite/src/lib.rs +++ b/crates/fidget-spinner-store-sqlite/src/lib.rs @@ -3,18 +3,16 @@ use std::collections::{BTreeMap, BTreeSet}; use std::fmt::Write as _; use std::fs; use std::io; -use std::process::Command; use camino::{Utf8Path, Utf8PathBuf}; use fidget_spinner_core::{ - AnnotationVisibility, CheckpointDisposition, CheckpointRecord, CheckpointSnapshotRef, - CodeSnapshotRef, CommandRecipe, CompletedExperiment, DagEdge, DagNode, DiagnosticSeverity, + AnnotationVisibility, CommandRecipe, CompletedExperiment, DagEdge, DagNode, DiagnosticSeverity, EdgeKind, ExecutionBackend, ExperimentResult, FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, FrontierProjection, FrontierRecord, FrontierStatus, - FrontierVerdict, GitCommitHash, InferencePolicy, JsonObject, MetricDefinition, MetricSpec, - MetricUnit, MetricValue, NodeAnnotation, NodeClass, NodeDiagnostics, NodePayload, NonEmptyText, - OpenExperiment, OptimizationObjective, ProjectFieldSpec, ProjectSchema, RunDimensionDefinition, - RunDimensionValue, RunRecord, RunStatus, TagName, TagRecord, + FrontierVerdict, FrontierVerdictCounts, InferencePolicy, JsonObject, MetricDefinition, + MetricSpec, MetricUnit, MetricValue, NodeAnnotation, NodeClass, NodeDiagnostics, NodePayload, + NonEmptyText, OpenExperiment, OptimizationObjective, ProjectFieldSpec, ProjectSchema, + RunDimensionDefinition, RunDimensionValue, RunRecord, RunStatus, TagName, TagRecord, }; use rusqlite::types::Value as SqlValue; use rusqlite::{Connection, OptionalExtension, Transaction, params, params_from_iter}; @@ -29,7 +27,7 @@ pub const STORE_DIR_NAME: &str = ".fidget_spinner"; pub const STATE_DB_NAME: &str = "state.sqlite"; pub const PROJECT_CONFIG_NAME: &str = "project.json"; pub const PROJECT_SCHEMA_NAME: &str = "schema.json"; -pub const CURRENT_STORE_FORMAT_VERSION: u32 = 2; +pub const CURRENT_STORE_FORMAT_VERSION: u32 = 3; #[derive(Debug, Error)] pub enum StoreError { @@ -58,8 +56,6 @@ pub enum StoreError { NodeNotFound(fidget_spinner_core::NodeId), #[error("frontier {0} was not found")] FrontierNotFound(fidget_spinner_core::FrontierId), - #[error("checkpoint {0} was not found")] - CheckpointNotFound(fidget_spinner_core::CheckpointId), #[error("experiment {0} was not found")] ExperimentNotFound(fidget_spinner_core::ExperimentId), #[error("node {0} is not a hypothesis node")] @@ -68,10 +64,6 @@ pub enum StoreError { "project store format {observed} is incompatible with this binary (expected {expected}); reinitialize the store" )] IncompatibleStoreFormatVersion { observed: u32, expected: u32 }, - #[error("frontier {frontier_id} has no champion checkpoint")] - MissingChampionCheckpoint { - frontier_id: fidget_spinner_core::FrontierId, - }, #[error("unknown tag `{0}`")] UnknownTag(TagName), #[error("tag `{0}` already exists")] @@ -82,8 +74,6 @@ pub enum StoreError { ProseSummaryRequired(NodeClass), #[error("{0} nodes require a non-empty string payload field `body`")] ProseBodyRequired(NodeClass), - #[error("git repository inspection failed for {0}")] - GitInspectionFailed(Utf8PathBuf), #[error("metric `{0}` is not registered")] UnknownMetricDefinition(NonEmptyText), #[error( @@ -301,13 +291,12 @@ pub struct MetricBestEntry { pub value: f64, pub order: MetricRankOrder, pub experiment_id: fidget_spinner_core::ExperimentId, + pub experiment_title: NonEmptyText, pub frontier_id: fidget_spinner_core::FrontierId, pub hypothesis_node_id: fidget_spinner_core::NodeId, pub hypothesis_title: NonEmptyText, pub run_id: fidget_spinner_core::RunId, pub verdict: FrontierVerdict, - pub candidate_checkpoint_id: fidget_spinner_core::CheckpointId, - pub candidate_commit_hash: GitCommitHash, pub unit: Option<MetricUnit>, pub objective: Option<OptimizationObjective>, pub dimensions: BTreeMap<NonEmptyText, RunDimensionValue>, @@ -375,19 +364,11 @@ pub struct CreateFrontierRequest { pub contract_title: NonEmptyText, pub contract_summary: Option<NonEmptyText>, pub contract: FrontierContract, - pub initial_checkpoint: Option<CheckpointSeed>, -} - -#[derive(Clone, Debug)] -pub struct CheckpointSeed { - pub summary: NonEmptyText, - pub snapshot: CheckpointSnapshotRef, } #[derive(Clone, Debug)] pub struct OpenExperimentRequest { pub frontier_id: fidget_spinner_core::FrontierId, - pub base_checkpoint_id: fidget_spinner_core::CheckpointId, pub hypothesis_node_id: fidget_spinner_core::NodeId, pub title: NonEmptyText, pub summary: Option<NonEmptyText>, @@ -397,7 +378,6 @@ pub struct OpenExperimentRequest { pub struct OpenExperimentSummary { pub id: fidget_spinner_core::ExperimentId, pub frontier_id: fidget_spinner_core::FrontierId, - pub base_checkpoint_id: fidget_spinner_core::CheckpointId, pub hypothesis_node_id: fidget_spinner_core::NodeId, pub title: NonEmptyText, pub summary: Option<NonEmptyText>, @@ -414,14 +394,11 @@ pub struct ExperimentAnalysisDraft { #[derive(Clone, Debug)] pub struct CloseExperimentRequest { pub experiment_id: fidget_spinner_core::ExperimentId, - pub candidate_summary: NonEmptyText, - pub candidate_snapshot: CheckpointSnapshotRef, pub run_title: NonEmptyText, pub run_summary: Option<NonEmptyText>, pub backend: ExecutionBackend, pub dimensions: BTreeMap<NonEmptyText, RunDimensionValue>, pub command: CommandRecipe, - pub code_snapshot: Option<CodeSnapshotRef>, pub primary_metric: MetricValue, pub supporting_metrics: Vec<MetricValue>, pub note: FrontierNote, @@ -434,7 +411,6 @@ pub struct CloseExperimentRequest { #[derive(Clone, Debug, Deserialize, PartialEq, Serialize)] pub struct ExperimentReceipt { pub open_experiment: OpenExperiment, - pub checkpoint: CheckpointRecord, pub run_node: DagNode, pub run: RunRecord, pub analysis_node: Option<DagNode>, @@ -629,18 +605,6 @@ impl ProjectStore { } insert_node(&tx, &contract_node)?; insert_frontier(&tx, &frontier)?; - if let Some(seed) = request.initial_checkpoint { - let checkpoint = CheckpointRecord { - id: fidget_spinner_core::CheckpointId::fresh(), - frontier_id: frontier.id, - node_id: contract_node.id, - snapshot: seed.snapshot, - disposition: CheckpointDisposition::Champion, - summary: seed.summary, - created_at: OffsetDateTime::now_utc(), - }; - insert_checkpoint(&tx, &checkpoint)?; - } insert_event( &tx, "frontier", @@ -1074,68 +1038,43 @@ impl ProjectStore { frontier_id: fidget_spinner_core::FrontierId, ) -> Result<FrontierProjection, StoreError> { let frontier = self.load_frontier(frontier_id)?; - let mut champion_checkpoint_id = None; - let mut candidate_checkpoint_ids = BTreeSet::new(); - - let mut statement = self.connection.prepare( - "SELECT id, disposition - FROM checkpoints - WHERE frontier_id = ?1", - )?; - let mut rows = statement.query(params![frontier_id.to_string()])?; - while let Some(row) = rows.next()? { - let checkpoint_id = parse_checkpoint_id(&row.get::<_, String>(0)?)?; - match parse_checkpoint_disposition(&row.get::<_, String>(1)?)? { - CheckpointDisposition::Champion => champion_checkpoint_id = Some(checkpoint_id), - CheckpointDisposition::FrontierCandidate => { - let _ = candidate_checkpoint_ids.insert(checkpoint_id); - } - CheckpointDisposition::Baseline - | CheckpointDisposition::DeadEnd - | CheckpointDisposition::Archived => {} - } - } - let experiment_count = self.connection.query_row( + let open_experiment_count = self.connection.query_row( + "SELECT COUNT(*) FROM open_experiments WHERE frontier_id = ?1", + params![frontier_id.to_string()], + |row| row.get::<_, i64>(0), + )? as u64; + let completed_experiment_count = self.connection.query_row( "SELECT COUNT(*) FROM experiments WHERE frontier_id = ?1", params![frontier_id.to_string()], |row| row.get::<_, i64>(0), )? as u64; + let verdict_counts = self.connection.query_row( + "SELECT + SUM(CASE WHEN verdict = 'accepted' THEN 1 ELSE 0 END), + SUM(CASE WHEN verdict = 'kept' THEN 1 ELSE 0 END), + SUM(CASE WHEN verdict = 'parked' THEN 1 ELSE 0 END), + SUM(CASE WHEN verdict = 'rejected' THEN 1 ELSE 0 END) + FROM experiments + WHERE frontier_id = ?1", + params![frontier_id.to_string()], + |row| { + Ok(FrontierVerdictCounts { + accepted: row.get::<_, Option<i64>>(0)?.unwrap_or(0) as u64, + kept: row.get::<_, Option<i64>>(1)?.unwrap_or(0) as u64, + parked: row.get::<_, Option<i64>>(2)?.unwrap_or(0) as u64, + rejected: row.get::<_, Option<i64>>(3)?.unwrap_or(0) as u64, + }) + }, + )?; Ok(FrontierProjection { frontier, - champion_checkpoint_id, - candidate_checkpoint_ids, - experiment_count, + open_experiment_count, + completed_experiment_count, + verdict_counts, }) } - pub fn load_checkpoint( - &self, - checkpoint_id: fidget_spinner_core::CheckpointId, - ) -> Result<Option<CheckpointRecord>, StoreError> { - let mut statement = self.connection.prepare( - "SELECT - id, - frontier_id, - node_id, - repo_root, - worktree_root, - worktree_name, - commit_hash, - disposition, - summary, - created_at - FROM checkpoints - WHERE id = ?1", - )?; - statement - .query_row(params![checkpoint_id.to_string()], |row| { - read_checkpoint_row(row) - }) - .optional() - .map_err(StoreError::from) - } - pub fn open_experiment( &mut self, request: OpenExperimentRequest, @@ -1149,16 +1088,9 @@ impl ProjectStore { if hypothesis_node.frontier_id != Some(request.frontier_id) { return Err(StoreError::FrontierNotFound(request.frontier_id)); } - let base_checkpoint = self - .load_checkpoint(request.base_checkpoint_id)? - .ok_or(StoreError::CheckpointNotFound(request.base_checkpoint_id))?; - if base_checkpoint.frontier_id != request.frontier_id { - return Err(StoreError::CheckpointNotFound(request.base_checkpoint_id)); - } let experiment = OpenExperiment { id: fidget_spinner_core::ExperimentId::fresh(), frontier_id: request.frontier_id, - base_checkpoint_id: request.base_checkpoint_id, hypothesis_node_id: request.hypothesis_node_id, title: request.title, summary: request.summary, @@ -1175,7 +1107,6 @@ impl ProjectStore { json!({ "frontier_id": experiment.frontier_id, "hypothesis_node_id": experiment.hypothesis_node_id, - "base_checkpoint_id": experiment.base_checkpoint_id, }), )?; tx.commit()?; @@ -1190,7 +1121,6 @@ impl ProjectStore { "SELECT id, frontier_id, - base_checkpoint_id, hypothesis_node_id, title, summary, @@ -1205,14 +1135,13 @@ impl ProjectStore { items.push(OpenExperimentSummary { id: parse_experiment_id(&row.get::<_, String>(0)?)?, frontier_id: parse_frontier_id(&row.get::<_, String>(1)?)?, - base_checkpoint_id: parse_checkpoint_id(&row.get::<_, String>(2)?)?, - hypothesis_node_id: parse_node_id(&row.get::<_, String>(3)?)?, - title: NonEmptyText::new(row.get::<_, String>(4)?)?, + hypothesis_node_id: parse_node_id(&row.get::<_, String>(2)?)?, + title: NonEmptyText::new(row.get::<_, String>(3)?)?, summary: row - .get::<_, Option<String>>(5)? + .get::<_, Option<String>>(4)? .map(NonEmptyText::new) .transpose()?, - created_at: decode_timestamp(&row.get::<_, String>(6)?)?, + created_at: decode_timestamp(&row.get::<_, String>(5)?)?, }); } Ok(items) @@ -1241,16 +1170,6 @@ impl ProjectStore { open_experiment.hypothesis_node_id, )); } - let base_checkpoint = self - .load_checkpoint(open_experiment.base_checkpoint_id)? - .ok_or(StoreError::CheckpointNotFound( - open_experiment.base_checkpoint_id, - ))?; - if base_checkpoint.frontier_id != open_experiment.frontier_id { - return Err(StoreError::CheckpointNotFound( - open_experiment.base_checkpoint_id, - )); - } let tx = self.connection.transaction()?; let dimensions = validate_run_dimensions_tx(&tx, &request.dimensions)?; let primary_metric_definition = @@ -1292,7 +1211,6 @@ impl ProjectStore { frontier_id: Some(open_experiment.frontier_id), status: RunStatus::Succeeded, backend: request.backend, - code_snapshot: request.code_snapshot, dimensions: dimensions.clone(), command: request.command, started_at: Some(now), @@ -1339,28 +1257,9 @@ impl ProjectStore { decision_diagnostics, ); - let checkpoint = CheckpointRecord { - id: fidget_spinner_core::CheckpointId::fresh(), - frontier_id: open_experiment.frontier_id, - node_id: run_node.id, - snapshot: request.candidate_snapshot, - disposition: match request.verdict { - FrontierVerdict::PromoteToChampion => CheckpointDisposition::Champion, - FrontierVerdict::KeepOnFrontier | FrontierVerdict::NeedsMoreEvidence => { - CheckpointDisposition::FrontierCandidate - } - FrontierVerdict::RevertToChampion => CheckpointDisposition::DeadEnd, - FrontierVerdict::ArchiveDeadEnd => CheckpointDisposition::Archived, - }, - summary: request.candidate_summary, - created_at: now, - }; - let experiment = CompletedExperiment { id: open_experiment.id, frontier_id: open_experiment.frontier_id, - base_checkpoint_id: open_experiment.base_checkpoint_id, - candidate_checkpoint_id: checkpoint.id, hypothesis_node_id: open_experiment.hypothesis_node_id, run_node_id: run_node.id, run_id, @@ -1428,16 +1327,6 @@ impl ProjectStore { supporting_metric_definitions.as_slice(), )?; insert_run_dimensions(&tx, run.run_id, &dimensions)?; - match request.verdict { - FrontierVerdict::PromoteToChampion => { - demote_previous_champion(&tx, open_experiment.frontier_id)?; - } - FrontierVerdict::KeepOnFrontier - | FrontierVerdict::NeedsMoreEvidence - | FrontierVerdict::RevertToChampion - | FrontierVerdict::ArchiveDeadEnd => {} - } - insert_checkpoint(&tx, &checkpoint)?; insert_experiment(&tx, &experiment)?; delete_open_experiment(&tx, open_experiment.id)?; touch_frontier(&tx, open_experiment.frontier_id)?; @@ -1450,14 +1339,12 @@ impl ProjectStore { "frontier_id": open_experiment.frontier_id, "hypothesis_node_id": open_experiment.hypothesis_node_id, "verdict": format!("{:?}", request.verdict), - "candidate_checkpoint_id": checkpoint.id, }), )?; tx.commit()?; Ok(ExperimentReceipt { open_experiment, - checkpoint, run_node, run, analysis_node, @@ -1466,13 +1353,6 @@ impl ProjectStore { }) } - pub fn auto_capture_checkpoint( - &self, - summary: NonEmptyText, - ) -> Result<Option<CheckpointSeed>, StoreError> { - auto_capture_checkpoint_seed(&self.project_root, summary) - } - fn load_annotations( &self, node_id: fidget_spinner_core::NodeId, @@ -1565,12 +1445,11 @@ struct MetricSample { value: f64, frontier_id: fidget_spinner_core::FrontierId, experiment_id: fidget_spinner_core::ExperimentId, + experiment_title: NonEmptyText, hypothesis_node_id: fidget_spinner_core::NodeId, hypothesis_title: NonEmptyText, run_id: fidget_spinner_core::RunId, verdict: FrontierVerdict, - candidate_checkpoint_id: fidget_spinner_core::CheckpointId, - candidate_commit_hash: GitCommitHash, unit: Option<MetricUnit>, objective: Option<OptimizationObjective>, dimensions: BTreeMap<NonEmptyText, RunDimensionValue>, @@ -1584,13 +1463,12 @@ impl MetricSample { value: self.value, order, experiment_id: self.experiment_id, + experiment_title: self.experiment_title, frontier_id: self.frontier_id, hypothesis_node_id: self.hypothesis_node_id, hypothesis_title: self.hypothesis_title, run_id: self.run_id, verdict: self.verdict, - candidate_checkpoint_id: self.candidate_checkpoint_id, - candidate_commit_hash: self.candidate_commit_hash, unit: self.unit, objective: self.objective, dimensions: self.dimensions, @@ -1737,10 +1615,10 @@ fn compare_metric_samples( #[derive(Clone, Debug)] struct ExperimentMetricRow { experiment_id: fidget_spinner_core::ExperimentId, + experiment_title: NonEmptyText, frontier_id: fidget_spinner_core::FrontierId, run_id: fidget_spinner_core::RunId, verdict: FrontierVerdict, - candidate_checkpoint: CheckpointRecord, hypothesis_node: DagNode, run_node: DagNode, analysis_node: Option<DagNode>, @@ -1755,13 +1633,13 @@ fn load_experiment_rows(store: &ProjectStore) -> Result<Vec<ExperimentMetricRow> let mut statement = store.connection.prepare( "SELECT id, + title, frontier_id, run_id, hypothesis_node_id, run_node_id, analysis_node_id, decision_node_id, - candidate_checkpoint_id, primary_metric_json, supporting_metrics_json, verdict @@ -1770,23 +1648,20 @@ fn load_experiment_rows(store: &ProjectStore) -> Result<Vec<ExperimentMetricRow> let mut rows = statement.query([])?; let mut items = Vec::new(); while let Some(row) = rows.next()? { - let hypothesis_node_id = parse_node_id(&row.get::<_, String>(3)?)?; - let run_id = parse_run_id(&row.get::<_, String>(2)?)?; - let run_node_id = parse_node_id(&row.get::<_, String>(4)?)?; + let hypothesis_node_id = parse_node_id(&row.get::<_, String>(4)?)?; + let run_id = parse_run_id(&row.get::<_, String>(3)?)?; + let run_node_id = parse_node_id(&row.get::<_, String>(5)?)?; let analysis_node_id = row - .get::<_, Option<String>>(5)? + .get::<_, Option<String>>(6)? .map(|raw| parse_node_id(&raw)) .transpose()?; - let decision_node_id = parse_node_id(&row.get::<_, String>(6)?)?; - let candidate_checkpoint_id = parse_checkpoint_id(&row.get::<_, String>(7)?)?; + let decision_node_id = parse_node_id(&row.get::<_, String>(7)?)?; items.push(ExperimentMetricRow { experiment_id: parse_experiment_id(&row.get::<_, String>(0)?)?, - frontier_id: parse_frontier_id(&row.get::<_, String>(1)?)?, + experiment_title: NonEmptyText::new(row.get::<_, String>(1)?)?, + frontier_id: parse_frontier_id(&row.get::<_, String>(2)?)?, run_id, verdict: parse_frontier_verdict(&row.get::<_, String>(10)?)?, - candidate_checkpoint: store - .load_checkpoint(candidate_checkpoint_id)? - .ok_or(StoreError::CheckpointNotFound(candidate_checkpoint_id))?, hypothesis_node: store .get_node(hypothesis_node_id)? .ok_or(StoreError::NodeNotFound(hypothesis_node_id))?, @@ -1856,12 +1731,11 @@ fn metric_sample_from_observation( value: metric.value, frontier_id: row.frontier_id, experiment_id: row.experiment_id, + experiment_title: row.experiment_title.clone(), hypothesis_node_id: row.hypothesis_node.id, hypothesis_title: row.hypothesis_node.title.clone(), run_id: row.run_id, verdict: row.verdict, - candidate_checkpoint_id: row.candidate_checkpoint.id, - candidate_commit_hash: row.candidate_checkpoint.snapshot.commit_hash.clone(), unit: registry.map(|definition| definition.unit), objective: registry.map(|definition| definition.objective), dimensions: row.dimensions.clone(), @@ -1895,12 +1769,11 @@ fn metric_samples_from_payload( value, frontier_id: row.frontier_id, experiment_id: row.experiment_id, + experiment_title: row.experiment_title.clone(), hypothesis_node_id: row.hypothesis_node.id, hypothesis_title: row.hypothesis_node.title.clone(), run_id: row.run_id, verdict: row.verdict, - candidate_checkpoint_id: row.candidate_checkpoint.id, - candidate_commit_hash: row.candidate_checkpoint.snapshot.commit_hash.clone(), unit: None, objective: None, dimensions: row.dimensions.clone(), @@ -1968,30 +1841,12 @@ fn migrate(connection: &Connection) -> Result<(), StoreError> { updated_at TEXT NOT NULL ); - CREATE TABLE IF NOT EXISTS checkpoints ( - id TEXT PRIMARY KEY, - frontier_id TEXT NOT NULL REFERENCES frontiers(id) ON DELETE CASCADE, - node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE RESTRICT, - repo_root TEXT NOT NULL, - worktree_root TEXT NOT NULL, - worktree_name TEXT, - commit_hash TEXT NOT NULL, - disposition TEXT NOT NULL, - summary TEXT NOT NULL, - created_at TEXT NOT NULL - ); - CREATE TABLE IF NOT EXISTS runs ( run_id TEXT PRIMARY KEY, node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, frontier_id TEXT REFERENCES frontiers(id) ON DELETE SET NULL, status TEXT NOT NULL, backend TEXT NOT NULL, - repo_root TEXT, - worktree_root TEXT, - worktree_name TEXT, - head_commit TEXT, - dirty_paths_json TEXT, benchmark_suite TEXT, working_directory TEXT NOT NULL, argv_json TEXT NOT NULL, @@ -2037,7 +1892,6 @@ fn migrate(connection: &Connection) -> Result<(), StoreError> { CREATE TABLE IF NOT EXISTS open_experiments ( id TEXT PRIMARY KEY, frontier_id TEXT NOT NULL REFERENCES frontiers(id) ON DELETE CASCADE, - base_checkpoint_id TEXT NOT NULL REFERENCES checkpoints(id) ON DELETE RESTRICT, hypothesis_node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE RESTRICT, title TEXT NOT NULL, summary TEXT, @@ -2047,8 +1901,6 @@ fn migrate(connection: &Connection) -> Result<(), StoreError> { CREATE TABLE IF NOT EXISTS experiments ( id TEXT PRIMARY KEY, frontier_id TEXT NOT NULL REFERENCES frontiers(id) ON DELETE CASCADE, - base_checkpoint_id TEXT NOT NULL REFERENCES checkpoints(id) ON DELETE RESTRICT, - candidate_checkpoint_id TEXT NOT NULL REFERENCES checkpoints(id) ON DELETE RESTRICT, hypothesis_node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE RESTRICT, run_node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE RESTRICT, run_id TEXT NOT NULL REFERENCES runs(run_id) ON DELETE RESTRICT, @@ -2903,43 +2755,6 @@ fn insert_frontier(tx: &Transaction<'_>, frontier: &FrontierRecord) -> Result<() Ok(()) } -fn insert_checkpoint( - tx: &Transaction<'_>, - checkpoint: &CheckpointRecord, -) -> Result<(), StoreError> { - let _ = tx.execute( - "INSERT INTO checkpoints ( - id, - frontier_id, - node_id, - repo_root, - worktree_root, - worktree_name, - commit_hash, - disposition, - summary, - created_at - ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)", - params![ - checkpoint.id.to_string(), - checkpoint.frontier_id.to_string(), - checkpoint.node_id.to_string(), - checkpoint.snapshot.repo_root.as_str(), - checkpoint.snapshot.worktree_root.as_str(), - checkpoint - .snapshot - .worktree_name - .as_ref() - .map(NonEmptyText::as_str), - checkpoint.snapshot.commit_hash.as_str(), - encode_checkpoint_disposition(checkpoint.disposition), - checkpoint.summary.as_str(), - encode_timestamp(checkpoint.created_at)?, - ], - )?; - Ok(()) -} - fn insert_run( tx: &Transaction<'_>, run: &RunRecord, @@ -2949,28 +2764,6 @@ fn insert_run( supporting_metrics: &[MetricValue], supporting_metric_definitions: &[MetricDefinition], ) -> Result<(), StoreError> { - let (repo_root, worktree_root, worktree_name, head_commit, dirty_paths) = run - .code_snapshot - .as_ref() - .map_or((None, None, None, None, None), |snapshot| { - ( - Some(snapshot.repo_root.as_str().to_owned()), - Some(snapshot.worktree_root.as_str().to_owned()), - snapshot.worktree_name.as_ref().map(ToOwned::to_owned), - snapshot.head_commit.as_ref().map(ToOwned::to_owned), - Some( - snapshot - .dirty_paths - .iter() - .map(ToOwned::to_owned) - .collect::<Vec<_>>(), - ), - ) - }); - let dirty_paths_json = match dirty_paths.as_ref() { - Some(paths) => Some(encode_json(paths)?), - None => None, - }; let started_at = match run.started_at { Some(timestamp) => Some(encode_timestamp(timestamp)?), None => None, @@ -2986,29 +2779,19 @@ fn insert_run( frontier_id, status, backend, - repo_root, - worktree_root, - worktree_name, - head_commit, - dirty_paths_json, benchmark_suite, working_directory, argv_json, env_json, started_at, finished_at - ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12, ?13, ?14, ?15, ?16)", + ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11)", params![ run.run_id.to_string(), run.node_id.to_string(), run.frontier_id.map(|id| id.to_string()), encode_run_status(run.status), encode_backend(run.backend), - repo_root, - worktree_root, - worktree_name.map(|item| item.to_string()), - head_commit.map(|item| item.to_string()), - dirty_paths_json, benchmark_suite, run.command.working_directory.as_str(), encode_json(&run.command.argv)?, @@ -3046,16 +2829,14 @@ fn insert_open_experiment( "INSERT INTO open_experiments ( id, frontier_id, - base_checkpoint_id, hypothesis_node_id, title, summary, created_at - ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)", + ) VALUES (?1, ?2, ?3, ?4, ?5, ?6)", params![ experiment.id.to_string(), experiment.frontier_id.to_string(), - experiment.base_checkpoint_id.to_string(), experiment.hypothesis_node_id.to_string(), experiment.title.as_str(), experiment.summary.as_ref().map(NonEmptyText::as_str), @@ -3084,8 +2865,6 @@ fn insert_experiment( "INSERT INTO experiments ( id, frontier_id, - base_checkpoint_id, - candidate_checkpoint_id, hypothesis_node_id, run_node_id, run_id, @@ -3100,12 +2879,10 @@ fn insert_experiment( note_next_json, verdict, created_at - ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12, ?13, ?14, ?15, ?16, ?17, ?18)", + ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12, ?13, ?14, ?15, ?16)", params![ experiment.id.to_string(), experiment.frontier_id.to_string(), - experiment.base_checkpoint_id.to_string(), - experiment.candidate_checkpoint_id.to_string(), experiment.hypothesis_node_id.to_string(), experiment.run_node_id.to_string(), experiment.run_id.to_string(), @@ -3154,7 +2931,6 @@ fn load_open_experiment( "SELECT id, frontier_id, - base_checkpoint_id, hypothesis_node_id, title, summary, @@ -3169,18 +2945,16 @@ fn load_open_experiment( .map_err(to_sql_conversion_error)?, frontier_id: parse_frontier_id(&row.get::<_, String>(1)?) .map_err(to_sql_conversion_error)?, - base_checkpoint_id: parse_checkpoint_id(&row.get::<_, String>(2)?) - .map_err(to_sql_conversion_error)?, - hypothesis_node_id: parse_node_id(&row.get::<_, String>(3)?) + hypothesis_node_id: parse_node_id(&row.get::<_, String>(2)?) .map_err(to_sql_conversion_error)?, - title: NonEmptyText::new(row.get::<_, String>(4)?) + title: NonEmptyText::new(row.get::<_, String>(3)?) .map_err(core_to_sql_conversion_error)?, summary: row - .get::<_, Option<String>>(5)? + .get::<_, Option<String>>(4)? .map(NonEmptyText::new) .transpose() .map_err(core_to_sql_conversion_error)?, - created_at: decode_timestamp(&row.get::<_, String>(6)?) + created_at: decode_timestamp(&row.get::<_, String>(5)?) .map_err(to_sql_conversion_error)?, }) }) @@ -3192,7 +2966,6 @@ fn summarize_open_experiment(experiment: &OpenExperiment) -> OpenExperimentSumma OpenExperimentSummary { id: experiment.id, frontier_id: experiment.frontier_id, - base_checkpoint_id: experiment.base_checkpoint_id, hypothesis_node_id: experiment.hypothesis_node_id, title: experiment.title.clone(), summary: experiment.summary.clone(), @@ -3214,19 +2987,6 @@ fn touch_frontier( Ok(()) } -fn demote_previous_champion( - tx: &Transaction<'_>, - frontier_id: fidget_spinner_core::FrontierId, -) -> Result<(), StoreError> { - let _ = tx.execute( - "UPDATE checkpoints - SET disposition = 'baseline' - WHERE frontier_id = ?1 AND disposition = 'champion'", - params![frontier_id.to_string()], - )?; - Ok(()) -} - fn read_node_row(row: &rusqlite::Row<'_>) -> Result<DagNode, rusqlite::Error> { let payload_json = row.get::<_, String>(9)?; let diagnostics_json = row.get::<_, String>(10)?; @@ -3276,31 +3036,6 @@ fn read_frontier_row(row: &rusqlite::Row<'_>) -> Result<FrontierRecord, StoreErr }) } -fn read_checkpoint_row(row: &rusqlite::Row<'_>) -> Result<CheckpointRecord, rusqlite::Error> { - Ok(CheckpointRecord { - id: parse_checkpoint_id(&row.get::<_, String>(0)?).map_err(to_sql_conversion_error)?, - frontier_id: parse_frontier_id(&row.get::<_, String>(1)?) - .map_err(to_sql_conversion_error)?, - node_id: parse_node_id(&row.get::<_, String>(2)?).map_err(to_sql_conversion_error)?, - snapshot: CheckpointSnapshotRef { - repo_root: Utf8PathBuf::from(row.get::<_, String>(3)?), - worktree_root: Utf8PathBuf::from(row.get::<_, String>(4)?), - worktree_name: row - .get::<_, Option<String>>(5)? - .map(NonEmptyText::new) - .transpose() - .map_err(core_to_sql_conversion_error)?, - commit_hash: GitCommitHash::new(row.get::<_, String>(6)?) - .map_err(core_to_sql_conversion_error)?, - }, - disposition: parse_checkpoint_disposition(&row.get::<_, String>(7)?) - .map_err(to_sql_conversion_error)?, - summary: NonEmptyText::new(row.get::<_, String>(8)?) - .map_err(core_to_sql_conversion_error)?, - created_at: decode_timestamp(&row.get::<_, String>(9)?).map_err(to_sql_conversion_error)?, - }) -} - fn frontier_contract_payload(contract: &FrontierContract) -> Result<JsonObject, StoreError> { json_object(json!({ "objective": contract.objective.as_str(), @@ -3395,44 +3130,6 @@ fn discovery_start(path: &Utf8Path) -> Utf8PathBuf { } } -fn auto_capture_checkpoint_seed( - project_root: &Utf8Path, - summary: NonEmptyText, -) -> Result<Option<CheckpointSeed>, StoreError> { - let top_level = git_output(project_root, &["rev-parse", "--show-toplevel"])?; - let Some(repo_root) = top_level else { - return Ok(None); - }; - let commit_hash = git_output(project_root, &["rev-parse", "HEAD"])? - .ok_or_else(|| StoreError::GitInspectionFailed(project_root.to_path_buf()))?; - let worktree_name = git_output(project_root, &["rev-parse", "--abbrev-ref", "HEAD"])?; - Ok(Some(CheckpointSeed { - summary, - snapshot: CheckpointSnapshotRef { - repo_root: Utf8PathBuf::from(repo_root), - worktree_root: project_root.to_path_buf(), - worktree_name: worktree_name.map(NonEmptyText::new).transpose()?, - commit_hash: GitCommitHash::new(commit_hash)?, - }, - })) -} - -fn git_output(project_root: &Utf8Path, args: &[&str]) -> Result<Option<String>, StoreError> { - let output = Command::new("git") - .arg("-C") - .arg(project_root.as_str()) - .args(args) - .output()?; - if !output.status.success() { - return Ok(None); - } - let text = String::from_utf8_lossy(&output.stdout).trim().to_owned(); - if text.is_empty() { - return Ok(None); - } - Ok(Some(text)) -} - fn to_sql_conversion_error(error: StoreError) -> rusqlite::Error { rusqlite::Error::FromSqlConversionFailure(0, rusqlite::types::Type::Text, Box::new(error)) } @@ -3453,12 +3150,6 @@ fn parse_frontier_id(raw: &str) -> Result<fidget_spinner_core::FrontierId, Store Ok(fidget_spinner_core::FrontierId::from_uuid(parse_uuid(raw)?)) } -fn parse_checkpoint_id(raw: &str) -> Result<fidget_spinner_core::CheckpointId, StoreError> { - Ok(fidget_spinner_core::CheckpointId::from_uuid(parse_uuid( - raw, - )?)) -} - fn parse_experiment_id(raw: &str) -> Result<fidget_spinner_core::ExperimentId, StoreError> { Ok(fidget_spinner_core::ExperimentId::from_uuid(parse_uuid( raw, @@ -3565,30 +3256,6 @@ fn parse_frontier_status(raw: &str) -> Result<FrontierStatus, StoreError> { } } -fn encode_checkpoint_disposition(disposition: CheckpointDisposition) -> &'static str { - match disposition { - CheckpointDisposition::Champion => "champion", - CheckpointDisposition::FrontierCandidate => "frontier-candidate", - CheckpointDisposition::Baseline => "baseline", - CheckpointDisposition::DeadEnd => "dead-end", - CheckpointDisposition::Archived => "archived", - } -} - -fn parse_checkpoint_disposition(raw: &str) -> Result<CheckpointDisposition, StoreError> { - match raw { - "champion" => Ok(CheckpointDisposition::Champion), - "frontier-candidate" => Ok(CheckpointDisposition::FrontierCandidate), - "baseline" => Ok(CheckpointDisposition::Baseline), - "dead-end" => Ok(CheckpointDisposition::DeadEnd), - "archived" => Ok(CheckpointDisposition::Archived), - other => Err(StoreError::Json(serde_json::Error::io(io::Error::new( - io::ErrorKind::InvalidData, - format!("unknown checkpoint disposition `{other}`"), - )))), - } -} - fn encode_run_status(status: RunStatus) -> &'static str { match status { RunStatus::Queued => "queued", @@ -3670,21 +3337,19 @@ fn decode_optimization_objective(raw: &str) -> Result<OptimizationObjective, Sto fn encode_frontier_verdict(verdict: FrontierVerdict) -> &'static str { match verdict { - FrontierVerdict::PromoteToChampion => "promote-to-champion", - FrontierVerdict::KeepOnFrontier => "keep-on-frontier", - FrontierVerdict::RevertToChampion => "revert-to-champion", - FrontierVerdict::ArchiveDeadEnd => "archive-dead-end", - FrontierVerdict::NeedsMoreEvidence => "needs-more-evidence", + FrontierVerdict::Accepted => "accepted", + FrontierVerdict::Kept => "kept", + FrontierVerdict::Parked => "parked", + FrontierVerdict::Rejected => "rejected", } } fn parse_frontier_verdict(raw: &str) -> Result<FrontierVerdict, StoreError> { match raw { - "promote-to-champion" => Ok(FrontierVerdict::PromoteToChampion), - "keep-on-frontier" => Ok(FrontierVerdict::KeepOnFrontier), - "revert-to-champion" => Ok(FrontierVerdict::RevertToChampion), - "archive-dead-end" => Ok(FrontierVerdict::ArchiveDeadEnd), - "needs-more-evidence" => Ok(FrontierVerdict::NeedsMoreEvidence), + "accepted" => Ok(FrontierVerdict::Accepted), + "kept" => Ok(FrontierVerdict::Kept), + "parked" => Ok(FrontierVerdict::Parked), + "rejected" => Ok(FrontierVerdict::Rejected), other => Err(StoreError::Json(serde_json::Error::io(io::Error::new( io::ErrorKind::InvalidData, format!("unknown frontier verdict `{other}`"), @@ -3785,10 +3450,10 @@ mod tests { RemoveSchemaFieldRequest, UpsertSchemaFieldRequest, }; use fidget_spinner_core::{ - CheckpointSnapshotRef, CommandRecipe, DiagnosticSeverity, EvaluationProtocol, - FieldPresence, FieldRole, FieldValueType, FrontierContract, FrontierNote, FrontierVerdict, - GitCommitHash, InferencePolicy, MetricSpec, MetricUnit, MetricValue, NodeAnnotation, - NodeClass, NodePayload, NonEmptyText, OptimizationObjective, RunDimensionValue, TagName, + CommandRecipe, DiagnosticSeverity, EvaluationProtocol, FieldPresence, FieldRole, + FieldValueType, FrontierContract, FrontierNote, FrontierVerdict, InferencePolicy, + MetricSpec, MetricUnit, MetricValue, NodeAnnotation, NodeClass, NodePayload, NonEmptyText, + OptimizationObjective, RunDimensionValue, TagName, }; fn temp_project_root(label: &str) -> camino::Utf8PathBuf { @@ -3850,7 +3515,7 @@ mod tests { } #[test] - fn frontier_projection_tracks_initial_champion() -> Result<(), super::StoreError> { + fn frontier_projection_tracks_experiment_counts() -> Result<(), super::StoreError> { let root = temp_project_root("frontier"); let mut store = ProjectStore::init( &root, @@ -3874,18 +3539,14 @@ mod tests { }, promotion_criteria: vec![NonEmptyText::new("strict speedup")?], }, - initial_checkpoint: Some(super::CheckpointSeed { - summary: NonEmptyText::new("seed")?, - snapshot: CheckpointSnapshotRef { - repo_root: root.clone(), - worktree_root: root, - worktree_name: Some(NonEmptyText::new("main")?), - commit_hash: GitCommitHash::new("0123456789abcdef")?, - }, - }), })?; - assert!(projection.champion_checkpoint_id.is_some()); + assert_eq!(projection.open_experiment_count, 0); + assert_eq!(projection.completed_experiment_count, 0); + assert_eq!(projection.verdict_counts.accepted, 0); + assert_eq!(projection.verdict_counts.kept, 0); + assert_eq!(projection.verdict_counts.parked, 0); + assert_eq!(projection.verdict_counts.rejected, 0); Ok(()) } @@ -3948,7 +3609,6 @@ mod tests { }, promotion_criteria: vec![NonEmptyText::new("faster")?], }, - initial_checkpoint: None, })?; let nodes = store.list_nodes(ListNodesQuery { @@ -4203,15 +3863,8 @@ mod tests { }, promotion_criteria: vec![NonEmptyText::new("strict speedup")?], }, - initial_checkpoint: Some(super::CheckpointSeed { - summary: NonEmptyText::new("seed")?, - snapshot: checkpoint_snapshot(&root, "aaaaaaaaaaaaaaaa")?, - }), })?; let frontier_id = projection.frontier.id; - let base_checkpoint_id = projection - .champion_checkpoint_id - .ok_or_else(|| super::StoreError::MissingChampionCheckpoint { frontier_id })?; let _ = store.define_metric(DefineMetricRequest { key: NonEmptyText::new("wall_clock_s")?, unit: MetricUnit::Seconds, @@ -4257,21 +3910,18 @@ mod tests { })?; let first_experiment = store.open_experiment(open_experiment_request( frontier_id, - base_checkpoint_id, first_hypothesis.id, "first experiment", )?)?; let second_experiment = store.open_experiment(open_experiment_request( frontier_id, - base_checkpoint_id, second_hypothesis.id, "second experiment", )?)?; - let first_receipt = store.close_experiment(experiment_request( + let _first_receipt = store.close_experiment(experiment_request( &root, first_experiment.id, - "bbbbbbbbbbbbbbbb", "first run", 10.0, run_dimensions("belt_4x5", 20.0)?, @@ -4279,7 +3929,6 @@ mod tests { let second_receipt = store.close_experiment(experiment_request( &root, second_experiment.id, - "cccccccccccccccc", "second run", 5.0, run_dimensions("belt_4x5", 60.0)?, @@ -4334,9 +3983,10 @@ mod tests { assert_eq!(canonical_best.len(), 1); assert_eq!(canonical_best[0].value, 5.0); assert_eq!( - canonical_best[0].candidate_checkpoint_id, - second_receipt.checkpoint.id + canonical_best[0].experiment_title.as_str(), + "second experiment" ); + assert_eq!(canonical_best[0].verdict, FrontierVerdict::Kept); assert_eq!( canonical_best[0] .dimensions @@ -4369,8 +4019,8 @@ mod tests { Err(super::StoreError::MetricOrderRequired { .. }) )); assert_eq!( - first_receipt.checkpoint.snapshot.commit_hash.as_str(), - "bbbbbbbbbbbbbbbb" + second_receipt.experiment.title.as_str(), + "second experiment" ); Ok(()) } @@ -4401,15 +4051,8 @@ mod tests { }, promotion_criteria: vec![NonEmptyText::new("keep the metric plane queryable")?], }, - initial_checkpoint: Some(super::CheckpointSeed { - summary: NonEmptyText::new("seed")?, - snapshot: checkpoint_snapshot(&root, "aaaaaaaaaaaaaaaa")?, - }), })?; let frontier_id = projection.frontier.id; - let base_checkpoint_id = projection - .champion_checkpoint_id - .ok_or_else(|| super::StoreError::MissingChampionCheckpoint { frontier_id })?; let hypothesis = store.add_node(CreateNodeRequest { class: NodeClass::Hypothesis, frontier_id: Some(frontier_id), @@ -4425,14 +4068,12 @@ mod tests { })?; let experiment = store.open_experiment(open_experiment_request( frontier_id, - base_checkpoint_id, hypothesis.id, "migration experiment", )?)?; let _ = store.close_experiment(experiment_request( &root, experiment.id, - "bbbbbbbbbbbbbbbb", "migration run", 11.0, BTreeMap::from([( @@ -4472,27 +4113,13 @@ mod tests { Ok(()) } - fn checkpoint_snapshot( - root: &camino::Utf8Path, - commit: &str, - ) -> Result<CheckpointSnapshotRef, super::StoreError> { - Ok(CheckpointSnapshotRef { - repo_root: root.to_path_buf(), - worktree_root: root.to_path_buf(), - worktree_name: Some(NonEmptyText::new("main")?), - commit_hash: GitCommitHash::new(commit)?, - }) - } - fn open_experiment_request( frontier_id: fidget_spinner_core::FrontierId, - base_checkpoint_id: fidget_spinner_core::CheckpointId, hypothesis_node_id: fidget_spinner_core::NodeId, title: &str, ) -> Result<OpenExperimentRequest, super::StoreError> { Ok(OpenExperimentRequest { frontier_id, - base_checkpoint_id, hypothesis_node_id, title: NonEmptyText::new(title)?, summary: Some(NonEmptyText::new(format!("{title} summary"))?), @@ -4502,15 +4129,12 @@ mod tests { fn experiment_request( root: &camino::Utf8Path, experiment_id: fidget_spinner_core::ExperimentId, - candidate_commit: &str, run_title: &str, wall_clock_s: f64, dimensions: BTreeMap<NonEmptyText, RunDimensionValue>, ) -> Result<CloseExperimentRequest, super::StoreError> { Ok(CloseExperimentRequest { experiment_id, - candidate_summary: NonEmptyText::new(format!("candidate {candidate_commit}"))?, - candidate_snapshot: checkpoint_snapshot(root, candidate_commit)?, run_title: NonEmptyText::new(run_title)?, run_summary: Some(NonEmptyText::new("run summary")?), backend: fidget_spinner_core::ExecutionBackend::WorktreeProcess, @@ -4520,7 +4144,6 @@ mod tests { vec![NonEmptyText::new("true")?], BTreeMap::new(), )?, - code_snapshot: None, primary_metric: MetricValue { key: NonEmptyText::new("wall_clock_s")?, value: wall_clock_s, @@ -4530,7 +4153,7 @@ mod tests { summary: NonEmptyText::new("note summary")?, next_hypotheses: Vec::new(), }, - verdict: FrontierVerdict::KeepOnFrontier, + verdict: FrontierVerdict::Kept, analysis: None, decision_title: NonEmptyText::new("decision")?, decision_rationale: NonEmptyText::new("decision rationale")?, diff --git a/docs/architecture.md b/docs/architecture.md index 30d01fc..e274ad5 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -80,7 +80,6 @@ The engine depends on a stable, typed spine stored in SQLite: - node annotations - node edges - frontiers -- checkpoints - runs - metrics - experiments @@ -207,23 +206,6 @@ Important constraint: That keeps frontier filtering honest. -### `checkpoints` - -Stores committed candidate or champion checkpoints: - -- checkpoint id -- frontier id -- anchoring node id -- repo/worktree metadata -- commit hash -- disposition -- summary -- created timestamp - -In the current codebase, a frontier may temporarily exist without a champion if -it was initialized outside a git repo. Core-path experimentation is only fully -available once git-backed checkpoints exist. - ### `runs` Stores run envelopes: @@ -233,8 +215,7 @@ Stores run envelopes: - frontier id - backend - status -- code snapshot metadata -- benchmark suite +- run dimensions - command envelope - started and finished timestamps @@ -254,12 +235,12 @@ Stores the atomic closure object for core-path work: - experiment id - frontier id -- base checkpoint id -- candidate checkpoint id - hypothesis node id - run node id and run id - optional analysis node id - decision node id +- title +- summary - verdict - note payload - created timestamp @@ -307,9 +288,9 @@ Track is derived from class, not operator whim. The frontier projection currently exposes: - frontier record -- champion checkpoint id -- active candidate checkpoint ids -- experiment count +- open experiment count +- completed experiment count +- verdict counts This projection is derived from canonical state and intentionally rebuildable. @@ -336,11 +317,10 @@ It persists, in one transaction: - run node - run record -- candidate checkpoint - decision node - experiment record - lineage and evidence edges -- frontier touch and champion demotion when needed +- frontier touch and verdict accounting inputs That atomic boundary is the answer to the ceremony/atomicity pre-mortem. diff --git a/docs/libgrid-dogfood.md b/docs/libgrid-dogfood.md index 59e214e..206c4d7 100644 --- a/docs/libgrid-dogfood.md +++ b/docs/libgrid-dogfood.md @@ -19,8 +19,8 @@ The MVP does not need to solve all of `libgrid`. It needs to solve this specific problem: replace the giant freeform experiment log with a machine in which the active -frontier, the current champion, the candidate evidence, and the dead ends are -all explicit and queryable. +frontier, the accepted lines, the live evidence, and the dead ends are all +explicit and queryable. When using a global unbound MCP session from a `libgrid` worktree, the first project-local action should be `project.bind` against the `libgrid` worktree @@ -54,7 +54,6 @@ The root contract should state: Use `hypothesis.record` to capture: - what hypothesis is being tested -- what base checkpoint it starts from - what benchmark suite matters - any terse sketch of the intended delta @@ -65,19 +64,17 @@ The run node should capture: - exact command - cwd - backend kind -- benchmark suite -- code snapshot +- run dimensions - resulting metrics ### Decision node The decision should make the verdict explicit: -- promote to champion -- keep on frontier -- revert to champion -- archive dead end -- needs more evidence +- accepted +- kept +- parked +- rejected ### Off-path nodes @@ -100,7 +97,6 @@ The MVP does not need hard rejection. It does need meaningful warnings. Good first project fields: - `hypothesis` on `hypothesis` -- `base_checkpoint_id` on `hypothesis` - `benchmark_suite` on `hypothesis` and `run` - `body` on `hypothesis`, `source`, and `note` - `comparison_claim` on `analysis` @@ -121,7 +117,6 @@ Good first metric vocabulary: 1. Initialize the project store. 2. Create a frontier contract. -3. Capture the incumbent git checkpoint if available. ### 2. Start a line of attack @@ -132,13 +127,12 @@ Good first metric vocabulary: ### 3. Execute one experiment 1. Modify the worktree. -2. Commit the candidate checkpoint. -3. Run the benchmark protocol. -4. Close the experiment atomically. +2. Run the benchmark protocol. +3. Close the experiment atomically. ### 4. Judge and continue -1. Promote the checkpoint or keep it alive. +1. Mark the line accepted, kept, parked, or rejected. 2. Archive dead ends instead of leaving them noisy and active. 3. Repeat. @@ -148,12 +142,10 @@ For `libgrid`, the benchmark evidence needs to be structurally trustworthy. The MVP should always preserve at least: -- benchmark suite identity +- run dimensions - primary metric - supporting metrics - command envelope -- host/worktree metadata -- git commit identity This is the minimum needed to prevent "I think this was faster" folklore. @@ -176,27 +168,28 @@ The right sequence is: ## Repo-Local Dogfood Before Libgrid This repository itself is a valid off-path dogfood target even though it is not -currently a git repo. +a benchmark-heavy repo. That means we can already use it to test: - project initialization - schema visibility -- frontier creation without a champion +- frontier creation and status projection - off-path source recording - hidden annotations - MCP read and write flows -What it cannot honestly test is full git-backed core-path experiment closure. -That still belongs in a real repo such as the `libgrid` worktree. +What it cannot honestly test is heavy benchmark ingestion and the retrieval +pressure that comes with it. That still belongs in a real optimization corpus +such as the `libgrid` worktree. ## Acceptance Bar For Libgrid Fidget Spinner is ready for serious `libgrid` use when: - an agent can run for hours without generating a giant markdown graveyard -- the operator can identify the champion checkpoint mechanically -- each completed experiment has checkpoint, result, note, and verdict +- the operator can identify accepted, kept, parked, and rejected lines mechanically +- each completed experiment has result, note, and verdict - off-path side investigations stay preserved but do not pollute the core path - the system feels like a machine for evidence rather than a diary with better typography diff --git a/docs/product-spec.md b/docs/product-spec.md index efa57df..85561ad 100644 --- a/docs/product-spec.md +++ b/docs/product-spec.md @@ -50,7 +50,7 @@ These are the load-bearing decisions to hold fixed through the MVP push. The canonical record is the DAG plus its normalized supporting tables. Frontier state is not a rival authority. It is a derived, rebuildable -projection over the DAG and related run/checkpoint/experiment records. +projection over the DAG and related run/experiment records. ### 2. Storage is per-project @@ -103,8 +103,6 @@ benchmark/decision bureaucracy while still preserving it in the DAG. A completed experiment exists only when all of these exist together: -- base checkpoint -- candidate checkpoint - measured result - terse note - explicit verdict @@ -112,13 +110,6 @@ A completed experiment exists only when all of these exist together: The write surface should make that one atomic mutation, not a loose sequence of low-level calls. -### 7. Checkpoints are git-backed - -Dirty worktree snapshots are useful as descriptive context, but a completed -core-path experiment should anchor to a committed candidate checkpoint. - -Off-path notes and source captures can remain lightweight and non-committal. - ## Node Model ### Global envelope @@ -203,9 +194,9 @@ The frontier is a derived operational view over the canonical DAG. It answers: - what objective is active -- what the current champion checkpoint is -- which candidate checkpoints are still alive -- how many completed experiments exist +- how many experiments are open +- how many experiments are completed +- how the verdict mix currently breaks down The DAG answers: @@ -316,10 +307,10 @@ The MVP is successful when: - an agent can record off-path sources and notes without bureaucratic pain - the project schema can softly declare whether payload fields are strings, numbers, booleans, or timestamps - an operator can inspect recent nodes through a minimal localhost web navigator filtered by tag -- a git-backed project can close a real core-path experiment atomically +- a project can close a real core-path experiment atomically - retryable worker faults do not duplicate side effects - stale nodes can be archived instead of polluting normal enumeration -- a human can answer "what changed, what ran, what is the current champion, +- a human can answer "what was tried, what ran, what was accepted or parked, and why?" without doing markdown archaeology ## Full Product |