Lab #70: AI agent discoverability: reduce trial-and-error for CLI syntax

Problem

AI agents (e.g. Claude Code) repeatedly fail to remember swamp CLI syntax across sessions and waste time stepping through --help output for every command. Common mistakes include:

--arg instead of --input for model method run
model run <name> <method> instead of model method run <name> <method>
Guessing subcommand structure (extension describe instead of extension pull/extension search)

Each session, the agent has to re-discover the correct syntax through trial-and-error, which adds latency and noise to every interaction.

Possible Solutions

swamp help --agent or swamp schema — Emit a structured JSON summary of all commands, subcommands, and their arguments in a single call. Agents could fetch this once per session and cache it in context, eliminating repeated --help calls.
MCP tool server — Expose swamp operations as MCP tools (e.g. swamp mcp serve). This would let agents discover available operations through the standard MCP tool listing protocol, with typed argument schemas, instead of parsing CLI help text.
Command suggestion in error output — The CLI already does "Did you mean ...?" for unknown subcommands. Extending this to flag-level suggestions (e.g. Unknown option "--arg". Did you mean "--input"?) would help agents self-correct faster. (Note: flag-level suggestions may already be partially implemented.)

Alternatives Considered

Saving CLI syntax in agent memory files: Works but is fragile — memory drifts as the CLI evolves, and every user's agent has to independently learn the same syntax.
Wrapper scripts: Adds maintenance burden and doesn't scale across the full CLI surface.

Additional Context

This was observed during a session where the agent tried --arg, then model run, then --help on three different subcommands before arriving at the correct swamp model method run <name> <method> --input key=value invocation. The full discovery took 4 tool calls that could have been 1.

Quality harness issue 1 by qualityfounder

Quality harness issue 11 by qualitymaven

Quality harness issue 10 by qualitymaven

Quality harness issue 9 by qualitymaven

Quality harness issue 8 by qualitymaven

Quality harness issue 7 by qualitymaven

Quality harness issue 6 by qualitymaven

Quality harness issue 5 by qualitymaven

Quality harness issue 4 by qualitymaven

Quality harness issue 3 by qualitymaven

Quality harness issue 2 by qualitymaven

Quality harness issue 1 by qualitymaven

Quality harness issue 1 by qualityrook

env var overrides for globalArguments silently change which environment a model targets

deno lint no-import-prefix conflicts with swamp's required npm:/jsr: inline specifiers

AI agents should search community extensions before offering to create custom models

feat: add --content-type filter to extension search

Add a 'reports' feature

Port remaining CLI commands to libswamp + renderer pattern

Persistent runner / server mode to eliminate per-invocation CLI startup overhead

AI agent should prefer fan-out model methods over shell loops for fleet operations

Support unbundled helper scripts in extension packages

Add progress indicator for long-running model methods

Automatic trace context propagation across workflow steps

Native OpenTelemetry tracing for swamp CLI internals

Claude Code should prefer extending swamp extensions over using external CLI tools

Framework log lines written to stdout pollute model method output

Swamp skill should guide AI agents to create extension models instead of inline shell scripts

Shared team collectives for collaborative extension publishing

Delete methods should return last known state data

Add smoke-test skill for extension models

Workflow run output is unreadable — resource data and errors are buried in log noise

Recommend or manage lockfile strategy for extension model npm dependencies

Add macOS Keychain vault type

bug: Vault secret shell escaping incomplete — $(), semicolons, and pipes are interpreted

Swamp should support "Scheduled"-type status for execution

feat: add support for Nix via a flake

ModelType.normalize replaces all dots with slashes, corrupting dotted type names

fix: collective validator error message omits drivers and datastores

Skill docs: document configSchema inline requirement and .describe() constraints for vault extensions

It takes 42 seconds to run swamp --help

After today upgrade

Auto-resolver fails when extensions/models/ directory does not exist for vault-only extensions

Azure Key Vault extension vault bundle fails to load in compiled binary

RangeError: Maximum call stack size exceeded when loading extension models with npm dependencies after upgrade to 20260317

Extension Zod schema incompatibility after upgrade to 20260312

Data garbage collection never fully removes expired data from disk

extension push --dry-run requires authentication unnecessarily

Typed context APIs to eliminate no-explicit-any in extension models

AWS extensions fail to load: missing _lib/aws.ts shared dependency

getContent() should resolve vault expressions for sensitive fields

getContent() should return parsed objects, not raw Uint8Array

Add context.readResource() to MethodContext for user extension models

Datastore lock: allow concurrent reads across different models

extension install should be an alias of extension pull

feat: Execution Driver Abstraction — Pluggable Isolation for Model Execution

Support --global-arg on model create

consider a libswamp

Running workflow should not block read-only operations

Update from 20260227 to 20260311 breaks existing model definitions (symlink + path-based type)

loadUserModels ignores --repo-dir, uses cwd instead

datastore lock release --force deadlocks on the stale lock it's trying to release

Datastore lock not released when interactive command fails in non-TTY context

vault put to 1Password stores empty password field

Leaderboard shows 0-event users whose profiles 404

Data instance name uniqueness is global across specs instead of per-spec

Audit timeline misses today's entries in UTC+ timezones

AI agent discoverability: reduce trial-and-error for CLI syntax

GlobalArguments CEL expressions fail when factory inputs are omitted — selective evaluation not working

Document factory pattern for model reuse in skills

Delete method uses identifier as data name instead of instance name

CLI method run ignores required arguments without .default()

Workflow run output mixes data, errors, and logs without clear separation — hard to find what actually executed

data.latest() in workflow step inputs resolved at workflow start, not at step execution time

Update method fails when globalArguments contain cross-model CEL expressions

fix: extension npm packages fail to resolve in compiled binary

swamp repo upgrade replaces entire CLAUDE.md with generic template instead of merging

Add drift detection to reconcile stored state against live cloud resources

Model inputs schema validated on all methods — delete fails requiring create-time inputs

Support wait-until-ready semantics for async resource provisioning