env var overrides for globalArguments silently change which environment a model targets

When developing extension models, AI agents frequently believe the model is correct after unit tests pass, but subtle corner-case bugs survive into pushes and extension publishes. These bugs are typically discovered only during manual smoke testing against live APIs — after the code has already been committed or published. Examples from real development sessions:

Content-Type mismatches: v2 API required application/vnd.api+json but the model used application/json — unit tests with stubbed fetch didn't catch this
Stale bundle caching: Source fixes weren't reflected at runtime because .swamp/bundles/*.js wasn't cleared — agents didn't know about this caching layer
API validation quirks: Honeycomb boards require type: "flexible" in the body, not just a name — only discovered during live create
delete_protected defaults: Honeycomb creates environments with delete_protected: true, making delete fail unless update is called first
Read-only resource guards: Attempting create/update/delete on read-only resources like dataset-definitions or auth should be rejected before making API calls

These are the kinds of issues that unit tests with mocked responses can't catch, but a structured smoke-test protocol would.

Proposed solution

A swamp-smoke-test skill that agents can invoke (or that hooks trigger automatically) before git push, swamp extension push, or similar publish actions. The skill would:

Discover the extension's method surface: Parse the model to enumerate all methods × resource types × argument combinations
Generate a smoke-test plan: For each method, identify:
- Safe read-only operations (GET/list) that can run against live APIs without side effects
- CRUD cycle candidates: resources that can be safely created, updated, and deleted (with unique test names to avoid collisions)
- Error-path tests: missing required args, read-only resource rejection, invalid auth
- Corner cases specific to the API: required fields beyond name, default flags that block deletion, etc.
Execute the plan: Run each test via swamp model method run, verify success/failure matches expectations
Report results: Produce a structured summary table (method × resource × result) suitable for PR descriptions
Clean up: Ensure all created test resources are deleted, even if intermediate steps fail

Key design considerations

The skill should be API-aware but not API-specific — it reads the model's method schemas and resource registry to generate tests, rather than hard-coding per-service knowledge
It should never touch pre-existing resources — all created resources use unique names (e.g. smoke-test-{resource}-{timestamp})
It should handle permission errors gracefully — a 401 on slos because the key lacks permission is not a test failure, it's an expected constraint
Bundle cache clearing (.swamp/bundles/) should be part of the pre-test setup
The skill could optionally integrate with git hooks to block pushes when smoke tests fail

Alternatives considered

Manual smoke testing: Current approach — works but is tedious, error-prone, and depends on the agent remembering to do it
Enhanced unit tests: Better mocks could catch some issues, but can't catch Content-Type mismatches, bundle caching, or API validation quirks that only surface with real HTTP calls
CI-based integration tests: Would require live API credentials in CI, which adds secret management complexity

Additional context

This was motivated by developing the @bixu/honeycomb extension, where multiple bugs survived unit tests and were only caught during manual smoke testing sessions. The pattern of "agent thinks it's done → smoke test reveals bugs → fix → re-test" repeated across several sessions. A skill that codifies this testing protocol would catch these issues earlier and more consistently.

02Bog Flow

Triaged

No activity in this phase yet.

03Sludge Pulse

Quality harness issue 1 by qualityfounder

Quality harness issue 11 by qualitymaven

Quality harness issue 10 by qualitymaven

Quality harness issue 9 by qualitymaven

Quality harness issue 8 by qualitymaven

Quality harness issue 7 by qualitymaven

Quality harness issue 6 by qualitymaven

Quality harness issue 5 by qualitymaven

Quality harness issue 4 by qualitymaven

Quality harness issue 3 by qualitymaven

Quality harness issue 2 by qualitymaven

Quality harness issue 1 by qualitymaven

Quality harness issue 1 by qualityrook

env var overrides for globalArguments silently change which environment a model targets

deno lint no-import-prefix conflicts with swamp's required npm:/jsr: inline specifiers

AI agents should search community extensions before offering to create custom models

feat: add --content-type filter to extension search

Add a 'reports' feature

Port remaining CLI commands to libswamp + renderer pattern

Persistent runner / server mode to eliminate per-invocation CLI startup overhead

AI agent should prefer fan-out model methods over shell loops for fleet operations

Support unbundled helper scripts in extension packages

Add progress indicator for long-running model methods

Automatic trace context propagation across workflow steps

Native OpenTelemetry tracing for swamp CLI internals

Claude Code should prefer extending swamp extensions over using external CLI tools

Framework log lines written to stdout pollute model method output

Swamp skill should guide AI agents to create extension models instead of inline shell scripts

Shared team collectives for collaborative extension publishing

Delete methods should return last known state data

Add smoke-test skill for extension models

Workflow run output is unreadable — resource data and errors are buried in log noise

Recommend or manage lockfile strategy for extension model npm dependencies

Add macOS Keychain vault type

bug: Vault secret shell escaping incomplete — $(), semicolons, and pipes are interpreted

Swamp should support "Scheduled"-type status for execution

feat: add support for Nix via a flake

ModelType.normalize replaces all dots with slashes, corrupting dotted type names

fix: collective validator error message omits drivers and datastores

Skill docs: document configSchema inline requirement and .describe() constraints for vault extensions

It takes 42 seconds to run swamp --help

After today upgrade

Auto-resolver fails when extensions/models/ directory does not exist for vault-only extensions

Azure Key Vault extension vault bundle fails to load in compiled binary

RangeError: Maximum call stack size exceeded when loading extension models with npm dependencies after upgrade to 20260317

Extension Zod schema incompatibility after upgrade to 20260312

Data garbage collection never fully removes expired data from disk

extension push --dry-run requires authentication unnecessarily

Typed context APIs to eliminate no-explicit-any in extension models

AWS extensions fail to load: missing _lib/aws.ts shared dependency

getContent() should resolve vault expressions for sensitive fields

getContent() should return parsed objects, not raw Uint8Array

Add context.readResource() to MethodContext for user extension models

Datastore lock: allow concurrent reads across different models

extension install should be an alias of extension pull

feat: Execution Driver Abstraction — Pluggable Isolation for Model Execution

Support --global-arg on model create

consider a libswamp

Running workflow should not block read-only operations

Update from 20260227 to 20260311 breaks existing model definitions (symlink + path-based type)

loadUserModels ignores --repo-dir, uses cwd instead

datastore lock release --force deadlocks on the stale lock it's trying to release

Datastore lock not released when interactive command fails in non-TTY context

vault put to 1Password stores empty password field

Leaderboard shows 0-event users whose profiles 404

Data instance name uniqueness is global across specs instead of per-spec

Audit timeline misses today's entries in UTC+ timezones

AI agent discoverability: reduce trial-and-error for CLI syntax

GlobalArguments CEL expressions fail when factory inputs are omitted — selective evaluation not working

Document factory pattern for model reuse in skills

Delete method uses identifier as data name instead of instance name

CLI method run ignores required arguments without .default()

Workflow run output mixes data, errors, and logs without clear separation — hard to find what actually executed

data.latest() in workflow step inputs resolved at workflow start, not at step execution time

Update method fails when globalArguments contain cross-model CEL expressions

fix: extension npm packages fail to resolve in compiled binary

swamp repo upgrade replaces entire CLAUDE.md with generic template instead of merging

Add drift detection to reconcile stored state against live cloud resources

Model inputs schema validated on all methods — delete fails requiring create-time inputs

Support wait-until-ready semantics for async resource provisioning