Skip to main content
← Back to list
01Issue
BugClosedSwamp Club
AssigneesNone

Workflow reports succeeded when step outputs indicate failure

Opened by swampadmin · 5/8/2025

Summary

Workflows report "succeeded" even when the data produced by steps clearly indicates failure (e.g., command not found, missing kernel modules). The workflow engine determines step success entirely based on whether the model's execute function throws an exception — there is no mechanism for a model to signal "I completed execution but the result indicates a problem."

Reproduction

A user ran a proxmox-gpu-driver workflow with a verify step that checks for NVIDIA driver installation. The workflow reported "succeeded", but the produced data showed:

bash: line 1: nvidia-smi: command not found
NVIDIA_SMI_FAILED

--- Loaded NVIDIA Kernel Modules ---
(none)

Root Cause

MethodResult (in src/domain/models/model.ts) only carries dataHandles and followUpActions — it has no way to express failure. The entire success/failure chain relies solely on exceptions:

  1. Model execute function — returns { dataHandles: [...] } regardless of whether the underlying operation succeeded or failed
  2. DefaultMethodExecutionService.execute() — treats any non-throwing return as success
  3. DefaultStepExecutor.executeModelMethod() — calls output.markSucceeded() if no exception
  4. Step execution — calls stepRun.succeed() if no exception, stepRun.fail() only on catch
  5. Workflow completion — marks workflow as "succeeded" if all steps succeeded

This affects:

  • command/shell model — captures exit codes as data attributes but never throws on non-zero exit codes
  • All user-defined extension models — no guidance or mechanism to signal failure through the return value
  • Any model that captures errors as data rather than throwing

Key Files

  • src/domain/models/model.tsMethodResult interface (no success/failure field)
  • src/domain/models/method_execution_service.ts — executes methods, no failure check on result
  • src/domain/models/command/shell/shell_model.ts — always returns success regardless of exit code
  • src/domain/models/user_model_loader.tswrapUserExecute passes through result without checking
  • src/domain/workflows/execution_service.ts — step/job/workflow success determined only by exceptions

Suggested Approach

Add a way for models to signal failure through their return value rather than relying solely on exceptions. For example:

  • Add an optional failed / error field to MethodResult that the engine checks after execution
  • Make command/shell throw (or set the failure signal) on non-zero exit codes, with an opt-out like ignoreExitCode
  • Document the convention for extension model authors
02Bog Flow
OPENTRIAGEDIN PROGRESSCLOSED

Closed

No activity in this phase yet.

03Sludge Pulse

Sign in to post a ripple.