Workflow reports succeeded when step outputs indicate failure
Opened by swampadmin · 5/8/2025
Summary
Workflows report "succeeded" even when the data produced by steps clearly indicates failure (e.g., command not found, missing kernel modules). The workflow engine determines step success entirely based on whether the model's execute function throws an exception — there is no mechanism for a model to signal "I completed execution but the result indicates a problem."
Reproduction
A user ran a proxmox-gpu-driver workflow with a verify step that checks for NVIDIA driver installation. The workflow reported "succeeded", but the produced data showed:
bash: line 1: nvidia-smi: command not found
NVIDIA_SMI_FAILED
--- Loaded NVIDIA Kernel Modules ---
(none)Root Cause
MethodResult (in src/domain/models/model.ts) only carries dataHandles and followUpActions — it has no way to express failure. The entire success/failure chain relies solely on exceptions:
- Model execute function — returns
{ dataHandles: [...] }regardless of whether the underlying operation succeeded or failed DefaultMethodExecutionService.execute()— treats any non-throwing return as successDefaultStepExecutor.executeModelMethod()— callsoutput.markSucceeded()if no exception- Step execution — calls
stepRun.succeed()if no exception,stepRun.fail()only on catch - Workflow completion — marks workflow as "succeeded" if all steps succeeded
This affects:
command/shellmodel — captures exit codes as data attributes but never throws on non-zero exit codes- All user-defined extension models — no guidance or mechanism to signal failure through the return value
- Any model that captures errors as data rather than throwing
Key Files
src/domain/models/model.ts—MethodResultinterface (no success/failure field)src/domain/models/method_execution_service.ts— executes methods, no failure check on resultsrc/domain/models/command/shell/shell_model.ts— always returns success regardless of exit codesrc/domain/models/user_model_loader.ts—wrapUserExecutepasses through result without checkingsrc/domain/workflows/execution_service.ts— step/job/workflow success determined only by exceptions
Suggested Approach
Add a way for models to signal failure through their return value rather than relying solely on exceptions. For example:
- Add an optional
failed/errorfield toMethodResultthat the engine checks after execution - Make
command/shellthrow (or set the failure signal) on non-zero exit codes, with an opt-out likeignoreExitCode - Document the convention for extension model authors
Closed
No activity in this phase yet.
Sign in to post a ripple.