Tutorial: File Input — CLI Document Processor
This tutorial walks through building a workflow that processes file content piped via stdin — or read from a path — and returns an LLM-generated summary. This pattern is useful for document analysis, batch processing, ETL pipelines, and any scenario where you want to drive a KDeps workflow from the command line.
Prerequisites
kdepsCLI installed (kdeps version)- Ollama installed locally (or
installOllama: trueinagentSettings)
How the File Source Works
When sources: [file] is configured:
- Uses the
--fileCLI argument if provided (highest priority). - KDeps reads content from stdin (raw text or JSON
{"path":"…","content":"…"}). - Falls back to the
KDEPS_FILE_PATHenvironment variable. - Falls back to the configured
input.file.pathfield. - If only a path is provided, the file is read from disk.
- The workflow executes once and exits.
Resources access the content via input("fileContent") and the path via input("filePath").
Step 1 — Create the Workflow File
# workflow.yaml
apiVersion: kdeps.io/v1
kind: Workflow
metadata:
name: doc-summarizer
description: Summarize a document piped via stdin
version: "1.0.0"
targetActionId: summarize
settings:
agentSettings:
timezone: Etc/UTC
installOllama: true
models:
- llama3.2:3b
input:
sources: [file]
# Optional: default file path when stdin and KDEPS_FILE_PATH are not set
# file:
# path: /tmp/default-document.txtKey points:
sources: [file]enables the file input subsystem.targetActionId: summarize— the workflow ends by executing thesummarizeresource.- No API server is started; the process reads input, runs once, and exits.
Step 2 — Create the LLM Resource
# resources/summarize.yaml
apiVersion: kdeps.io/v1
kind: Resource
metadata:
actionId: summarize
name: Summarize Document
run:
chat:
model: llama3.2:3b
prompt: |
You are a concise document summarizer.
Summarize the following document in 3–5 bullet points:
{{ input('fileContent') }}The input('fileContent') expression injects the file's text content into the LLM prompt. You can also access input('filePath') if you need to reference the source path.
Step 3 — Run the Workflow
Option A — Pass the file path as a CLI argument (highest priority)
./kdeps run workflow.yaml --file /path/to/report.txtThis is the simplest and most explicit option — no stdin redirection, no environment variables.
Option B — Pipe raw text from stdin
cat report.txt | ./kdeps run workflow.yamlOption C — Pipe a JSON object with a file path
The file is read from disk automatically:
echo '{"path":"/tmp/report.txt"}' | ./kdeps run workflow.yamlOption D — Pipe a JSON object with inline content
echo '{"path":"/tmp/report.txt","content":"Q1 revenue exceeded targets by 12%..."}' \
| ./kdeps run workflow.yamlOption E — Use an environment variable
KDEPS_FILE_PATH=/tmp/report.txt ./kdeps run workflow.yamlOption F — Use the configured default path
Set input.file.path in workflow.yaml and run without stdin:
settings:
input:
sources: [file]
file:
path: /tmp/report.txt./kdeps run workflow.yamlStep 4 — Using the File Path in Resources
If you need to reference the source file path (for example, to log it or pass it to another resource):
run:
exec:
command: echo
args:
- "Processing file: {{ input('filePath') }}"Step 5 — Chaining Resources
You can chain multiple resources. The file content flows through the pipeline via get():
# resources/extract.yaml
metadata:
actionId: extract
run:
exec:
command: bash
args:
- "-c"
- "echo '{{ input('fileContent') | replace('\n', ' ') }}' | wc -w"# resources/summarize.yaml
metadata:
actionId: summarize
dependencies: [extract]
run:
chat:
model: llama3.2:3b
prompt: |
Document word count: {{ get('extract') }}
Summarize this document:
{{ input('fileContent') }}Full Working Example
workflow.yaml:
apiVersion: kdeps.io/v1
kind: Workflow
metadata:
name: doc-summarizer
version: "1.0.0"
targetActionId: summarize
settings:
agentSettings:
installOllama: true
models:
- llama3.2:3b
input:
sources: [file]resources/summarize.yaml:
apiVersion: kdeps.io/v1
kind: Resource
metadata:
actionId: summarize
run:
chat:
model: llama3.2:3b
prompt: |
Summarize this document in 3 bullet points:
{{ input('fileContent') }}Run it:
cat /path/to/document.txt | ./kdeps run workflow.yamlIntegration with Shell Scripts
The file source is designed for scripting. Here is a shell script that processes every .txt file in a directory:
#!/bin/bash
for f in /docs/*.txt; do
echo "=== Summarizing $f ==="
./kdeps run workflow.yaml --file "$f"
doneTesting
The file input runner is tested at multiple levels:
| Test level | File | What it covers |
|---|---|---|
| Unit (black-box) | pkg/input/file/runner_test.go | readFileInput — all resolution paths, priority ordering, error cases |
| Unit (white-box) | pkg/input/file/runner_internal_test.go | runWithReader, Run, RunWithArg — success and error paths; 100% statement coverage |
| E2E integration | tests/integration/executor/file_e2e_integration_test.go | Full engine pipeline: raw stdin, --file arg, env var, config path, multi-resource, large file, request body keys |
| Cmd integration | tests/integration/cmd/file_cmd_integration_test.go | StartFileRunner dispatch, RunFlags.FileArg field wiring, error propagation |
Run all file-input tests:
# Unit tests (100% coverage)
go test -v -coverprofile=cover.out ./pkg/input/file/...
go tool cover -func=cover.out
# E2E integration tests
go test -v -timeout 60s ./tests/integration/executor/ -run TestE2E_FileInput
# Cmd integration tests
go test -v -timeout 60s ./tests/integration/cmd/ -run "TestStartFileRunner|TestRunFlags"See Also
- Input Sources — Full reference for all input source types
- LLM Resource — Language model configuration
- Exec Resource — Shell command execution
- Bot Tutorial — Chat bot with stateless stdin input (similar single-shot pattern)