Docker Deployment
kdeps bundle build packages your workflow into a Docker image that starts an API server when run. No Dockerfile needed -- kdeps generates one from your workflow.yaml.
Overview
# Package workflow into .kdeps file
kdeps bundle package workflow.yaml
# Build Docker image
kdeps bundle build myagent-1.0.0.kdeps --tag myregistry/myagent:latest
# Or with GPU support
kdeps bundle build myagent-1.0.0.kdeps --gpu cuda --tag myregistry/myagent:latest-gpuPackaging
The package command creates a .kdeps archive containing your workflow and resources:
kdeps bundle package path/to/workflow.yamlThis creates myagent-1.0.0.kdeps (name and version from workflow metadata).
What's Included
myagent-1.0.0.kdeps
├── workflow.yaml # workflow entry point
├── resources/ # all resource YAML files
├── data/ # data files and scripts
├── requirements.txt # Python dependencies (if present)
└── public/ # static files (if present)Building Docker Images
Basic Build
kdeps bundle build myagent-1.0.0.kdepsCreates image: kdeps-myagent:1.0.0
Custom Tag
kdeps bundle build myagent-1.0.0.kdeps --tag myregistry/myagent:latestShow Dockerfile
View the generated Dockerfile without building:
kdeps bundle build myagent-1.0.0.kdeps --show-dockerfileGPU Support
Build images with GPU acceleration:
# NVIDIA CUDA
kdeps bundle build myagent-1.0.0.kdeps --gpu cuda
# AMD ROCm
kdeps bundle build myagent-1.0.0.kdeps --gpu rocm
# Intel oneAPI
kdeps bundle build myagent-1.0.0.kdeps --gpu intel
# Vulkan (cross-platform)
kdeps bundle build myagent-1.0.0.kdeps --gpu vulkanGPU Runtime
When running GPU-enabled images:
# NVIDIA
docker run --gpus all myregistry/myagent:latest
# AMD
docker run --device=/dev/kfd --device=/dev/dri myregistry/myagent:latestBase OS Auto-Selection
KDeps automatically selects the base OS based on GPU requirements:
- No
--gpuflag → Alpine (CPU-only, smallest images ~300MB) --gpuspecified → Ubuntu (GPU support, glibc-based)
The OS is automatically chosen to ensure compatibility:
# CPU-only: Uses Alpine (smallest)
kdeps bundle build myagent-1.0.0.kdeps
# GPU: Uses Ubuntu (required for GPU drivers)
kdeps bundle build myagent-1.0.0.kdeps --gpu cudaWhy Auto-Selection?
- Alpine uses musl libc and cannot run GPU workloads (NVIDIA CUDA, AMD ROCm require glibc)
- Ubuntu uses glibc and supports all GPU types
- Auto-selection prevents invalid combinations (e.g., Alpine + CUDA)
| Configuration | Base OS | Image Size | Use Case |
|---|---|---|---|
kdeps bundle build . | Alpine | ~300MB | CPU-only, edge deployment |
kdeps bundle build . --gpu cuda | Ubuntu | ~800MB+ | NVIDIA GPU inference |
kdeps bundle build . --gpu rocm | Ubuntu | ~800MB+ | AMD GPU inference |
kdeps bundle build . --gpu intel | Ubuntu | ~600MB+ | Intel GPU inference |
kdeps bundle build . --gpu vulkan | Ubuntu | ~600MB+ | Cross-platform GPU |
Override the auto-selected distro in workflow.yaml:
settings:
agentSettings:
baseOS: ubuntu # alpine (default) or ubuntu--gpu always forces Ubuntu. debian is not supported.
LLM Backend in Images
By default no LLM server is installed: chat resources run on the file backend, and the llamafile model for each chat resource is downloaded into the image at build time (/app/.kdeps/models/). The container is a vanilla alpine:latest or ubuntu:latest base plus kdeps and the baked models - the llamafile self-serves inside the container, no extra process to manage.
default (file backend) → FROM alpine:latest or ubuntu:latest
+ pre-baked llamafile per chat model (~1.1 GB for llama3.2:1b)
no chat resources → FROM alpine:latest or ubuntu:latest (vanilla)Ollama Docker Images (opt-in)
Ollama is bundled only when installOllama: true, KDEPS_DEFAULT_BACKEND=ollama is set with chat resources, or KDEPS_LLM_ROUTER routes to Ollama.
needs Ollama + alpine CPU → FROM alpine/ollama:<tag> (~70MB third-party CPU image)
needs Ollama + ubuntu CPU → FROM ollama/ollama:<tag> (official image)
needs Ollama + --gpu cuda → FROM ollama/ollama:<tag> (runtime: docker run --gpus all)
needs Ollama + --gpu rocm → FROM ollama/ollama:rocm (runtime: --device /dev/kfd --device /dev/dri)
no Ollama → FROM alpine:latest or ubuntu:latest| Image | Size | GPU | Source |
|---|---|---|---|
alpine/ollama | ~70MB | CPU only | alpine-docker/ollama |
ollama/ollama | ~4GB | CPU + NVIDIA | Official Ollama Docker |
ollama/ollama:rocm | varies | AMD | Official :rocm variant |
When the base image already includes Ollama, kdeps does not COPY --from a second Ollama layer.
Offline Mode
Bake models into the image for air-gapped deployments:
# workflow.yaml
settings:
agentSettings:
offlineMode: true
models:
- llama3.2:1b
- llama3.2-visionBuild with models included:
kdeps bundle build myagent-1.0.0.kdepsThe resulting image contains all models and doesn't require internet access.
Python Dependencies
Using requirements.txt
# workflow.yaml
settings:
agentSettings:
requirementsFile: "requirements.txt"KDeps uses uv for fast Python package management (97% smaller than Anaconda).
Inline Packages
# workflow.yaml
settings:
agentSettings:
pythonVersion: "3.12"
pythonPackages:
- pandas>=2.0
- numpy
- scikit-learnSystem Packages
Install OS-level packages:
# workflow.yaml
settings:
agentSettings:
osPackages:
- ffmpeg
- imagemagick
- tesseract-ocr
- poppler-utils
repositories:
- ppa:alex-p/tesseract-ocr-develPackage Version Pinning
On every bundle build, kdeps resolves package versions before generating the Dockerfile:
- kdeps — latest GitHub release;
install.shis fetched from that tag and the same tag is passed to the installer (nevermain+ floating latest) - ollama — latest ollama/ollama release as the Docker image tag for
ollama/ollamaandalpine/ollama(never:latest).--gpu rocmuses the fixedollama/ollama:rocmtag instead. - uv — latest astral-sh/uv release as the
ghcr.io/astral-sh/uvtag
Override any field with an explicit semver (v1.2.3 or 1.2.3). Use latest or omit a field to accept the resolved value at build time.
# workflow.yaml
settings:
agentSettings:
versions:
kdeps: v2.0.0 # optional — default: newest GitHub release when bundle build runs
ollama: 0.5.4 # optional — default: newest ollama/ollama release
uv: 0.6.3 # optional — default: newest astral-sh/uv releasePreview resolved pins:
kdeps bundle build myagent-1.0.0.kdeps --show-dockerfileWhen no Ollama base image is selected, kdeps uses alpine:latest or ubuntu:latest. Python defaults to 3.12 when pythonVersion is omitted.
Environment Variables
Build-time Args
# workflow.yaml
settings:
agentSettings:
args:
BUILD_VERSION: ""Pass during build:
docker build --build-arg BUILD_VERSION=1.0.0 ...Runtime Environment
# workflow.yaml
settings:
agentSettings:
env:
LOG_LEVEL: "info"
API_TIMEOUT: "30"Override at runtime:
docker run -e LOG_LEVEL=debug myregistry/myagent:latestDocker Compose
KDeps generates a docker-compose.yml:
# docker-compose.yml
version: '3.8'
services:
myagent:
image: kdeps-myagent:1.0.0
ports:
- "16395:16395" # API server
- "16395:16395" # Web server (if enabled)
environment:
- LOG_LEVEL=info
volumes:
- ollama:/root/.ollama
- kdeps_data:/agent/volume
restart: on-failure
# For GPU:
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
volumes:
ollama:
kdeps_data:Run with:
docker-compose up -dOptimized Build Process
KDeps uses a streamlined build process that leverages the official installation script. This ensures the smallest possible image size and maximum compatibility.
# Example of generated Dockerfile logic
FROM alpine:3.18
# Upgrade base packages (security patches), then install dependencies
RUN apk upgrade --no-cache && \
apk add --no-cache curl bash python3 py3-pip
# Install kdeps via official install script. Released CLIs pin both the
# script ref and the binary tag to their own version; dev builds use main.
RUN curl -LsSf https://raw.githubusercontent.com/kdeps/kdeps/v2.1.0/install.sh | sh -s -- -b /usr/local/bin v2.1.0
# Copy agent files
COPY workflow.yaml /app/workflow.yaml
COPY resources/ /app/resources/
WORKDIR /app
ENTRYPOINT ["kdeps"]
CMD ["run", "workflow.yaml"]The build process also automatically handles:
- Python environments: Integrated
uvfor 97% smaller virtual environments. - Model management: Pre-pulling models for offline readiness.
- Service orchestration: Lightweight
supervisorto manage API and LLM processes.
Health Checks
Add a health endpoint:
# workflow.yaml
settings:
apiServer:
routes:
- path: /health
methods: [GET]In Docker Compose:
# docker-compose.yml
services:
myagent:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:16395/health"]
interval: 30s
timeout: 10s
retries: 3Kubernetes Deployment
KDeps generates Kubernetes manifests directly from your workflow.yaml using kdeps export k8s. No manual YAML authoring needed.
# Build and push the Docker image first
kdeps bundle build . --tag myregistry/myagent:1.0.0
docker push myregistry/myagent:1.0.0
# Generate manifests
kdeps export k8s . \
--image myregistry/myagent:1.0.0 \
--output k8s.yaml
# Apply to cluster
kubectl apply -f k8s.yamlConfigure Kubernetes settings in workflow.yaml:
# workflow.yaml
settings:
portNum: 16395
agentSettings:
replicas: 3
resources:
cpuLimit: "2000m"
memoryLimit: "4Gi"
cpuRequest: "500m"
memoryRequest: "1Gi"
env:
LOG_LEVEL: infoThe generated manifest includes a Deployment with readiness/liveness probes and a ClusterIP Service, both derived from your workflow settings.
See the Kubernetes Deployment guide for the full reference.
See Also
- Docker Reference - Production best practices, security hardening, troubleshooting
- Workflow Configuration - Agent settings
- WebServer Mode - Serve frontends
- LLM Backends - Backend configuration
- Management API - Live workflow updates without rebuilding
