Skip to content

Embedding Resource

The embedding executor is a native capability compiled into the kdeps binary. It provides a SQLite-backed keyword store for indexing, searching, upserting, and deleting text documents. Use it as the storage layer for RAG pipelines that run fully on-prem.

Where it runs

Both workflow mode and agent mode. In workflow mode it executes as a DAG step. In agent mode, the workflow containing this resource runs as a single callable tool.

Configuration

yaml
# resources/store.yaml
embedding:
  operation: "index"                    # required: index | search | upsert | delete
  text: "document content here"         # required for index/search/upsert (optional for delete-all)
  collection: "default"                 # optional namespace (default: "default")
  dbPath: "/data/kdeps-store.db"       # optional path (default: "kdeps-embedding.db")
  limit: 10                             # optional max search results (default: 10)
FieldTypeRequiredDefaultDescription
operationstringyesindex, search, upsert, or delete
textstringyes*Text to index, query, or delete. *Optional for delete (omit to delete entire collection)
collectionstringno"default"Namespace for documents
dbPathstringno"kdeps-embedding.db"SQLite database file path
limitintegerno10Max results for search

Operations

OperationDescription
indexInsert text (ignored if duplicate)
upsertInsert or replace text
searchCase-insensitive keyword search via LIKE
deleteDelete by text, or whole collection if text is empty

Output

KeyTypeDescription
operationstringThe operation that was performed
collectionstringThe collection used
successbooltrue on success
results[]stringMatching texts (search only)
countintegerNumber of results (search only)
affectedintegerRows deleted (delete only)
jsonstringFull result as JSON string

Examples

Build a RAG pipeline

yaml
# Step 1: Scrape content
actionId: fetch
scraper:
  url: "{{ get('url') }}"

# Step 2: Index it
actionId: storeDoc
requires: [fetch]
embedding:
  operation: "index"
  text: "{{ output('fetch').content }}"
  collection: "knowledge"
  dbPath: "/data/store.db"

# Step 3: Search on user query
actionId: findDocs
embedding:
  operation: "search"
  text: "{{ get('query') }}"
  collection: "knowledge"
  dbPath: "/data/store.db"
  limit: 5

# Step 4: Answer with context
actionId: answer
requires: [findDocs]
chat:
  model: llama3.2:1b
  prompt: |
    Context: {{ output('findDocs').results }}
    Question: {{ get('query') }}
apiResponse:
  response: "{{ output('answer') }}"

Collections

Use collection to namespace documents — useful for multi-tenant or multi-topic stores:

yaml
# Index into separate collections
embedding:
  operation: "index"
  text: "..."
  collection: "contracts"

# Search only within one collection
embedding:
  operation: "search"
  text: "termination clause"
  collection: "contracts"

Note: This uses keyword (LIKE) matching, not vector similarity. For OpenAI vector embeddings, install the component:

bash
kdeps registry install embedding

See Also

Released under the Apache 2.0 License.