Scraper Resource

The scraper executor is a native capability compiled into the kdeps binary. It fetches a URL and returns the text content, with optional CSS selector filtering.

Where it runs

Both workflow mode and agent mode. In workflow mode it executes as a DAG step. In agent mode, the workflow containing this resource runs as a single callable tool.

Configuration

yaml

# resources/fetch.yaml
scraper:
  url: "https://example.com"     # required
  selector: "article.content"    # optional CSS selector
  timeout: 30                    # seconds (default: 30)

Field	Type	Required	Default	Description
`url`	string	yes	—	URL to fetch
`selector`	string	no	—	CSS selector to scope extraction
`timeout`	integer	no	`30`	Request timeout in seconds

Output

Key	Type	Description
`content`	string	Extracted text (full body or selector match)
`url`	string	The URL that was fetched
`status`	integer	HTTP status code
`json`	string	Full result as a JSON string

Access fields with output('actionId').content etc.

Examples

Fetch a page and summarize

yaml

# resources/fetch.yaml
actionId: fetch
scraper:
  url: "{{ get('url') }}"

---
actionId: summarize
requires: [fetch]
chat:
  model: llama3.2:1b
  prompt: "Summarize: {{ output('fetch').content }}"
apiResponse:
  response: "{{ output('summarize') }}"

Extract with CSS selector

yaml

# resources/fetch-article.yaml
actionId: fetchArticle
scraper:
  url: "https://news.example.com/article"
  selector: "article.body"

Error Handling

Use onError to handle unreachable URLs gracefully:

yaml

# resources/example.yaml
scraper:
  url: "https://example.com"
onError:
  action: continue
  fallback: ""

Need more? For PDF extraction, OCR, and document types (.docx, .xlsx), install the component:
bash
kdeps registry install scraper
1

Scraper Resource ​

Where it runs ​

Configuration ​

Output ​

Examples ​

Fetch a page and summarize ​

Extract with CSS selector ​

Error Handling ​

See Also ​

Scraper Resource

Where it runs

Configuration

Output

Examples

Fetch a page and summarize

Extract with CSS selector

Error Handling

See Also