Skip to main content

Document Processing

This page describes all commands in the Nestbox AI CLI for managing document processing pipelines. The doc-proc command group lets you upload documents, run processing jobs, manage profiles, run evaluations, execute batch queries, and configure webhooks.

Global options (available on all doc-proc subcommands):

  • --project <projectId> – Project ID or name (defaults to current project)
  • --instance <instanceId> – Document processing instance ID
  • --json – Output raw JSON instead of formatted tables

Profile Management

A profile is a YAML configuration file that controls how documents are processed (OCR settings, chunking strategy, GraphRAG indexing, etc.).

Initialize a Profile Template

Scaffold a profile YAML template to the local filesystem:

nestbox doc-proc profile init -o ./my-profile.yaml

Options:

  • -o, --output <path> – Output file path (default: ./profile.yaml)
  • -f, --force – Overwrite existing file

Create a Profile

Register a profile from a YAML file with the processing instance:

nestbox doc-proc profile create --file ./my-profile.yaml

With a custom name and tags:

nestbox doc-proc profile create --file ./my-profile.yaml --name "OCR + GraphRAG" --tags "finance,contracts,2024"

Options:

  • -f, --file <path> – Path to profile YAML file (required)
  • -n, --name <name> – Override the profile name from the file
  • --tags <tags> – Comma-separated list of tags to associate with the profile
  • --project <projectId> – Project ID or name
  • --instance <instanceId> – Processing instance ID
  • --json – Output raw JSON

List Profiles

List all profiles registered with the instance. Displays a table with Profile ID, Name, Tags, and Created At:

nestbox doc-proc profile list

Filter by tags:

nestbox doc-proc profile list --tags "finance,2024"

Options:

  • --page <page> – Page number (default: 1)
  • --limit <limit> – Page size (default: 20)
  • --tags <tags> – Filter by comma-separated tags — only profiles matching any of the given tags are returned

Show Profile Details

Show full details of a profile by ID:

nestbox doc-proc profile show --profile <profileId>

Options:

  • --profile <profileId> – Profile ID (required)

Validate a Profile

Validate a profile YAML file against the schema without registering it:

nestbox doc-proc profile validate --file <path>

Options:

  • -f, --file <path> – Path to profile YAML file (required)

Profile Schema

Print the full profile JSON Schema for reference:

nestbox doc-proc profile schema

Document Management

Create a Document

Upload a file and create a document processing job:

nestbox doc-proc document create --input ./contract.pdf --profile prof-abc123

With tags:

nestbox doc-proc document create --input ./invoice.pdf --profile prof-abc123 --tags "invoice,2024,acme-corp"

Options:

  • --input <path> – Document file path (required)
  • --profile <profileId> – Processing profile ID
  • --stages <stages> – Comma-separated stage override (e.g. ocr,chunking)
  • --priority <priority> – Job priority: low, normal, or high
  • --tags <tags> – Comma-separated list of tags to associate with the document

List Documents

List all processed documents. Displays a table with Document ID, Name, Tags, Profile ID, and Processed At:

nestbox doc-proc document list

Filter by profile and tags:

nestbox doc-proc document list --profile prof-abc123 --tags "invoice,2024"

Options:

  • --page <page> – Page number (default: 1)
  • --limit <limit> – Page size (default: 20)
  • --profile <profileId> – Filter by profile ID
  • --tags <tags> – Filter by comma-separated tags — only documents matching any of the given tags are returned

Show Document Details

Show details of a specific processed document:

nestbox doc-proc document show --document <documentId>

Options:

  • --document <documentId> – Document ID (required)

Download Document Artifacts

Download all artifacts for a document as a zip file (GraphRAG output, chunks, etc.):

nestbox doc-proc document artifacts --document doc-abc123 -o ./artifacts.zip

Options:

  • --document <documentId> – Document ID (required)
  • -o, --output <path> – Output zip path (default: ./document-artifacts.zip)

Job Monitoring

List Jobs

List document processing jobs:

nestbox doc-proc job list

Options:

  • --state <state> – Filter by job state (e.g. pending, running, completed, failed)
  • --page <page> – Page number (default: 1)
  • --limit <limit> – Page size (default: 20)

Job Status

Get the status of a specific job:

nestbox doc-proc job status --job job-xyz789

With full details:

nestbox doc-proc job status --job job-xyz789 --full

Options:

  • --job <jobId> – Job ID (required)
  • --full – Fetch full job details instead of lightweight status

Evaluations

Evaluations run a set of Q&A test cases against a processed document to measure extraction quality.

Initialize an Eval Template

Scaffold an eval YAML template:

nestbox doc-proc eval init -o ./eval.yaml

Options:

  • -o, --output <path> – Output file path (default: ./eval.yaml)
  • -f, --force – Overwrite existing file

Example eval.yaml:

testCases:
- id: q1
question: "What are the payment terms?"
expectedAnswer: "Net 30"

Run an Evaluation

Run an evaluation against a document:

nestbox doc-proc eval run --document doc-abc123 --file ./eval.yaml

Options:

  • --document <documentId> – Document ID (required)
  • -f, --file <path> – Path to eval YAML file (required)

Validate an Eval File

Validate an eval YAML file against the schema without running it:

nestbox doc-proc eval validate --document <documentId> --file <path>

List Evaluations

List all evaluations for a document:

nestbox doc-proc eval list --document <documentId>

Options:

  • --document <documentId> – Document ID (required)
  • --page <page> – Page number (default: 1)
  • --limit <limit> – Page size (default: 20)

Show Evaluation Details

Get full details of a specific evaluation:

nestbox doc-proc eval show --document <documentId> --eval <evalId>

Options:

  • --document <documentId> – Document ID (required)
  • --eval <evalId> – Evaluation ID (required)

Batch Queries

Batch queries let you run multiple questions against a processed document in one request.

Initialize a Query Template

Scaffold a batch query YAML template:

nestbox doc-proc query init -o ./query.yaml

Options:

  • -o, --output <path> – Output file path (default: ./query.yaml)
  • -f, --force – Overwrite existing file

Example query.yaml:

queries:
- id: payment_terms
question: "What are the payment terms?"
mode: local

Submit a Batch Query

Submit a batch query from a YAML file:

nestbox doc-proc query create --file <path>

Options:

  • -f, --file <path> – YAML file path (required)

Validate a Query File

Validate a query YAML file without submitting it:

nestbox doc-proc query validate --file <path>

List Batch Queries

List all batch queries:

nestbox doc-proc query list

Options:

  • --page <page> – Page number (default: 1)
  • --limit <limit> – Page size (default: 20)

Show Query Details

Get details of a specific batch query:

nestbox doc-proc query show --query <queryId>

Options:

  • --query <queryId> – Query ID (required)

Webhooks

Create a Webhook

Register a webhook to receive processing event notifications:

nestbox doc-proc webhook create --url https://my-app.com/hooks/nestbox --event job.completed job.failed

Options:

  • --url <url> – Webhook URL (required)
  • --secret <secret> – HMAC signing secret for payload verification
  • --event <event...> – One or more event names to subscribe to

List Webhooks

List all registered webhooks:

nestbox doc-proc webhook list

Show Webhook Details

Get details of a specific webhook:

nestbox doc-proc webhook show --webhook <webhookId>

Update a Webhook

Update a webhook's configuration:

nestbox doc-proc webhook update --webhook <webhookId> --url <url>

Options:

  • --webhook <webhookId> – Webhook ID (required)
  • --url <url> – New webhook URL
  • --secret <secret> – New signing secret
  • --event <event...> – New event subscriptions
  • --active <true|false> – Enable or disable the webhook

Delete a Webhook

Delete a webhook:

nestbox doc-proc webhook delete --webhook <webhookId>

Options:

  • --webhook <webhookId> – Webhook ID (required)

Health Check

Check the health of the document processing API:

nestbox doc-proc health