📄DocParse Docs

🖥 Document Classification — API Endpoints

The Classification API is for the case where you don't yet know what type of document each file is. Define a set of categories; we sort each uploaded file into one (or none), with confidence.

All endpoints are prefixed:

plaintext
https://api.docparse-labs.vercel.app/v1

POST Create a Classification

Defines the set of categories. You can optionally link each category to an existing extraction template, in which case files classified into that category are automatically routed to the extraction.

plaintext
POST /classifications

Body

json
{
  "name": "Inbox triage",
  "description": "Sort everything in our document inbox",
  "categories": [
    {
      "name": "Invoice",
      "description": "Vendor bills and invoices",
      "keywords": ["invoice", "bill to", "total due"],
      "linked_extraction_id": "ext_01HQX..."
    },
    {
      "name": "Contract",
      "description": "Service agreements, NDAs",
      "keywords": ["agreement", "party", "whereas"]
    },
    {
      "name": "Receipt",
      "description": "Point-of-sale receipts"
    }
  ]
}

Response — 201 Created

json
{
  "classification": {
    "id": "cls_01HQX...",
    "name": "Inbox triage",
    "categories": [
      { "id": "cat_01...", "name": "Invoice" },
      { "id": "cat_02...", "name": "Contract" },
      { "id": "cat_03...", "name": "Receipt" }
    ]
  }
}

POST Add Files to a Classification Batch

plaintext
POST /classifications/{classification_id}/batches
Content-Type: multipart/form-data

Same multipart/form-data shape as the Data Extraction API. Up to 30 files, 25 MB each.

Response — 202 Accepted

json
{
  "batch": {
    "id": "cbtc_01HQX...",
    "classification_id": "cls_01HQX...",
    "status": "queued",
    "file_count": 5
  }
}

GET Get Batch Status

plaintext
GET /classifications/{classification_id}/batches/{batch_id}

Response — 200 OK

json
{
  "batch": { "status": "processed", "file_count": 5 },
  "files": [
    {
      "id": "file_01HQX...",
      "file_name": "ACME-bill.pdf",
      "status": "processed",
      "classified_category_id": "cat_01...",
      "classified_category_name": "Invoice",
      "confidence": 0.96,
      "classification_reasoning": "Document contains 'invoice' header, line items, and a billing address."
    }
  ]
}

PATCH Override Classification

When the model gets it wrong, override it. The override sticks — we don't reclassify the file.

plaintext
PATCH /classifications/{classification_id}/files/{file_id}

Body

json
{ "classified_category_id": "cat_02..." }

POST Re-classify a File

Forces a fresh classification run on a single file (useful if you've updated the category definitions).

plaintext
POST /classifications/{classification_id}/files/{file_id}/redo

Chaining classification → extraction

When a category has linked_extraction_id, every file classified into that category is automatically queued into the linked extraction. You'll get two webhook events per file:

  1. file.classified — classifier verdict.
  2. file.processed — extraction result, after the linked extraction completes.

See Classification Details for the chaining mental model.