📄DocParse Docs

Extraction Details

Deeper notes on how extractions are configured and what fields you get back.

Field types

TypeReturned as JSONNotes
stringstringDefault for text-like fields.
numbernumberDecimals included.
integernumberWill be rounded to nearest int.
booleanbooleantrue / false only.
dateISO 8601 stringE.g. "2026-05-24".
datetimeISO 8601 stringE.g. "2026-05-24T10:00:00Z".
objectnested objectDefine fields recursively.
list<string>string[]Useful for tags, tracking numbers.
list<object>object[]For repeating sections — line items, line items, party blocks.

Field examples

Adding a one-line example to each field dramatically improves accuracy. The example is shown to the model as a hint, not enforced.

json
{
  "key": "invoice_number",
  "type": "string",
  "description": "The unique invoice ID from the seller.",
  "example": "INV-2026-00128"
}

Language

Set language at the extraction level. Options:

  • "Multi-Lingual" (default) — auto-detect per document.
  • "English", "French", "German", "Spanish", "Hindi", "Japanese", "Chinese (Simplified)", and more.

Forcing a language gives a small accuracy bump on borderline-quality scans.

Document Options

Per-extraction toggles that influence parsing behavior:

OptionEffect
ocr_priorityWhen true, treat the document as scanned (skip text-layer extraction). Useful for low-quality PDFs.
infer_missingAllow the model to infer fields that aren't explicitly written (e.g. total = subtotal + tax).
strict_formatReject documents that don't match the expected layout. Returns status: needs_review instead.

Confidence scores

Every extracted field comes back with a confidence value in [0, 1]. Treat anything below 0.7 as needing human review.

json
{
  "result": { "total_amount": 4108.26 },
  "confidence": { "total_amount": 0.98 }
}

needs_review status

A file lands in needs_review when at least one field has confidence below your configured threshold (defaults to 0.7). The result is still available — but you should surface it in your UI for a human to double-check.

Resubmitting

To re-process a single file (e.g. after editing the schema), DELETE it and re-upload. Pages are charged again.