Introduction to Dex

Dex is Scale’s document understanding service that transforms unstructured documents into actionable, structured data. It is a comprehensive platform that combines advanced OCR, natural language processing, and machine learning to extract meaningful information from PDFs, images, spreadsheets, and more.

Why Use Dex?

Around 80-90% of enterprise data lives within unstructured formats such as PDFs and DOCX files. Dex solves the most common challenges of programmatic document processing:

Format Diversity: Process any document type with a single API—business reports, financial documents, legal contracts, healthcare records, and more.
Unstructured Data: Convert complex layouts into structured JSON with semantic understanding, including text, tables, charts, and infographics.
Quality Variations: Handle scanned, handwritten, and low-quality documents with high accuracy across multiple languages.
Scalability: Process thousands of documents efficiently with built-in scalable infrastructure.
Flexibility: Choose from multiple OCR engines and customize extraction with your own tools and workflows.

Core Primitives

Dex is designed as a capability rather than a standalone product, centered around composable primitives that can be used, extended, and combined:

File Management

Upload, retrieve, and securely store confidential documents. Supports persistent storage with metadata tracking, secure access patterns, and configurable data retention policies for automatic lifecycle management.

Parse

Convert documents into machine-readable formats using multiple OCR engines. Dex extracts:

Plain text in multiple languages (English, Spanish, Arabic, German, and more)
Tables including small and large tabular data (up to 500+ rows)
Checkboxes for form processing
Images and figures with bounding box information
Charts for data visualization analysis

Vector Stores

Vectorize and index parsed documents for semantic search and retrieval. Vector stores enable:

Semantic search over document chunks with embedding-based similarity
Context management for multi-file processing and large documents
Regex search for pattern-based extraction (dates, IDs, emails, etc.)
Document summarization for quick overview generation

Extract

Extract structured data from parse results or document collections using:

Custom schemas defined with Pydantic models
Natural language prompts to guide extraction
Citations that link extracted data to source locations
Confidence scores for quality assessment
RAG-enhanced extraction using vector store context
Agentic extraction with custom MCP tools for advanced workflows

Ways to Interact with Dex

Dex provides multiple interfaces to support different use cases:

REST API: OpenAPI-documented endpoints for direct integration
Python SDK: High-level wrapper for rapid development with both sync and async support
MCP Server: Model Context Protocol integration for agent-based workflows (coming soon…)

Common Use Cases

Financial Services: Automate invoice processing, tax document analysis, and financial report extraction.
Healthcare: Extract patient information from medical records, insurance claims, and healthcare forms.
Legal: Analyze contracts, process discovery documents, and extract key clauses and obligations.
Business Operations: Process HR documents, supply chain orders, customer service tickets, and business reports.

Understanding Industry Document Challenges

Different industries face unique document processing challenges based on their document types and layouts. For a comprehensive overview of typical document formats and layout challenges across finance, healthcare, insurance, and legal sectors, see Industry Document Types and Layout Challenges. This guide covers:

Finance: SEC filings, research reports, and financial statements with multi-column layouts, complex footnotes, and embedded visualizations
Healthcare: Medical records and clinical documentation with handwritten elements, scanned materials, and variable form structures
Insurance: Claims forms (CMS-1500, UB-04) combining typed prompts with handwritten responses on poor-quality scans
Legal: Contracts and court filings requiring hierarchical structure preservation through complex sections and redlined annotations

Understanding these document-specific challenges can help you optimize your Dex configuration for better extraction accuracy and results.

Language Support

Dex supports multi-language document processing with good support for germanic languages. For non-germanic, there are 35 languages including but not limited to: Afrikaans: 🇿🇦 - Albanian: 🇦🇱 - Arabic: 🇸🇦 - Armenian: 🇦🇲 - Belarusian: 🇧🇾 - Bengali: 🇧🇩 - Bulgarian: 🇧🇬 - Catalan: 🇪🇸 - Chinese: 🇨🇳 - Croatian: 🇭🇷 - Czech: 🇨🇿 - Danish: 🇩🇰 - Dutch: 🇳🇱 - English: 🇬🇧 - Estonian: 🇪🇪 - Filipino: 🇵🇭 - Finnish: 🇫🇮 - French: 🇫🇷 - German: 🇩🇪 - Greek: 🇬🇷 - Gujarati: 🇮🇳 - Hebrew: 🇮🇱 - Hindi: 🇮🇳 - Hungarian: 🇭🇺 - Icelandic: 🇮🇸 - Indonesian: 🇮🇩 - Italian: 🇮🇹 - Japanese: 🇯🇵 - Kannada: 🇮🇳 - Khmer: 🇰🇭 - Korean: 🇰🇷 - Lao: 🇱🇦 - Latvian: 🇱🇻 - Lithuanian: 🇱🇹 - Macedonian: 🇲🇰 - Malay: 🇲🇾 - Malayalam: 🇮🇳 - Marathi: 🇮🇳 - Nepali: 🇳🇵 - Norwegian: 🇳🇴 - Persian: 🇮🇷 - Polish: 🇵🇱 - Portuguese: 🇵🇹 - Punjabi: 🇮🇳 - Romanian: 🇷🇴 - Russian: 🇷🇺 - Serbian: 🇷🇸 - Slovak: 🇸🇰 - Slovenian: 🇸🇮 - Spanish: 🇪🇸 - Swedish: 🇸🇪 - Tagalog: 🇵🇭 - Tamil: 🇮🇳 - Telugu: 🇮🇳 - Thai: 🇹🇭 - Turkish: 🇹🇷 - Ukrainian: 🇺🇦 - Vietnamese: 🇻🇳 - Yiddish: 🇮🇱

Key Features

Citations and Traceability

Every extracted field can be associated with its source location (page number, bounding box, text snippet), enabling auditability and human review.

Confidence Scoring

Assigns confidence scores to extracted fields based on model outputs, helping you filter and prioritize results for downstream review.

Flexible OCR Engine Support

Choose from multiple OCR engines:

Reducto: see documentation here
Iris: See When to choose Iris?.

Data Lifecycle Management

Configurable retention policies automatically manage the lifecycle of files and processing artifacts, helping you meet compliance requirements and optimize storage costs.

Infrastructure and Deployment

Core Components

Dex is deployed as a Kubernetes service with the following components: Application Pods:

API Pod: Handles REST API requests for document processing
Temporal Worker Pod: Processes document understanding workflows
Both use the same Docker image with different entry points

Infrastructure Services:

Temporal Server: Workflow orchestration (v1.25.0)
Postgres Database: Application data and Temporal state (Postgres 17)
Object Storage: S3, MinIO, or Azure Blob Storage for document artifacts

Deployment:

Deployed via Helm charts on Kubernetes
Supported on AWS, Azure, and GCP
Requires namespace and service configuration

GPU and Hardware Requirements

GPU requirements depend on which OCR engine and extraction models you use. OCR Engine Options: Reducto Local OCR:

Small (CPU): No GPU. Germanic languages. ~3s/page latency.
Medium: A10G GPU. Germanic languages.
Large: 8xH100 GPUs. 40+ languages (no Arabic RTL, degraded CJK).

Cloud Provider OCR (No provisioned GPU):

AWS Textract
Azure Vision Read
GCP Vision API

Iris:

For custom OCR needs with 15+ configurable models. See When to choose Iris?.

Extraction, Key-Value, Tables:

Azure OpenAI or Claude via Bedrock (no GPU, recommended)
Hosted H100 (slower auto-scaling)

Recommended Configuration

For minimal GPU requirements:

OCR: Reducto Small (CPU) or Azure Vision Read
Extraction: Azure OpenAI or Claude via AWS Bedrock
Infrastructure: Kubernetes cluster with Postgres and object storage

Getting Started

To begin using Dex, you’ll need a Scale account with SGP access. Quick Links:

Getting Started Guide: Step-by-step tutorial for your first extraction
File Management: Upload, pagination, and file types
Parse: Parse engines and async jobs
Chunking: Chunking strategies
Vector Stores: Semantic search and RAG
Extract: Data extraction and batch processing
Best Practices: Quick start and optimization
API Reference: Complete SDK documentation

Dex makes document understanding accessible to developers while delivering the power and accuracy required for production applications.

Getting Started

Document Understanding

OCR

Workflows

Introduction to Dex

Why Use Dex?

Core Primitives

File Management

Parse

Vector Stores

Extract

Ways to Interact with Dex

Common Use Cases

Understanding Industry Document Challenges

Language Support

Key Features

Citations and Traceability

Confidence Scoring

Flexible OCR Engine Support

Data Lifecycle Management

Infrastructure and Deployment

Core Components

GPU and Hardware Requirements

Recommended Configuration

Getting Started

Getting Started

Document Understanding

OCR

Workflows

​Why Use Dex?

​Core Primitives

​File Management

​Parse

​Vector Stores

​Extract

​Ways to Interact with Dex

​Common Use Cases

​Understanding Industry Document Challenges

​Language Support

​Key Features

​Citations and Traceability

​Confidence Scoring

​Flexible OCR Engine Support

​Data Lifecycle Management

​Infrastructure and Deployment

​Core Components

​GPU and Hardware Requirements

​Recommended Configuration

​Getting Started

Why Use Dex?

Core Primitives

File Management

Parse

Vector Stores

Extract

Ways to Interact with Dex

Common Use Cases

Understanding Industry Document Challenges

Language Support

Key Features

Citations and Traceability

Confidence Scoring

Flexible OCR Engine Support

Data Lifecycle Management

Infrastructure and Deployment

Core Components

GPU and Hardware Requirements

Recommended Configuration

Getting Started