Skip to main content
When processing documents in non-Latin scripts (such as Arabic, Hebrew, Chinese, Japanese, Korean, or Indic languages), choosing the right tool is critical for success. This guide will help you understand when to use Iris versus Dex, and how they work together.

Quick Decision Guide

Use Iris (in Dex) when

  • You’re processing non-Germanic language documents where Iris has proven better accuracy than Reducto
  • Latency is acceptable for your use case (asynchronous batch processing)
  • You need to experiment and evaluate which OCR models work best for your specific document types
  • You want to test different OCR engines (EasyOCR, PaddleOCR, Gemini, etc.) to find the best fit

Use Dex with Reducto when

  • You’re processing English or Germanic language documents
  • You need better stability for production workloads
  • You want the system to automatically scale with your workload

Understanding the Relationship Between Iris and Dex

Iris and Dex are not mutually exclusive—they’re complementary. Dex is a document understanding platform that provides primitives for file management, parsing, vector stores, and data extraction. Iris is one of several OCR engines available within Dex. Think of it this way: Dex is the platform, Iris is one of the engines. When you use Dex, you choose which OCR engine to use for parsing:
  • Reducto: Best for English and Germanic languages
  • Iris: Better accuracy for non-Germanic languages
  • Custom engines: Integrate your own OCR solution

What is Iris?

Iris is Scale’s OCR capability that provides a flexible, modular pipeline for extracting text from documents. It offers:
  • 15+ OCR models: Tesseract, EasyOCR, PaddleOCR, Surya, GPT-4o, Gemini, and more
  • Specialized support for non-Germanic scripts: Arabic Nougat models, QAARI, MBZUAI-AIN, and others
  • Complete pipeline control: Configure layout detection, OCR processing, and assembly separately
  • Inspection capabilities: Save and review intermediate results at each processing stage
  • Extensibility: Add custom OCR models or layout detectors

Current Limitations

Iris is not designed for production environments yet. We’re actively developing Iris to become a fast, lightweight alternative to Reducto, but we’re not there yet.
Current issues include:
  • Stability concerns: Less robust than Reducto for production use
  • Significant latency: Higher processing time than desired
  • No auto-scaling: Cannot scale up or down with system load
Best suited for: Asynchronous batch jobs where accuracy for non-Germanic languages is the priority, and where stability/latency trade-offs are acceptable.

What is Dex?

Dex is Scale’s document understanding platform—a service that transforms unstructured documents into actionable, structured data. It provides:
  • File Management: Secure upload, storage, and retrieval with access control
  • Document Parsing: Convert documents (PDF, DOCX, images) into structured JSON using multiple OCR engines
  • Vector Stores: Index and search parsed documents with semantic embeddings
  • Data Extraction: Extract information using custom schemas, prompts, and RAG-enhanced context
  • Project Management: Organize and isolate data with proper authorization
  • Automatic scaling with your workload
  • Multiple OCR engine options (Reducto and Iris)

Using Iris Within Dex

Dex supports multiple OCR engines. When creating a parse job in Dex, you specify which engine to use:

Engine Options

Reducto
  • Best for English and Germanic language documents
  • Better stability for production workloads
  • Auto-scaling capabilities
  • Use for: English/Germanic documents or when stability is critical
Iris
  • Proven better accuracy for non-Germanic language documents
  • Higher latency (asynchronous processing)
  • Stability concerns for production
  • Use for: Non-Germanic languages where accuracy is the priority and asynchronous processing is acceptable
While Iris provides better accuracy for non-Germanic languages, it’s not yet ready for production environments that require high stability or auto-scaling. We’re working on improving Iris to make it production-ready, but currently recommend considering the stability/latency trade-offs carefully.

Feature Comparison

FeatureIris (via Dex)Dex with Reducto
Best ForNon-Germanic languagesEnglish & Germanic languages
AccuracyBetter for non-Germanic scriptsBetter for Germanic scripts
LatencyHigher (asynchronous)Medium-high (asynchronous)
ScalabilityNo auto-scalingFull auto-scaling
OCR Models15+ models (configurable)Reducto engine
Production Ready⚠️ Not yet✅ Yes

Decision Tree

What language are your documents?

English or Germanic languages (German, Dutch, Swedish, etc.)

Use Dex with Reducto

Non-Germanic languages (Arabic, Hebrew, CJK, Indic, etc.)

Do you need maximum stability?
  • Yes → Use Dex with Reducto (accept lower accuracy)
  • No → Use Dex with Iris (better accuracy, accept latency/stability trade-offs)

Language Support Guidance

For English & Germanic Languages

  • Recommended: Dex with Reducto
  • Reason: Best accuracy and production stability

For Non-Germanic Languages (Arabic, Hebrew, CJK, Indic Languages)

For better accuracy: Dex with Iris (proven better results)
  • Trade-offs: Higher latency, stability concerns
  • Use case: Asynchronous batch processing where accuracy is the priority
For better stability: Dex with Reducto
  • Trade-offs: Lower accuracy for these languages
  • Use case: Production workloads requiring high reliability

Common Misconceptions

“Dex is real-time/low latency”Neither Reducto nor Iris in Dex provides real-time processing. Both have significant latency. The difference is that Reducto is more stable and has auto-scaling.
“Iris has better accuracy for non-Germanic languages”This is correct. Iris (used via Dex) has been proven to work better than Reducto for non-Germanic language documents.
“Iris is production-ready”Not yet. We’re actively working to make Iris a fast, lightweight production alternative to Reducto, but stability issues currently prevent production use in most scenarios.

For Production Applications

  1. English/Germanic languages → Use Dex with Reducto
  2. Non-Germanic languages → Choose based on priority:
    • Accuracy priority → Use Dex with Iris (accept latency/stability trade-offs)
    • Stability priority → Use Dex with Reducto (accept lower accuracy)

For Experimentation

  1. Test with Iris to evaluate different OCR models
  2. Compare results between Iris and Reducto
  3. Transition to Dex for production deployment with your chosen engine

Getting Started

Dex Documentation

Iris Documentation

Summary

  • Dex is the platform for document understanding with file management, parsing, extraction, and vector stores
  • Iris is an OCR engine available within Dex (and standalone) optimized for non-Germanic languages
  • For English/Germanic languages: Use Dex with Reducto
  • For non-Germanic languages: Use Dex with Iris for better accuracy (asynchronous processing), or Reducto for better stability
  • Iris is not production-ready yet: We’re working on making it fast and lightweight, but stability issues currently exist
Need help deciding? Contact the Dex team at #dex-help on Slack.