Latest Updates
New Capability: IRIS OCR
Added: October 2024 IRIS is Scale’s OCR capability that transforms document images and PDFs into structured text through an intelligent multi-stage pipeline.What’s New
Two new documentation pages:-
Introduction to IRIS
- Overview of IRIS OCR capability
- Three-stage pipeline architecture (layout detection, OCR processing, assembly)
- 15+ supported OCR models including open-source and vision-language models
- Multi-language support with specialized Arabic models
- Common use cases and key advantages
-
Getting Started with IRIS
- Comprehensive guide to using IRIS through Dex SDK
- Prerequisites and setup instructions
- Parsing PDFs and images with complete examples
- Configuration options for parse engine
- Understanding parse results and chunk structure
- File management and error handling
- Batch processing examples
- Multi-language support details
- Best practices for production use
- Performance considerations and optimization tips
Key Features
- Layout-Aware Processing: Automatically detects text, tables, and images before OCR
- Multiple OCR Engines: Choose from Tesseract, EasyOCR, PaddleOCR, Surya, GPT-4o, Gemini, and more
- Table-Specific Processing: Specialized models optimized for accurate table extraction
- Multi-Language Support: Process documents in 35+ languages including Arabic
- Dex Integration: Seamless integration with Dex’s document understanding platform
- Async Processing: Non-blocking parse jobs with project-based organization
How to Access
IRIS is available through the Dex SDK as a parse engine option:Configuration Updates
Navigation Structure Improvements
Updated: October 2024- Added explicit V5 (beta) version tags to all Capabilities navigation groups
- Ensures proper scoping of Capabilities documentation to V5
- Improved navigation organization for better user experience
- Getting Started
- Document Understanding
- OCR
- Workflows

