Skip to main content
This page tracks updates and additions to Scale’s Capabilities documentation.

Latest Updates

New Capability: IRIS OCR

Added: October 2024 IRIS is Scale’s OCR capability that transforms document images and PDFs into structured text through an intelligent multi-stage pipeline.

What’s New

Two new documentation pages:
  1. Introduction to IRIS
    • Overview of IRIS OCR capability
    • Three-stage pipeline architecture (layout detection, OCR processing, assembly)
    • 15+ supported OCR models including open-source and vision-language models
    • Multi-language support with specialized Arabic models
    • Common use cases and key advantages
  2. Getting Started with IRIS
    • Comprehensive guide to using IRIS through Dex SDK
    • Prerequisites and setup instructions
    • Parsing PDFs and images with complete examples
    • Configuration options for parse engine
    • Understanding parse results and chunk structure
    • File management and error handling
    • Batch processing examples
    • Multi-language support details
    • Best practices for production use
    • Performance considerations and optimization tips

Key Features

  • Layout-Aware Processing: Automatically detects text, tables, and images before OCR
  • Multiple OCR Engines: Choose from Tesseract, EasyOCR, PaddleOCR, Surya, GPT-4o, Gemini, and more
  • Table-Specific Processing: Specialized models optimized for accurate table extraction
  • Multi-Language Support: Process documents in 35+ languages including Arabic
  • Dex Integration: Seamless integration with Dex’s document understanding platform
  • Async Processing: Non-blocking parse jobs with project-based organization

How to Access

IRIS is available through the Dex SDK as a parse engine option:
from dex_core.models.parse_job import IrisParseEngineOptions, IrisParseJobParams

parse_job = await dex_file.start_parse_job(
    IrisParseJobParams(options=IrisParseEngineOptions())
)

Configuration Updates

Updated: October 2024
  • Added explicit V5 (beta) version tags to all Capabilities navigation groups
  • Ensures proper scoping of Capabilities documentation to V5
  • Improved navigation organization for better user experience
Affected Sections:
  • Getting Started
  • Document Understanding
  • OCR
  • Workflows

Feedback

Have suggestions for improving our Capabilities documentation? Please contact the Scale AI team or submit feedback through your account dashboard.