This guide will help you get started with IRIS, Scale’s experimental OCR capability integrated with Dex. You’ll learn how to parse documents with IRIS through the Dex SDK, work with parse results, and integrate IRIS into your document processing workflows.
IRIS has experimental stability. While you can use IRIS in production, expect experimental-level support and stability. For standard production workloads, we recommend Reducto within Dex for proven reliability. IRIS is designed for teams with custom OCR needs who want to maximize accuracy, minimize latency, or build specialized models.
Overview
IRIS is Scale’s experimental OCR capability that provides complete pipeline control for custom document processing needs. When using IRIS through Dex, you get:
- 15+ OCR Models: Experiment with different OCR engines to optimize for your specific documents
- Complete Pipeline Control: Configure layout detection, OCR processing, and assembly independently
- Unified Document Management: Upload and manage files through Dex’s file system
- Async Processing: Non-blocking parse jobs that process documents in the background
- Project Organization: Group files and parse results within Dex projects
- Extensibility: Add custom OCR models or layout detectors for specialized needs
Note: If you don’t have custom OCR requirements, use Reducto within Dex for production-ready document processing.
Prerequisites
Before using IRIS, ensure you have:
- Dex SDK installed
- SGP Account ID and API Key
- Access to a Dex instance
Basic Usage
Initialize Dex Client
import os
from dex_sdk.client import DexClient
from dex_sdk.types import ProjectCredentials, SGPCredentials
# Initialize Dex client
dex_client = DexClient(
base_url="your-dex-url",
)
# Create a project with SGP credentials
project = await dex_client.create_project(
name="my-ocr-project",
credentials=ProjectCredentials(
sgp=SGPCredentials(
account_id=os.getenv("SGP_ACCOUNT_ID"),
api_key=os.getenv("SGP_API_KEY")
),
),
)
Parse a Document with IRIS
from dex_core.models.parse_job import IrisParseEngineOptions, IrisParseJobParams
# Upload document
dex_file = await project.upload_file("document.pdf")
# Start IRIS parse job
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
# Get parse results
parse_result = await parse_job.get_parse_result()
# Access parsed content
chunks = parse_result.data.content.chunks
for chunk in chunks:
print(chunk.content)
Supported Document Types
IRIS supports various document formats:
- PDF documents (.pdf)
- Images (.png, .jpg, .jpeg, .tiff)
- Scanned documents with printed or handwritten text
- Multi-page documents
Parsing PDFs
Basic PDF Parsing
from dex_core.models.parse_job import IrisParseEngineOptions, IrisParseJobParams
# Upload PDF file
dex_file = await project.upload_file("path/to/document.pdf")
# Start IRIS parse job
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
# Wait for and retrieve parse results
parse_result = await parse_job.get_parse_result()
# Access parsed content
chunks = parse_result.data.content.chunks
for chunk in chunks:
print(chunk.content)
Complete PDF Example
import os
from dex_sdk.client import DexClient
from dex_sdk.file import DexFile
from dex_sdk.types import ProjectCredentials, SGPCredentials
from dex_core.models.parse_job import IrisParseEngineOptions, IrisParseJobParams
from dex_core.models.files import DUCFileEntity
async def parse_pdf_example():
# Initialize client
dex_client = DexClient(
base_url="your-dex-url",
)
# Create project
project = await dex_client.create_project(
name="ocr-project",
credentials=ProjectCredentials(
sgp=SGPCredentials(
account_id=os.getenv("SGP_ACCOUNT_ID"),
api_key=os.getenv("SGP_API_KEY")
),
),
)
# Upload file
dex_file = await project.upload_file("document.pdf")
# Verify file upload
assert isinstance(dex_file, DexFile)
assert isinstance(dex_file.data, DUCFileEntity)
# Verify file is in project
files = await project.list_files()
assert len(files) > 0, "Project should contain at least one file"
# Get download URL if needed
download_url = await dex_file.get_download_url()
assert download_url.startswith("http")
# Start IRIS parse job
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
# Get parse results
parse_result = await parse_job.get_parse_result()
# Access parsed content
content_length = len(parse_result.data.content.chunks[0].content)
print(f"Parsed content length: {content_length}")
Parsing Images
IRIS supports various image formats including PNG, JPG, and TIFF.
Basic Image Parsing
# Upload image file
dex_file = await project.upload_file("path/to/image.png")
# Start IRIS parse job
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
# Get parse results
parse_result = await parse_job.get_parse_result()
# Access chunks
chunks = parse_result.data.content.chunks
print(f"Number of chunks: {len(chunks)}")
Complete Image Example
async def parse_image_example():
dex_client = DexClient(
base_url="your-dex-url",
)
project = await dex_client.create_project(
name="ocr-project",
credentials=ProjectCredentials(
sgp=SGPCredentials(
account_id=os.getenv("SGP_ACCOUNT_ID"),
api_key=os.getenv("SGP_API_KEY")
),
),
)
# Upload PNG file
dex_file = await project.upload_file("image.png")
assert isinstance(dex_file, DexFile)
assert isinstance(dex_file.data, DUCFileEntity)
# Verify file in project
assert len(await project.list_files()) > 0
# Get download URL
download_url = await dex_file.get_download_url()
assert download_url.startswith("http")
# Start parse job
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
# Get results
parse_result = await parse_job.get_parse_result()
# Access chunks
print(f"Number of chunks: {len(parse_result.data.content.chunks)}")
Uploading Files from Memory
You can also upload files directly from memory streams:
import io
# Create file from bytes
file_bytes = b"your file content here"
file_stream = io.BytesIO(file_bytes)
# Upload from stream
dex_file = await project.upload_file(stream=file_stream, filename="document.pdf")
# Parse as usual
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
Configuration Options
IrisParseEngineOptions
Customize IRIS behavior using parse engine options:
from dex_core.models.parse_job import IrisParseEngineOptions
# Configure IRIS options
options = IrisParseEngineOptions(
# Configuration parameters available based on your needs
# (specific options depend on your Dex version)
)
# Use custom options
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(options=options)
)
Understanding Parse Results
Result Structure
IRIS returns parsed content organized into chunks, where each chunk represents a section of the document:
parse_result = await parse_job.get_parse_result()
# Access all chunks
chunks = parse_result.data.content.chunks
for i, chunk in enumerate(chunks):
print(f"Chunk {i}:")
print(f" Content: {chunk.content[:100]}...") # First 100 chars
# Additional chunk metadata may be available
if hasattr(chunk, 'page_number'):
print(f" Page: {chunk.page_number}")
if hasattr(chunk, 'region_type'):
print(f" Type: {chunk.region_type}")
Working with Different Content Types
IRIS detects and processes different content regions:
- Text regions: Paragraphs, headings, and body text
- Table regions: Structured tabular data
- Image regions: Figures and diagrams
# Filter for text chunks
text_chunks = [
chunk for chunk in chunks
if hasattr(chunk, 'region_type') and chunk.region_type == 'text'
]
# Filter for table chunks
table_chunks = [
chunk for chunk in chunks
if hasattr(chunk, 'region_type') and chunk.region_type == 'table'
]
File Management
List Files in Project
# List all files
files = await project.list_files()
for file in files:
print(f"File: {file.data.name}")
print(f" ID: {file.data.id}")
print(f" Size: {file.data.size_bytes} bytes")
Get File Download URL
# Get pre-signed S3 URL for downloading
download_url = await dex_file.get_download_url()
# Use URL with requests or other HTTP client
import requests
response = requests.get(download_url)
file_content = response.content
Check File Details
# Access file metadata
print(f"File name: {dex_file.data.name}")
print(f"File ID: {dex_file.data.id}")
print(f"File size: {dex_file.data.size_bytes} bytes")
# Check if file is a DUCFileEntity
from dex_core.models.files import DUCFileEntity
assert isinstance(dex_file.data, DUCFileEntity)
Error Handling
Handle Upload Errors
try:
dex_file = await project.upload_file("path/to/file.pdf")
except FileNotFoundError:
print("File not found")
except Exception as e:
print(f"Upload failed: {e}")
Handle Parse Errors
try:
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(
options=IrisParseEngineOptions(),
)
)
parse_result = await parse_job.get_parse_result()
except Exception as e:
print(f"Parse failed: {e}")
Processing Multiple Documents
Batch Processing
import asyncio
async def process_documents(file_paths):
# Upload all files
dex_files = []
for path in file_paths:
dex_file = await project.upload_file(path)
dex_files.append(dex_file)
# Start all parse jobs
parse_jobs = []
for dex_file in dex_files:
job = await dex_file.start_parse_job(
IrisParseJobParams(options=IrisParseEngineOptions())
)
parse_jobs.append(job)
# Wait for all to complete
results = []
for job in parse_jobs:
result = await job.get_parse_result()
results.append(result)
return results
# Usage
file_paths = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
results = await process_documents(file_paths)
Multi-Language Support
IRIS supports OCR in multiple languages, including:
- Latin-based languages: English, Spanish, French, German, Italian, Portuguese, etc.
- Arabic: With specialized model support
- Asian languages: Chinese, Japanese, Korean
- Other languages: Russian, Hebrew, Hindi, Thai, and more
The appropriate language models are selected automatically based on the document content.
Best Practices
Use Environment Variables for Credentials
import os
# Load credentials from environment
account_id = os.getenv("SGP_ACCOUNT_ID")
api_key = os.getenv("SGP_API_KEY")
if not account_id or not api_key:
raise ValueError("SGP credentials not found in environment")
credentials = SGPCredentials(
account_id=account_id,
api_key=api_key
)
Organize Files by Project
# Create separate projects for different document types
invoices_project = await dex_client.create_project(
name="invoices",
credentials=ProjectCredentials(sgp=sgp_credentials),
)
contracts_project = await dex_client.create_project(
name="contracts",
credentials=ProjectCredentials(sgp=sgp_credentials),
)
Wait for Parse Completion
# Parse jobs are async - ensure completion before accessing results
parse_job = await dex_file.start_parse_job(
IrisParseJobParams(options=IrisParseEngineOptions())
)
# This waits for completion
parse_result = await parse_job.get_parse_result()
# Now safe to access results
chunks = parse_result.data.content.chunks
Common Use Cases
def extract_all_text(parse_result):
"""Extract all text content from parse result."""
all_text = []
for chunk in parse_result.data.content.chunks:
all_text.append(chunk.content)
return "\n\n".join(all_text)
# Usage
parse_result = await parse_job.get_parse_result()
full_text = extract_all_text(parse_result)
print(full_text)
# IRIS automatically detects and extracts structured content
parse_result = await parse_job.get_parse_result()
# Access structured chunks
for chunk in parse_result.data.content.chunks:
# Process based on content type
if hasattr(chunk, 'region_type'):
if chunk.region_type == 'table':
# Handle table data
process_table(chunk.content)
elif chunk.region_type == 'text':
# Handle text data
process_text(chunk.content)
Processing Time
Parse job duration depends on:
- Document length (number of pages)
- Image resolution and quality
- Content complexity (tables, mixed layouts)
- Selected OCR models
Optimization Tips
- Batch processing: Process multiple documents concurrently when possible
- Pre-processing: Ensure documents are properly oriented and of good quality
- Project organization: Group related documents in the same project for better management
Next Steps
Now that you understand how to use IRIS:
- Review the Introduction to IRIS to learn about the underlying OCR pipeline
- Explore the Dex documentation for additional capabilities
- Integrate IRIS into your document processing workflows
- Use parsed results with Dex’s extraction and vector store features
For questions or support, please contact the Scale AI team.