30-Second Quick Start
Synchronous Client
For synchronous workflows, useDexSyncClient from dex_sdk:
Data Retention Policies
Configure automatic data lifecycle management for compliance and cost optimization.Setting Retention Policies
Updating Retention Policies
Retention Policy Use Cases
Compliance (GDPR, HIPAA):Optimizing Extraction Accuracy
-
Write Clear Prompts
-
Design Good Schemas
-
Enable Citations and Confidence
- Always set
generate_citations=Truefor debugging - Use
generate_confidence=Trueto filter low-confidence results
- Always set
-
Use Vector Stores for Large Documents
- Documents > 50 pages benefit from RAG-enhanced extraction
- Vector stores improve accuracy for cross-document queries
Getting SGP Traces for Extraction
Usestart_extract_job instead of extract to get async jobs that are linked to SGP traces. Search for traces by job ID to debug extraction latency, token usage, or failures.
Common Workflows
Single Document Processing:Performance Optimization
Reduce Latency
- Use appropriate chunking: Smaller chunks = faster parsing
- Limit OCR scope: Only process pages you need
- Batch operations: Process multiple files in parallel
- Cache parse results: Reuse parsed documents for multiple extractions
Reduce Costs
- Set retention policies: Auto-delete old data
- Choose right model: Use smaller models when possible
- Optimize prompts: Shorter prompts = lower token costs
- Filter before extraction: Use vector search to find relevant chunks first
Common Errors & Quick Fixes
| Error | Cause | Quick Fix |
|---|---|---|
AuthenticationError | Missing/invalid credentials | Check SGP_API_KEY and SGP_ACCOUNT_ID env vars |
FileUploadError | Unsupported format or too large | Check file type, reduce size |
ParsingError | OCR failure | Try different engine or check document quality |
ExtractionError | Invalid schema | Validate Pydantic model, check field types |
ConnectionError | Network issues | Check internet connection, verify base URL |
RateLimitError | Too many requests | Implement backoff/retry, reduce concurrency |
Access Response Data
Remember: SDK methods return wrapper objects, access data via.data:
Type Import Quick Lookup
| Type | Import From | Used For |
|---|---|---|
DexClient | dex_sdk | Client initialization |
DexSyncClient | dex_sdk | Sync client |
ParseEngine | dex_sdk.types | OCR engine selection |
ReductoParseJobParams | dex_sdk.types | Reducto configuration |
IrisParseJobParams | dex_sdk.types | Iris configuration |
ReductoChunkingMethod | dex_sdk.types | Chunking method enum |
ReductoChunkingOptions | dex_sdk.types | Chunking config |
ReductoParseEngineOptions | dex_sdk.types | Parser options |
IrisParseEngineOptions | dex_sdk.types | Iris parser options |
ExtractionParameters | dex_sdk.types | Extraction config |
VectorStoreEngines | dex_sdk.types | Vector store engines |
VectorStoreSearchResult | dex_sdk.types | Search results |
ProjectConfiguration | dex_sdk.types | Project config |
RetentionPolicy | dex_sdk.types | Data retention |
PaginationParams | dex_sdk.types | Pagination config |
FileListFilter | dex_sdk.types | File filtering |
JobListFilter | dex_sdk.types | Job filtering |
ParseResultListFilter | dex_sdk.types | Parse result filtering |
ExtractionListFilter | dex_sdk.types | Extraction filtering |
VectorStoreListFilter | dex_sdk.types | Vector store filtering |
JobStatus | dex_sdk.types | Job status enum |
.data, no import needed):
ProjectEntity,FileEntity,ParseResultEntity,ExtractionEntity,VectorStoreEntity
Next Steps
- Getting Started: Step-by-step tutorial
- File Management: Upload and manage files
- API Reference: Complete SDK documentation
- Troubleshooting: Detailed error solutions

