Installation Issues
Installing the Dex SDK
The Dex SDK is distributed via Scale’s private CodeArtifact repository and requires AWS authentication. Prerequisites:- AWS CLI installed and configured
- Access to Scale’s production AWS account
- Authenticate with AWS SSO:
- Configure pip for CodeArtifact:
- Install the SDK: The exact command can be found on GitHub, in the introduction notebooks.
- Verify installation:
Common Installation Errors
Token Expired (401 Unauthorized) Re-run the CodeArtifact login command to refresh your authentication token. AWS CLI Not Configured Ensure AWS CLI is installed (aws --version) and configure AWS SSO for the production-developer profile.
Package Not Found
Verify your AWS credentials are valid: aws sts get-caller-identity
Permission Denied
Use a virtual environment instead of sudo: python -m venv venv && source venv/bin/activate
Runtime Issues
Connection Problems
Symptoms: Cannot connect to Dex service Test connection:- Check network connectivity
- Verify credentials (see Authentication Errors below)
- Confirm Dex service is available
Authentication Errors
Symptoms: Invalid credentials, permission denied Verify credentials:- Set environment variables:
SGP_ACCOUNT_IDandSGP_API_KEY - Verify your account has SGP access
- Check API key hasn’t expired or been revoked
File Upload Errors
Symptoms: Upload fails, file too large, unsupported format Supported formats:- Images: PNG, JPEG, TIFF, HEIC, and more
- PDFs: PDF
- Spreadsheets: CSV, XLSX, XLS
- Documents: PPTX, DOCX, TXT, RTF
- Verify file path is correct
- Check file size is under 100MB
- Ensure file format is supported
- Verify file isn’t corrupted
Parsing Errors
Symptoms: OCR fails, document can’t be processed Common causes:- Low quality scans (use at least 300 DPI)
- Password-protected documents
- Extremely complex layouts
- Very large documents timing out
- Check document quality and resolution
- Remove password protection before uploading
- Try a different OCR engine:
- Split large documents into smaller chunks
- See complex layouts guide: Industry Document Types and Layout Challenges
Extraction Errors
Symptoms: Extraction fails, schema errors, timeout Common causes:- Invalid Pydantic schema
- Model timeout or rate limiting
- Insufficient context for extraction
- Schema-data mismatch
- Validate your schema:
- Use clear prompts with detailed instructions
- Enable debugging:
- Use vector stores for large documents:
Performance Issues
Slow Parsing
Causes: Large documents, complex layouts, high OCR load Solutions:- Process documents asynchronously in batches
- Cache parse results for frequently accessed documents
Slow Extraction
Causes: Large context, complex schema, model performance Solutions:- Use vector stores to reduce context size
- Simplify extraction schema
- Choose faster models for time-sensitive applications
- Use batch processing
Slow Vector Store Search
Causes: Large document collections, complex queries Solutions:- Use appropriate
top_kvalues - Add filters to narrow search scope
- Use
vector_store.search_in_file()for file-specific searches - Create separate vector stores for different categories
Error Handling
Exception Types
Always wrap Dex operations in try-except blocks:Common Exceptions
| Exception | Cause | Solution |
|---|---|---|
AuthenticationError | Invalid credentials | Verify SGP_ACCOUNT_ID and SGP_API_KEY |
FileUploadError | File format/size issues | Check format and size limits |
ParsingError | OCR failure | Try different OCR engine |
ExtractionError | Schema or model error | Validate schema |
ConnectionError | Network issues | Check connectivity |
RateLimitError | Too many requests | Implement backoff/retry |
PermissionError | Insufficient access | Check permissions |
Debugging
Enable Logging
Inspect Parse Results
Validate Extraction Results
Test Components Individually
Getting Help
Support Channels
- Slack:
#dex-helpfor Dex team support - Documentation: Dex Documentation
When Reporting Issues
Include:- Error message - Full error text and stack trace
- Code snippet - Minimal reproducible example
- Document type - File format and characteristics
- SDK version - Output of
pip show dex_sdk - Environment - Python version, OS
- Expected vs actual behavior
Additional Resources
- Getting Started Guide - Quick start tutorial
- File Management - Upload and pagination
- Parse - Parse engines and async jobs
- Vector Stores - Semantic search and RAG
- Extract - Data extraction and batch processing
- Best Practices - Common patterns and optimization
- API Reference - Complete SDK documentation
- Introduction to Dex - Core concepts

