Installation Issues
Installing the Dex SDK
The Dex SDK is distributed via Scale’s private CodeArtifact repository and requires AWS authentication. Prerequisites:- AWS CLI installed and configured
- Access to Scale’s production AWS account
- VPN connection to Scale’s internal network
- Authenticates pip with CodeArtifact
- Configures your pip to use Scale’s private repository
- Sets up a temporary token (typically valid for 12 hours)
Common Installation Problems
1. Token Expired Error If you see authentication errors like “401 Unauthorized” when installing packages, your CodeArtifact token has likely expired.- Ensure AWS CLI is installed:
aws --version - Configure AWS SSO for the production-developer profile
- Contact IT support for AWS access if you don’t have it
- Verify you’re on the Scale VPN
- Re-run the CodeArtifact login command
- Check that you have access to Scale’s CodeArtifact repository
- Verify your AWS credentials are valid:
aws sts get-caller-identity
- Don’t use
sudowith pip - this can cause issues with CodeArtifact authentication - Use a virtual environment:
python -m venv venv && source venv/bin/activate - Check that your AWS user has the necessary CodeArtifact permissions
- Ensure you’re connected to the Scale VPN
- Update your CA certificates
- Check your network proxy settings
Verifying Installation
After installation, verify the SDK is installed correctly:Common Issues
1. VPN Connection Problems
If you cannot connect to Dex, ensure you’re connected to Scale’s internal network via all-traffic VPN (not eng-split-prod). Test your connection:- Verify you’re connected to the correct VPN (all-traffic)
- Check your network connectivity
- Ensure the Dex service is available and not under maintenance
2. Authentication Errors
Authentication issues are usually caused by missing or incorrect credentials. Verify your credentials:- Missing or incorrect
SGP_ACCOUNT_IDandSGP_API_KEYenvironment variables - Insufficient permissions on your Scale account
- Account doesn’t have SGP access
- API key has expired or been revoked
- Set your environment variables:
- Verify your account has SGP access
- Ensure your API key is still valid
- Check that you have the necessary permissions for the operations you’re trying to perform
3. File Upload Issues
File upload failures can occur due to format, size, or accessibility issues. Check file before upload:- Images: PNG, JPEG/JPG, GIF, BMP, TIFF, PCX, PPM, APNG, PSD, CUR, DCX, FTEX, PIXAR, HEIC
- PDFs: PDF (Portable Document Format)
- Spreadsheets: CSV, XLSX, XLSM, XLS, XLTX, XLTM, QPW
- Documents: PPTX, PPT, DOCX, DOC, DOTX, WPD, TXT, RTF
- File doesn’t exist at the specified path
- File exceeds size limit (100MB)
- File format is not supported
- File is corrupted or malformed
- Insufficient storage quota in your project
- Verify file path is correct
- Check file size is under 100MB
- Ensure file format is in the supported list
- Try opening the file locally to verify it’s not corrupted
- Contact support if you need a higher size limit
4. Parsing Errors
Parsing can fail due to unsupported content, OCR issues, or corrupted documents. Common parsing issues:- Document contains unsupported elements
- OCR engine cannot process the content (e.g., extremely low quality scans)
- Document is password-protected or encrypted
- Document structure is too complex
- Parsing timeout for very large documents
- Check document quality: Ensure scanned documents have sufficient resolution (at least 300 DPI recommended)
- Remove password protection: Unlock encrypted documents before uploading
- Try a different OCR engine: Switch between Reducto and Scale OCR engines:
- Split large documents: For very large documents, consider splitting them into smaller chunks
- Check document validity: Open the document in a native viewer to ensure it’s not corrupted
5. Extraction Errors
Extraction can fail due to schema issues, model errors, or invalid parameters. Common extraction issues:- Invalid extraction schema
- Model timeout or rate limiting
- Insufficient context for extraction
- Schema-data mismatch
- Model doesn’t support requested features
- Validate your schema:
-
Use clear prompts: Provide detailed, specific instructions in your
user_prompt - Enable debugging features:
- Use vector stores for large documents: If extraction fails due to context length, use RAG-enhanced extraction:
- Check model availability: Ensure the model you’re using is available and not deprecated
Error Handling
The Dex SDK raises exceptions for various error conditions. Always wrap your code in try-except blocks for production use:| Exception Type | Cause | Solution |
|---|---|---|
AuthenticationError | Invalid credentials | Verify SGP_ACCOUNT_ID and SGP_API_KEY |
FileUploadError | File format/size issues | Check file format and size limits |
ParsingError | OCR failure | Try different OCR engine or check document quality |
ExtractionError | Schema or model error | Validate schema and check model availability |
ConnectionError | Network/VPN issues | Verify VPN connection and network |
RateLimitError | Too many requests | Implement backoff and retry logic |
PermissionError | Insufficient access | Check account permissions |
Performance Issues
Slow Parsing
Causes:- Large document size
- Complex document structure
- High OCR processing load
- Process documents asynchronously in batches
- Consider caching parse results for frequently accessed documents
Slow Extraction
Causes:- Large context size
- Complex extraction schema
- Model performance
- Use vector stores for large documents to reduce context size
- Simplify extraction schema if possible
- Choose faster models for time-sensitive applications
- Use batch processing for multiple extractions
Vector Store Search Performance
Causes:- Large number of indexed documents
- Complex search queries
- Embedding model latency
- Use appropriate
top_kvalues (avoid retrieving too many results) - Add filters to narrow down search scope
- Use file-specific search when possible:
vector_store.search_in_file() - Consider creating separate vector stores for different document categories
Debugging Tips
Enable Verbose Logging
Inspect Parse Results
Validate Extraction Results
Test Components Individually
Getting Help
If you continue to experience issues after trying these troubleshooting steps:Internal Support Channels
- Slack: Contact the Dex team at
#sgp-document-understanding-capability - Documentation: Review the full documentation at Dex Documentation
When Reporting Issues
Please include:- Error message: Full error text and stack trace
- Code snippet: Minimal reproducible example
- Document type: File format and characteristics (if relevant)
- SDK version: Output of
pip show dex_sdk - Environment: Python version, OS, VPN status
- Expected vs actual behavior: What you expected to happen vs what actually happened
Useful Resources
- Getting Started Guide: Learn Dex basics with examples
- API Reference: Complete SDK documentation
- Introduction to Dex: Core concepts and architecture
- SGP Platform Docs: Scale General Platform documentation

