Skip to main content
Manage files, projects, and collections in Dex. The same pagination and filtering patterns apply to jobs, parse results, extractions, and vector stores.

Create Project

import os
from dex_sdk import DexClient

dex_client = DexClient(
    base_url="https://dex.sgp.scale.com",
    api_key=os.getenv("SGP_API_KEY"),
    account_id=os.getenv("SGP_ACCOUNT_ID"),
)

project = await dex_client.create_project(name="My Project")

Create Project with Retention

project = await dex_client.create_project(
    name="My Project",
    configuration=ProjectConfiguration(
        retention=RetentionPolicy(
            files=timedelta(days=30),
            result_artifacts=timedelta(days=7),
        )
    )
)

Upload File

dex_file = await project.upload_file("document.pdf")

Process Multiple Files

import asyncio

files_to_upload = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
dex_files = await asyncio.gather(*[project.upload_file(f) for f in files_to_upload])
print(f"Uploaded {len(dex_files)} files")

Pagination and Filtering

New in v0.4.0: Efficiently manage large collections with pagination and filtering.

Listing Files with Pagination

pagination_params = PaginationParams(
    page_size=50,
    sort_by="created_at",
    sort_order="desc",
    continuation_token=None
)

result = await project.list_files(pagination_params=pagination_params)

# Access items
for file in result.items:
    print(f"File: {file.filename} ({file.created_at})")

# Paginate through remaining results
while result.next_token:
    pagination_params.continuation_token = result.next_token
    result = await project.list_files(pagination_params=pagination_params)
    # Process result.items

Filtering by Creation Time

from datetime import datetime, timedelta

# Filter files created in the last 24 hours
file_filter = FileListFilter(
    created_at_start=datetime.now() - timedelta(days=1)
)
recent_files = await project.list_files(filter=file_filter)

# Combine pagination and filtering for files
pagination_params = PaginationParams(page_size=20, sort_by="created_at", sort_order="desc")
file_filter = FileListFilter(created_at_start=datetime.now() - timedelta(days=30))
recent_files = await project.list_files(
    pagination_params=pagination_params,
    filter=file_filter
)

Available List Operations

All entity types support unified list_* operations with pagination and filtering:
  • list_files(pagination_params, filter) - List uploaded files
  • list_jobs(pagination_params, filter) - List all jobs
  • list_parse_results(pagination_params, filter) - List parse results
  • list_extractions(pagination_params, filter) - List extractions
  • list_vector_stores(pagination_params, filter) - List vector stores

Supported File Types

Dex supports a wide variety of document formats:

Images

PNG, JPEG/JPG, GIF, BMP, TIFF, PCX, PPM, APNG, PSD, CUR, DCX, FTEX, PIXAR, HEIC

Documents

PDF, DOCX, DOC, DOTX, WPD, TXT, RTF, PPTX, PPT

Spreadsheets

CSV, XLSX, XLSM, XLS, XLTX, XLTM, QPW
For best results with spreadsheets, use XLSX format. CSV files are processed as-is without layout analysis.

Access Response Data

SDK methods return wrapper objects; access data via .data:
# Correct
project_id = project.data.id
project_name = project.data.name

dex_file.data.id
dex_file.data.filename
dex_file.data.size_bytes

Appendix: Essential Imports

from dex_sdk.types import (
    PaginationParams,
    FileListFilter,
    ProjectConfiguration,
    RetentionPolicy,
)
from datetime import timedelta

Next Steps