Indexing

An index organizes your project’s files into a searchable tree. Leaf nodes (level 0) represent individual files, and higher-level nodes (level 1+) hold generated summaries that group related files—like folders. You can run fast lexical (BM25) search over the files, or navigate the hierarchy level by level.

Indexing is a newly released Dex capability. The API may change in future SDK versions.

The index lifecycle is: create → add/remove files (async) → search/navigate (sync) → delete.

Index vs. Vector Store

Indexes and vector stores are both retrieval primitives, but they serve different access patterns:

	Index	Vector Store
Search	Lexical BM25 over file summaries	Semantic, lexical, or hybrid over chunks
Structure	Hierarchical tree you can navigate	Flat collection of chunks
Best for	Finding and browsing relevant files in a corpus; agentic navigation	Retrieving passages for RAG-enhanced extraction

Both can be combined as inputs to a research job.

Creating an Index

Indexing currently supports only the file-system index engine (FileSystemIndexCreateRequest). This engine builds the tree from the folder structure of your Dex project: folders are inferred from each file’s name, treating everything before the last / as the folder path. For example, a file named finance/2016/report.pdf becomes a leaf under a finance/2016 folder, with a parent finance folder above it. Folder nodes are summarized using the configured model and prompts.

from dex_sdk import FileSystemIndexCreateRequest

index = await project.create_index(
    FileSystemIndexCreateRequest(
        name="financial-reports",
        model="openai/gpt-5.2",  # used to generate folder/file summaries
        file_summarization_prompt="<prompt for summarizing each file>",
        folder_summarization_prompt="<prompt for summarizing each folder>",
    )
)

print("Status:", index.status)   # PENDING, READY, UPDATING, or FAILED
print("Nodes:", len(index.nodes))

Adding Files

add_files starts an async job and blocks until it completes. Adding files is best-effort per file, so the result reports which files succeeded and which failed.

result = await index.add_files(
    file_ids=["file_1", "file_2", "file_3"]
)

print("Succeeded:", result.succeeded_file_ids)
for failed in result.failed_files:
    print(f"Failed {failed.file_id}: {failed.error}")

# Retry failed files
result = await index.add_files(
    file_ids=[failed.file_id for failed in result.failed_files]
)

To start the job without blocking, use start_add_files_job, which returns immediately so you can poll the job yourself:

job = await index.start_add_files_job(file_ids=["file_1", "file_2"])
print("Job started:", job.id)

Searching

search runs BM25 over the leaf nodes (files) and returns the most relevant matches.

response = await index.search(query="quarterly revenue", top_k=5)

for node in response.nodes:
    print(f"[level {node.level}] {node.id}: {node.text[:80]}")

Restrict the search to a subtree by passing current_node_id:

response = await index.search(
    query="quarterly revenue",
    top_k=5,
    current_node_id="<node_id>",  # search only within this node's subtree
)

Navigating the Hierarchy

navigate ranks the children of a node so you can walk the tree from broad summaries down to individual files. Pass current_node_id=None to start from the root.

response = await index.navigate(query="finance", current_node_id=None, top_k=5)

for node in response.nodes:
    print(f"[level {node.level}] {node.id}: {node.text[:80]}")

Removing Files and Deleting

Removing files is all-or-nothing: on success, every requested file is removed from the index.

# Remove files (blocks until the job finishes)
await index.remove_files(file_ids=["file_1", "file_2"])

# Re-fetch the latest index state from the API
await index.refresh()

# Permanently delete the index and its storage
await index.delete()

Listing and Retrieving Indexes

# List all indexes in the project
indexes = await project.list_indexes()

# Fetch a single index by ID (with hydrated nodes)
index = await project.get_index_by_id("index_123")

Appendix: Essential Imports

from dex_sdk import DexClient, FileSystemIndexCreateRequest

Next Steps

Research: Run agentic research jobs that use indexes as input
Vector Stores: Semantic and hybrid search for RAG-enhanced extraction
Extract: Extract structured data from parse results or vector stores

Getting Started

Document Understanding

OCR

Workflows

Training

Index vs. Vector Store

Creating an Index

Adding Files

Searching

Navigating the Hierarchy

Removing Files and Deleting

Listing and Retrieving Indexes

Appendix: Essential Imports

Next Steps

​Index vs. Vector Store

​Creating an Index

​Adding Files

​Searching

​Navigating the Hierarchy

​Removing Files and Deleting

​Listing and Retrieving Indexes

​Appendix: Essential Imports

​Next Steps

Index vs. Vector Store

Creating an Index

Adding Files

Searching

Navigating the Hierarchy

Removing Files and Deleting

Listing and Retrieving Indexes

Appendix: Essential Imports

Next Steps