An index organizes your project’s files into a searchable tree. Leaf nodes (level 0) represent individual files, and higher-level nodes (level 1+) hold generated summaries that group related files—like folders. You can run fast lexical (BM25) search over the files, or navigate the hierarchy level by level.
Indexing is a newly released Dex capability. The API may change in future SDK versions.
The index lifecycle is: create → add/remove files (async) → search/navigate (sync) → delete.
Index vs. Vector Store
Indexes and vector stores are both retrieval primitives, but they serve different access patterns:
| Index | Vector Store |
|---|
| Search | Lexical BM25 over file summaries | Semantic, lexical, or hybrid over chunks |
| Structure | Hierarchical tree you can navigate | Flat collection of chunks |
| Best for | Finding and browsing relevant files in a corpus; agentic navigation | Retrieving passages for RAG-enhanced extraction |
Both can be combined as inputs to a research job.
Creating an Index
Indexing currently supports only the file-system index engine (FileSystemIndexCreateRequest). This engine builds the tree from the folder structure of your Dex project: folders are inferred from each file’s name, treating everything before the last / as the folder path. For example, a file named finance/2016/report.pdf becomes a leaf under a finance/2016 folder, with a parent finance folder above it. Folder nodes are summarized using the configured model and prompts.
from dex_sdk import FileSystemIndexCreateRequest
index = await project.create_index(
FileSystemIndexCreateRequest(
name="financial-reports",
model="openai/gpt-5.2", # used to generate folder/file summaries
file_summarization_prompt="<prompt for summarizing each file>",
folder_summarization_prompt="<prompt for summarizing each folder>",
)
)
print("Status:", index.status) # PENDING, READY, UPDATING, or FAILED
print("Nodes:", len(index.nodes))
Adding Files
add_files starts an async job and blocks until it completes. Adding files is best-effort per file, so the result reports which files succeeded and which failed.
result = await index.add_files(
file_ids=["file_1", "file_2", "file_3"]
)
print("Succeeded:", result.succeeded_file_ids)
for failed in result.failed_files:
print(f"Failed {failed.file_id}: {failed.error}")
# Retry failed files
result = await index.add_files(
file_ids=[failed.file_id for failed in result.failed_files]
)
To start the job without blocking, use start_add_files_job, which returns immediately so you can poll the job yourself:
job = await index.start_add_files_job(file_ids=["file_1", "file_2"])
print("Job started:", job.id)
Searching
search runs BM25 over the leaf nodes (files) and returns the most relevant matches.
response = await index.search(query="quarterly revenue", top_k=5)
for node in response.nodes:
print(f"[level {node.level}] {node.id}: {node.text[:80]}")
Restrict the search to a subtree by passing current_node_id:
response = await index.search(
query="quarterly revenue",
top_k=5,
current_node_id="<node_id>", # search only within this node's subtree
)
Navigating the Hierarchy
navigate ranks the children of a node so you can walk the tree from broad summaries down to individual files. Pass current_node_id=None to start from the root.
response = await index.navigate(query="finance", current_node_id=None, top_k=5)
for node in response.nodes:
print(f"[level {node.level}] {node.id}: {node.text[:80]}")
Removing Files and Deleting
Removing files is all-or-nothing: on success, every requested file is removed from the index.
# Remove files (blocks until the job finishes)
await index.remove_files(file_ids=["file_1", "file_2"])
# Re-fetch the latest index state from the API
await index.refresh()
# Permanently delete the index and its storage
await index.delete()
Listing and Retrieving Indexes
# List all indexes in the project
indexes = await project.list_indexes()
# Fetch a single index by ID (with hydrated nodes)
index = await project.get_index_by_id("index_123")
Appendix: Essential Imports
from dex_sdk import DexClient, FileSystemIndexCreateRequest
Next Steps
- Research: Run agentic research jobs that use indexes as input
- Vector Stores: Semantic and hybrid search for RAG-enhanced extraction
- Extract: Extract structured data from parse results or vector stores