Dex SDK).
You’ll learn how to connect to cloud file storage, filter and parse documents, store parsed data to Knowledge Base, and use the results in your workflows.
Step 1: Connect to cloud file storage
Start by navigating toWorkflows in SGP. Click on the Cloud Storage Browser card to connect to your cloud file storage.

- Amazon S3
- Azure Blob Storage
- Google Cloud Storage
To add an S3 bucket for use with the Cloud Storage Browser card, you must first create a bucket using Terraform and add the bucket to the existing policy within the Compass role. Please reach out to the Compass team for example PRs that illustrate how to take those steps.
Step 2: Filter files (optional)
Before parsing, you can use aFilter card to select only the files you want to process. This helps you narrow down your dataset and avoid parsing unnecessary documents.

Step 3: Parse documents with the Dex Parse Card
This features utilizes our Document Understanding capability, ourDex SDK on the backend.
Add a Dex Parse card to extract structured data from your documents. This card utilizes the Dex SDK in the backend and automatically creates a Dex project that you can reference later in the Dex UI.

Dex Project Configuration
Project Name: Give your Dex project a descriptive name (e.g., “Resume Parsing Project”)Project ID (Optional):- Enter an existing project ID to reuse a project
- Leave empty to create a new project each time the workflow runs
Parse Engine
Choose the OCR engine based on your document language and layout:Engine: Select from available parsing engines (e.g.,Reducto)- Reducto: Best for English and Latin-script documents with tables, figures, and complex layouts
Engine Configuration
For Reducto, configure the following settings:Chunking Method: Select how documents should be split into chunksVariable (recommended): Automatically determines optimal chunk boundaries
Chunk Size (Optional): Set toAutoto let the engine determine the best size
Output Options
Control the visibility of the output data:Include parsed data in output: Toggle on to output parsed text content for each chunk- When enabled: Outputs parsed text content with full details
- When disabled: Outputs one row per file with metadata (project ID, parsed result ID) only
Knowledge Base
Optionally push parsed results into a vector store for semantic search and RAG:Push to Vector Store: Toggle on to enable Knowledge Base integrationVector Store Name: Provide a name for your Knowledge Base (e.g., “Resume Vector Store”)Embedding Model: Specify the embedding model (e.g.,openai/text-embedding-3-large)
Step 4: View parsed output
After the Dex Parse card processes your documents, the output is a dataframe with the following columns:parseResultId: Unique identifier for the parsing resultprojectId: The Dex project ID where results are storedengine: The parsing engine used (e.g., “Reducto”)pagesProcessed: Number of pages processed in the documentchunkCount: Number of chunks created from the documentstatus: Parsing status (e.g., “completed”, “failed”)parsedContent: The extracted text content from the documentvectorStoreId: ID of the Knowledge Base vector store (if enabled)
Step 5: Access Dex project and SGP Knowledge Base
The Dex Parse card automatically creates a Dex project that you can access later for review and management. If you enabled the “Push to Vector Store” option, navigate to theKnowledge Base section in SGP to access your vector store:
- Search parsed documents semantically
- Test retrieval quality
- Use for RAG applications
- Manage embeddings and vector store settings
Step 6: Use parsed data in your workflow
Now you can use the parsed content in subsequent workflow cards. For this example, we’ll call an agent to categorize resume content into suitable job roles.Call Agent for Analysis
Add aCall Agent card to process the parsed content:
Select Agent: Choose your agent from the dropdownOutput Column Name: Specify where agent responses will be stored (e.g.,job_categories)Prompt Template: Write your prompt using{{parsedContent}}to reference the parsed text

Additional Processing
You can add more cards to:- Filter or transform the categorized results
- Join with job posting data
- Run evaluations on categorization quality
- Generate reports or summaries
Step 7: Save your results
Finally, save the workflow output for future reference and analysis. You can save the results as:- SGP Dataset: Store the categorized resumes as a dataset for future use in training, evaluation, or analysis
- SGP Evaluation: Create an evaluation to track the quality of resume categorization over time


