> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
> Use this file to discover all available pages before exploring further.

# List Extractions

> List all extractions for a project with token-based pagination and filtering.

Only SUCCEEDED extraction jobs are returned. Filter by ids, created_at,
completed_at, or source_id.

Args:
    project_id: Project ID that the extractions belong to
    filter_params: Filter parameters (ids, created_at, completed_at, source_id)
    pagination_params: Pagination parameters (page_size, continuation_token, sort_by, sort_order)

Returns:
    PaginatedItems or list of ExtractionEntity instances

Raises:
    HTTPException: If project not found



## OpenAPI

````yaml https://dex.sgp.scale.com/openapi.json get /v1/projects/{project_id}/extract/results
openapi: 3.1.0
info:
  title: Document Understanding API
  description: API for uploading and processing documents
  version: 0.4.5
servers: []
security:
  - ApiKey: []
    AccountId: []
tags:
  - name: Projects
    description: Operations related to project creation and management
  - name: Files
    description: Operations related to file upload and access
  - name: Parse
    description: Operations related to starting parse jobs and accessing their results
  - name: Vector Stores
    description: Operations related to vector store creation and management
  - name: Extract
    description: Operations related to starting extract jobs and accessing their results
  - name: Research
    description: Dex Research agent kickoff and results.
  - name: Jobs
    description: Operations related to monitoring jobs and their status
paths:
  /v1/projects/{project_id}/extract/results:
    get:
      tags:
        - Extract
      summary: List Extractions
      description: >-
        List all extractions for a project with token-based pagination and
        filtering.


        Only SUCCEEDED extraction jobs are returned. Filter by ids, created_at,

        completed_at, or source_id.


        Args:
            project_id: Project ID that the extractions belong to
            filter_params: Filter parameters (ids, created_at, completed_at, source_id)
            pagination_params: Pagination parameters (page_size, continuation_token, sort_by, sort_order)

        Returns:
            PaginatedItems or list of ExtractionEntity instances

        Raises:
            HTTPException: If project not found
      operationId: list_extractions_v1_projects__project_id__extract_results_get
      parameters:
        - name: project_id
          in: path
          required: true
          schema:
            type: string
            title: Project Id
        - name: ids
          in: query
          required: false
          schema:
            anyOf:
              - items:
                  type: string
                type: array
              - type: 'null'
            description: Filter by extraction IDs (job IDs)
            title: Ids
          description: Filter by extraction IDs (job IDs)
        - name: created_at_start
          in: query
          required: false
          schema:
            anyOf:
              - type: string
                format: date-time
              - type: 'null'
            description: Filter by creation time (start)
            title: Created At Start
          description: Filter by creation time (start)
        - name: created_at_end
          in: query
          required: false
          schema:
            anyOf:
              - type: string
                format: date-time
              - type: 'null'
            description: Filter by creation time (end)
            title: Created At End
          description: Filter by creation time (end)
        - name: completed_at_start
          in: query
          required: false
          schema:
            anyOf:
              - type: string
                format: date-time
              - type: 'null'
            description: Filter by completion time (start)
            title: Completed At Start
          description: Filter by completion time (start)
        - name: completed_at_end
          in: query
          required: false
          schema:
            anyOf:
              - type: string
                format: date-time
              - type: 'null'
            description: Filter by completion time (end)
            title: Completed At End
          description: Filter by completion time (end)
        - name: source_id
          in: query
          required: false
          schema:
            anyOf:
              - type: string
              - type: 'null'
            description: Filter by source ID (parse result ID or vector store ID)
            title: Source Id
          description: Filter by source ID (parse result ID or vector store ID)
        - name: page_size
          in: query
          required: false
          schema:
            anyOf:
              - type: integer
                maximum: 100
                minimum: 1
              - type: 'null'
            description: Number of items per page
            title: Page Size
          description: Number of items per page
        - name: sort_by
          in: query
          required: false
          schema:
            anyOf:
              - type: string
              - type: 'null'
            description: Field to sort by
            title: Sort By
          description: Field to sort by
        - name: sort_order
          in: query
          required: false
          schema:
            anyOf:
              - $ref: '#/components/schemas/SortOrder'
              - type: 'null'
            description: Sort order (asc or desc)
            title: Sort Order
          description: Sort order (asc or desc)
        - name: continuation_token
          in: query
          required: false
          schema:
            anyOf:
              - type: string
              - type: 'null'
            description: Pagination token
            title: Continuation Token
          description: Pagination token
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: >-
                  #/components/schemas/MaybePaginatedListResponse_ExtractionEntity_
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
components:
  schemas:
    SortOrder:
      type: string
      enum:
        - asc
        - desc
      title: SortOrder
      description: >-
        Enum for project sort fields.

        Note that only non-nullable fields can be sorted by (for

        details, see:
        https://github.com/djrobstep/sqlakeyset?tab=readme-ov-file#limitations
        ).
    MaybePaginatedListResponse_ExtractionEntity_:
      anyOf:
        - $ref: '#/components/schemas/PaginatedItems_ExtractionEntity_'
        - items:
            $ref: '#/components/schemas/ExtractionEntity'
          type: array
      title: MaybePaginatedListResponse[ExtractionEntity]
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    PaginatedItems_ExtractionEntity_:
      properties:
        items:
          items:
            $ref: '#/components/schemas/ExtractionEntity'
          type: array
          title: Items
        total_count:
          type: integer
          title: Total Count
        next_token:
          anyOf:
            - type: string
            - type: 'null'
          title: Next Token
        prev_token:
          anyOf:
            - type: string
            - type: 'null'
          title: Prev Token
      type: object
      required:
        - items
        - total_count
      title: PaginatedItems[ExtractionEntity]
    ExtractionEntity:
      properties:
        id:
          type: string
          title: Id
          description: ID of the entity
        project_id:
          type: string
          title: Project Id
          description: ID of the project
        object:
          type: string
          const: extraction
          title: Object
          default: extraction
        source_id:
          type: string
          title: Source Id
          description: Source ID that was extracted from
        result:
          $ref: '#/components/schemas/ExtractionResult'
          description: The extraction result
        parameters:
          $ref: '#/components/schemas/ExtractionParameters'
          description: Parameters used for extraction
        created_at:
          type: string
          format: date-time
          title: Created At
          description: When the extraction was completed
        processing_time_ms:
          anyOf:
            - type: integer
            - type: 'null'
          title: Processing Time Ms
          description: Processing time in milliseconds
      type: object
      required:
        - id
        - project_id
        - source_id
        - result
        - parameters
        - created_at
      title: ExtractionEntity
      description: Extraction result response model.
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
        input:
          title: Input
        ctx:
          type: object
          title: Context
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
    ExtractionResult:
      properties:
        data:
          additionalProperties:
            $ref: '#/components/schemas/ExtractedField'
          type: object
          title: Data
          description: Extracted data matching the user's schema structure
        usage_info:
          anyOf:
            - $ref: '#/components/schemas/UsageInfo'
            - type: 'null'
          description: Usage information for the extraction
      type: object
      required:
        - data
      title: ExtractionResult
      description: Complete extraction result structure.
      example:
        data:
          invoice_id:
            citations:
              - bbox:
                  height: 0.2058080808080808
                  left: 0.07761437908496732
                  top: 0.42045454545454547
                  width: 0.8415032679738562
                content: INV-2023-001
                page: 1
            confidence: 0.95
            value: INV-2023-001
          total_amount:
            citations:
              - bbox:
                  height: 0.2058080808080808
                  left: 0.07761437908496732
                  top: 0.42045454545454547
                  width: 0.8415032679738562
                content: '119.99'
                page: 1
            confidence: 0.95
            value: 119.99
    ExtractionParameters:
      properties:
        model:
          type: string
          title: Model
          description: >-
            LLM model to use for extraction (e.g., 'openai/gpt-4',
            'anthropic/claude-3-sonnet')
        model_kwargs:
          anyOf:
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Model Kwargs
          description: Additional kwargs for the LLM model
        extraction_schema:
          additionalProperties: true
          type: object
          title: Extraction Schema
          description: JSON schema defining the desired output structure
        system_prompt:
          anyOf:
            - type: string
            - type: 'null'
          title: System Prompt
          description: High-level instructions or context for the extraction model
        user_prompt:
          anyOf:
            - type: string
            - type: 'null'
          title: User Prompt
          description: Specific hints about the current document
        generate_citations:
          type: boolean
          title: Generate Citations
          description: Whether to return bounding boxes for extracted values
          default: true
        generate_confidence:
          type: boolean
          title: Generate Confidence
          description: Whether to return confidence scores for extracted values
          default: true
        use_agentic:
          type: boolean
          title: Use Agentic
          description: >-
            Must stay False on POST …/extract (400 if True). The extract
            endpoint only runs single-pass LLM extraction. For agentic
            (decompose / ReAct) extraction, use POST …/research with
            ResearchJobRequest.
          default: false
        agentic_tools:
          anyOf:
            - additionalProperties:
                $ref: '#/components/schemas/ResearchToolConfigEntry'
              type: object
            - type: 'null'
          title: Agentic Tools
          description: >-
            Research-shaped tool map; ignored by the extract HTTP path
            (single-pass only). Use POST …/research for per-tool config. Same
            key shape as ResearchJobRequest.tools.
        execute_dependent_subtask_tool_hints:
          type: boolean
          title: Execute Dependent Subtask Tool Hints
          description: >-
            Ignored by POST …/extract (single-pass only). When True on a
            Research job, dependent subtasks execute their warm-start tool hints
            as real searches instead of converting them to intent-only guidance.
          default: false
        agent_settings:
          anyOf:
            - $ref: '#/components/schemas/AgenticSettings'
            - type: 'null'
          description: >-
            Ignored by POST …/extract. If ``agent_settings.use_agentic`` is
            True, the extract request is rejected (400); configure agentic
            behaviour via POST …/research instead.
      type: object
      required:
        - model
        - extraction_schema
      title: ExtractionParameters
      description: Parameters for extraction operation.
    ExtractedField:
      properties:
        value:
          title: Value
          description: The extracted value
        citations:
          anyOf:
            - items:
                $ref: '#/components/schemas/ExtractionCitation'
              type: array
            - type: 'null'
          title: Citations
          description: Citation context for the extracted value
        confidence:
          anyOf:
            - type: number
            - type: 'null'
          title: Confidence
          description: Confidence score for the extracted value
      type: object
      required:
        - value
      title: ExtractedField
      description: A single extracted field with its value and citations.
    UsageInfo:
      properties:
        prompt_tokens:
          type: integer
          title: Prompt Tokens
          description: Number of tokens in the prompt.
          default: 0
        completion_tokens:
          type: integer
          title: Completion Tokens
          description: Number of tokens in the generated completion.
          default: 0
        reasoning_tokens:
          type: integer
          title: Reasoning Tokens
          description: >-
            Number of tokens generated by the model for reasoning. (included
            already in the completion tokens and the total tokens)
          default: 0
        total_tokens:
          type: integer
          title: Total Tokens
          description: Total number of tokens used in the request (prompt + completion).
          default: 0
      type: object
      title: UsageInfo
      description: Usage information.
    ResearchToolConfigEntry:
      properties:
        enabled:
          type: boolean
          title: Enabled
          description: When False, the tool is not offered to the agent
          default: true
        config:
          additionalProperties: true
          type: object
          title: Config
          description: Tool-specific configuration (e.g. top_k, filters, rerank_config)
      type: object
      title: ResearchToolConfigEntry
      description: >-
        Per-tool enablement and opaque config (e.g. search_config for
        similarity_search).
    AgenticSettings:
      properties:
        use_agentic:
          type: boolean
          title: Use Agentic
          description: >-
            Legacy mirror of ExtractionParameters.use_agentic. On POST
            …/extract, True causes a 400 rejection (extract is single-pass
            only). For agentic runs use POST …/research with ResearchJobRequest.
          default: false
        planning_model:
          anyOf:
            - type: string
            - type: 'null'
          title: Planning Model
          description: >-
            Model for planning steps (deciding what to search, decomposing
            tasks). Defaults to the main extraction model.
        acting_model:
          anyOf:
            - type: string
            - type: 'null'
          title: Acting Model
          description: >-
            Model for acting steps (producing the final extracted output).
            Defaults to the main extraction model.
        reflection_model:
          anyOf:
            - type: string
            - type: 'null'
          title: Reflection Model
          description: >-
            Model for reflection steps (synthesizing search results, aggregating
            subtasks). Defaults to the main extraction model.
        max_turns:
          type: integer
          title: Max Turns
          description: >-
            Maximum number of agent turns (decompose→run→aggregate cycles)
            before forcing completion.
          default: 2
        max_context_tokens:
          type: integer
          title: Max Context Tokens
          description: Max tokens for LLM context (system + messages). Trims when exceeded.
          default: 100000
        max_reasoning_hops:
          anyOf:
            - type: integer
            - type: 'null'
          title: Max Reasoning Hops
          description: >-
            Max reasoning hops (search or extract actions) per subtask in the
            ReAct loop. Default is 10.
        max_waves:
          anyOf:
            - type: integer
            - type: 'null'
          title: Max Waves
          description: >-
            Max dependency waves per turn. Subtasks beyond this depth are
            promoted to the last wave. Default is 3.
        max_validation_fix_attempts:
          type: integer
          title: Max Validation Fix Attempts
          description: Max attempts to fix schema validation errors via LLM reflection.
          default: 2
        decompose_prompt_template:
          anyOf:
            - type: string
            - type: 'null'
          title: Decompose Prompt Template
          description: Custom decompose prompt template. When None, uses the default.
        decompose_rules:
          anyOf:
            - type: string
            - type: 'null'
          title: Decompose Rules
          description: >-
            Extra rules appended to the decompose prompt. When None, uses
            built-in rules.
        search_rules:
          anyOf:
            - type: string
            - type: 'null'
          title: Search Rules
          description: >-
            Extra rules appended to the reasoning step search action. When None,
            uses built-in rules.
        extraction_rules:
          anyOf:
            - type: string
            - type: 'null'
          title: Extraction Rules
          description: >-
            Extra rules appended to the reasoning step extract action. When
            None, uses built-in rules.
      type: object
      title: AgenticSettings
      description: >-
        Configuration for the agentic extractor (models, turns, hop limits,
        prompt templates).


        Tool enablement and search-summary scope for vector-store agentic
        extract live on

        :class:`ExtractionParameters` as ``agentic_tools`` (Research-shaped
        map); dependent-hint

        execution uses
        ``ExtractionParameters.execute_dependent_subtask_tool_hints``.
    ExtractionCitation:
      properties:
        block_ref:
          anyOf:
            - type: string
            - type: 'null'
          title: Block Ref
          description: >-
            Reference to the block from tool results (e.g. '0_0_1' for tool 0,
            chunk 0, block 1). When provided,
            content/file_id/parse_result_id/page/bbox are resolved from the
            block.
        file_id:
          anyOf:
            - type: string
            - type: 'null'
          title: File Id
          description: This is the file id where the content was found
        parse_result_id:
          anyOf:
            - type: string
            - type: 'null'
          title: Parse Result Id
          description: This is the parse result id where the content was found
        content:
          type: string
          title: Content
          description: This is the content of the block
        page:
          type: integer
          title: Page
          description: This is the page number where the block was found
        bbox:
          anyOf:
            - $ref: '#/components/schemas/BoundingBox'
            - type: 'null'
          description: |2-

                        This is the block's bounding box in normalized [0,1]
                        distances as an object with left, top, width, height.
                        
      type: object
      required:
        - content
        - page
      title: ExtractionCitation
      description: information for an extracted value.
    BoundingBox:
      properties:
        left:
          type: number
          title: Left
          description: Left coordinate
        top:
          type: number
          title: Top
          description: Top coordinate
        width:
          type: number
          title: Width
          description: Width
        height:
          type: number
          title: Height
          description: Height
      type: object
      required:
        - left
        - top
        - width
        - height
      title: BoundingBox
      description: Bounding box information for location in document.
  securitySchemes:
    ApiKey:
      type: apiKey
      in: header
      name: x-api-key
      description: API key for authentication
    AccountId:
      type: apiKey
      in: header
      name: x-selected-account-id
      description: Selected Account ID

````