Span Hierarchy & Best Practices

Well-structured traces make the difference between useful observability and noisy data. This page covers how to name spans consistently, structure your trace tree, and populate span input/output so the Profiler and Trace Detail View give you actionable insights.

Naming Conventions

Use consistent, parseable span names. The SGP Profiler aggregates performance metrics by span name. Inconsistent naming splits your data into separate rows and makes bottleneck analysis unreliable.

Span Type	Name Pattern	Example
Turn	`turn:<N>`	`turn:1`, `turn:2`
LLM Call	`llm:<model>`	`llm:gpt-4o`, `llm:claude-sonnet-4-20250514`
Tool Call	`tool:<name>`	`tool:web_search`, `tool:code_interpreter`
Reasoning	`reasoning:<step>`	`reasoning:plan`, `reasoning:reflect`
State Update	`state:<operation>`	`state:update`, `state:create`
Sub-task	`subtask:<name>`	`subtask:market_analysis`
Research Pass	`research_pass_<N>`	`research_pass_1`, `research_pass_2`

The SGP Profiler aggregates metrics by span name. Mixing formats like Turn 1 and turn:1 will split your data into separate rows, making performance analysis unreliable. Pick a convention and stick with it.

The Ideal Trace Tree

A well-structured multi-turn agent trace looks like this: ✦ = auto-traced by built-in providers Key principles:

Turn spans are the primary grouping unit: every operation in a conversation turn nests under its turn span
Auto-traced spans nest automatically when you pass parent_span_id to built-in providers
Sub-tasks and research passes create intermediate grouping for complex multi-step operations
State updates capture what changed at the end of a turn

Span Input/Output Best Practices

Input and output appear in two places in the SGP UI: the Trace Detail View span inspector panel and the Profiler span inspector. Well-structured I/O makes debugging dramatically faster.

The Traces table preview column extracts text from specific preferred keys in your span input and output (e.g., title, query, content). If the preview shows an unhelpful value, add a title key with the value you want displayed.

Agentex profiler view showing aggregated span metrics

Turn spans

Input: The user message or event that triggered this turn, plus current state
Output: The updated state after processing

async with adk.tracing.span(
    trace_id=task_id,
    name=f"turn:{turn_number}",
    input={"message": user_message, "state": state.model_dump()},
) as span:
    # ... process turn ...
    if span:
        span.output = state.model_dump()

Comparing input and output on a turn span lets you quickly see what changed during that turn.

LLM spans (auto-traced)

These are populated automatically by the built-in providers:

Input: Model name, tools count, model settings, whether system instructions exist
Output: new_items (serialized response items), final_output (extracted text)

Tool spans

Input: The tool arguments as a dict
Output: The tool result (string or structured data)

async with adk.tracing.span(
    trace_id=task_id,
    name=f"tool:{tool_name}",
    parent_id=turn_span_id,
    input={"query": search_query, "max_results": 10},
) as span:
    result = await execute_tool(tool_name, search_query)
    if span:
        span.output = {"results": result, "count": len(result)}

Reasoning spans

Input: The context or question being reasoned about
Output: The reasoning result, confidence score, and planned next steps

async with adk.tracing.span(
    trace_id=task_id,
    name="reasoning:plan",
    parent_id=turn_span_id,
    input={"context": context, "question": user_question},
) as span:
    plan = await generate_plan(context, user_question)
    if span:
        span.output = {"plan": plan, "confidence": 0.95, "steps": 3}

State update spans

Input: Previous state (or the delta being applied)
Output: The new state

These are most useful when your state changes are complex and you want to track exactly what was modified.

Manual Spans for Custom Logic

When built-in providers don’t cover your use case (agent reasoning, data processing, custom tool execution), create manual spans.

Using Agentex ADK (Temporal-aware)

Use adk.tracing.span() inside Temporal workflows and activities. It handles Temporal activity routing automatically.

from agentex.lib import adk

async with adk.tracing.span(
    trace_id=task_id,
    name="reasoning:reflect",
    parent_id=turn_span_id,
    input={"context": "...", "question": "..."},
    data={"__span_type__": "CUSTOM"},
) as span:
    result = await my_reasoning_function(...)
    if span:
        span.output = {"reasoning": result, "confidence": 0.95}

Always check if span: before setting attributes. The span is None when trace_id is falsy, which prevents errors when tracing is disabled.

Using SGP SDK Directly (non-Temporal)

For standalone scripts or non-Temporal agents, use the SGP tracing SDK directly:

import scale_gp_beta.lib.tracing as tracing

with tracing.create_span(
    name="research_task_activity_market_analysis",
    trace_id=task_id,
    parent_id=parent_span_id,
    input={"system_prompt": "...", "user_input": "...", "num_passes": 2},
) as span:
    result = await run_research(...)
    span.output = {"output_length": len(result), "success": True}

Span Type Mapping to SGP

Set __span_type__ in span data to control how the SGP UI categorizes the span:

`__span_type__`	SGP `operation_type`	Use For
`STANDALONE`	STANDALONE	Default, generic operations
`COMPLETION`	COMPLETION	LLM / model calls
`CUSTOM`	CUSTOM	Tool calls, custom logic

Real-World Example: Multi-Level Research Agent

This example from a due diligence research agent shows effective multi-level span nesting:

@activity.defn
async def research_task_activity(
    task_info: TaskInfo,
    model: str,
    task_id: str,
    parent_span_id: str,
    num_passes: int = 2,
) -> ResearchOutput:
    span_name = f"research_task_activity_{task_info.section_title.lower().replace(' ', '_')}"

    # Level 1: Activity-level span
    with tracing.create_span(
        name=span_name,
        trace_id=task_id,
        parent_id=parent_span_id,
        input={
            "system_prompt": system_prompt,
            "user_input": user_input,
            "num_passes": num_passes,
        },
    ) as span:

        # Level 2: Per-pass spans
        for pass_num in range(num_passes):
            with tracing.create_span(
                name=f"research_pass_{pass_num + 1}",
                trace_id=task_id,
                parent_id=span.span_id,
                input={
                    "pass_num": pass_num + 1,
                    "is_refinement": pass_num > 0,
                    "message_count": len(messages),
                },
            ) as pass_span:

                # Level 3: Auto-traced LLM/tool spans via stream processor
                processor = AgentsStreamProcessor(
                    task_id,
                    pass_span.span_id,
                    f"{task_info.section_title} (pass {pass_num + 1})",
                )

                result = Runner.run_streamed(agent, messages)
                async for event in result.stream_events():
                    await processor.process_event(event)

                pass_span.output = {
                    "output_length": len(pass_output),
                    "success": True,
                }

This produces a clean, nested trace:

research_task_activity_market_analysis
├── research_pass_1
│   ├── llm:o3 (auto-traced)
│   ├── tool:web_search (auto-traced)
│   └── llm:o3 (auto-traced)
└── research_pass_2
    ├── llm:o3 (auto-traced)
    └── llm:o3 (auto-traced)

Checklist

Must-Do

Set trace_id = task_id consistently across all spans
Create turn-level spans wrapping each conversation turn
Pass parent_span_id to all ADK methods and child operations
Set span.output before the span closes (inside the context manager)
Use descriptive, parseable span names (turn:1, tool:search, llm:gpt-4o)

Should-Do

Include relevant context in span input, not just raw data but structured metadata
Set __span_type__ in data for proper SGP categorization
Handle errors gracefully by setting error output before the span closes
Use nested spans for multi-step operations (research passes, sub-tasks)

Avoid

Don’t create flat traces where everything sits at the root level. Use parent_id hierarchy
Don’t put massive payloads in input/output. Summarize or truncate large data
Don’t forget to check if span: before setting attributes (span is None when trace_id is falsy)
Don’t create spans without trace_id. They’ll be orphaned and invisible in the UI

Documentation Index

​Naming Conventions

​The Ideal Trace Tree

​Span Input/Output Best Practices

​Manual Spans for Custom Logic

​Using Agentex ADK (Temporal-aware)

​Using SGP SDK Directly (non-Temporal)

​Span Type Mapping to SGP

​Real-World Example: Multi-Level Research Agent

​Checklist

​Must-Do

​Should-Do

​Avoid

Naming Conventions

The Ideal Trace Tree

Span Input/Output Best Practices

Manual Spans for Custom Logic

Using Agentex ADK (Temporal-aware)

Using SGP SDK Directly (non-Temporal)

Span Type Mapping to SGP

Real-World Example: Multi-Level Research Agent

Checklist

Must-Do

Should-Do

Avoid