PageIndex Integration

PageIndex is a vectorless reasoning-based RAG system that navigates document trees to find relevant content. Unlike embedding-based retrieval, PageIndex performs tree traversal — each node selection is a distinct decision point.

Briefcase provides two components for PageIndex observability:

PageIndexTracer — wraps PageIndexClient directly; captures chat_completions calls and enriches records with tree structure metadata
PageIndexMCPObserver — post-processes MCP tool call records when PageIndex is accessed via Model Context Protocol

Installation

pip install briefcase-ai[pageindex]

PageIndexMCPObserver does not require the pageindex package.

PageIndexTracer

Constructor

from briefcase.integrations.frameworks import PageIndexTracer

tracer = PageIndexTracer(
    api_key="your-pageindex-api-key",   # used if client not provided
    context_version=None,               # version tag for all records
    async_capture=True,                 # export in background thread
    client=None,                        # existing PageIndexClient (overrides api_key)
    fetch_tree_metadata=True,           # call get_tree() after each chat_completions
)

Parameter	Default	Description
`api_key`	`None`	PageIndex API key. Creates a `PageIndexClient` internally.
`context_version`	`None`	Version tag added to every decision record.
`async_capture`	`True`	Export to `BriefcaseConfig.exporter` in a daemon thread.
`client`	`None`	Supply an existing `PageIndexClient` (takes precedence over `api_key`).
`fetch_tree_metadata`	`True`	Call `get_tree()` after `chat_completions` to compute depth/path.

Basic Usage

tracer = PageIndexTracer(api_key="pi-key-abc123", context_version="v2.1")

response = tracer.chat_completions(
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    doc_id="pi-abc123",
)

print(response["choices"][0]["message"]["content"])

# Inspect captured records
for record in tracer.get_records():
    print(f"doc_id: {record['pageindex.doc_id']}")
    print(f"depth: {record['pageindex.tree.depth']}")
    print(f"nodes_visited: {record['pageindex.tree.nodes_visited']}")
    print(f"path: {record['pageindex.tree.path']}")
    print(f"execution_time_ms: {record['execution_time_ms']:.1f}")

Using an Existing Client

from pageindex import PageIndexClient

client = PageIndexClient(api_key="pi-key-abc123")
tracer = PageIndexTracer(client=client, context_version="prod-v1")

response = tracer.chat_completions(
    messages=[{"role": "user", "content": "Explain section 3.2"}],
    doc_id="doc-789",
)

get_tree() Pass-Through

tree = tracer.get_tree("doc-789")
print(tree)  # {"tree": {"title": "...", "nodes": [...]}}

Public API

tracer.get_records() -> List[Dict[str, Any]]   # all captured records
tracer.clear()                                  # reset captured records

PageIndex Decision Record

Every chat_completions call produces a record with these fields:

{
    "decision_id": "uuid-...",
    "decision_type": "pageindex_retrieval",
    "function_name": "PageIndexTracer.chat_completions",
    "inputs": {
        "messages": [...],
        "doc_id": "pi-abc123"
    },
    "outputs": {
        "content": "Paris is the capital of France."
    },
    "started_at": "2026-02-26T10:00:00Z",
    "ended_at": "2026-02-26T10:00:00.843Z",
    "execution_time_ms": 843.2,
    "context_version": "v2.1",

    # PageIndex-specific attributes
    "pageindex.doc_id": "pi-abc123",
    "pageindex.retrieval_method": "tree_search",
    "pageindex.tree.depth": 3,
    "pageindex.tree.nodes_visited": 12,
    "pageindex.tree.path": "root > Chapter 1 > Section 1.2 > ... (4 more)",
    "pageindex.tree.backtrack_count": 0
}

Note on nodes_visited: This is the total node count of the fetched tree, used as an upper-bound proxy for traversal. PageIndex does not expose per-query traversal paths via the API.

Note on backtrack_count: Always 0 — backtracking is server-side only and not exposed in the API response.

PageIndexMCPObserver

When a LangChain agent or OpenAI agent uses PageIndex as an MCP tool, the tool call appears in handler records with an opaque JSON output string. PageIndexMCPObserver parses that output and adds pageindex.* attributes in-place.

Constructor

from briefcase.integrations.frameworks import PageIndexMCPObserver

observer = PageIndexMCPObserver()
# No configuration needed. No pageindex package required.

Detection Logic

The observer identifies PageIndex MCP records by (in order):

Tool/function name contains any of: pageindex, page_index, pi_search, pi_chat, pi_retrieve (case-insensitive)
The output JSON contains doc_id or retrieval_id keys
The output JSON contains a nodes array at root level
The output JSON contains a tree key with a dict value

Usage with LangChain Handler

from briefcase.integrations.frameworks import (
    BriefcaseLangChainHandler,
    PageIndexMCPObserver,
)

handler = BriefcaseLangChainHandler(engagement_id="my-project")
observer = PageIndexMCPObserver()

# After running the chain...
for record in handler.get_decisions_as_dicts():
    enriched = observer.observe(record)  # mutates record in-place
    if enriched:
        print(f"PageIndex call: doc={record['pageindex.doc_id']}")

print(f"Observed: {observer.observed_count}, Enriched: {observer.enriched_count}")

Usage with OpenAI Agents Tracer

from briefcase.integrations.frameworks import OpenAIAgentsTracer, PageIndexMCPObserver

tracer = OpenAIAgentsTracer()
observer = PageIndexMCPObserver()

# After the agent run...
for trace in tracer.get_records():
    for span in trace.get("spans", []):
        if observer.observe(span):
            print(f"PageIndex span: doc={span['pageindex.doc_id']}")

Public API

observer.observe(record: Dict) -> bool      # True if enriched
observer.is_pageindex_mcp_response(record)  # check without mutating
observer.observed_count                     # total records seen
observer.enriched_count                     # total records enriched

Choosing PageIndexTracer vs PageIndexMCPObserver

Scenario	Use
Direct PageIndex SDK calls	`PageIndexTracer`
PageIndex accessed as LangChain tool via MCP	`PageIndexMCPObserver` on LangChain records
PageIndex accessed by an OpenAI agent via MCP	`PageIndexMCPObserver` on tracer spans
Both (mixed architecture)	Both

Installation​

PageIndexTracer​

Constructor​

Basic Usage​

Using an Existing Client​

get_tree() Pass-Through​

Public API​

PageIndex Decision Record​

PageIndexMCPObserver​

Constructor​

Detection Logic​

Usage with LangChain Handler​

Usage with OpenAI Agents Tracer​

Public API​

Choosing PageIndexTracer vs PageIndexMCPObserver​

See Also​

Installation

PageIndexTracer

Constructor

Basic Usage

Using an Existing Client

get_tree() Pass-Through

Public API

PageIndex Decision Record

PageIndexMCPObserver

Constructor

Detection Logic

Usage with LangChain Handler

Usage with OpenAI Agents Tracer

Public API

Choosing PageIndexTracer vs PageIndexMCPObserver

See Also