PageIndex Integration
PageIndex is a vectorless reasoning-based RAG system that navigates document trees to find relevant content. Unlike embedding-based retrieval, PageIndex performs tree traversal — each node selection is a distinct decision point.
Briefcase provides two components for PageIndex observability:
PageIndexTracer— wrapsPageIndexClientdirectly; captureschat_completionscalls and enriches records with tree structure metadataPageIndexMCPObserver— post-processes MCP tool call records when PageIndex is accessed via Model Context Protocol
Installation
pip install briefcase-ai[pageindex]
PageIndexMCPObserver does not require the pageindex package.
PageIndexTracer
Constructor
from briefcase.integrations.frameworks import PageIndexTracer
tracer = PageIndexTracer(
api_key="your-pageindex-api-key", # used if client not provided
context_version=None, # version tag for all records
async_capture=True, # export in background thread
client=None, # existing PageIndexClient (overrides api_key)
fetch_tree_metadata=True, # call get_tree() after each chat_completions
)
| Parameter | Default | Description |
|---|---|---|
api_key | None | PageIndex API key. Creates a PageIndexClient internally. |
context_version | None | Version tag added to every decision record. |
async_capture | True | Export to BriefcaseConfig.exporter in a daemon thread. |
client | None | Supply an existing PageIndexClient (takes precedence over api_key). |
fetch_tree_metadata | True | Call get_tree() after chat_completions to compute depth/path. |
Basic Usage
tracer = PageIndexTracer(api_key="pi-key-abc123", context_version="v2.1")
response = tracer.chat_completions(
messages=[{"role": "user", "content": "What is the capital of France?"}],
doc_id="pi-abc123",
)
print(response["choices"][0]["message"]["content"])
# Inspect captured records
for record in tracer.get_records():
print(f"doc_id: {record['pageindex.doc_id']}")
print(f"depth: {record['pageindex.tree.depth']}")
print(f"nodes_visited: {record['pageindex.tree.nodes_visited']}")
print(f"path: {record['pageindex.tree.path']}")
print(f"execution_time_ms: {record['execution_time_ms']:.1f}")
Using an Existing Client
from pageindex import PageIndexClient
client = PageIndexClient(api_key="pi-key-abc123")
tracer = PageIndexTracer(client=client, context_version="prod-v1")
response = tracer.chat_completions(
messages=[{"role": "user", "content": "Explain section 3.2"}],
doc_id="doc-789",
)
get_tree() Pass-Through
tree = tracer.get_tree("doc-789")
print(tree) # {"tree": {"title": "...", "nodes": [...]}}
Public API
tracer.get_records() -> List[Dict[str, Any]] # all captured records
tracer.clear() # reset captured records
PageIndex Decision Record
Every chat_completions call produces a record with these fields:
{
"decision_id": "uuid-...",
"decision_type": "pageindex_retrieval",
"function_name": "PageIndexTracer.chat_completions",
"inputs": {
"messages": [...],
"doc_id": "pi-abc123"
},
"outputs": {
"content": "Paris is the capital of France."
},
"started_at": "2026-02-26T10:00:00Z",
"ended_at": "2026-02-26T10:00:00.843Z",
"execution_time_ms": 843.2,
"context_version": "v2.1",
# PageIndex-specific attributes
"pageindex.doc_id": "pi-abc123",
"pageindex.retrieval_method": "tree_search",
"pageindex.tree.depth": 3,
"pageindex.tree.nodes_visited": 12,
"pageindex.tree.path": "root > Chapter 1 > Section 1.2 > ... (4 more)",
"pageindex.tree.backtrack_count": 0
}
Note on nodes_visited: This is the total node count of the fetched tree,
used as an upper-bound proxy for traversal. PageIndex does not expose
per-query traversal paths via the API.
Note on backtrack_count: Always 0 — backtracking is server-side only
and not exposed in the API response.
PageIndexMCPObserver
When a LangChain agent or OpenAI agent uses PageIndex as an MCP tool, the
tool call appears in handler records with an opaque JSON output string.
PageIndexMCPObserver parses that output and adds pageindex.* attributes
in-place.
Constructor
from briefcase.integrations.frameworks import PageIndexMCPObserver
observer = PageIndexMCPObserver()
# No configuration needed. No pageindex package required.
Detection Logic
The observer identifies PageIndex MCP records by (in order):
- Tool/function name contains any of:
pageindex,page_index,pi_search,pi_chat,pi_retrieve(case-insensitive) - The output JSON contains
doc_idorretrieval_idkeys - The output JSON contains a
nodesarray at root level - The output JSON contains a
treekey with a dict value
Usage with LangChain Handler
from briefcase.integrations.frameworks import (
BriefcaseLangChainHandler,
PageIndexMCPObserver,
)
handler = BriefcaseLangChainHandler(engagement_id="my-project")
observer = PageIndexMCPObserver()
# After running the chain...
for record in handler.get_decisions_as_dicts():
enriched = observer.observe(record) # mutates record in-place
if enriched:
print(f"PageIndex call: doc={record['pageindex.doc_id']}")
print(f"Observed: {observer.observed_count}, Enriched: {observer.enriched_count}")
Usage with OpenAI Agents Tracer
from briefcase.integrations.frameworks import OpenAIAgentsTracer, PageIndexMCPObserver
tracer = OpenAIAgentsTracer()
observer = PageIndexMCPObserver()
# After the agent run...
for trace in tracer.get_records():
for span in trace.get("spans", []):
if observer.observe(span):
print(f"PageIndex span: doc={span['pageindex.doc_id']}")
Public API
observer.observe(record: Dict) -> bool # True if enriched
observer.is_pageindex_mcp_response(record) # check without mutating
observer.observed_count # total records seen
observer.enriched_count # total records enriched
Choosing PageIndexTracer vs PageIndexMCPObserver
| Scenario | Use |
|---|---|
| Direct PageIndex SDK calls | PageIndexTracer |
| PageIndex accessed as LangChain tool via MCP | PageIndexMCPObserver on LangChain records |
| PageIndex accessed by an OpenAI agent via MCP | PageIndexMCPObserver on tracer spans |
| Both (mixed architecture) | Both |
See Also
- Integrations Overview
- LangChain Integration — used together with MCP observer
- OpenAI Agents Integration — used together with MCP observer