Getting Started with the ipto.ai API
A practical guide to integrating the ipto.ai retrieval API with your AI agent. Authentication, first query, parsing retrieval units, and handling structured responses.
By ipto.ai Research
Overview
The ipto.ai retrieval API gives your AI agent programmatic access to private data — structured, permissioned, and priced for machine consumption. This guide walks through authentication, your first query, and how to parse the structured retrieval units that come back.
If you are unfamiliar with retrieval units as a concept, read What Are Agent-Consumable Retrieval Units? first. This guide assumes you are building or integrating an agent that needs to consume private data at runtime.
Step 1: Get your API key
API keys are managed through the admin portal at admin.ipto.ai. Each key is scoped to one or more data tenants and carries the access policies configured for your organization.
- Sign in to admin.ipto.ai
- Navigate to Settings > API Keys
- Click Generate Key and select the tenant scopes your agent requires
- Copy the key immediately — it is displayed only once
Store the key in your environment variables. Never hard-code it in source files.
export IPTO_API_KEY="ik_live_a1b2c3d4e5f6..."
Step 2: Make your first retrieval query
The retrieval endpoint accepts a natural language query and returns structured retrieval units from the private data sources your key has access to. The base URL is https://api.ipto.ai.
Using curl:
curl -X POST https://api.ipto.ai/v1/retrieve \
-H "Authorization: Bearer $IPTO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the quarterly disclosure obligations for vendor contracts?",
"top_k": 5,
"min_confidence": 0.8,
"rerank": true
}'
Using Python:
import requests
import os
response = requests.post(
"https://api.ipto.ai/v1/retrieve",
headers={
"Authorization": f"Bearer {os.environ['IPTO_API_KEY']}",
"Content-Type": "application/json",
},
json={
"query": "What are the quarterly disclosure obligations for vendor contracts?",
"top_k": 5,
"min_confidence": 0.8,
"rerank": True,
},
)
data = response.json()
The top_k parameter controls how many retrieval units are returned. The min_confidence threshold filters out low-confidence extractions. Setting rerank to true enables a second-pass ranking model for higher relevance at the cost of slight additional latency.
Step 3: Understand the retrieval unit response
The API returns a JSON response containing an array of retrieval units. Each unit is a self-contained data object with content, structured facts, provenance, and economic metadata.
{
"request_id": "req_7f3a9b2e",
"latency_ms": 342,
"units": [
{
"chunk_id": "ru_4a8c1e2f",
"tenant_id": "tenant_acme_corp",
"modality": "document",
"text": "Vendor partners are required to submit quarterly disclosure reports within 30 calendar days of each quarter close, per section 4.2 of the compliance handbook.",
"structured_facts": [
{
"entity": "quarterly_disclosure_report",
"type": "obligation",
"frequency": "quarterly",
"deadline_days": 30,
"reference": "section 4.2",
"confidence": 0.94
}
],
"provenance": {
"source": "compliance_handbook_v3.pdf",
"page": 42,
"section": "4.2",
"last_updated": "2026-02-15T00:00:00Z",
"hash": "sha256:9f86d081..."
},
"confidence": 0.94,
"freshness": "current",
"access_policy": {
"permitted_uses": ["internal_analysis", "agent_workflow"],
"citation_allowed": true
},
"price_per_retrieval": 0.003,
"citation_premium": 0.001
}
],
"total_units": 1,
"billing": {
"total_cost": 0.003,
"currency": "USD"
}
}
Every retrieval unit includes the fields your agent needs to evaluate quality, verify the source, respect permissions, and account for cost — all in a single response.
Step 4: Work with structured facts and provenance
The real value of retrieval units is in the structured_facts array. Instead of parsing natural language, your agent consumes typed fields directly.
for unit in data["units"]:
for fact in unit["structured_facts"]:
if fact["type"] == "obligation" and fact["confidence"] >= 0.9:
print(f"Obligation: {fact['entity']}")
print(f"Frequency: {fact['frequency']}")
print(f"Deadline: {fact['deadline_days']} days")
print(f"Source: {unit['provenance']['source']}:{unit['provenance']['page']}")
Provenance fields allow your agent to cite specific sources and verify integrity. The hash field can be used to confirm that the underlying document has not changed since extraction. The last_updated timestamp helps your agent assess data freshness without a separate lookup.
When citation is allowed under the access policy, include the provenance in your agent’s output so downstream consumers can trace every fact to its origin document.
Step 5: Handle errors
The API uses standard HTTP status codes. The most common error responses you should handle:
| Status | Meaning | Action |
|---|---|---|
| 401 | Invalid or expired API key | Regenerate key at admin.ipto.ai |
| 403 | Key lacks access to requested tenant | Check tenant scopes in admin portal |
| 429 | Rate limit exceeded | Back off and retry with exponential delay |
| 422 | Malformed request body | Validate JSON payload against the schema |
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 5))
time.sleep(retry_after)
# Retry the request
if response.status_code == 401:
raise AuthenticationError("API key is invalid or expired. Regenerate at admin.ipto.ai.")
if not response.ok:
error = response.json()
raise APIError(f"{error['code']}: {error['message']}")
All error responses return a JSON body with code and message fields describing the issue. Log these for debugging — they are designed to be informative without exposing internal system details.
Next steps
This guide covers the basics. The full API surface includes endpoints for batch retrieval, tenant discovery, usage analytics, and webhook-based data freshness notifications. Explore them in the complete documentation at docs.ipto.ai.
If you are designing your agent’s data layer from scratch, The Data Stack Your AI Agent Actually Needs provides architectural context for where the retrieval API fits in your system.
Key takeaways
- API keys are scoped to tenants and managed at admin.ipto.ai — never hard-code them
- The
/v1/retrieveendpoint accepts natural language queries and returns structured retrieval units, not raw text - Each retrieval unit contains structured facts, provenance, confidence scores, and pricing metadata for direct agent consumption
- Use
min_confidenceto filter low-quality extractions andrerankfor higher relevance in production workflows - Provenance fields enable source citation and integrity verification without additional lookups
- Handle 401, 403, 429, and 422 errors explicitly in your agent’s integration layer
- Full API documentation is available at docs.ipto.ai
Frequently Asked Questions
How do I authenticate with the ipto.ai API?
Authentication uses API keys passed via the Authorization header as a Bearer token. Generate keys from the admin portal at admin.ipto.ai. Each key is scoped to specific data tenants and access policies.
What format does the ipto.ai retrieval API return?
The API returns structured retrieval units — not raw text chunks. Each unit contains structured_facts (entities, dates, obligations), provenance metadata (source, page, hash), confidence scores, and pricing information. Responses are JSON with typed fields for direct agent consumption.
What is the latency of the ipto.ai retrieval API?
The API targets sub-400ms p50 retrieval latency and under 1.5s at p95 with reranking enabled. These are designed for agent-grade performance where milliseconds matter in multi-step workflows.
Related Articles
The Agent Data Stack Explained
A conceptual breakdown of the four essential layers that make private data safely consumable by AI agents — retrieval, pricing, trust, and audit.
InfrastructureWhat Are Retrieval Units? A New AI Primitive
Retrieval units are the atomic building blocks of the agent data economy — structured data objects optimized for AI agent consumption, not human search. Here's what they are and why they matter.
Infrastructureipto.ai vs RAG: Retrieval Units vs Text Chunks
A technical comparison of ipto.ai's retrieval unit architecture versus traditional RAG pipelines. Structured facts, provenance, pricing, and audit — the layers agents actually need.
Get our research delivered weekly
Deep dives on agent infrastructure, data monetization, and the future of AI — straight to your inbox.
Subscribe on Substack →ipto.ai is building the private data infrastructure layer for the agent economy.