How do I authenticate with the ipto.ai API?

Authentication uses API keys passed via the Authorization header as a Bearer token. Generate keys from the admin portal at admin.ipto.ai. Each key is scoped to specific data tenants and access policies.

What format does the ipto.ai retrieval API return?

The API returns structured retrieval units — not raw text chunks. Each unit contains structured_facts (entities, dates, obligations), provenance metadata (source, page, hash), confidence scores, and pricing information. Responses are JSON with typed fields for direct agent consumption.

What is the latency of the ipto.ai retrieval API?

The API targets sub-400ms p50 retrieval latency and under 1.5s at p95 with reranking enabled. These are designed for agent-grade performance where milliseconds matter in multi-step workflows.

Getting Started with the ipto.ai API

Overview

The ipto.ai retrieval API gives your AI agent programmatic access to private data — structured, permissioned, and priced for machine consumption. This guide walks through authentication, your first query, and how to parse the structured retrieval units that come back.

If you are unfamiliar with retrieval units as a concept, read What Are Agent-Consumable Retrieval Units? first. This guide assumes you are building or integrating an agent that needs to consume private data at runtime.

Step 1: Get your API key

API keys are managed through the admin portal at admin.ipto.ai. Each key is scoped to one or more data tenants and carries the access policies configured for your organization.

Sign in to admin.ipto.ai
Navigate to Settings > API Keys
Click Generate Key and select the tenant scopes your agent requires
Copy the key immediately — it is displayed only once

Store the key in your environment variables. Never hard-code it in source files.

export IPTO_API_KEY="ik_live_a1b2c3d4e5f6..."

Step 2: Make your first retrieval query

The retrieval endpoint accepts a natural language query and returns structured retrieval units from the private data sources your key has access to. The base URL is https://api.ipto.ai.

Using curl:

curl -X POST https://api.ipto.ai/v1/retrieve \
  -H "Authorization: Bearer $IPTO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the quarterly disclosure obligations for vendor contracts?",
    "top_k": 5,
    "min_confidence": 0.8,
    "rerank": true
  }'

Using Python:

import requests
import os

response = requests.post(
    "https://api.ipto.ai/v1/retrieve",
    headers={
        "Authorization": f"Bearer {os.environ['IPTO_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "query": "What are the quarterly disclosure obligations for vendor contracts?",
        "top_k": 5,
        "min_confidence": 0.8,
        "rerank": True,
    },
)

data = response.json()

The top_k parameter controls how many retrieval units are returned. The min_confidence threshold filters out low-confidence extractions. Setting rerank to true enables a second-pass ranking model for higher relevance at the cost of slight additional latency.

Step 3: Understand the retrieval unit response

The API returns a JSON response containing an array of retrieval units. Each unit is a self-contained data object with content, structured facts, provenance, and economic metadata.

{
  "request_id": "req_7f3a9b2e",
  "latency_ms": 342,
  "units": [
    {
      "chunk_id": "ru_4a8c1e2f",
      "tenant_id": "tenant_acme_corp",
      "modality": "document",
      "text": "Vendor partners are required to submit quarterly disclosure reports within 30 calendar days of each quarter close, per section 4.2 of the compliance handbook.",
      "structured_facts": [
        {
          "entity": "quarterly_disclosure_report",
          "type": "obligation",
          "frequency": "quarterly",
          "deadline_days": 30,
          "reference": "section 4.2",
          "confidence": 0.94
        }
      ],
      "provenance": {
        "source": "compliance_handbook_v3.pdf",
        "page": 42,
        "section": "4.2",
        "last_updated": "2026-02-15T00:00:00Z",
        "hash": "sha256:9f86d081..."
      },
      "confidence": 0.94,
      "freshness": "current",
      "access_policy": {
        "permitted_uses": ["internal_analysis", "agent_workflow"],
        "citation_allowed": true
      },
      "price_per_retrieval": 0.003,
      "citation_premium": 0.001
    }
  ],
  "total_units": 1,
  "billing": {
    "total_cost": 0.003,
    "currency": "USD"
  }
}

Every retrieval unit includes the fields your agent needs to evaluate quality, verify the source, respect permissions, and account for cost — all in a single response.

Step 4: Work with structured facts and provenance

The real value of retrieval units is in the structured_facts array. Instead of parsing natural language, your agent consumes typed fields directly.

for unit in data["units"]:
    for fact in unit["structured_facts"]:
        if fact["type"] == "obligation" and fact["confidence"] >= 0.9:
            print(f"Obligation: {fact['entity']}")
            print(f"Frequency: {fact['frequency']}")
            print(f"Deadline: {fact['deadline_days']} days")
            print(f"Source: {unit['provenance']['source']}:{unit['provenance']['page']}")

Provenance fields allow your agent to cite specific sources and verify integrity. The hash field can be used to confirm that the underlying document has not changed since extraction. The last_updated timestamp helps your agent assess data freshness without a separate lookup.

When citation is allowed under the access policy, include the provenance in your agent’s output so downstream consumers can trace every fact to its origin document.

Step 5: Handle errors

The API uses standard HTTP status codes. The most common error responses you should handle:

Status	Meaning	Action
401	Invalid or expired API key	Regenerate key at admin.ipto.ai
403	Key lacks access to requested tenant	Check tenant scopes in admin portal
429	Rate limit exceeded	Back off and retry with exponential delay
422	Malformed request body	Validate JSON payload against the schema

if response.status_code == 429:
    retry_after = int(response.headers.get("Retry-After", 5))
    time.sleep(retry_after)
    # Retry the request

if response.status_code == 401:
    raise AuthenticationError("API key is invalid or expired. Regenerate at admin.ipto.ai.")

if not response.ok:
    error = response.json()
    raise APIError(f"{error['code']}: {error['message']}")

All error responses return a JSON body with code and message fields describing the issue. Log these for debugging — they are designed to be informative without exposing internal system details.

Next steps

This guide covers the basics. The full API surface includes endpoints for batch retrieval, tenant discovery, usage analytics, and webhook-based data freshness notifications. Explore them in the complete documentation at docs.ipto.ai.

If you are designing your agent’s data layer from scratch, The Data Stack Your AI Agent Actually Needs provides architectural context for where the retrieval API fits in your system.

Key takeaways

API keys are scoped to tenants and managed at admin.ipto.ai — never hard-code them
The /v1/retrieve endpoint accepts natural language queries and returns structured retrieval units, not raw text
Each retrieval unit contains structured facts, provenance, confidence scores, and pricing metadata for direct agent consumption
Use min_confidence to filter low-quality extractions and rerank for higher relevance in production workflows
Provenance fields enable source citation and integrity verification without additional lookups
Handle 401, 403, 429, and 422 errors explicitly in your agent’s integration layer
Full API documentation is available at docs.ipto.ai

Getting Started with the ipto.ai API

Overview

Step 1: Get your API key

Step 2: Make your first retrieval query

Step 3: Understand the retrieval unit response

Step 4: Work with structured facts and provenance

Step 5: Handle errors

Next steps

Key takeaways

Frequently Asked Questions

Related Articles

The Agent Data Stack Explained

What Are Retrieval Units? A New AI Primitive

ipto.ai vs RAG: Retrieval Units vs Text Chunks

Related Articles

Infrastructure
The Agent Data Stack Explained
A conceptual breakdown of the four essential layers that make private data safely consumable by AI agents — retrieval, pricing, trust, and audit.

Infrastructure
What Are Retrieval Units? A New AI Primitive
Retrieval units are the atomic building blocks of the agent data economy — structured data objects optimized for AI agent consumption, not human search. Here's what they are and why they matter.

Infrastructure
ipto.ai vs RAG: Retrieval Units vs Text Chunks
A technical comparison of ipto.ai's retrieval unit architecture versus traditional RAG pipelines. Structured facts, provenance, pricing, and audit — the layers agents actually need.