API Contract (Hard)
This document defines hard, server-enforced contracts for Aionis Memory Graph API responses. These rules exist to prevent:
- accidental data exfiltration via embeddings
- response-size explosions (tokens, bandwidth, client memory)
- future regressions when subgraph expansion adds more nodes/edges
Guiding Rules (A/B/C)
A. Product boundary (safety/usability)
Embeddings are never returned by default. Debug embeddings are a privileged channel with strict caps.
B. API contract (stability)
All outward responses are DTO-whitelisted (no accidental new fields).
C. Query/perf boundary (stop-the-bleed)
The DB layer uses explicit SELECT lists and does not fetch embeddings unless debug mode explicitly requests a bounded preview.
Tenant Isolation (P2.5)
- All memory endpoints support
tenant_id?: string(default:MEMORY_TENANT_ID, normallydefault). - Header fallback is supported:
- if request body omits
tenant_id, server readsX-Tenant-Id.
- if request body omits
- Isolation key is
(tenant_id, scope)(internally encoded as a tenant-aware scope key). - Backward compatibility:
tenant_id=defaultkeeps legacy single-tenant scope behavior.
Auth Identity Mapping (Phase C MVP)
- Optional global mode:
MEMORY_AUTH_MODE=api_key - Key registry:
MEMORY_API_KEYS_JSON({ "<api-key>": { tenant_id, agent_id?, team_id?, role? } }) - JWT mode:
MEMORY_AUTH_MODE=jwt(orapi_key_or_jwt) withAuthorization: Bearer <jwt>- HS256 secret:
MEMORY_JWT_HS256_SECRET - claims:
tenant_id(ortenant) required;agent_id/sub,team_id,roleoptional - time validation:
exp/nbfwithMEMORY_JWT_CLOCK_SKEW_SEC
- HS256 secret:
- In
api_keymode:- all
/v1/memory/*endpoints requireX-Api-Key - request tenant must match key-bound tenant (
401/403on failure) - identity fields are auto-injected when omitted:
- recall:
consumer_agent_id,consumer_team_id - write:
producer_agent_id(and owner fallback) - rules/tools context:
context.agent.id,context.agent.team_id
- recall:
- identity mismatch in caller-provided fields returns
403 identity_mismatch
- all
Tenant Quotas (Phase C MVP)
- Enabled by
TENANT_QUOTA_ENABLED=true. - Buckets are tenant-scoped and per-process:
- recall:
TENANT_RECALL_RATE_LIMIT_RPS/BURST - write-like endpoints:
TENANT_WRITE_RATE_LIMIT_RPS/BURST - debug embeddings:
TENANT_DEBUG_EMBED_RATE_LIMIT_RPS/BURST
- recall:
- On exceed:
- HTTP
429 retry-afterheader- error codes:
tenant_rate_limited_recalltenant_rate_limited_writetenant_rate_limited_debug_embeddings
- HTTP
Endpoints
POST /v1/memory/recall
Default recall knobs are profile-driven:
- base default:
MEMORY_RECALL_PROFILE - optional layered overrides:
MEMORY_RECALL_PROFILE_POLICY_JSON- priority:
tenant_endpoint>tenant_default>endpoint> global default
- priority:
- optional adaptive downgrade on queue pressure:
MEMORY_RECALL_ADAPTIVE_DOWNGRADE_ENABLED=trueMEMORY_RECALL_ADAPTIVE_WAIT_MS=<threshold>MEMORY_RECALL_ADAPTIVE_TARGET_PROFILE=<profile>- adaptive downgrade only applies when request did not explicitly pin recall knobs
Profiles:
strict_edges(default):limit=24,hops=2,max_nodes=60,max_edges=80,ranked_limit=140,min_edge_weight=0.2,min_edge_confidence=0.2quality_first:limit=30,hops=2,max_nodes=80,max_edges=100,ranked_limit=180,min_edge_weight=0.05,min_edge_confidence=0.05legacy: historical defaults (30/2/50/100/100/0/0)
Request
query_embedding: number[](required, dim=1536)tenant_id?: stringscope?: stringconsumer_agent_id?: string(optional; enables lane-based visibility and read-side audit)consumer_team_id?: string(optional; used with team-private lane and team-scoped rules)limit?: number(default from recall profile, max 200)neighborhood_hops?: 1|2(default from recall profile)max_nodes?: number(default from recall profile, max 200)max_edges?: number(default from recall profile, max 100 hard)ranked_limit?: number(default from recall profile, max 500)min_edge_weight?: number(default from recall profile, max 1) stage-2 edge fetch filtermin_edge_confidence?: number(default from recall profile, max 1) stage-2 edge fetch filtercontext_token_budget?: number(optional; compactscontext.texttoward token budget using conservative estimation, keepsitems/citationsintact)context_char_budget?: number(optional; direct char budget override forcontext.text; takes precedence over token budget)context_compaction_profile?: "balanced"|"aggressive"(optional; section-level compaction preset, defaultbalanced)include_meta?: boolean(default false)include_slots?: boolean(default false)include_slots_preview?: boolean(default false)slots_preview_keys?: number(default 10, max 50)return_debug?: boolean(default false)include_embeddings?: boolean(default false; debug-only, see below)- Rules (optional planner injection):
rules_context?: any(normalized planner context object; seedocs/PLANNER_CONTEXT.md)rules_include_shadow?: boolean(default false)rules_limit?: number(default 50, max 200)
Response
tenant_id: stringscope: stringseeds: { id,type,title,text_summary,tier,salience,confidence,similarity }[]ranked: { id, activation, score }[](bounded byranked_limit)subgraph: { nodes: NodeDTO[], edges: EdgeDTO[] }context: { text: string, items: any[], citations: any[] }rules?: { scope, considered, matched, skipped_invalid_then, invalid_then_sample, applied }(only whenrules_contextis provided)debug?: { neighborhood_counts: { nodes:number, edges:number }, embeddings?: DebugEmbeddingDTO[], context_compaction?: { profile, token_budget, char_budget, applied, before_chars, after_chars, before_est_tokens, after_est_tokens, dropped_lines, dropped_by_section } }(only whenreturn_debug=true)
NodeDTO (whitelist)
- Always:
id,type,title,text_summary - For topics:
topic_state?,member_count? - For compression summaries:
type="concept"withslots.summary_kind="compression_rollup"(only ifinclude_slots=true)
- Slots:
include_slots=true:slotsinclude_slots_preview=true:slots_preview(topslots_preview_keys, sorted by key)
- Meta (only when
include_meta=true):raw_ref,evidence_ref,embedding_status,embedding_model,memory_lane,producer_agent_id,owner_agent_id,owner_team_id,created_at,updated_at,last_activated,salience,importance,confidence,commit_id
EdgeDTO (whitelist)
- Always:
from_id,to_id,type,weight - Meta (only when
include_meta=true):commit_id
POST /v1/memory/recall_text
Server embeds query_text using the configured embedding provider, then calls the same recall pipeline as /recall.
Request
- Same fields as
/recall, except:query_text: string(required)- server can apply default
context_token_budgetwhen request omits both compaction fields:MEMORY_RECALL_TEXT_CONTEXT_TOKEN_BUDGET_DEFAULT(0disables)
Response
- Same as
/recall, plus:query: { text: string, embedding_provider: string }
Upstream embedding failure mapping (/recall_text)
- provider throttled / rate limited:
429 upstream_embedding_rate_limited(+retry-after) - provider timeout / unavailable:
503 upstream_embedding_unavailable(+retry-after) - provider malformed response:
502 upstream_embedding_bad_response - server should not expose generic
500for known upstream embedding failures
Hard Constraints: Debug Embeddings
Debug embeddings are the most likely “silent data export” channel. The server enforces all of the following:
Allowed iff all conditions are met
return_debug=trueANDinclude_embeddings=true- Request is authorized via:
X-Admin-Tokenheader matchingADMIN_TOKEN, ORADMIN_TOKENis unset ANDNODE_ENV != productionAND the request is from loopback (127.0.0.1/::1)
limit <= 20(hard)
What is returned
- Embeddings are never included in
subgraph.nodes. - If allowed, the response includes
debug.embeddingswith bounded previews only:- max 5 seed nodes
previewfirst 16 dims onlysha256of the full vector text (integrity/debug)- hard size cap:
max_debug_bytes = 64KB(exceeding this returns400)
Draft Topics and Explainability
- Draft topics (
topic_state=draft) are excluded from scoring and cannot influence ranking. - For explainability, the server may append (or, if
max_nodesis already full, swap in) a small number of draft topics into the returned subgraph (bounded bymax_nodes) when they are strongly connected (part_of/derived_from) to top nodes.
Compression Preference in recall_text
- When compression concept summaries are present in ranked context,
context.textprefers summary-first rendering and reduces raw event fanout. - In compression mode, events already cited by selected compression summaries are excluded from context event listing to avoid duplicate token spend.
- Compression summaries must remain evidence-backed through citations and graph edges (
derived_from). - When
context_token_budgetorcontext_char_budgetis set, Aionis compactscontext.textby dropping lower-priority detail lines first (evidence fanout and verbose rule lines), while preserving structuredcontext.itemsandcontext.citations. context_compaction_profilepresets:balanced: keeps richer rule/event details when budget allows.aggressive: trims event fanout and verbose rule JSON earlier, preserving topic/concept and rule summary lines first.
Embedding Readiness (Derived Artifacts)
Embeddings are treated as derived artifacts:
/v1/memory/writemay accept nodes without embeddings and backfill them asynchronously./v1/memory/recall*only uses nodes withembedding_status=readyas vector seeds and as activatable content sources for ranking. Nodes not ready are not returned by default.- Default active recall tiers are
hot+warm(cold/archive are non-default memory layers).
Multi-Agent Lane Semantics (P2 MVP)
Write-time fields (optional):
- top-level:
memory_lane?: "private"|"shared",producer_agent_id?,owner_agent_id?,owner_team_id? - node-level overrides: same keys on each node
Behavior:
- New writes default to
memory_lane="private"unless explicitly overridden. - Legacy rows are backfilled as
sharedvia migration to preserve compatibility. - Hard guard (DB + API):
type="rule"withmemory_lane="private"must set at least one owner:owner_agent_idORowner_team_id
- Violations are rejected at write/promote time (
invalid_private_rule_owner). - DB constraint:
0014_private_rule_owner_guard.sql(NOT VALID, enforced for new rows).
- If
consumer_agent_idis provided on recall:- visible:
shared - visible:
private+owner_agent_id == consumer_agent_id - visible:
private+owner_team_id == consumer_team_id(when team id provided)
- visible:
- Recall read-side audit is recorded in
memory_recall_audit(best-effort, non-blocking).
Consolidation Canonicalization (Phase 3)
- Canonicalization is implemented as offline jobs and writes only node
slotsmetadata:- duplicate node:
alias_of,superseded_by - canonical node:
merged_from[],merged_count
- duplicate node:
- Consolidation apply includes contradiction guardrails for
topic/concept:- candidates with opposing polarity/negation semantics are skipped by default (
contradictory_candidate) - explicit override is possible only in offline job mode (
--allow-contradictory), and is audit-marked in slots/commit diff
- candidates with opposing polarity/negation semantics are skipped by default (
- Edge redirection is also offline-job driven: aliased node incident edges are redirected to canonical nodes with conflict-safe upsert semantics.
- Default mode is dry-run; explicit apply is required.
- API contract remains stable: these fields are visible only when
include_slots=true(or viaslots_preview).
Archive Rehydrate (Phase 4)
POST /v1/memory/archive/rehydrate
On-demand retrieval policy for long-horizon memory: selectively rehydrate archived/cold nodes into active tiers.
Request
tenant_id?: stringscope?: stringactor?: stringnode_ids?: uuid[](1..200)client_ids?: string[](1..200)target_tier?: "warm"|"hot"(defaultwarm)reason?: stringinput_textorinput_sha256(required)
At least one of node_ids or client_ids must be provided.
Behavior
- Only nodes whose current tier rank is below
target_tierare moved.- e.g.
archive -> warm,cold -> warm,warm -> hot
- e.g.
- Moved nodes get:
last_activated=now()- slots markers:
last_rehydrated_at,last_rehydrated_job,last_rehydrated_from_tier,last_rehydrated_to_tier,last_rehydrated_reason,last_rehydrated_input_sha256
- Writes are commit-audited via
memory_commits.
Response
tenant_id,scope,target_tiercommit_id,commit_hash(null when nothing moved)rehydrated: requested/resolved/moved/noop/missing counters + ids
Node Activation / Feedback (Adaptive Decay Signals)
POST /v1/memory/nodes/activate
Ingest node-level activity and optional positive/negative signal, used by adaptive decay.
Request
tenant_id?: stringscope?: stringactor?: stringnode_ids?: uuid[](1..200)client_ids?: string[](1..200)run_id?: stringoutcome?: "positive"|"negative"|"neutral"(defaultneutral)activate?: boolean(defaulttrue)reason?: stringinput_textorinput_sha256(required)
At least one of node_ids or client_ids must be provided.
Behavior
- Resolves target nodes by id/client_id in scope.
- For found nodes:
- updates feedback counters/quality in slots:
feedback_positive,feedback_negative,feedback_qualitylast_feedback_outcome,last_feedback_at,last_feedback_run_id,last_feedback_reason,last_feedback_input_sha256
- optionally updates
last_activated(activate=true) - updates
commit_id
- updates feedback counters/quality in slots:
- All writes are commit-audited via
memory_commits.
Response
tenant_id,scopecommit_id,commit_hash(null when no node found)activated: requested/resolved/found/updated/missing counters + ids
POST /v1/memory/rules/evaluate
Evaluate SHADOW/ACTIVE rules against a caller-provided execution context. This is intended to be called by a planner/tool-selector before making decisions.
Request
context: any(required)tenant_id?: stringscope?: stringinclude_shadow?: boolean(default true)limit?: number(default 50, max 200)
Response
tenant_id: stringscope: stringconsidered: numbermatched: numberskipped_invalid_then: number(rules skipped due to invalidthen_jsonschema)agent_visibility_summary:agent: { id, team_id }rule_scope: { scanned, filtered_by_scope, filtered_by_lane, filtered_by_condition, skipped_invalid_then, matched_active, matched_shadow }lane: { applied, reason, legacy_unowned_private_visible, legacy_unowned_private_detected }reasonenum:missing_agent_contextenforced_agent_onlyenforced_team_onlyenforced_agent_team
active: RuleMatchDTO[]shadow: RuleMatchDTO[](omitted from matching wheninclude_shadow=false)applied: { policy, sources, conflicts, shadow_policy?, shadow_sources?, shadow_conflicts? }applied.conflict_explain[](readable winner/loser explanation per conflict path)applied.tool_explain(readable tool.* conflict/winner explanation; computed with special semantics)applied.shadow_tool_explain(wheninclude_shadow=true)
RuleMatchDTO (whitelist)
rule_node_id,state,summaryrule_scope,target_agent_id?,target_team_id?if_json,then_json,exceptions_jsonstats: { positive, negative }rank: { score, evidence_score, priority, weight, specificity }match_detail: { condition_paths, condition_path_count }commit_id
Notes:
- This endpoint never returns embeddings.
then_jsonis a strict policy patch schema (minimal, stable) designed for planner/tool-selector injection.- Rule scope filtering is applied before condition matching:
global: always eligibleteam: requirescontext.agent.team_id(orcontext.team_id) to equaltarget_team_idagent: requirescontext.agent.id(orcontext.agent_id) to equaltarget_agent_id
- Rule lane filtering:
- when
context.agent.idorcontext.agent.team_idis provided, rule node visibility is lane-enforced:- visible:
shared - visible:
private+ owner match (owner_agent_id/owner_team_id) private+ no owner fields is treated as non-visible (legacy data defect); detected count is surfaced inlane.legacy_unowned_private_detected
- visible:
- when no agent/team context is provided, lane filter is not enforced (
lane.applied=false,reason=missing_agent_context).
- when
PolicyPatch (then_json)
output?: { format?: "json"|"text"|"markdown", strict?: boolean }tool?: { allow?: string[], deny?: string[], prefer?: string[] }extensions?: object(namespaced escape hatch)- Rule rank metadata can be set in rule node
slots.rule_meta:slots.rule_meta.priority(int, default 0, range -100..100)slots.rule_meta.weight(number, default 1, range 0..2)
- Rule scope metadata can be set in rule node slots:
slots.rule_scope:"global"|"team"|"agent"(defaultglobal)slots.target_team_id: required whenrule_scope="team"slots.target_agent_id: required whenrule_scope="agent"
Recommended context shape (non-binding)
- See
docs/PLANNER_CONTEXT.mdandexamples/planner_context.json.
POST /v1/memory/tools/select
Apply ACTIVE rule policy to a caller-provided list of tool candidates and return a deterministic selection decision. This endpoint is a convenience wrapper around rules evaluation + tool policy application, intended for tool selector integration.
Request
context: any(required; normalized planner context recommended)candidates: string[](required, 1..200)tenant_id?: stringscope?: stringrun_id?: string(recommended; execution run correlation id)include_shadow?: boolean(default false; if true, returns a non-enforcingshadow_selection)rules_limit?: number(default 50, max 200)strict?: boolean(default true)
Response
tenant_id: stringscope: stringcandidates: string[](deduped)selection: { candidates, allowed, denied, preferred, ordered, selected }rules: { considered, matched, skipped_invalid_then, invalid_then_sample, agent_visibility_summary, applied, tool_conflicts_summary, shadow_selection?, shadow_tool_conflicts_summary? }decision: { decision_id, run_id, selected_tool, policy_sha256, source_rule_ids, created_at }
Notes:
- This endpoint never returns embeddings.
- Only
rules.applied.policy.toolis interpreted by the selector:tool.allow: allowlist (hard). If multiple matched rules specifytool.allow, the effective allowlist is the intersection.tool.deny: denylist (hard). Effective denylist is the union.tool.prefer: ordering hint (soft). Effective prefer order is rank-aware: higher rule rank has higher priority.- rank =
priority * 1000 + evidence_score * 100 * weight + specificity
- rank =
strictbehavior:strict=true(default): if allow/deny filters remove all candidates, returns400 no_tools_allowedstrict=false: iftool.allowremoves all, fall back to deny-only (ignore allowlist) and returnselection.fallback
tool_conflicts_summaryis a short, UI/log-friendly list derived fromapplied.tool_explain.conflicts(bounded; not exhaustive).rules.agent_visibility_summaryfollows the same shape/semantics as/v1/memory/rules/evaluate.
POST /v1/memory/tools/feedback
Feedback loop for tool selection decisions. This attributes an outcome to the matched rules and updates their verification stats (positive_count / negative_count) for ordering and governance.
Request
context: any(required)candidates: string[](required)selected_tool: string(required)outcome: "positive"|"negative"|"neutral"(required)tenant_id?: stringscope?: stringactor?: stringrun_id?: stringdecision_id?: string(UUID; if omitted, server attempts inference and may create a feedback-derived decision record)target?: "tool"|"all"(default"tool")include_shadow?: boolean(default false; if true, also attributes to matched SHADOW rule sources)rules_limit?: number(default 50, max 200)note?: stringinput_textorinput_sha256(required; same rule as/v1/memory/feedback)
Response
ok: truetenant_id: stringscope: stringupdated_rules: numberrule_node_ids: string[]commit_id: string|nullcommit_hash: string|nulldecision_id: stringdecision_link_mode: "provided"|"inferred"|"created_from_feedback"decision_policy_sha256: string
Verification Stamp
- Last reviewed:
2026-02-16 - Verification commands:
npm run test:contractnpm run docs:check