0Overview

This walkthrough explains the validation-graph design end-to-end: what it is, why each decision was made, and how it fits together. Aimed at the engineering team for shared understanding before implementation begins.

The validation graph is the knowledge layer of the Aucert product. It comprises two graphs (one per customer + one shared), enforces multi-tenant isolation and ACLs, supports natural-language and structured queries, and is continuously enriched by autonomous agents (the rover swarm). It's the substrate against which everything else is validated.

How to read this: Sections 1–3 set the conceptual frame. Sections 4–11 walk through each design area with diagrams and decisions. Sections 12–14 cover tech choices, abstractions, and scale. Sections 15–16 look forward.

For full reasoning + alternatives + worked examples, see the design notes and SPEC-035.

1Premise: a two-graph knowledge layer

Aucert's product depends on a substrate that captures everything we know — about apps, policies, bugs, workflows, people, and the patterns that emerge across customers.

Two graphs, not one

The knowledge layer comprises two graphs:

  • Tenant Graph (per-customer, private). Holds everything one customer owns or discovers — apps, bugs, policies, people, processes. Deployable in Aucert cloud OR customer-hosted on-prem. The thing the customer "takes with them."
  • Ecosystem Graph (singular, Aucert-owned, shared). Holds curated ontology, abstracted patterns, public policies (Google Play, App Store), and learnings derived across the tenant base. The "holy grail" of mobile testing knowledge.
flowchart LR
    subgraph Customers["Per-customer (cloud OR on-prem)"]
      TG1["Tenant Graph
(Acme)"] TG2["Tenant Graph
(Beta Co)"] TG3["Tenant Graph
(...)"] end subgraph Aucert["Aucert (cloud-only)"] EG["Ecosystem Graph
canonical ontology
+ patterns + policies"] end TG1 -->|references via eco: prefix| EG TG2 -->|references via eco: prefix| EG TG3 -->|references via eco: prefix| EG TG1 -.promotion pipeline.-> EG TG2 -.promotion pipeline.-> EG
Two-graph knowledge layer. References are one-way only: tenant → ecosystem.

Why two graphs?

Tenant data is private and customer-owned. Ecosystem data is shared abstracted learnings owned by Aucert. Mixing them would either leak customer data or restrict ecosystem flexibility. Separating them gives both proper governance.

Validation substrate

Both graphs at end-state act as a validation substrate. Downstream — the 5-layer testing pipeline, the rover, agents — uses the graph to validate test outcomes, decisions, and app behavior. The Tenant Graph also has an internal validation pipeline because raw rover observations arrive unverified.

2Design principles

Thirteen user-stated principles shape every decision. If a decision violates a principle, the principle wins until amended.

P1
Conditions are first-class
Not features attached to edges. Have lifecycle, ownership, ACLs; reasoned about as data.
P2
Generic enough to scale
No hardcoded domain-specific operators or assumptions. Different products have different vocabularies.
P3
Domain-agnostic
Serves mobile, web, backend, docs. No mobile-specific schema baked into the core.
P4
Multi-dimensional versions
RN + App + Native, or service + schema, or browser + API. Compose constraints across them.
P5
LLM- and agent-accessible
NL ↔ structured roundtrip. LLMs and agents are first-class consumers AND authors.
P6
Mostly-incremental versions
Forward inference allowed; contradiction-aware. Internal upgrades may regress.
P7
Documentation richness
Notes carry full reasoning, alternatives, examples, NL ↔ structured pairs. Sufficient to generate user-facing docs and visuals.
P8
Reuse over reinvention
Adopt CEL for conditions, ACL policies, filters everywhere. Don't build what already exists, battle-tested.
P9
Integrate with customer IdP
Their identity / groups / permissions are the source of truth. We sync; we don't ask them to re-author.
P10
Server-side enforcement
Feature gating with security implications must be server-enforced. UI is for UX, not security.
P11
Graceful degradation
Engine never fail-stops. Tenant downgrade keeps data; only feature evaluation changes.
P12
Open-core readiness
Enterprise features as plug-in modules — even if we never open-source. Forces clean separation.
P13
Graph-native abstraction
Storage abstraction speaks graph terms (nodes, edges, traversals). Postgres is an implementation detail.

3Substrate: how facts are represented

Every fact about the world is a claim. The graph is the materialized current-truth view; the claim log is the append-only source of truth.

flowchart TB
    R[Rover / Agent / Human / Integration] -->|asserts claim| CL[Append-only Claim Log
D3 + C1] CL -->|materialized into| GE[Graph Entity
nodes + edges] GE -->|fast hot reads| Q[Query Layer] CL -->|cold path: provenance, audit, conflict| Q R2[Rover bug discovered] -->|retraction claim D10| CL CL -->|recompute current truth| GE
The (a′) hybrid model. Hot reads hit the entity; cold queries hit the log.

Key model decisions

D3
Hybrid storage (a′)
Entity carries current-truth properties for fast reads. Claim log holds full history beside the graph (source of truth). Best of pure-current-state and pure-claim-log.
D2
Two trust axes: validation_status + confidence
Validation status is process state ("verified yet?"). Confidence is belief measure (0–1). Orthogonal — a verified fact can have low confidence; an unverified fact can have high prior confidence.
D10
Retractions are claims
Predicate retracts; targets a prior claim. Append-only. Restoration = retract the retraction. Multi-party conflicts preserved (rover-A retracted, admin un-retracted).
D9
Subject context = open variable bindings
Each claim's subject_context is a Map<variable_id, value>. tenant_id and partition_id stay fixed envelope fields outside the bindings (storage isolation primitives).
D17
Bitemporal-lite
Each claim carries asserted_at (mandatory system time) + optional valid_from / valid_to. Engine answers both "what's true now?" and "what did we believe at time T?"
D16
Validation status: 5 states
unverified → pending_review → verified | rejected | disputed. Transitions are themselves claims; full audit of how status evolved.
D6
Symmetric nodes/edges
Both first-class — both with properties and claim history. Edges aren't degenerate "thin" objects.
D7
Decomposed ownership roles
creator (immutable audit) + owner (transferable accountable) + maintainer(s) (collaborators). Single-owner field is too coarse for an agent-driven graph.
D18
Entity identity + claim-based split/merge
Stable opaque entity_id. Identity resolution at write is writer-side. Split via (predicate: split_into, targets: [...]); merge via (predicate: merged_into, target: canonical). Both reversible via standard retraction.

Worked example: the rover discovers a screen

{
  "claim_id": "clm_01h3v2k...",
  "tenant_id": "acme",
  "partition_id": "payments-app",
  "subject_context": {
    "env": "prod",
    "app_version": "3.5.0",
    "rn_version": "0.71.4",
    "enable_wallet": true,
    "device_type": "android"
  },
  "target_type": "NODE",
  "target_id": "wallet-screen",
  "asserted_by": "agent:rover-v3",
  "asserted_at": "2026-05-15T14:32:00Z",
  "valid_from": "2026-05-15T14:32:00Z",
  "confidence": 0.93,
  "validation_status": "unverified",
  "value_payload": { "exists": true, "screen_role": "Wallet" }
}

4Tenancy & partitions

One logical Tenant Graph per customer, internally organized by named partitions.

flowchart TB
    subgraph TG["Tenant Graph (Acme)"]
      P1["payments-app partition"]
      P2["hr partition"]
      P3["policies partition"]
      P4["_acl reserved partition
(Roles, Policies, Tags, Mappings)"] P5["_types reserved partition
(type definitions)"] end P1 -.tagged 'project-redesign-2026'.-> T1[Tag entity] P3 -.tagged.-> T1
Partitions inside a Tenant Graph. Reserved partitions prefixed with _.
D4
One logical Tenant Graph per customer
Holds all their apps, bugs, policies, people. Cross-product queries are common; splitting per-app would force cross-graph joins.
D5
Named partitions inside the tenant graph
Every node and edge declares a partition. payments-app, hr, etc. Partitions are first-class organizing scopes — natural ACL boundaries, natural promotion scopes, natural blast-radius limits.
D14
Tag-based scoping for transient teams
Tags are first-class graph entities. has_tag(target, tag_id) plugs into CEL filters. Solves "working group X gets permissions on a project's entities" without graph traversal — partitions are too structural for transient grouping.

5Identity & ACL

Identity is externalized; permissions live in the graph; ACL is hybrid (ABAC + role-derived shortcuts).

Identity: externalized via WorkOS

flowchart LR
    Okta["Customer IdP
(Okta / Azure AD / Google)"] -->|SCIM/SSO| WorkOS WorkOS -->|unified API| IDS[Aucert Identity Service] IDS -->|principal_id refs + events| VG[Validation Graph] VG -.does NOT mirror.-x WorkOS
D13.8 — graph references HUMAN/GROUP/ORG by opaque principal_id; no mirroring.
D8
Open principal set
HUMAN, AGENT, GROUP, SYSTEM, ORG — each principal carries principal_type discriminator. AGENT principals (rover, atlas, future agents) live as graph entities; HUMAN/GROUP/ORG live in identity service.
D13.8
Identity service is source of truth
Validation graph references principals by principal_id. Identity service emits canonical events (deactivation, group membership change, deletion); graph reacts via cache invalidation + auto-create/retract of GroupRoleMapping-derived assignments.

ACL evaluation pipeline

flowchart LR
    Req["Request: principal P, action A, target T"] --> Eval[ACL Evaluator]
    Eval --> R1{D7 role-derived?
creator/owner/maintainer/viewer} R1 -->|implies| Allow1[ALLOW] Eval --> R2{Custom role assignment?
D13 RoleAssignment} R2 -->|matching| Allow2[ALLOW] Eval --> R3{CEL policy match?
D13.7 _acl partition} R3 -->|effect| Combine[Combine: deny-overrides D13.4] R1 --> Combine R2 --> Combine Combine --> Result[ALLOW or DENY] Combine --> Default{No match?} Default -->|deny-by-default D13.5| DenyDefault[DENY]
Hybrid ACL: built-in role shortcuts + custom roles + CEL policies, combined with deny-overrides.
D13
Hybrid ACL: ABAC + role-derived shortcuts
Built-in roles (creator/owner/maintainer/viewer) imply common permissions. Custom roles are first-class graph entities (Role nodes with permission bundles); assignments scoped by (principal, role, object_scope, expires_at?).
D13.2 → D13.7: ACL sub-decisions
D13.2
Granularity via CEL object_filter
Partition / type / entity / tag — all expressed in one mechanism. Property-level reserved (F8 deferred). Hierarchical/transitive deferred (F7).
D13.3
Action vocabulary: namespaced
<resource>:<verb> — 22 actions across 9 namespaces (entity / claim / role / permission / tag / variable / condition / validation / ownership). Wildcards via CEL string ops. Action implications explicit.
D13.4
Combination semantics: deny-overrides
Any matching DENY rule wins. Standard security default.
D13.5
Default policy: deny-by-default
No matching rule = deny. Built-in role shortcuts cover common cases.
D13.6
Edge-traversal permissions
traversal_visibility ∈ {visible, hidden, inherit_target}. Default visible returns redacted placeholders for unreadable targets. New entity:traverse action separable from entity:read.
D13.7
ACL artifacts in _acl reserved partition
Roles, Policies, RoleAssignments, GroupRoleMappings, Tag definitions. Caching layer (TTL ~60s) invalidated on writes. Decision response carries rules_evaluated_at_version for honest audit.

Worked example: custom role for a transient working group

# Transient team (D8 GROUP/TEAM principal in identity service)
team "wg-payment-redesign-2026" {
  members: [user:alice, user:bob, team:design-leads]
  expires_at: "2026-12-31"
}

# Tag relevant entities (D14)
tag "project-payment-redesign-2026" applied to:
  - workflow:checkout-v2
  - workflow:wallet-integration-v2
  - bug:payment-flow-stale-state

# Custom role assignment with tag-based scope
assign role "maintainer"
  to team:wg-payment-redesign-2026
  on scope: object_filter: 'has_tag(target, "project-payment-redesign-2026")'
  expires_at: "2026-12-31"

When the project ends → tag archived → permissions auto-stop applying → team dissolves. Clean.

6Conditions & variables

Entities carry conditions — structural validity rules expressed in CEL. Variables are first-class.

D12.1
Conditions on nodes/edges, not claims
Claims carry subject_context (observation point); entities carry condition (structural validity rule). Clean separation between observation and rule.
D12.2
Adopt CEL (Common Expression Language)
Google's safe expression language used by K8s admission, Envoy, GCP IAM. Multi-language implementations (Java/Go/TS). Conformance test suite, bounded compute, LLM-aware. Custom Aucert functions for version comparison etc.
D12.3
Variables: 3-layer registry, primitives only v1
L1 built-in / L2 ecosystem-shared / L3 tenant-specific. Variables are first-class graph entities with rich properties (type, monotonicity hint, default, indexing flags). Primitives only in v1; structured composites in F4.
D11
Inference at unobserved contexts
Engine extrapolates validity at unobserved contexts using variable-type monotonicity hints. Inferred answers carry evidence_strength: inferred + derivation chain. Forward and gap-fill on by default for monotonic variables.
D12.4
Kleene three-valued logic for unbound
Unbound variable → unknown. Per-variable optional default_value. Security guard allows_default: false on tenant_id, partition_id, etc.

Worked example: condition on a Wallet entity

# CEL text (canonical storage form per C3)
version_gte(rn_version, "0.71") &&
version_in_range(app_version, "3.5", "4.0") &&
enable_wallet == true

Variables involved:

  • rn_version — L2 ecosystem-shared, type Version, monotonicity mostly-incremental
  • app_version — L2 ecosystem-shared, type Version, monotonicity mostly-incremental
  • enable_wallet — L3 tenant-specific (acme), type Boolean, monotonicity none
Kleene example: Query with context {rn_version: "0.72", app_version: "3.7"}enable_wallet is unbound, no default. Result: true && unknown → unknown. Consumer (e.g., L1 generation) sees this and generates BOTH enable_wallet=true and enable_wallet=false test paths automatically.

7Type system

Three-layer hybrid type registry; required-core + open extensions; single inheritance + interfaces.

LayerWhatExamples
L1 — Built-inAucert-defined; available to all tenantsEntity, Workflow, Tag, Role, Policy, Variable
L2 — Ecosystem-sharedDeclared in Ecosystem Graph; canonical cross-tenant meaningScreen, Component, Bug, TestCase, Document, WikiPage, Spec, GoogleDoc
L3 — Tenant-specificDeclared in tenant graph; tenant-prefixed namesacme:LoyaltyPoint, acme:RegionalPricingTier
Q8.1
3-layer hybrid type system
Mirrors D12.3 variable registry. Cross-tenant patterns consistent (L2); tenant flexibility (L3); promotion path L3 → L2 when patterns stabilize.
Q8.3
Required-core + open extensions
Type's property_schema declares required-core properties (validated at write). Instances may carry additional properties as extensions (accepted, not validated). Promotion to required-core via deliberate schema versioning. Matches how the rover discovers properties before formal declaration.
Q8.5
Single inheritance + interfaces
One extends parent per type, transitive. Multiple implements for cross-cutting capabilities (Searchable, Embedded, Spatial, Localized, Versioned). Polymorphic queries resolve type-set membership.

8Embeddings & AI integration

AI consumers (rover, atlas, agents) need semantic retrieval, not just symbolic queries.

flowchart LR
    subgraph Storage["Storage (D20)"]
      V[Vector property on entity
source of truth in graph] VI[(VectorIndex
derived data)] V -.replicates.-> VI end Q[Query: by-vector / by-entity / by-NL] --> SS[search_semantic D22] SS -->|filter via CEL| VI VI --> Results[Top-K results + similarity scores
+ ACL filtering]
Vectors stored in graph (source of truth) + replicated to pluggable VectorIndex.
D19
Embeddings via type-driven Embedded interface
Selective: content entities (Screen, Component, Workflow, Bug, TestCase, Document, etc.) embed; system types (Role, ACL Policy, Variable, Tag) do NOT. Type declares embedding_source (which properties feed the vector).
D20
Vector storage hybrid + pluggable index
Vector stored as a structured property on the entity (source of truth). Replicated to VectorIndex abstraction. Implementations: pgvector (default), Qdrant, Pinecone, Null (OSS). Vector dimension per-model; tenant isolation via tenant_id partitioning.
D21
Multi-model coexistence during transitions
Each embedding carries embedding_model_id. Vector index partitions by model. New model → both partitions live → background re-embed → old decommissioned when coverage threshold met.
D22
Unified search_semantic API
Three input modes (raw vector / by-entity / by-NL-query). Structural filtering via CEL (P8 reuse). Index-aware metadata filtering. ACL respected automatically; redacted placeholders for unreadable matches.

Worked NL → search example

# NL: "Find me Wallet-like screens in payments-app that I have access to."

search_semantic({
  entity_id: "wallet",
  filter: 'object_type(target) == "Screen"
           && entity_partition(target) == "payments-app"',
  k: 10
})

# ACL applied automatically; results ranked by similarity.

9Cross-graph references

Tenant entities can reference ecosystem entities. Compact prefix scheme; one-way only.

D23
Compact prefix references
Bare IDs for intra-tenant (default scope). eco:<entity_id>[@version] for ecosystem references. Direction restricted to one-way: tenant → ecosystem only. No tenant→tenant or ecosystem→tenant; engine rejects at write.
ReferenceResolves to
scr_01h3...Intra-tenant entity (current tenant graph)
eco:wallet-pattern-v3Ecosystem entity, current floating version
eco:gdpr-policy-v5@5Ecosystem entity, pinned to schema_version 5

Lifecycle states (deprecated / sunsetting / superseded / merged_into) returned in resolution metadata; consumer interprets. No "broken" refs.

10API shape

gRPC + protobuf as canonical core; thin adapters for other surfaces.

flowchart TB
    subgraph Core["Core API: gRPC + protobuf"]
      Ops["Entity CRUD · Claims · Filter queries (CEL)
Traversal · Path queries · search_semantic
Subscriptions · Registry CRUD · Introspection"] end Core -.adapter.-> REST[REST adapter
auto-gen from proto] Core -.adapter.-> NL[NL adapter
LLM-translated] Core -.adapter.-> GQL["GraphQL adapter
F-graphql-api (deferred)"] Core -->|direct| Internal[Internal services
L1-L5 pipeline, billing, identity] REST --> Humans[Humans / scripts] NL --> Agents[AI agents] GQL --> UIs[Admin console / dashboards]
D24 — gRPC core, adapters for REST / NL / GraphQL.
D24
gRPC + protobuf core; adapter layers
Single canonical core. Adapters: REST (auto-gen), NL (LLM-translated), GraphQL (deferred). All adapters use CEL for filtering. Subscriptions via gRPC streams with CEL filters. ACL + entitlements enforced uniformly.

11Deployment & tenancy tiers

Same code path across cloud, on-prem, and OSS — only the entitlement source differs.

flowchart LR
    subgraph Cloud["Cloud"]
      BS[Billing Service] -->|entitlements.changed events| EE1[Engine: BillingServiceEntitlementSource]
      EE1 --> CFG1[ACL + feature gates]
    end
    subgraph OnPrem["On-prem (commercial license)"]
      LK[Cryptographic License Key
RSA/ECDSA signed, verified locally] --> EE2[Engine: LicenseKeyEntitlementSource] EE2 --> CFG2[ACL + feature gates] end subgraph OSS["OSS community (potential future)"] NoLK[No license; basic features only] --> EE3[Engine: NullEntitlementSource] EE3 --> CFG3[ACL + basic features only] end
Same enforcement code path; entitlement source pluggable.
D15
Multi-deployment entitlement enforcement
Cloud → billing service. On-prem → cryptographic license keys (works air-gapped). OSS → community binary built without enterprise modules. Engine layer is the consistent gate.

Tiered tenancy + horizontal pod scaling

TierTenantsIsolation
Small / hobbyist100K+RLS-pooled across multiple Postgres pods (5K–10K per pod)
Mid-market1K–10KSchema-per-tenant in shared cluster
Enterprise100–1KDatabase-per-tenant or dedicated cluster
flowchart TB
    Router["Routing Layer
(tenant_id → pod_id)"] Router --> P1["Pod 1
5-10K tenants
RLS"] Router --> P2["Pod 2
5-10K tenants
RLS"] Router --> P3["Pod 3
5-10K tenants
RLS"] Router --> Pn["Pod N
5-10K tenants
RLS"]
D15.1 — Salesforce-style horizontal pod scaling. Each pod = one Postgres cluster.
D15.1
Horizontal pod scaling for pooled tier
100K+ small tenants across multiple Postgres pods. Lookup-based routing layer (cached) maps tenant_id → pod via global metadata service. Salesforce-pod pattern.

12Tech choice & migration paths

Postgres + pgvector for both day-1; pluggable abstraction keeps migration paths open.

flowchart LR
    subgraph Day1["Day 1 (cloud + on-prem)"]
      PG[PostgresGraphStorageEngine
Postgres 18 + pgvector + JSONB + CTEs] end subgraph CloudFuture["Cloud — when triggered"] Mem[MemgraphGraphStorageEngine] N4[Neo4jGraphStorageEngine] Citus[Sharded Postgres / Citus] end subgraph OnPrem["On-prem — permanent"] PGOnPrem[PostgresGraphStorageEngine
same code] end PG -->|migration triggers fire| Mem PG -->|or| N4 PG -->|or| Citus PG -.permanent.-> PGOnPrem
D26 — single tech day-1; pluggable abstraction enables cloud migration when measurements demand.
D26
Postgres + pgvector + JSONB
Single shared PostgresGraphStorageEngine for both cloud and on-prem day-1. Cloud migration triggers documented; on-prem stays permanently. Multi-SQL-backend support is custom-engagement only — Postgres bundle ships standalone for "we don't run Postgres" customers.

Migration triggers (any one fires)

TriggerMigration target
Cloud aggregate ≥ 100M entitiesMemgraph / Neo4j / Citus (TBD by measurement)
p95 traversal latency >500ms at depth ≥3 sustainedSame
Shortest-path latency at depth ≥5 consistently >500ms p95Same (likely first to fire)
Cloud vector index ≥50M embeddings OR pgvector p95 search >500msQdrant or Pinecone
Single-shard write throughput saturatedCitus OR distributed graph engine
Note on deep traversal: Postgres CTE handles 1–3 hop traversals comfortably. 5–10 hop shortest-path queries on large apps are the most likely first migration trigger. Mitigation: bidirectional BFS at engine layer + pre-computed shortest-path cache + embedding-guided pruning. All built behind P13 abstraction so the application doesn't change when we migrate.

Replay-based per-tenant migration playbook

  1. Snapshot tenant X at time T from Postgres.
  2. Migrate snapshot to new engine.
  3. Replay claims since T from Postgres claim log into new engine.
  4. Run shadow-reads (queries hit both; results compared).
  5. Cut over reads to new engine when shadow-read confidence threshold met.
  6. Keep Postgres write path active for N days; can roll back.
  7. Decommission Postgres for that tenant.

Per-tenant migration means: small tenants migrate fast; large ones carefully; never "stuck" mid-migration.

13The five plug-in abstractions

Locked from day 1. Migration paths flow through these.

AbstractionDay-1 implementationFuture implementations
GraphStorageEnginePostgresGraphStorageEngineMemgraph / Neo4j / Sharded Postgres
VectorIndexPgvectorVectorIndexQdrant / Pinecone / Null
EntitlementSourceBilling (cloud) / License (on-prem) / Null (OSS)(covered)
IdentityServiceClientgRPC to Aucert identity service(single; identity service swappable)
EventBusPostgres LISTEN/NOTIFYKafka / Redpanda
P13 in practice: Application code calls graph.findShortestPath(from, to, maxDepth=10). The Postgres adapter implements this with CTEs internally; the eventual Neo4j adapter uses Cypher's shortestPath. Same call site; different backend. No SQL or Cypher above the abstraction.

14Scale envelope

D25 — illustrative ranges; tunes as we measure year-1 customer behavior.

Per-tenant ranges

DimensionSmall earlyMid-marketLarge enterprise
Apps in scope1–3 mobile5–10 mobile + web10–50 across mobile/web/backend
Entities3K–15K50K–200K500K–2M
Claims (full history)30K–750K5M–50M50M–500M
Storage100MB–2GB5GB–50GB50GB–500GB

Cloud aggregate by year 3 (pessimistic)

~2B entities · ~500B claims · ~1PB storage · 100K–1M reads/sec · 10K–100K writes/sec

Latency targets

Operationp50p95p99
Get entity by ID<10ms<50ms<200ms
Filter query (CEL)<50ms<200ms<1s
Semantic search<100ms<500ms<2s
Traversal (3 hops)<100ms<500ms<2s
ACL evaluation (per request)<5ms cached / <50ms uncached

15Future enhancements

End-state goals with explicit migration paths. Today's design preserves them all.

F1
Claims-as-primary (b)
Drop materialized current-truth; derive from log. Cheap migration thanks to C1 (claim-shaped log).
F2
Capability-graph ACL (Zanzibar)
All permissions become graph edges. Cheap migration thanks to C2 (principal IDs in role fields).
F3
CEL grammar v2 extensions
Tenant-defined functions or richer built-in operators. Versioned grammar field preserves compatibility.
F4
Structured composite variable types
Records, nested types, parametric. When primitives + multiple variables prove insufficient.
F5
Entity-property refs in CEL
Conditions can reference other entity properties. Engine extension; v2 grammar.
F6
Materialized inference claims
Inference engine writes derivations to log when scale demands it.
F7
Hierarchical / transitive ACL
Subgraph reachability scopes. Engine traversal at ACL-check time.
F8
Property-level ACL
Sensitivity tags + property_filter (already reserved in rule shape). Triggered by PII / regulated data.
F9
External authorization adapter
Pluggable adapter for advanced enterprise auth services (OPA, BeyondTrust, custom).
F10
Denormalized principal display cache
Small in-graph cache (NOT mirror) for query performance, when fan-out becomes bottleneck.
F-multi-facet
Multi-facet embeddings
Visual + text + combined per entity. Triggered by visual screen similarity use case.
F-graphql-api
GraphQL adapter for UIs
Powerful query composition for reactive UIs. Deferred unless UI complexity demands.
F-graph-vector-fusion
Engine-side combined queries
Single-step semantic + traversal. Two-step composition is day-1 default.
F-per-tenant-embedding-model
Per-tenant embedding model
For enterprise customers with compliance-approved model requirements.
F-deep-merge-semantics
Richer split/merge
Partial merges, attribute reconciliation, multi-way merge. When basic claim-based mechanism (D18) proves insufficient.

16What's next

From design to implementation.

Already done

  • Design notes (`.tasks/drafts/validation-graph-design-notes.md`) — full reasoning, alternatives, worked examples, NL ↔ structured pairs. ~1500 lines.
  • Architecture summary (`.tasks/drafts/validation-graph-architecture-summary.md`) — readable distillation with mermaid diagrams.
  • SPEC-035 (`docs/specs/drafts/SPEC-035-validation-graph.md`) — formal spec ready for review.
  • ADR-005 amended with supersession note pointing to SPEC-035.
  • This walkthrough — for team explanation.

Coming up

  1. Team review of SPEC-035 before approval.
  2. Day-1 POC — small Postgres + pgvector + CEL POC validating the design before full implementation.
  3. Implementation roadmap — module structure, sequencing, team allocation.
  4. Operational sub-specs — Q-* backlog items get their own specs as their time comes (validation workflows, pod architecture, embedding rollout, migration playbook, etc.).
  5. Promotion pipeline design (Q1) — once Tenant Graph is built, design how facts flow tenant → ecosystem.

Open operational backlog (parked)

Not graph-shaping; will be designed when their time comes. See design notes' "Operational" section for the full list — Q-validation-workflows, Q-merge-split-deep-dive, Q-vector-replication, Q-embedding-rollout, Q-postgres-adapter-optimizations, Q-pod-architecture, Q-migration-playbook, Q-tenancy-tiering, Q-identity-resolution, Q-debug, Q-license-*, Q-entitlement-*, Q-hook-ip-framework, plus Q5.2–Q5.5 (rover internals).

17References

Links to companion documents and external standards.

Aucert documents

External standards

Generated from the design notes 2026-05-15. Last updated as the SPEC and design notes evolve.