Skip to main content

Shared KB bifurcation — wiki-indexed now, vector-RAG deferred (2026-05-08)

This document captures the 2026-05-08 decision to bifurcate the per-agent memory work into two layers with different storage strategies. It exists so the deferred path (vector-RAG seeding of shared_kb_db) can be picked up later without re-doing the brainstorming, and so the rationale survives the agent-context window of any single session.

Context

The original handover at per-agent-memory-handover-2026-05-06.md proposed a two-layer memory system:

  1. Shared memoryshared_kb_db seeded with canonical terms, approved specs, ADRs, and system topology. Used by all agents. Served via vector-RAG (pgvector + embedding model).
  2. Per-agent memoryagent_memory_db. Per-agent state written at runtime by the agent. Vector-RAG over private memories.

PR-1 (merged 2026-05-07, aucert#115) shipped the infrastructure for both: VECTOR allow-list on the internal PG server, Flyway block for shared_kb_db, and SHARED_KB_DB_* keys on astra-db-credentials. After PR-1, shared_kb_db has 8 tables but is empty.

Phase 2 (PR-2) was originally scoped as a Kubernetes Job that walks docs/specs/approved/*.md, .context/decisions/*.md, .context/ARCHITECTURE.md, and .context/GLOSSARY.md, embeds chunks via EmbeddingClient (AWS Bedrock + Titan v2, 1536 dims), and writes rows to approved_specs + approved_spec_embeddings, adrs + adr_embeddings, canonical_terms, and system_topology. Estimated ~3-5 days of work. This is the seed work that, on the original plan, would precede SPEC-NNN drafting and the per-agent memory build.

Alternative considered: LLM Wiki

Mid-flight on 2026-05-08, the alternative pattern surfaced: instead of seeding a database with embeddings, treat the repo itself as the wiki and let agents navigate it via Read + Glob + auto-generated INDEX files. This is the pattern Cognition/DeepWiki popularized for code-base navigation, and it generalizes to any markdown corpus.

The central observation: the Aucert repo already is a wiki. .context/, docs/specs/approved/, .context/decisions/ are markdown trees that every agent already has Read access to. Building shared_kb_db adds a database layer over content the agent can read directly.

Per-axis comparison

AxisWiki-indexed (chosen)Vector-RAG (deferred)
Source of truthgit (markdown files)shared_kb_db (replicated from git)
Update mechanismgit commit; auto-regenerate INDEX via pre-commit hookseed-job re-embeds and upserts on doc change
Per-lookup tokens~10K (INDEX read + 1-2 file Reads)~1.5K (top-3 chunks)
Per-lookup API calls01 embed call (~$0.000001)
Build cost$0 (50-line bash script)$0.01-0.03 per re-seed (Titan v2 @ $0.02/1M tokens, ~250K tokens)
Schema lock-inNone (markdown is portable)vector(1536) hard-coded — model swap requires schema migration
Provider lock-inNoneAWS Bedrock (current EmbeddingClient wiring)
Failure mode"Wrong file Read" — debuggable"Cosine 0.78 ranked above 0.74" — opaque
Atomic updatesYes (git commit)No (partial re-embeds possible)
Crossover pointWins until ~250 specs (INDEX.md exceeds context budget)Wins past ~1000 specs (vector retrieval mandatory)
Aucert today~10 approved specs, ~14 ADRs, ~17 drafts → ~5-10 years of growth before crossoverMandatory only at scale we will not reach soon

Conclusion

Wiki-indexed wins on every axis except per-query token count at the current corpus size, and the gap there (~6×) does not justify the operational complexity of an embedding pipeline.

Decision

Bifurcate. Wiki-indexed for shared KB; vector-RAG only for per-agent memory.

LayerStorageWhy
Canonical terms (vocabulary)Wiki — .context/GLOSSARY.mdAlready exists; PR is the proposal workflow.
Approved specsWiki — docs/specs/approved/*.md indexed by docs/specs/INDEX.mdAlready exists. Frontmatter is structured.
ADRsWiki — .context/decisions/*.md indexed by .context/decisions/INDEX.mdAlready exists.
System topologyWiki — .context/ARCHITECTURE.mdAlready exists.
Per-agent memoryDB — agent_memory_db + pgvectorRuntime writes by the agent, semantic recall over thousands of memories. Wiki pattern does not apply — content is generated, not curated.

What ships now (PR-A — wiki-indexed shared KB)

  • tools/scripts/build-kb-index.sh — generates docs/specs/INDEX.md and .context/decisions/INDEX.md from frontmatter.
  • Pre-commit hook build-kb-index regenerates on changes to spec/ADR files.
  • Updates to AGENTS.md (Hard Rule 15) and .context/AI_RULES.md (Hard Rule 13 + new "Knowledge base lookups" section) teaching agents to consult INDEX first.
  • This decision document.

What is deferred (re-evaluate when triggers below fire)

The vector-RAG path remains technically unblocked. PR-1's infrastructure is intact:

  • shared_kb_db schema is migrated and live (8 tables — verifiable via flyway_schema_history).
  • VECTOR extension is allow-listed on the PG server.
  • SHARED_KB_DB_* keys are populated on astra-db-credentials.
  • The Flyway block in deploy-astra.yml continues to apply future migrations.
  • EmbeddingClient (Bedrock + Titan v2) is wired and ready.
  • SharedKbSearchTool, ApprovedSpecsSearchTool, AdrsSearchTool, SystemTopologySearchTool, CanonicalTermsLookupTool exist as ~100-line implementations that currently return [] (tables empty). They will start returning real results the moment seed data lands.

The deferred work to reactivate this path:

  1. Seed job (~3-5 days) — Kubernetes Job (or scheduled workflow) that walks repo content, embeds via EmbeddingClient, upserts to shared_kb_db tables. Idempotent — content-hash UPSERT for spec/ADR/topology rows; wholesale TRUNCATE+INSERT for canonical_terms (small dictionary). Use CASCADE on DELETE for embeddings to follow parent updates.
  2. Trigger workflow — separate seed-shared-kb.yml watching docs/specs/approved/**, .context/decisions/**, .context/ARCHITECTURE.md, .context/GLOSSARY.md. Manual trigger via workflow_dispatch for one-shot rebuilds.
  3. Force-rebuild flag — when embedding model changes (e.g., Titan v2 → v3, or a Foundry-hosted swap), set a flag that ignores hashes and re-embeds everything. Cheap, prevents stale-vector haunting.

Triggers to re-evaluate

SignalAction
Approved spec count exceeds ~250INDEX.md exceeds ~50K tokens → loading it costs more than vector retrieval. Activate seeding.
Agents repeatedly fail to find relevant context via INDEX-driven lookupQuality signal. Investigate — could be index design, prompt design, or genuine RAG need.
Shared corpus grows to include non-markdown sources (code symbols, diagrams, video transcripts)Markdown-only navigation breaks down. Activate seeding with multi-modal embeddings.
Memory recall in agent_memory_db motivates blending with shared canonical knowledgeThe shared_kb_search tool was designed to be blended with agent_memory_recall. If the blending pattern proves valuable, seed shared KB to enable it.

If any of these fire, this document is the entry point. Re-read the original handover §Phase 2 for the seed-job design and proceed from there.

Side notes captured during the 2026-05-08 design conversation

These do not change the decision but are worth preserving so future readers do not re-discover them.

astra-db-credentials is script-managed, not Terraform-managed

The astra-db-credentials K8s secret is created by tools/scripts/setup-astra-secrets.sh (reads from Key Vault, applies via kubectl). PR-1 extended that script to add SHARED_KB_DB_* keys. The "proper" Terraform-managed-infra path (per feedback_infra_proper_approach.md user preference) would be a kubernetes_secret_v1 resource or Key Vault CSI mount. This is its own infra refactor and was deliberately kept out of PR-1's scope.

setup-astra-secrets.sh has a destructive footgun

The script's confirm_overwrite function returns "proceed" when the secret does not exist, which silently regenerates ENCRYPTION_MASTER_KEY (the script itself warns this is permanent and breaks all encrypted agent tokens). On 2026-05-08 the user nearly hit this path — escaped by Ctrl+C at the CF_AUDIENCE prompt. Worth a follow-up: gate ENCRYPTION_MASTER_KEY regeneration behind an explicit --regenerate-encryption-key flag or a separate "this is destructive" prompt that fires regardless of secret existence.

deploy-astra.yml path filter excludes infra-only changes

The paths-filter@v3 filter in deploy-astra.yml only watches internal/backend/**, internal/frontend/**, frontend/packages/ui/**, and infra/migrations/**. Changes to Terraform, the deploy workflow itself, the secrets script, or context docs do not auto-trigger the workflow. PR-1 needed a manual gh workflow run deploy-astra.yml --ref main to deploy. Worth noting for future infra-only PRs.

SPEC id collisions

The frontmatter validator checks filename↔frontmatter id match per file but does not enforce cross-file uniqueness. Auto-generating INDEX.md exposed three files all claiming SPEC-021 (identity-domain, implementation-plan, wave5-dispatcher-design) and two claiming SPEC-020 (unified-platform-management, user-management-system). Worth a separate cleanup PR. The next free spec id is SPEC-025 — when the per-agent memory work needs a real spec, that is the id to use, not SPEC-023 (which is finance-catalog-domain).

Subsequent PRs (per-agent memory build)

The per-agent memory work continues unchanged from the original handover, just without Phase 2/3:

  • PR-Bagent_memory_db provisioned via Terraform, infra/migrations/agent-memory/V001__create_agent_memories.sql, fourth Flyway block in deploy-astra.yml, AGENT_MEMORY_DB_* keys via setup-astra-secrets.sh, context file updates.
  • PR-CAgentMemoryRepository + three tools (agent_memory_recall, agent_memory_remember, agent_memory_forget), tool registry registration, tests.
  • PR-D — Personality prompt updates (astra_db.personalities row for atlas), Astra UI memory listing page (read-only audit), zero-hits metric.

When PR-D ships, the original handover document is fully closed out except for the deferred shared-KB seeding above.

References