Shared KB bifurcation — wiki-indexed now, vector-RAG deferred (2026-05-08)
This document captures the 2026-05-08 decision to bifurcate the per-agent memory work into two layers with different storage strategies. It exists so the deferred path (vector-RAG seeding of shared_kb_db) can be picked up later without re-doing the brainstorming, and so the rationale survives the agent-context window of any single session.
Context
The original handover at per-agent-memory-handover-2026-05-06.md proposed a two-layer memory system:
- Shared memory —
shared_kb_dbseeded with canonical terms, approved specs, ADRs, and system topology. Used by all agents. Served via vector-RAG (pgvector + embedding model). - Per-agent memory —
agent_memory_db. Per-agent state written at runtime by the agent. Vector-RAG over private memories.
PR-1 (merged 2026-05-07, aucert#115) shipped the infrastructure for both: VECTOR allow-list on the internal PG server, Flyway block for shared_kb_db, and SHARED_KB_DB_* keys on astra-db-credentials. After PR-1, shared_kb_db has 8 tables but is empty.
Phase 2 (PR-2) was originally scoped as a Kubernetes Job that walks docs/specs/approved/*.md, .context/decisions/*.md, .context/ARCHITECTURE.md, and .context/GLOSSARY.md, embeds chunks via EmbeddingClient (AWS Bedrock + Titan v2, 1536 dims), and writes rows to approved_specs + approved_spec_embeddings, adrs + adr_embeddings, canonical_terms, and system_topology. Estimated ~3-5 days of work. This is the seed work that, on the original plan, would precede SPEC-NNN drafting and the per-agent memory build.
Alternative considered: LLM Wiki
Mid-flight on 2026-05-08, the alternative pattern surfaced: instead of seeding a database with embeddings, treat the repo itself as the wiki and let agents navigate it via Read + Glob + auto-generated INDEX files. This is the pattern Cognition/DeepWiki popularized for code-base navigation, and it generalizes to any markdown corpus.
The central observation: the Aucert repo already is a wiki. .context/, docs/specs/approved/, .context/decisions/ are markdown trees that every agent already has Read access to. Building shared_kb_db adds a database layer over content the agent can read directly.
Per-axis comparison
| Axis | Wiki-indexed (chosen) | Vector-RAG (deferred) |
|---|---|---|
| Source of truth | git (markdown files) | shared_kb_db (replicated from git) |
| Update mechanism | git commit; auto-regenerate INDEX via pre-commit hook | seed-job re-embeds and upserts on doc change |
| Per-lookup tokens | ~10K (INDEX read + 1-2 file Reads) | ~1.5K (top-3 chunks) |
| Per-lookup API calls | 0 | 1 embed call (~$0.000001) |
| Build cost | $0 (50-line bash script) | $0.01-0.03 per re-seed (Titan v2 @ $0.02/1M tokens, ~250K tokens) |
| Schema lock-in | None (markdown is portable) | vector(1536) hard-coded — model swap requires schema migration |
| Provider lock-in | None | AWS Bedrock (current EmbeddingClient wiring) |
| Failure mode | "Wrong file Read" — debuggable | "Cosine 0.78 ranked above 0.74" — opaque |
| Atomic updates | Yes (git commit) | No (partial re-embeds possible) |
| Crossover point | Wins until ~250 specs (INDEX.md exceeds context budget) | Wins past ~1000 specs (vector retrieval mandatory) |
| Aucert today | ~10 approved specs, ~14 ADRs, ~17 drafts → ~5-10 years of growth before crossover | Mandatory only at scale we will not reach soon |
Conclusion
Wiki-indexed wins on every axis except per-query token count at the current corpus size, and the gap there (~6×) does not justify the operational complexity of an embedding pipeline.
Decision
Bifurcate. Wiki-indexed for shared KB; vector-RAG only for per-agent memory.
| Layer | Storage | Why |
|---|---|---|
| Canonical terms (vocabulary) | Wiki — .context/GLOSSARY.md | Already exists; PR is the proposal workflow. |
| Approved specs | Wiki — docs/specs/approved/*.md indexed by docs/specs/INDEX.md | Already exists. Frontmatter is structured. |
| ADRs | Wiki — .context/decisions/*.md indexed by .context/decisions/INDEX.md | Already exists. |
| System topology | Wiki — .context/ARCHITECTURE.md | Already exists. |
| Per-agent memory | DB — agent_memory_db + pgvector | Runtime writes by the agent, semantic recall over thousands of memories. Wiki pattern does not apply — content is generated, not curated. |
What ships now (PR-A — wiki-indexed shared KB)
tools/scripts/build-kb-index.sh— generatesdocs/specs/INDEX.mdand.context/decisions/INDEX.mdfrom frontmatter.- Pre-commit hook
build-kb-indexregenerates on changes to spec/ADR files. - Updates to
AGENTS.md(Hard Rule 15) and.context/AI_RULES.md(Hard Rule 13 + new "Knowledge base lookups" section) teaching agents to consult INDEX first. - This decision document.
What is deferred (re-evaluate when triggers below fire)
The vector-RAG path remains technically unblocked. PR-1's infrastructure is intact:
shared_kb_dbschema is migrated and live (8 tables — verifiable viaflyway_schema_history).VECTORextension is allow-listed on the PG server.SHARED_KB_DB_*keys are populated onastra-db-credentials.- The Flyway block in
deploy-astra.ymlcontinues to apply future migrations. EmbeddingClient(Bedrock + Titan v2) is wired and ready.SharedKbSearchTool,ApprovedSpecsSearchTool,AdrsSearchTool,SystemTopologySearchTool,CanonicalTermsLookupToolexist as ~100-line implementations that currently return[](tables empty). They will start returning real results the moment seed data lands.
The deferred work to reactivate this path:
- Seed job (~3-5 days) — Kubernetes Job (or scheduled workflow) that walks repo content, embeds via
EmbeddingClient, upserts toshared_kb_dbtables. Idempotent — content-hash UPSERT for spec/ADR/topology rows; wholesale TRUNCATE+INSERT for canonical_terms (small dictionary). UseCASCADE on DELETEfor embeddings to follow parent updates. - Trigger workflow — separate
seed-shared-kb.ymlwatchingdocs/specs/approved/**,.context/decisions/**,.context/ARCHITECTURE.md,.context/GLOSSARY.md. Manual trigger viaworkflow_dispatchfor one-shot rebuilds. - Force-rebuild flag — when embedding model changes (e.g., Titan v2 → v3, or a Foundry-hosted swap), set a flag that ignores hashes and re-embeds everything. Cheap, prevents stale-vector haunting.
Triggers to re-evaluate
| Signal | Action |
|---|---|
| Approved spec count exceeds ~250 | INDEX.md exceeds ~50K tokens → loading it costs more than vector retrieval. Activate seeding. |
| Agents repeatedly fail to find relevant context via INDEX-driven lookup | Quality signal. Investigate — could be index design, prompt design, or genuine RAG need. |
| Shared corpus grows to include non-markdown sources (code symbols, diagrams, video transcripts) | Markdown-only navigation breaks down. Activate seeding with multi-modal embeddings. |
Memory recall in agent_memory_db motivates blending with shared canonical knowledge | The shared_kb_search tool was designed to be blended with agent_memory_recall. If the blending pattern proves valuable, seed shared KB to enable it. |
If any of these fire, this document is the entry point. Re-read the original handover §Phase 2 for the seed-job design and proceed from there.
Side notes captured during the 2026-05-08 design conversation
These do not change the decision but are worth preserving so future readers do not re-discover them.
astra-db-credentials is script-managed, not Terraform-managed
The astra-db-credentials K8s secret is created by tools/scripts/setup-astra-secrets.sh (reads from Key Vault, applies via kubectl). PR-1 extended that script to add SHARED_KB_DB_* keys. The "proper" Terraform-managed-infra path (per feedback_infra_proper_approach.md user preference) would be a kubernetes_secret_v1 resource or Key Vault CSI mount. This is its own infra refactor and was deliberately kept out of PR-1's scope.
setup-astra-secrets.sh has a destructive footgun
The script's confirm_overwrite function returns "proceed" when the secret does not exist, which silently regenerates ENCRYPTION_MASTER_KEY (the script itself warns this is permanent and breaks all encrypted agent tokens). On 2026-05-08 the user nearly hit this path — escaped by Ctrl+C at the CF_AUDIENCE prompt. Worth a follow-up: gate ENCRYPTION_MASTER_KEY regeneration behind an explicit --regenerate-encryption-key flag or a separate "this is destructive" prompt that fires regardless of secret existence.
deploy-astra.yml path filter excludes infra-only changes
The paths-filter@v3 filter in deploy-astra.yml only watches internal/backend/**, internal/frontend/**, frontend/packages/ui/**, and infra/migrations/**. Changes to Terraform, the deploy workflow itself, the secrets script, or context docs do not auto-trigger the workflow. PR-1 needed a manual gh workflow run deploy-astra.yml --ref main to deploy. Worth noting for future infra-only PRs.
SPEC id collisions
The frontmatter validator checks filename↔frontmatter id match per file but does not enforce cross-file uniqueness. Auto-generating INDEX.md exposed three files all claiming SPEC-021 (identity-domain, implementation-plan, wave5-dispatcher-design) and two claiming SPEC-020 (unified-platform-management, user-management-system). Worth a separate cleanup PR. The next free spec id is SPEC-025 — when the per-agent memory work needs a real spec, that is the id to use, not SPEC-023 (which is finance-catalog-domain).
Subsequent PRs (per-agent memory build)
The per-agent memory work continues unchanged from the original handover, just without Phase 2/3:
- PR-B —
agent_memory_dbprovisioned via Terraform,infra/migrations/agent-memory/V001__create_agent_memories.sql, fourth Flyway block indeploy-astra.yml,AGENT_MEMORY_DB_*keys viasetup-astra-secrets.sh, context file updates. - PR-C —
AgentMemoryRepository+ three tools (agent_memory_recall,agent_memory_remember,agent_memory_forget), tool registry registration, tests. - PR-D — Personality prompt updates (
astra_db.personalitiesrow for atlas), Astra UI memory listing page (read-only audit), zero-hits metric.
When PR-D ships, the original handover document is fully closed out except for the deferred shared-KB seeding above.
References
per-agent-memory-handover-2026-05-06.md— original handover; Phase 2/3 are deferred per this decision, Phase 4 continues unchanged.- aucert#115 — PR-1 (shared_kb_db Flyway activation).
SPEC-005§8 — theshared_kb_dbschema design.tools/scripts/build-kb-index.sh— the wiki-indexer.docs/specs/INDEX.md,.context/decisions/INDEX.md— the indexes themselves.