Shared KB bifurcation — wiki-indexed now, vector-RAG deferred (2026-05-08)

This document captures the 2026-05-08 decision to bifurcate the per-agent memory work into two layers with different storage strategies. It exists so the deferred path (vector-RAG seeding of shared_kb_db) can be picked up later without re-doing the brainstorming, and so the rationale survives the agent-context window of any single session.

Context

The original handover at per-agent-memory-handover-2026-05-06.md proposed a two-layer memory system:

Shared memory — shared_kb_db seeded with canonical terms, approved specs, ADRs, and system topology. Used by all agents. Served via vector-RAG (pgvector + embedding model).
Per-agent memory — agent_memory_db. Per-agent state written at runtime by the agent. Vector-RAG over private memories.

PR-1 (merged 2026-05-07, aucert#115) shipped the infrastructure for both: VECTOR allow-list on the internal PG server, Flyway block for shared_kb_db, and SHARED_KB_DB_* keys on astra-db-credentials. After PR-1, shared_kb_db has 8 tables but is empty.

Phase 2 (PR-2) was originally scoped as a Kubernetes Job that walks docs/specs/approved/*.md, .context/decisions/*.md, .context/ARCHITECTURE.md, and .context/GLOSSARY.md, embeds chunks via EmbeddingClient (AWS Bedrock + Titan v2, 1536 dims), and writes rows to approved_specs + approved_spec_embeddings, adrs + adr_embeddings, canonical_terms, and system_topology. Estimated ~3-5 days of work. This is the seed work that, on the original plan, would precede SPEC-NNN drafting and the per-agent memory build.

Alternative considered: LLM Wiki

Mid-flight on 2026-05-08, the alternative pattern surfaced: instead of seeding a database with embeddings, treat the repo itself as the wiki and let agents navigate it via Read + Glob + auto-generated INDEX files. This is the pattern Cognition/DeepWiki popularized for code-base navigation, and it generalizes to any markdown corpus.

The central observation: the Aucert repo already is a wiki. .context/, docs/specs/approved/, .context/decisions/ are markdown trees that every agent already has Read access to. Building shared_kb_db adds a database layer over content the agent can read directly.

Per-axis comparison

Axis	Wiki-indexed (chosen)	Vector-RAG (deferred)
Source of truth	git (markdown files)	`shared_kb_db` (replicated from git)
Update mechanism	git commit; auto-regenerate INDEX via pre-commit hook	seed-job re-embeds and upserts on doc change
Per-lookup tokens	~10K (INDEX read + 1-2 file Reads)	~1.5K (top-3 chunks)
Per-lookup API calls	0	1 embed call (~$0.000001)
Build cost	$0 (50-line bash script)	$0.01-0.03 per re-seed (Titan v2 @ $0.02/1M tokens, ~250K tokens)
Schema lock-in	None (markdown is portable)	`vector(1536)` hard-coded — model swap requires schema migration
Provider lock-in	None	AWS Bedrock (current EmbeddingClient wiring)
Failure mode	"Wrong file Read" — debuggable	"Cosine 0.78 ranked above 0.74" — opaque
Atomic updates	Yes (git commit)	No (partial re-embeds possible)
Crossover point	Wins until ~250 specs (INDEX.md exceeds context budget)	Wins past ~1000 specs (vector retrieval mandatory)
Aucert today	~10 approved specs, ~14 ADRs, ~17 drafts → ~5-10 years of growth before crossover	Mandatory only at scale we will not reach soon

Conclusion

Wiki-indexed wins on every axis except per-query token count at the current corpus size, and the gap there (~6×) does not justify the operational complexity of an embedding pipeline.

Decision

Bifurcate. Wiki-indexed for shared KB; vector-RAG only for per-agent memory.

Layer	Storage	Why
Canonical terms (vocabulary)	Wiki — `.context/GLOSSARY.md`	Already exists; PR is the proposal workflow.
Approved specs	Wiki — `docs/specs/approved/*.md` indexed by `docs/specs/INDEX.md`	Already exists. Frontmatter is structured.
ADRs	Wiki — `.context/decisions/*.md` indexed by `.context/decisions/INDEX.md`	Already exists.
System topology	Wiki — `.context/ARCHITECTURE.md`	Already exists.
Per-agent memory	DB — `agent_memory_db` + pgvector	Runtime writes by the agent, semantic recall over thousands of memories. Wiki pattern does not apply — content is generated, not curated.

What ships now (PR-A — wiki-indexed shared KB)

tools/scripts/build-kb-index.sh — generates docs/specs/INDEX.md and .context/decisions/INDEX.md from frontmatter.
Pre-commit hook build-kb-index regenerates on changes to spec/ADR files.
Updates to AGENTS.md (Hard Rule 15) and .context/AI_RULES.md (Hard Rule 13 + new "Knowledge base lookups" section) teaching agents to consult INDEX first.
This decision document.

What is deferred (re-evaluate when triggers below fire)

The vector-RAG path remains technically unblocked. PR-1's infrastructure is intact:

shared_kb_db schema is migrated and live (8 tables — verifiable via flyway_schema_history).
VECTOR extension is allow-listed on the PG server.
SHARED_KB_DB_* keys are populated on astra-db-credentials.
The Flyway block in deploy-astra.yml continues to apply future migrations.
EmbeddingClient (Bedrock + Titan v2) is wired and ready.
SharedKbSearchTool, ApprovedSpecsSearchTool, AdrsSearchTool, SystemTopologySearchTool, CanonicalTermsLookupTool exist as ~100-line implementations that currently return [] (tables empty). They will start returning real results the moment seed data lands.

The deferred work to reactivate this path:

Seed job (~3-5 days) — Kubernetes Job (or scheduled workflow) that walks repo content, embeds via EmbeddingClient, upserts to shared_kb_db tables. Idempotent — content-hash UPSERT for spec/ADR/topology rows; wholesale TRUNCATE+INSERT for canonical_terms (small dictionary). Use CASCADE on DELETE for embeddings to follow parent updates.
Trigger workflow — separate seed-shared-kb.yml watching docs/specs/approved/**, .context/decisions/**, .context/ARCHITECTURE.md, .context/GLOSSARY.md. Manual trigger via workflow_dispatch for one-shot rebuilds.
Force-rebuild flag — when embedding model changes (e.g., Titan v2 → v3, or a Foundry-hosted swap), set a flag that ignores hashes and re-embeds everything. Cheap, prevents stale-vector haunting.

Triggers to re-evaluate

Signal	Action
Approved spec count exceeds ~250	INDEX.md exceeds ~50K tokens → loading it costs more than vector retrieval. Activate seeding.
Agents repeatedly fail to find relevant context via INDEX-driven lookup	Quality signal. Investigate — could be index design, prompt design, or genuine RAG need.
Shared corpus grows to include non-markdown sources (code symbols, diagrams, video transcripts)	Markdown-only navigation breaks down. Activate seeding with multi-modal embeddings.
Memory recall in `agent_memory_db` motivates blending with shared canonical knowledge	The `shared_kb_search` tool was designed to be blended with `agent_memory_recall`. If the blending pattern proves valuable, seed shared KB to enable it.

If any of these fire, this document is the entry point. Re-read the original handover §Phase 2 for the seed-job design and proceed from there.

Side notes captured during the 2026-05-08 design conversation

These do not change the decision but are worth preserving so future readers do not re-discover them.

`astra-db-credentials` is script-managed, not Terraform-managed

The astra-db-credentials K8s secret is created by tools/scripts/setup-astra-secrets.sh (reads from Key Vault, applies via kubectl). PR-1 extended that script to add SHARED_KB_DB_* keys. The "proper" Terraform-managed-infra path (per feedback_infra_proper_approach.md user preference) would be a kubernetes_secret_v1 resource or Key Vault CSI mount. This is its own infra refactor and was deliberately kept out of PR-1's scope.

`setup-astra-secrets.sh` has a destructive footgun

The script's confirm_overwrite function returns "proceed" when the secret does not exist, which silently regenerates ENCRYPTION_MASTER_KEY (the script itself warns this is permanent and breaks all encrypted agent tokens). On 2026-05-08 the user nearly hit this path — escaped by Ctrl+C at the CF_AUDIENCE prompt. Worth a follow-up: gate ENCRYPTION_MASTER_KEY regeneration behind an explicit --regenerate-encryption-key flag or a separate "this is destructive" prompt that fires regardless of secret existence.

`deploy-astra.yml` path filter excludes infra-only changes

The paths-filter@v3 filter in deploy-astra.yml only watches internal/backend/**, internal/frontend/**, frontend/packages/ui/**, and infra/migrations/**. Changes to Terraform, the deploy workflow itself, the secrets script, or context docs do not auto-trigger the workflow. PR-1 needed a manual gh workflow run deploy-astra.yml --ref main to deploy. Worth noting for future infra-only PRs.

SPEC id collisions

The frontmatter validator checks filename↔frontmatter id match per file but does not enforce cross-file uniqueness. Auto-generating INDEX.md exposed three files all claiming SPEC-021 (identity-domain, implementation-plan, wave5-dispatcher-design) and two claiming SPEC-020 (unified-platform-management, user-management-system). Worth a separate cleanup PR. The next free spec id is SPEC-025 — when the per-agent memory work needs a real spec, that is the id to use, not SPEC-023 (which is finance-catalog-domain).

Subsequent PRs (per-agent memory build)

The per-agent memory work continues unchanged from the original handover, just without Phase 2/3:

PR-B — agent_memory_db provisioned via Terraform, infra/migrations/agent-memory/V001__create_agent_memories.sql, fourth Flyway block in deploy-astra.yml, AGENT_MEMORY_DB_* keys via setup-astra-secrets.sh, context file updates.
PR-C — AgentMemoryRepository + three tools (agent_memory_recall, agent_memory_remember, agent_memory_forget), tool registry registration, tests.
PR-D — Personality prompt updates (astra_db.personalities row for atlas), Astra UI memory listing page (read-only audit), zero-hits metric.

When PR-D ships, the original handover document is fully closed out except for the deferred shared-KB seeding above.

References

per-agent-memory-handover-2026-05-06.md — original handover; Phase 2/3 are deferred per this decision, Phase 4 continues unchanged.
aucert#115 — PR-1 (shared_kb_db Flyway activation).
SPEC-005 §8 — the shared_kb_db schema design.
tools/scripts/build-kb-index.sh — the wiki-indexer.
docs/specs/INDEX.md, .context/decisions/INDEX.md — the indexes themselves.

Context​

Alternative considered: LLM Wiki​

Per-axis comparison​

Conclusion​

Decision​

What ships now (PR-A — wiki-indexed shared KB)​

What is deferred (re-evaluate when triggers below fire)​

Triggers to re-evaluate​

Side notes captured during the 2026-05-08 design conversation​

astra-db-credentials is script-managed, not Terraform-managed​

setup-astra-secrets.sh has a destructive footgun​

deploy-astra.yml path filter excludes infra-only changes​

SPEC id collisions​

Subsequent PRs (per-agent memory build)​

References​