Skip to main content

PostgreSQL configuration

This page is the single reference for how Postgres is wired across the Aucert stack: where it is provisioned, what credentials exist, how pods and developers connect, how migrations and the ORM are organized, and which gaps are known (no role hierarchy, no automated rotation, no pre-merge terraform plan). Read this before making any schema or infra change that touches a Postgres instance.

info

Credentials flow Terraform random_password → Key Vault → K8s Secret → pod env for every working service. Spec-agent-worker is the one exception — it tries to read Key Vault directly via KeyVaultSecretProvider, but Workload Identity for that pod is not yet wired so every fetch fails. See drift-2026-04-29-spec-agent-keyvault-auth.md and Secrets management → Migration to Workload Identity.

Rotation today is manual at every hop; see Rotating credentials.

Quick reference

PropertyValue
Servers (as of 2026-04-23)aucert-internal-pg (foundation tier), aucertdev-product-pg (dev environment tier)
EnginePostgreSQL 18 Flexible Server, SKU B_Standard_B2s, westus
NetworkPrivate VNet only — no public endpoint, sslmode=require
Private DNS zonesaucert-internal.postgres.database.azure.com, aucertdev.postgres.database.azure.com
Admin user (internal)internaladmin (hard-coded in foundation Terraform)
Admin user (product)from var.pg_admin_username (env-dev variable)
Key Vaultaucertdev-kv-41e0x5
ORMExposed 0.58.0 (Kotlin) + HikariCP 6.2.1 pool, driver org.postgresql:postgresql:42.7.5
MigrationsFlyway 11.4.0, four migration sets under infra/migrations/
Terraform filesinfra/terraform/foundation/database.tf, infra/terraform/environments/dev/database.tf, infra/terraform/internal-platform/databases.tf

Servers and tiers

Two Flexible Server instances are provisioned across two Terraform tiers. A third instance for production is planned but not yet created.

ServerTerraform tierAdmin userExtensions allow-listDestroyableDatabases
aucert-internal-pgfoundation/internaladminBTREE_GIN (for Temporal)prevent_destroy = trueplane_db, internal_shared_db, astra_db, specs_db, shared_kb_db, temporal, temporal_visibility
aucertdev-product-pgenvironments/dev/var.pg_admin_usernameAzure defaultNo (dev can be rebuilt)aucert
Future production PGenvironments/prod/ (reserved — subnet 10.0.8.0/24)TBDTBDprevent_destroy plannedTBD
prevent_destroy on internal PG

aucert-internal-pg is tagged destroyable=false and has lifecycle { prevent_destroy = true } in infra/terraform/foundation/database.tf. Destroying a dev environment can never reach it. Any attempt to terraform destroy the foundation tier will be blocked by Terraform.

The azure.extensions server parameter is a dynamic allow-list — Flexible Server rejects CREATE EXTENSION for anything not in the list. Today only BTREE_GIN is listed for the internal PG (needed by Temporal's advanced_visibility migration). pgvector works because it is already in the Azure default allow-list for PG 16; if you need a non-default extension, add it to azurerm_postgresql_flexible_server_configuration.internal_pg_extensions and terraform apply before the migration runs.

Databases

The internal PG multi-tenants seven logical databases on one server — each with a distinct owner and migration set. Separation is by database, not by role.

DatabaseOwner pathMigrationsPopulated by
internal_shared_dbAstra backend (Exposed)infra/migrations/internal-shared/ (V001–V006)Startup Flyway in DatabaseFactory
astra_dbAstra backend (Exposed)infra/migrations/astra/ (V001–V010)Startup Flyway in DatabaseFactory
specs_dbSpec agent (raw JDBC)infra/migrations/specs/ (V001–V007)Flyway pod in deploy workflow
shared_kb_dbAll agents (raw JDBC, read-heavy)infra/migrations/shared-kb/ (V001–V009, includes pgvector)Flyway pod in deploy workflow
plane_dbPlane CE (external tool)Managed by PlanePlane's own bootstrap
temporal / temporal_visibilityTemporal serverManaged by Temporal Helm chartTemporal's own schema tool
aucert (on product PG)Product backendNot yet populatedTBD

Access control — known gap

There is no role hierarchy today. Every backend connection uses the server admin (internaladmin on the internal PG, the pg_admin_username variable on product PG). Defense lives entirely at the network layer:

  • Both servers have public_network_access_enabled = false
  • Both are delegated to dedicated subnets (internal-platform-subnet, postgres-subnet)
  • Private DNS zones are VNet-linked, so pods resolve *.postgres.database.azure.com to private IPs
  • TLS is enforced via sslmode=require in every connection string
Planned — not yet implemented

We have not carved out app_rw / app_ro / migrator roles per database. Every service has full DDL + DML as the admin. Adding a role hierarchy is tracked as future work; until then, treat any app-level bug that leaks SQL as a worst-case blast radius.

Credentials

Key Vault secrets

All passwords live in the foundation-tier Key Vault aucertdev-kv-41e0x5. Generated by Terraform random_password (32 chars, special included) and written as secrets.

Secret nameWritten byPurposeConsumers
internal-pg-admin-passwordinfra/terraform/foundation/database.tfInternal PG admin passwordsetup-astra-secrets.sh, setup-temporal-secrets.sh
internal-pg-connection-stringinfra/terraform/foundation/database.tfFull postgresql://… URL (defaults to plane_db)Convenience only — services build their own per-DB URLs
pg-admin-passwordinfra/terraform/environments/dev/database.tfProduct PG admin passwordProduct backend (future)
pg-connection-stringinfra/terraform/environments/dev/database.tfFull URL pointing at aucert DBProduct backend (future)
info

The internal-pg-connection-string secret encodes /plane_db as the default database. It is convenience metadata — every actual consumer constructs its own JDBC URL with the correct database name at secret-sync time.

Kubernetes secrets

tools/scripts/setup-astra-secrets.sh mirrors Key Vault into K8s. The script builds two JDBC URLs (one per database) and writes six keys into astra-db-credentials:

tools/scripts/setup-astra-secrets.sh (excerpt)
PG_PASS=$(az keyvault secret show --vault-name "$KV_NAME" \
--name "$KV_SECRET_NAME" --query value -o tsv)

SHARED_DB_URL="jdbc:postgresql://${PG_HOST}:${PG_PORT}/internal_shared_db?sslmode=require"
ASTRA_DB_URL="jdbc:postgresql://${PG_HOST}:${PG_PORT}/astra_db?sslmode=require"

kubectl create secret generic "$DB_SECRET_NAME" -n "$NAMESPACE" \
--from-literal=INTERNAL_SHARED_DB_URL="$SHARED_DB_URL" \
--from-literal=INTERNAL_SHARED_DB_USER="$PG_USER" \
--from-literal=INTERNAL_SHARED_DB_PASSWORD="$PG_PASS" \
--from-literal=ASTRA_DB_URL="$ASTRA_DB_URL" \
--from-literal=ASTRA_DB_USER="$PG_USER" \
--from-literal=ASTRA_DB_PASSWORD="$PG_PASS" \
--dry-run=client -o yaml | kubectl apply -f -
K8s secretNamespaceKeysSource
astra-db-credentialsinternal-platformINTERNAL_SHARED_DB_URL/USER/PASSWORD, ASTRA_DB_URL/USER/PASSWORDtools/scripts/setup-astra-secrets.sh
astra-secretsinternal-platformENCRYPTION_MASTER_KEY, CF_AUDIENCE, DISPATCHER_SERVICE_TOKENSame script (generated, not mirrored)
temporal-postgres-credentialstemporalpasswordtools/scripts/setup-temporal-secrets.sh

See Secrets management for the full Key Vault inventory.

Connecting from a pod

Deployments inject credentials via explicit secretKeyRef entries, not envFrom — this keeps each variable's source auditable in the manifest. Non-sensitive runtime config is split into a separate ConfigMap.

infra/k8s/internal-platform/astra/backend.yaml
env:
- name: INTERNAL_SHARED_DB_URL
valueFrom:
secretKeyRef:
name: astra-db-credentials
key: INTERNAL_SHARED_DB_URL
- name: INTERNAL_SHARED_DB_USER
valueFrom:
secretKeyRef:
name: astra-db-credentials
key: INTERNAL_SHARED_DB_USER
- name: INTERNAL_SHARED_DB_PASSWORD
valueFrom:
secretKeyRef:
name: astra-db-credentials
key: INTERNAL_SHARED_DB_PASSWORD

Network path from pod → server: AKS egresses from the internal-platform-subnet; the Private DNS zone aucert-internal.postgres.database.azure.com is linked to the VNet, so the FQDN resolves to a private IP. No NAT, no public traffic, no bastion.

Connecting for local development

Local dev runs a real Postgres in Docker — no port-forward or bastion needed.

infra/docker/docker-compose.dev.yml
services:
postgres:
image: postgres:18.3-alpine
environment:
POSTGRES_DB: aucert
POSTGRES_USER: aucert
POSTGRES_PASSWORD: aucert_dev
ports: ["5432:5432"]
volumes: [pgdata:/var/lib/postgresql/data]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U aucert"]
interval: 5s

Tilt (Tiltfile) starts the compose stack alongside the backend. The image is pinned to postgres:18.3-alpine to match production's engine version — keep them aligned so SQL dialect behavior matches.

Connect from psql
psql postgresql://aucert:aucert_dev@localhost:5432/aucert

ORM: Exposed 0.58.0

The Kotlin backend uses Exposed as the primary persistence layer. It is wired into two databases (internal_shared_db and astra_db) through a single DatabaseFactory that also owns the HikariCP pools and runs Flyway migrations on startup.

DataSource + Exposed wiring

internal/backend/src/main/kotlin/dev/aucert/internal/shared/db/DatabaseFactory.kt
override fun init() {
val sharedDs = hikariDataSource(
name = "internal-shared",
url = requireEnv("INTERNAL_SHARED_DB_URL"),
user = requireEnv("INTERNAL_SHARED_DB_USER"),
password = requireEnv("INTERNAL_SHARED_DB_PASSWORD"),
)
internalSharedDb = Database.connect(sharedDs)
runMigrations(sharedDs, "internal-shared")

val astraDs = hikariDataSource(name = "astra", /* … */)
astraDb = Database.connect(astraDs)
runMigrations(astraDs, "astra")
}

private fun hikariDataSource(name: String, url: String, user: String, password: String) =
HikariDataSource(HikariConfig().apply {
poolName = "hikari-$name"
jdbcUrl = url; username = user; this.password = password
driverClassName = "org.postgresql.Driver"
maximumPoolSize = 10
isAutoCommit = false
transactionIsolation = "TRANSACTION_REPEATABLE_READ"
validate()
})

Conventions to honor when wiring a new database:

  • Read credentials with requireEnv(...) — a missing env var must fail-fast, not fall back.
  • Set transactionIsolation = "TRANSACTION_REPEATABLE_READ" and isAutoCommit = false. These are project defaults; do not override per-service.
  • Call runMigrations(dataSource, "<migration-dir>") immediately after Database.connect so Flyway uses the same pool.

Declaring a table

Exposed tables are Kotlin object singletons — they describe schema, not rows. Use UUIDTable("<name>") as the base for new tables; use jsonb(...) with kotlinx-serialization encoders for JSON columns and array<String>(...) for native text[] columns.

internal/backend/src/main/kotlin/dev/aucert/internal/astra/repository/Tables.kt
object AgentTable : UUIDTable("agents") {
val funName = varchar("fun_name", 32).uniqueIndex()
val email = varchar("email", 120).uniqueIndex()
val role = varchar("role", 32)
val tags = array<String>("tags", TextColumnType()).default(emptyList())
val runtimeConfig = jsonb(
"runtime_config",
{ Json.encodeToString(JsonObject.serializer(), it) },
{ Json.decodeFromString(JsonObject.serializer(), it) },
)
val status = varchar("status", 16).default("provisioning")
val createdAt = timestampWithTimeZone("created_at")
}

Writing a repository

Always use the project's suspended-transaction wrapper dbFactory.dbQuery(db) { ... } — never call Exposed's top-level transaction { } directly. The wrapper binds the IO dispatcher and the specific Database handle so the right connection pool is used.

internal/backend/src/main/kotlin/dev/aucert/internal/astra/repository/AgentRepository.kt
override suspend fun findAll(role: String?, status: String?, tech: String?, tag: String?): List<Agent> =
dbFactory.dbQuery(dbFactory.astraDb) {
val query = AgentTable.selectAll().orderBy(AgentTable.funName to SortOrder.ASC)
role?.let { r -> query.andWhere { AgentTable.role eq r } }
status?.let { s -> query.andWhere { AgentTable.status eq s } }
tech?.let { t -> query.andWhere { AgentTable.tech eq t } }
tag?.let { t -> query.andWhere { arrayContainsElement(AgentTable.tags, t) } }
query.map { it.toAgent() }
}

Inserts use insertAndGetId { it[col] = value }, returning the generated UUID in one roundtrip. Convert kotlinx.datetime.Instant to OffsetDateTime at the DB boundary (Exposed's timestampWithTimeZone expects the Java type):

internal/backend/src/main/kotlin/dev/aucert/internal/astra/repository/TokenRepository.kt
override suspend fun store(token: AgentToken): AgentToken =
dbFactory.dbQuery(dbFactory.astraDb) {
val now = OffsetDateTime.now(ZoneOffset.UTC)
val id = AgentTokenTable.insertAndGetId {
it[agentId] = UUID.fromString(token.agentId)
it[encryptedToken] = token.encryptedToken
it[issuedAt] = token.issuedAt?.let { ia ->
OffsetDateTime.ofInstant(ia.toJavaInstant(), ZoneOffset.UTC)
} ?: now
it[status] = token.status
}
token.copy(id = id.value.toString())
}

Raw JDBC: the spec-agent escape hatch

A parallel PostgresClient (raw JDBC + HikariCP) exists in internal/backend/src/main/kotlin/dev/aucert/internal/agents/shared/clients/PostgresClient.kt. It is documented as the spec-agent path because that code needs primitives Exposed does not expose cleanly:

  • pg_advisory_lock / pg_advisory_unlock for single-writer semantics on a topic
  • A transaction { tx -> ... } block that pins one physical connection so nested query / execute / advisory lock calls share it
  • Hand-tuned SQL (e.g. window functions) that is awkward to express in the Exposed DSL
Do not reach for PostgresClient in new code

Unless you genuinely need advisory locks or hand-tuned SQL, use Exposed through DatabaseFactory. Mixing JDBC and Exposed in the same bounded context will drift over time.

Migrations

Flyway 11.4.0 runs the four migration sets under infra/migrations/. Each set is an independent Flyway history (its own flyway_schema_history table in its own database), so version numbers never collide across bounded contexts.

SetPathFilesRuns where
internal-sharedinfra/migrations/internal-shared/V001–V006On backend startup via DatabaseFactory.runMigrations(...)
astrainfra/migrations/astra/V001–V010On backend startup
specsinfra/migrations/specs/V001–V007Flyway pod in .github/workflows/deploy-astra.yml
shared-kbinfra/migrations/shared-kb/V001–V009 (includes V001__enable_pgvector.sql)Flyway pod in deploy workflow

baselineOnMigrate(true) is set in DatabaseFactory.kt:74 — the first run against an empty database creates the baseline without replaying any migrations. Once baselined, migrations are strictly forward-only.

See How to create a database migration for the end-to-end task workflow (file naming, port-forward testing, CI behaviour).

Rotating credentials

Rotation today is manual end-to-end. All four steps must succeed for pods to resume serving requests; plan a short maintenance window.

Steps:

  1. Taint the Terraform password resource, then apply:

    cd infra/terraform/foundation
    terraform taint random_password.internal_pg_password
    terraform apply

    Terraform writes the new value to internal-pg-admin-password in Key Vault and calls the Azure API to update the server admin password atomically.

  2. Re-run the K8s secret sync to refresh astra-db-credentials:

    ./tools/scripts/setup-astra-secrets.sh

    Answer y when prompted to overwrite astra-db-credentials. Leave astra-secrets alone — the encryption key must never rotate (see the warning printed by the script).

  3. Rolling-restart every Deployment that consumes the secret:

    kubectl rollout restart deploy/astra-backend -n internal-platform
    kubectl rollout status deploy/astra-backend -n internal-platform
  4. Repeat for the product PG using random_password.pg_password in environments/dev/ once the product backend is deployed.

Never rotate ENCRYPTION_MASTER_KEY

The Astra Token Vault encrypts agent platform tokens with this key. Rotating it makes every existing encrypted token permanently undecryptable — setup-astra-secrets.sh will warn you. Rotation requires a data-migration plan, which does not exist yet.

Guardrails

GitHub Actions enforce part of the contract; several gaps remain.

EnforcedWhereWhat it does
Context-drift gate.github/workflows/context-drift-check.ymlBlocks PRs that change infra/terraform/** without updating infra/.context/*.md
Auto-apply Flyway on merge.github/workflows/deploy-astra.yml (lines 146–176)Creates a ConfigMap from SQL files and runs a Flyway pod in the cluster
Smoke test after deploySame workflowHTTP health check against astra.aucert.dev
Approved-spec immutabilitycontext-drift-check.ymlBlocks direct edits to docs/specs/approved/

Gaps — explicitly not enforced:

  • No terraform plan / terraform validate runs on PRs. Drift between main and deployed state is invisible until someone applies locally.
  • No schema drift detection (no pg_dump diff against Flyway history).
  • No pre-merge SQL lint or migration-naming check.
  • No approval gate on migrations — they auto-apply on push to main.

Closing any of these is a tracked future task; when you ship one, update this table.

Integration testing

Integration tests use Testcontainers with postgres:18.3-alpine — the same engine version as production. The pattern is in internal/backend/src/test/kotlin/dev/aucert/internal/agents/shared/clients/PostgresClientTest.kt:

internal/backend/src/test/.../PostgresClientTest.kt
@Testcontainers
class PostgresClientTest {
companion object {
@Container @JvmStatic
val pg = PostgreSQLContainer("postgres:18.3-alpine")
.withDatabaseName("w1c_test").withUsername("test").withPassword("test")
}

@BeforeTest fun setup() = runBlocking {
ds = HikariDataSource(HikariConfig().apply {
jdbcUrl = pg.jdbcUrl; username = pg.username; password = pg.password
driverClassName = "org.postgresql.Driver"
maximumPoolSize = 5
transactionIsolation = "TRANSACTION_REPEATABLE_READ"
})
client = PostgresClient(ds)
}
}

Conventions:

  • One container per test class (@Container on a companion object) for isolation between classes.
  • Keep postgres:18.3-alpine pinned — if prod moves again, update here at the same time.
  • TESTCONTAINERS_RYUK_DISABLED=true is set in build.gradle.kts because Ryuk's docker-java client doesn't honor our pinned API version on modern daemons. The JUnit 5 @Testcontainers extension handles container teardown without it.
  • On macOS, the test task auto-detects the Docker Desktop socket at ~/.docker/run/docker.sock — no need to set DOCKER_HOST by hand.

Troubleshooting

SymptomLikely causeFix
Connection refused from podPrivate DNS zone not linked to the VNet, or pod not in a subnet with accessaz network private-dns link vnet list -g aucert-foundation-rg -z aucert-internal.postgres.database.azure.com
FATAL: password authentication failed after a Terraform applySecret rotated in Key Vault but K8s secret not re-syncedRe-run ./tools/scripts/setup-astra-secrets.sh, then kubectl rollout restart the affected Deployment
ERROR: extension "X" is not allow-listedExtension not in azure.extensions server parameterAdd to azurerm_postgresql_flexible_server_configuration.internal_pg_extensions and terraform apply before the migration
Flyway Migration checksum mismatchSomeone edited an applied migrationRevert the edit. Never modify applied migrations — create a new V{next}__...sql instead
Backend readiness probe fails with SELECT 1 timeoutHikari pool exhausted or network partitionkubectl logs deploy/astra-backend — look for HikariPool-1 - Timeout failure

What's next