Troubleshooting

Common errors and their solutions, organized by platform. Jump to the section that matches your problem area.

Backend (Kotlin)

"Connection refused" — Database not accessible

Symptom: org.postgresql.util.PSQLException: Connection to localhost:5432 refused

Cause: PostgreSQL is not running or not port-forwarded.

Fix:

# Port-forward the product database
kubectl port-forward -n aucert-dev svc/product-pg 5432:5432

# Or the internal platform database
kubectl port-forward -n internal-platform svc/internal-pg 5433:5432

Ensure your DATABASE_URL environment variable matches the forwarded port.

"Class not found" errors during build

Symptom: ClassNotFoundException or NoClassDefFoundError at compile time.

Fix:

cd backend/platform
./gradlew clean build

If using generated Protobuf classes:

bazel build //proto:all
./gradlew clean build

Gradle daemon issues

Symptom: Build hangs, OOM errors, or stale cache.

Fix:

./gradlew --stop
rm -rf ~/.gradle/caches/
./gradlew clean build

Frontend (TypeScript)

pnpm install fails with peer dependency errors

Symptom: ERR_PNPM_PEER_DEP_ISSUES during install.

Fix:

# Clear the pnpm store and reinstall
pnpm store prune
rm -rf node_modules
pnpm install

Check that your Node.js version matches (22+):

node --version  # Should be v22.x

API calls fail in development (proxy issues)

Symptom: ERR_CONNECTION_REFUSED or 502 errors when the frontend calls the API.

Cause: The backend is not running on the expected port.

Fix:

Ensure the backend is running on port 8080
Check the proxy config in next.config.ts
Verify with: curl http://localhost:8080/health

TypeScript type errors after proto changes

Symptom: Type errors referencing generated types in schemas/generated/.

Fix:

# Regenerate from proto
bazel build //proto:all

# Restart the TS server in your IDE
# VS Code: Cmd+Shift+P → "TypeScript: Restart TS Server"

Kubernetes / AKS

Pod stuck in Pending

Symptom: Pod stays in Pending state indefinitely.

Diagnose:

kubectl describe pod -n <namespace> <pod-name>

Common causes:

Insufficient resources: Node pool doesn't have enough CPU/memory. Check Events section.
PVC not bound: Persistent volume claim waiting for storage.
Node affinity: Pod can't be scheduled to any available node.

Fix: Scale up the node pool or reduce resource requests in the Helm values file.

Pod in CrashLoopBackOff

Symptom: Pod starts, crashes, restarts repeatedly.

Diagnose:

# Check crash logs
kubectl logs -n <namespace> <pod-name> --previous

# Check environment variables
kubectl exec -n <namespace> <pod-name> -- env | sort

Common causes:

Missing environment variables (especially database URLs)
Database connection failure (PG not accessible from pod)
Application startup error (check logs for stack trace)

Pod in ImagePullBackOff

Symptom: Pod can't pull the Docker image from ACR.

Fix:

# Re-authenticate with ACR
az acr login --name aucertacr41e0x5

# Verify the image exists
az acr repository show-tags --name aucertacr41e0x5 --repository <image-name>

# Check AKS has ACR pull permissions
az aks check-acr --name aucert-aks --resource-group aucert-foundation-rg \
  --acr aucertacr41e0x5.azurecr.io

Helm upgrade fails

Symptom: helm upgrade returns an error about conflicting resources.

Fix:

# Check current release status
helm list -n <namespace>
helm history <release-name> -n <namespace>

# If stuck in "pending-upgrade" state
helm rollback <release-name> -n <namespace>

Terraform

State lock — "Error acquiring the state lock"

Symptom: Terraform can't acquire the state lock.

Diagnose: Another terraform apply may be running. Check with your team.

Fix (only if confirmed no other process is running):

terraform force-unlock <lock-id>

danger

Only force-unlock if you are certain no other process is running. Forcing an unlock while another apply is in progress can corrupt state.

"Resource already exists" on apply

Symptom: Terraform tries to create a resource that already exists (created manually or by another process).

Fix:

# Import the existing resource into state
terraform import <resource_type>.<name> <azure-resource-id>

# Then plan to verify no changes
terraform plan

Provider authentication errors

Fix:

# Re-authenticate with Azure
az login

# Set the correct subscription
az account set --subscription "<subscription-id>"

# Verify
az account show

Database / Flyway

Migration failed — "Detected resolved migration not applied"

Symptom: Flyway detects a migration file that should have been applied before an already-applied one.

Cause: Migration version numbers are out of sequence (e.g., V003 was applied but V002 was added later).

Fix: Ensure migration version numbers are strictly sequential. If a migration was skipped, either:

Apply it manually and update the flyway_schema_history table
Use -baselineOnMigrate=true (already set in our CI workflow)

"Connection refused" from Flyway pod

Cause: The Flyway pod can't reach PostgreSQL on the private VNet.

Check:

# Verify the PG service is accessible from within the cluster
kubectl exec -n internal-platform -it <any-pod> -- \
  pg_isready -h internal-pg -p 5432

Ensure the Flyway pod is running in the correct namespace with VNet access.

CI / GitHub Actions

Workflow not triggering on push

Check:

Verify the paths filter matches your changed files
Check that the workflow file is on the main branch
Look at the Actions tab for skipped runs

# View recent workflow runs
gh run list --workflow=ci.yml --limit=5

OIDC authentication failure in CI

Symptom: azure/login step fails with OIDC token error.

Check:

AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_SUBSCRIPTION_ID secrets are set
App registration federated credentials match the repo/branch
permissions: id-token: write is set in the workflow

What's next

How to debug AKS pods — Detailed pod debugging
How to deploy to dev — Deployment process
Secrets management — Credential issues

Backend (Kotlin)​

Frontend (TypeScript)​

Kubernetes / AKS​

Terraform​

Database / Flyway​

CI / GitHub Actions​

What's next​

Backend (Kotlin)

Frontend (TypeScript)

Kubernetes / AKS

Terraform

Database / Flyway

CI / GitHub Actions

What's next