Database Credential Rotation Incident
← Back to ASE ProjectsEnd-to-end incident response for a realistic outage: a database credential rotation occurred in Postgres while the application still used the old secret. The result was 500s on DB-backed routes. I scoped by timestamp, reproduced once, correlated logs, mitigated safely, validated recovery, and wrote a short rotation checklist to prevent repeats.
Stack
Docker • Nginx • Flask • Postgres • Linux
What I Did
- Captured baseline behavior & timestamp window
- Rotated the DB password to simulate an outage
- Correlated 500s on
/api/userswithFATAL authin app logs - Mitigated by restoring the secret or updating the app secret + restart
- Validated 200s and a clean log window after recovery
- Published a DB-secret rotation checklist
Incident Timeline
- Baseline: routes 200
- Rotate DB password → 500s on users API
- Logs show Postgres authentication failures
- Rollback/secret update → app restart
- Recovery validated; log window clean
Incident Response Story
1) Baseline & Scope
Confirm all services are healthy and take a quick baseline (/api/users 200). Note the Date header / timestamp window to align evidence in logs and future requests.

2) Introduce Change → Reproduce Failure
Rotate the DB password in Postgres while the app still uses the old secret. DB-backed routes flip to 500; capture the failures in the same timestamp window as the credential change.


GET /api/users returns 500 and app logs show FATAL password authentication failures from Postgres.3) Mitigation
The quickest mitigation is to restore the previous credential so the app and DB match again.


4) Recovery Validation
Re-test /api/users to confirm 200s, and tail logs to ensure the window is clean (no new auth failures). Document the incident and add the rotation checklist so future password changes don't cause surprise outages.

Key Commands Used
Repro & Evidence
# Baseline curl -i http://localhost:8080/api/users # Introduce outage (DB rotation only) docker compose exec db psql -U postgres -d appdb -c "ALTER USER postgres WITH PASSWORD 'WrongNow#1';" # Failure & logs (aligned by timestamp) curl -i http://localhost:8080/api/users # expect 500 docker compose logs --timestamps --tail=50 app | grep -Ei "FATAL|auth|psycopg2"
Mitigation & Validation
# Fast rollback docker compose exec db psql -U postgres -d appdb -c "ALTER USER postgres WITH PASSWORD 'postgres';" # OR rotate app secret to the new value, then: docker compose up -d --build app # Validate recovery curl -i http://localhost:8080/api/users # expect 200 docker compose logs --timestamps --since=2m app
Outcome & Prevention
- Outage localized to a DB credential mismatch between app and Postgres; recovered quickly once secrets were aligned.
- Added a DB-secret rotation checklist: update app secret → restart → smoke test → record timestamp.
- Set a simple alert on 5xx/auth-fail spikes to catch this class of issues early in production.