Data Durability And Recovery
Use this page when a deployed-dev change could affect database rows, media objects, runtime data paths, or future recovery
expectations. The current shared dev environment is useful for deployment confidence, smoke proof, and learning drills,
but it is not a staging or production durability promise.
For exact operator procedures, use Runbooks. For lifecycle consequences of each operation, use Deployed Dev Lifecycle.
Current Posture
Section titled “Current Posture”- Deployed-dev Postgres runs as a container on the shared runtime host. It is disposable and reseedable by policy.
- Host stop/start is cost control. It should preserve the current root EBS volume, but it is not backup, restore, reset, or cleanup.
- Runtime deploy and runtime rollback change app images. They do not roll back database rows or media objects.
- Database reset is explicit and destructive. It can delete non-seed rows while leaving S3 media objects behind.
- The deployed-dev media bucket persists across host stop/start, but it is disposable at the infrastructure lifecycle level.
- Backup and restore work is currently a learning drill. It should teach future staging and production planning without
claiming deployed
devhas RPO, RTO, retention, or disaster-recovery guarantees.
Data Surfaces
Section titled “Data Surfaces”| Surface | Current Durability Posture | Main Risk | Primary Reference |
|---|---|---|---|
| Containerized Postgres data | Preserved across ordinary container restart, Compose restart, and host stop/start. | Destructive reset, EC2 replacement, root-volume loss, or full stack teardown. | Deployed Dev Lifecycle |
| Deterministic seed data | Recreated by the database reset runbook. | Seed drift or warnings that are hidden by broad deploy output. | Database Reset |
| S3 media objects | Preserved across host stop/start and database reset. Disposable at teardown. | Orphaned objects after DB reset, missing objects after bucket replacement, forced cleanup. | Media Discrepancy Report |
| Last-good release receipts | App-runtime rollback state, not data backup. | Mistaking app rollback for database or media recovery. | Runtime Rollback |
| Private proof and artifacts | Kept outside public docs unless reduced to reusable lessons. | Publishing raw identifiers, logs, sampled rows, or cloud evidence. | Generated Documentation |
Before Risky Work
Section titled “Before Risky Work”Classify the operation before choosing a recovery path:
- Is it read-only inspection, app deploy, destructive reset, host replacement, media bucket change, or stack teardown?
- Is the intended data outcome “accept disposable loss”, “preserve with a manual learning backup”, or “pause until a real restore runbook exists”?
- Which proof is required afterward: endpoint smoke, seeded smoke, media discrepancy report, browser media smoke, or a restored-target health check?
- Which evidence belongs in the workflow or private notes, and which lesson is stable enough to promote into docs?
If the answer depends on preserving non-seed data, do not hide the decision inside runtime deploy, host shutdown, or automatic inactivity handling. Make it an explicit operator choice.
Reset And Media Drift
Section titled “Reset And Media Drift”Database reset is the normal deployed-dev path when losing non-seed rows is acceptable. It runs migrations, base seed, and deterministic dev-data seed inside the deployed API container, then requires seeded smoke.
Because reset does not delete the media bucket, DB rows and S3 objects can drift. Treat that as an explicit follow-up:
- Run the read-only media discrepancy report when DB/S3 divergence matters.
- Treat non-zero discrepancy counts as telemetry, not automatic cleanup.
- Add any future cleanup command as a separate approval-gated mutation that prints the exact rows or object keys it will touch.
Backup Learning Drill
Section titled “Backup Learning Drill”The first backup/restore pass should be manual and disposable. It should prove the shape of the work before automation is introduced.
The drill should answer:
- What exact source was backed up?
- Where was the backup artifact stored, and who could read it?
- What disposable target received the restore?
- How long did backup and restore take?
- Which permissions, tools, network paths, or runtime values were missing?
- Which health, readiness, seed, or app smoke checks proved the restored data was usable?
- Which lessons belong in deployed
dev, and which belong later in staging or production planning?
Until the first drill has real commands, docs should describe the proof shape rather than inventing a runbook. Once the commands are proven, promote the reviewed operator path into Runbooks and leave raw proof details in private evidence or working notes.
Later Environment Gates
Section titled “Later Environment Gates”Staging and production will need decisions that deployed dev intentionally does not make yet:
- RPO and RTO expectations.
- Retention windows and backup artifact ownership.
- Restore testing cadence.
- Failed-backup alerting.
- Cross-account or cross-region backup posture.
- Data access, encryption, and deletion policy.
- Disaster-recovery ownership during incidents.
Do not retrofit those promises onto deployed dev by implication. Graduate them when a later environment requires the
operational burden.