Agency operations

Inherited codebase: a 2-week recovery plan

Day-by-day plan for the first two weeks after taking on a system you didn't build. Read-only first, write access second, opinions last.

May 3, 2026 8 min read

The dangerous time on an inherited system is not month six. It is the first two weeks.

That is when the team that took it over still has the goodwill of the client, none of the context of the original developer, and every incentive to look productive. It is also when most of the avoidable damage gets done — credentials rotated before they were inventoried, deploys made before rollback was tested, opinions formed before the system was actually read.

This is the schedule we follow. It is read-only first, write access second, opinions last. For the access side specifically, the legacy handover checklist covers the inventory step in more depth.

Days 1–2: Access without changes

The first goal is not to fix anything. The first goal is to know what you have.

Inventory every credential you were handed: hosting, DNS, registrar, repos, CMS, payment, email, analytics, monitoring, backup storage. Write down who currently owns each account.
Confirm you can authenticate to each. Do not rotate yet.
Map which accounts are personal (tied to the previous developer’s email) and which are organizational. Personal accounts are a risk — but rotating them on day one usually breaks something nobody documented.
Note which integrations have webhooks pointing at the system. These will be the silent failures during any cutover.
Read the most recent invoices for hosting and SaaS. The bill is often a more honest inventory of the system than the documentation is.

If a credential is missing, ask now, in writing. Missing access discovered on day twelve under pressure costs ten times what missing access discovered on day two costs.

Days 3–4: Read the system

This is the temptation period. You will see code that is obviously bad, and you will want to fix it. Do not.

Open the repo. Read the README, then the deploy script, then the routing entry points, then the cron jobs. In that order.
Note framework, language, and dependency versions. Note whether they are out of support.
Open the database. Look at the largest tables and the columns that have NULL where you would not expect NULL. Those are usually the load-bearing edge cases.
Open the cron and queue configuration. Cron jobs that fire weekly or monthly are the ones that surprise people on day forty.
Trace one full user flow end to end — login through the most-used action through the resulting email or webhook. That trace is your first map of the system.

The goal is to read enough to ask informed questions. Not to refactor. If you find a bug, write it down. Do not fix it.

Days 5–6: Trust the backups

Most agency-managed systems have backups. Almost none have tested restores.

Find every backup destination — database, files, configuration. Note the retention policy.
Pick the most recent backup. Restore it to an isolated environment that is not production, not staging, and not your laptop.
Time the restore. Note what failed. Backups that “exist” but fail to restore cleanly under MySQL collation drift, file permissions, or missing config files are the most common surprise here.
Note the recovery point objective and the recovery time objective implied by what you found. Compare to what the client thinks they have.
Document the gap. Do not yet propose how to close it.

If the restore fails entirely, this is the first thing you will fix. Note it; we will come back to it.

Days 7–8: Watch a real day

The system in writing is not the system in production. The only way to know how it actually behaves is to watch one full business day.

Tail the application log, the web server log, and the database slow query log during peak hours.
Note the most-hit endpoints, the slowest queries, the noisy errors that everyone has learned to ignore.
Watch a deploy if one is scheduled. Note who runs it, from where, with what command, and whether anyone holds their breath.
Note the support tickets that come in. The ones that say “this happens sometimes” are usually pointing at a real bug that was being absorbed by the previous team.
Time the slow paths. Page-load slowness, queue lag, scheduled-job runtime. Numbers, not impressions.

By the end of these two days you should be able to describe the system’s normal day in one paragraph. If you cannot, you have not watched long enough.

Days 9–10: Risk register

Now you have material. Convert it into a single sheet.

One row per finding. ID, description, severity, business impact, effort, owner, suggested phase.
Severity in three buckets: critical (revenue or data loss within 30 days if unaddressed), high (operational fragility), medium/low.
Effort in three buckets: hours, days, weeks. Be honest. The effort estimate is the part most teams underprice.
Phase in three buckets: stabilize (now), upgrade (next 60 days), operate (steady state).

Resist the urge to fix things while writing the register. The register’s value is its honesty. If you fix as you go, you will leave the easy items off the list and the client will see a sheet that does not match the work being billed.

The format we use looks like the risk register in the sample audit. It is more structured than a list and less ceremonial than a spreadsheet template. The point is that someone non-technical can read it and rank.

Days 11–12: Talk to the client

By now, you know more about the system than the client does. That is a temporary advantage. Use it.

Walk the client through the register in their language, not yours. “Backups have not been tested in 18 months” not “RPO/RTO undefined.”
Identify the two or three findings that are genuinely urgent. Get explicit go-ahead to address them.
Identify the findings that are genuinely cosmetic. Get explicit acknowledgment that they are deferred.
Convert the rest of the register into options. “Now,” “next quarter,” or “deferred indefinitely.” The client decides; you advise.
Reframe the engagement. The first two weeks were diagnostic. The next phase is execution against an agreed plan.

Put the agreed phase in writing. The most expensive misunderstanding in agency handovers is the one where the client assumed everything in the register was being fixed and the agency assumed they had only been hired to identify it. Be specific.

If the engagement is a white-label takeover where the agency owns the client relationship, the same conversation happens — the agency’s account lead delivers the register and we stay in the background. The structure does not change.

Days 13–14: First safe change

The last two days are about proving the path. Pick the smallest, most reversible improvement on the register and ship it.

Use the deploy path you intend to use forever. Not a one-time hack. If the existing path is unsafe, fixing the path is the change you ship — that is acceptable.
Use staging. If there is no staging, the deploy path includes building one. That counts.
Confirm rollback before deploying. If rollback is theoretical, do not ship.
Pick a change with low blast radius and high observability. Adding a missing index to a slow query. Pinning a dependency that has been floating. Documenting an undocumented environment variable. Not a feature.
Watch it after it ships. For at least 24 hours, on the metrics you started measuring on day seven.

The point of the first change is not the change. The point is to walk every step of the path you will use for everything else, while the stakes are low. The next twenty changes will use the same path, and they will not be low-stakes.

What this is not

This is not a sprint. The goal of the two weeks is not to finish things. It is to make the next six months safe.

A team that follows this schedule emerges with: an inventory, a risk register, a tested restore, a baseline of normal-day metrics, a documented deploy path, and one safely-shipped change. That is enough to quote real work against, and enough to start a structured audit if the engagement calls for one. It is also enough to make the next emergency ten times cheaper than the last one.

The teams that skip this and start fixing things on day three are the ones that, six weeks in, cannot tell whether the system is more stable than when they took it over. Sequence is the difference. Sequence is most of the work.

For the related case where there is no previous team to hand over from — the developer left and the founder is alone with the system — see sole developer left, what now?. The shape of the work is similar. The audience is different. So is the urgency.