The Migration Plan

Learn how to migrate legacy monoliths safely using modular decomposition, anti-corruption layers, and controlled cutover strategies that reduce risk and ensure smooth transitions to modern architectures.

We'll cover the following...

Most migrations don’t fail because of bad code. They fail because the migration strategy is flawed.

Migrating a legacy monolith is one of the riskiest, most politically sensitive moves in engineering—and at the Staff+ level, you’re often expected to lead it.

Disasters can stem from issues ranging from overambitious big-bang rewrites to tight coupling to the lack of a rollback plan. But these risks are manageable if you lead with discipline.

This lesson covers three battle-tested tools to help you do exactly that:

  1. Modular decomposition to isolate concerns and reduce risk.

  2. Anti-corruption layers (ACLs) to protect new code from legacy complexity.

  3. Cutover strategies to shift production traffic with observability and rollback.

Let’s break these down.

1. Module-by-module migration

Big-bang rewritesA big‑bang rewrite is when a team decides to throw away the existing system—usually a legacy monolith—and rebuild it entirely from scratch in a new language, framework, or architecture. are the most common cause of migration failure. They promise a clean slate—but after months (or years) of work, teams often end up with half-integrated modules, stalled delivery, and frustrated stakeholders asking, “When does it ship?”

You can avoid this trap through module-by-module migration.

Instead of replacing everything at once, this would involve:

  • Gradually decomposing the monolith into smaller, independently managed domains, wherein each module should align with a meaningful business capability.

  • Modules are extracted, rewritten (if necessary), and deployed behind a clear API boundary.

  • Modules are extracted based on complexity, coupling, risk, and team familiarity.

Here are rules of thumb to follow:

  • Start small: Low-risk, loosely coupled domains first.

  • Own the domain: Every extracted module must fully control its data, logic, and APIs.

  • Keep the monolith alive: It’s not dead weight. It’s your safety net until the migration is complete.

Example: E-commerce modular migration

Let’s say a team maintains a monolith that handles all business logic—orders, payments, users, and products. They decide to rewrite everything in a new framework and separate services. But after a year, nothing is usable because the integration challenges between modules were too great, and there was no plan to deliver value incrementally. This is a failed big-bang rewrite.

Instead, they could have do a module-by-module migration

  • Starting small by extracting the “Search” functionality.

  • Rebuilding Search as an independent service.

  • Routing search-related requests to Search via an API gateway. 

Once the search was stable, they could move on to other functionalities, such as the “Product Catalog” module, “Orders,” and “Billing.”  

2. Anti-corruption layer (ACL)

Tight coupling between new and legacy systems is a silent killer of migrations. Without clear boundaries, the new architecture inherits legacy quirks, leaky data models, inconsistent naming, and hidden business logic that creep back in until the “new” system behaves just like the old one.

You can prevent this with anti-corruption layers (ACLs)—explicit translation boundaries that decouple legacy dependencies, normalize data, and shield new services from old complexity. The result: clean, consistent interfaces that keep legacy assumptions quarantined where they belong.

The most underestimated threat in migrations isn’t what you move out, but what seeps back in (like undocumented business rules and John’s cryptic DO_NOT_REMOVE field names).

Example: Billing API translation

Let’s say the old billing API returns:

{
"usr_nm": "jdoe",
"stat": "A",
"bal": "00124.0500"
}

Without an ACL, new services might scatter logic everywhere to rename fields, interpret statuses, and parse balances. Over time, you’ve just replicated legacy weirdness in the new system.

With an ACL, you centralize translation:

# billing_acl.py
@dataclass
class BillingUser:
username: str
status: str
balance: Decimal
def normalize(raw: dict) -> BillingUser:
return BillingUser(
username=raw["usr_nm"],
status="active" if raw["stat"] == "A" else "inactive",
balance=Decimal(raw["bal"])
)

Now, all new code consumes clean, typed models. Legacy complexity stays locked inside the ACL. 

Properties of a good ACL

  • Exists in a dedicated module, not spread across services.

  • All dependencies on the legacy system are centralized here.

  • Tested thoroughly to avoid unexpected translations.

  • Used in both read and write paths if needed.

The payoff? Refactoring is safer, and when the legacy system is finally retired, you only rewrite the ACL (not every downstream service).

3. De-risking cutovers

The most dangerous part of any migration is the cutover: when real users hit the new system. This is where theory meets production traffic, and subtle bugs or edge cases surface. If you treat it as a switch flip, you’re rolling dice with downtime, corrupted data, and late-night fire drills.

Cutovers should be treated as staged experiments.

Techniques

  • Dark launch: The new system receives production traffic but discards responses. This reveals performance, schema mismatches, and unexpected load—without impacting users.

  • Shadow testing: Requests are sent to both old and new systems. Outputs are compared for consistency, catching subtle bugs like rounding errors or missing fields.

  • Gradual rollout: Use feature flags to shift traffic in percentages (1%, 10%, 25%, 50%, 100%).

  • Observability and rollback: Monitor latency, error rates, and mismatches in real time. If something spikes, traffic reverts automatically.

Example: Staged traffic migration

Your team has just finished migrating the “Catalog” domain. Instead of pointing all traffic at once, you:

  1. Dark launch for a week—logs reveal unexpected load spikes during promotions.

  2. Shadow test against the old catalog—differences in how discounts are applied surface.

  3. Fix the logic, then ramp traffic: 1% of users on day one, 10% the next, 50% by week’s end.

  4. Only when error rates stay flat do you commit 100%.

The cutover succeeds without a blip in metrics or user experience.

Test your knowledge

Quiz: Staff+ Migration Strategies

1.

What is the main advantage of migrating module by module?

A.

Faster deployment

B.

Easier hiring

C.

Reduces risk by isolating domains

D.

Avoids writing tests


1 / 3