In This Article
Why Migrations Fail
Five root causes of migration failure: 1. Underestimating data quality issues. The source system has 15 years of data — including: duplicate records, orphaned references, invalid values, and undocumented custom fields. Discovery during migration (not before) causes delays and rework. 2. Incomplete data mapping. 200 source fields mapped to 180 target fields — but 20 fields have no equivalent in the target, 15 fields require transformation, and 8 fields have different data types. Each unmapped field is a defect discovered in testing or production. 3. Inadequate testing. "We migrated 100 records and they looked fine." But the 100 records didn't include: the edge case where a customer has 3 addresses, the scenario where an order has 50 line items, or the data type that truncates at 255 characters. 4. Big-bang cutover without rehearsal. The cutover plan is 40 steps executed by 8 people over 48 hours — rehearsed zero times. Step 23 fails. Nobody knows the contingency. 5. No rollback plan. The migration goes live, problems are discovered Monday morning, and there's no way to go back — because the rollback plan was "we won't need one."
The 6-Phase Framework
| Phase | Duration | Output |
|---|---|---|
| 1. Assessment | 2-3 weeks | Scope, risk register, data profiling results |
| 2. Mapping | 2-4 weeks | Field mapping document, transformation rules |
| 3. Cleansing | 3-6 weeks | Clean source data, quality metrics baseline |
| 4. Testing | 3-4 weeks | Test results, defect resolution, signoff |
| 5. Cutover | 1-2 days | Migration executed, systems switched |
| 6. Validation | 2-4 weeks | Post-migration audit, defect resolution |
Phase 1: Assessment
Data profiling: Before planning the migration, profile every source table: row counts, column statistics (null rates, unique values, min/max, patterns), data quality metrics (duplicates, orphans, invalid values), and volume estimates. Tools: Great Expectations, SQL Profiler, or Purview data profiling. Scope definition: Which data migrates? (all historical? last 5 years? active records only?) Which systems are in scope? What's the cutover strategy? (big-bang, phased, or parallel?) Risk register: Identify risks early: data quality issues discovered during profiling, complex transformations identified during mapping review, tight cutover window, and organizational readiness.
Phase 2: Data Mapping
The mapping document is the migration specification: field-level mapping (source field → target field, data type, transformation rule. Every source field mapped to: a target field, a transformation, or "not migrated" with business justification), reference data mapping (source codes → target codes: source status "A" = target status "Active" — every code value mapped), complex transformations (address parsing, name splitting, date format conversion, currency conversion — each documented with: input format, output format, edge cases, and validation rule), and relationship mapping (source foreign keys → target foreign keys). The mapping document is reviewed by both: technical team (can we implement this?) and business owner (does this preserve business meaning?).
Phase 3: Cleansing
Cleansing before migration — not after: deduplication (identify and merge duplicate records in the source before migration — migrating 5,000 duplicate customers into the new system just makes the new system dirty from day one), standardization (address standardization, phone number formatting, name casing — applied in the source or during migration transformation), validation (referential integrity: every foreign key references a valid parent record. Business rules: every order has at least one line item, every customer has a valid email format), and enrichment (filling missing values where possible: zip codes from city/state, industry codes from company name). Cleansing investment: 3-6 weeks of effort that prevents 3-6 months of post-migration data quality remediation. The organizations that skip cleansing save 3-6 weeks pre-migration and spend 3-6 months post-migration remediating the same issues — at 5-10x the cost because the issues are now distributed across two systems.
Phase 4: Testing
Structured migration testing: unit testing (each transformation rule tested with: normal cases, edge cases, null inputs, and maximum-length values — automated), volume testing (migrate the full dataset: does it complete in the cutover window? does it handle 50M rows without failure?), reconciliation testing (source count = target count for every table. Source sum = target sum for every numeric field. Source distinct values = target distinct values for every reference field — automated reports that flag discrepancies), business validation (business users validate 50-100 representative records in the target: does customer X's data look correct?), and regression testing (business processes executed on migrated data: can users create orders, run reports, process payments?). Coverage: automated reconciliation covers 100% of data. Business validation covers representative samples. Regression testing covers all critical processes.
Phase 5: Cutover
Rehearsal: Execute the full cutover in a test environment at least twice before production. Time every step. Identify bottlenecks. The production cutover should have zero surprises. Cutover runbook: Step-by-step document with: step number, description, responsible person, estimated duration, validation check, and rollback step. Go/no-go decision points: At defined checkpoints: validate previous steps, decide whether to proceed or rollback. Communication: Business stakeholders informed: pre-cutover (system unavailable from X to Y), during cutover (on track / delayed / rolling back), and post-cutover (system available, please validate).
Phase 6: Validation
First 2-4 weeks after cutover: automated reconciliation (scheduled comparison of source and target: are counts still matching?), business user validation (each department validates their critical data), defect triage (critical: data loss — fix immediately. Major: incorrect transformation — fix within 48 hours. Minor: cosmetic — fix in batch), and rollback readiness (maintain rollback capability for 2-4 weeks — if a critical defect is discovered that can't be fixed quickly, rollback is still possible). Defect volume should decline: Week 1: 20-50 issues. Week 2: 10-20. Week 4: under 5. If defect volume isn't declining — there's a systematic issue that needs root cause analysis, not more patches.
Migration Project Planning: Timeline and Resource Estimation
| Migration Size | Tables | Records | Timeline | Team Size |
|---|---|---|---|---|
| Small | 10-30 | Under 10M | 8-12 weeks | 2-3 (1 architect + 1-2 engineers) |
| Medium | 30-100 | 10-100M | 12-20 weeks | 4-6 (1 architect + 2-3 engineers + 1 QA + 1 PM) |
| Large | 100-500 | 100M-1B | 20-36 weeks | 8-12 (2 architects + 4-6 engineers + 2 QA + 1 PM) |
| Enterprise | 500+ | 1B+ | 6-12 months | 15-20+ (multiple workstreams) |
The timeline includes all 6 phases — not just the migration execution. The most common planning mistake: allocating 80% of the timeline to development and 20% to everything else. The correct allocation: assessment 15%, mapping 15%, cleansing 25%, development 15%, testing 20%, cutover + validation 10%. Cleansing and testing together consume 45% of the timeline — and organizations that shortchange these phases pay 10x in post-migration remediation.
Data Migration Governance: Change Control and Audit Trail
Migration governance ensures traceability and accountability: change control (every change to the migration specification — mapping rules, transformation logic, scope adjustments — documented with: what changed, why, who approved, and when. Changes after testing starts trigger re-testing of affected areas), audit trail (every migration run logged: timestamp, records processed, records succeeded, records failed, errors encountered, and who initiated the run. The audit trail proves: what data was migrated, when, and whether it was validated), signoff gates (formal signoff required at: mapping completion (business + technical), testing completion (QA + business), and go/no-go decision (project sponsor + business owners + technical lead). No phase proceeds without documented signoff), and issue management (every migration issue logged with: severity, description, root cause, resolution, and validation that the resolution works. Issues are reviewed daily during active migration phases and tracked to closure). For regulated industries (financial services, healthcare, pharmaceuticals): migration governance documentation is audit evidence — the regulator may review the migration process years later.
Data Migration Anti-Patterns
Anti-patterns that cause migration failures: the "we'll clean it later" approach (migrating dirty data with the plan to clean it in the new system — the dirty data corrupts new system processes, generates incorrect reports, and creates more work than cleansing before migration would have. Clean before migration — always), the "one big script" approach (a single monolithic migration script that processes all tables sequentially — if it fails at table 47 of 200, you re-run all 200 tables. Solution: modular scripts per table with: checkpoint/restart capability, error handling per table, and parallel execution for independent tables), the "weekend warrior" cutover (schedule the cutover for a single weekend with no rehearsal — the team is tired, the runbook is untested, and there's no time buffer for issues. Solution: rehearse twice, schedule for a long weekend with a buffer day), the "IT-only migration" (the migration team doesn't include business users until cutover — defects that business users would have caught in testing are discovered in production. Solution: business users involved from mapping through validation), and the "no rollback" plan (assuming the migration will succeed without preparing for failure — when issues arise, the only option is forward, regardless of severity. Solution: maintain rollback capability for 2-4 weeks post-cutover).
Migration Tools and Technology Selection
| Scenario | Recommended Tool | Why |
|---|---|---|
| SQL Server → Azure SQL | Azure Database Migration Service | Managed, supports online migration |
| Oracle → PostgreSQL | AWS SCT + DMS or ora2pg | Schema conversion + data migration |
| On-prem DW → Fabric | ADF + Fabric pipelines | Azure-native, supports incremental migration |
| Multi-source → lakehouse | Custom Spark + Great Expectations | Flexible transformation + quality validation |
| Large volume (10TB+) | Azure Data Box + CDC | Physical transfer for bulk + CDC for ongoing sync |
The Xylity Approach
We execute data migrations with the 6-phase framework — assessment, mapping, cleansing, testing, cutover, and validation. Our data engineers and data architects prevent the 60% migration failure rate by: profiling quality before migration, cleansing before loading, testing at scale with automated reconciliation, and rehearsing cutover until it's routine.
Go Deeper
Continue building your understanding with these related resources from our consulting practice.
Data Migration Without the 60% Failure Rate
6-phase framework — assessment, mapping, cleansing, testing, cutover, validation. Data migration that's rehearsed, tested, and reconciled.
Start Your Data Migration →