In This Article
Why Migration Testing Is Under-Invested
Migration testing gets squeezed because: the project is already over budget (cleansing took longer than planned), the deadline is fixed (datacenter lease expires, vendor contract ends), and the team assumes "if the script runs, the data is correct." The script can run successfully and still: truncate text fields, lose decimal precision, mismap reference data, break foreign key relationships, and omit records that don't match the expected format. Cutting testing saves 3-4 weeks of project time and creates 3-4 months of post-migration remediation. The economics: invest in testing or pay 10x in production defect resolution.
5-Layer Testing Framework
| Layer | What It Tests | Automation | Coverage |
|---|---|---|---|
| 1. Row count | Same records in source and target | Fully automated | 100% of tables |
| 2. Value reconciliation | Sum, min, max, distinct count for numeric/key fields | Fully automated | 100% of fields |
| 3. Transformation | Each transformation rule produces correct output | Automated unit tests | 100% of rules |
| 4. Business validation | Representative records verified by business users | Manual with checklists | 50-100 records per entity |
| 5. Process regression | Business processes work on migrated data | Manual with scripts | All critical processes |
Layers 1-3 are automated and run on every migration iteration. Layers 4-5 are human-validated and run on the final migration before cutover. All 5 layers must pass before go/no-go decision.
Automated Reconciliation
Per-table automated checks: row count match (source rows = target rows — if not, investigate missing or extra records), numeric field validation (SUM, MIN, MAX, AVG for every numeric column — catches truncation, rounding, precision loss, incorrect data type mapping), key field validation (COUNT DISTINCT of key fields matches. All PKs in source exist in target. All FKs resolve to valid parent records), date field validation (MIN/MAX dates match. No dates converted to NULL. No timezone-related shifts — the most common date migration defect), and string field validation (MAX LENGTH doesn't exceed target column width. Character encoding preserved — UTF-8 special characters not corrupted). Build reconciliation as a parameterized script that: connects to source and target, runs all checks for every table, and produces a pass/fail report with exact discrepancies. Target: 100% pass rate before cutover.
Business Validation
Automated reconciliation validates correct transfer. Business validation validates business sense: golden record validation (50-100 records representing: typical records, edge cases, high-value records, complex relationships — verified by business users in the target system), report comparison (same report in source and target — totals match? breakdown matches? filters produce same results?), calculation validation (calculated fields: account balances, inventory totals, financial aggregations — validated against source calculations. $0.01 discrepancy in financial totals must be investigated), and user acceptance testing (5-10 business users perform daily tasks on migrated data: create orders, process returns, run payroll — each workflow validated end-to-end).
Cutover Rehearsal
Protocol: Execute full cutover in non-production at least 2 times before production. Each rehearsal: follow exact runbook, time every step, validate checkpoints, and practice rollback (at least one rehearsal should include deliberate rollback). Record observations: steps that took longer than expected, steps that failed, steps missing from runbook, undocumented dependencies. After 2 successful rehearsals: the process should feel routine, not stressful. If rehearsal 2 still produces surprises — rehearse again. The cost of a third rehearsal: 1-2 days. The cost of a production cutover failure: weeks of remediation plus business disruption.
Rollback Strategy
Point of no return: Define the step after which rollback is no longer simple — usually when the target system starts accepting new transactions. Before this point: rollback = restore source from backup. After: rollback = reverse-migrate new transactions from target to source — significantly more complex. Rollback window: Maintain rollback capability for 2-4 weeks. Source system remains available, data synchronization keeps source current, rollback procedures documented and rehearsed. Rollback triggers: Critical data loss discovered. >5% reconciliation discrepancy. Critical business process failure. Key stakeholder declares "this doesn't work." Communication plan: If triggered: who is notified, timeline for restoring source, plan for attempt #2.
Post-Migration Monitoring
First 4 weeks: daily reconciliation (automated key metric comparison — catches missed records, ongoing quality issues, transformation defects), defect tracking (every issue logged, categorized, tracked. Week 1: 20-50 issues. Week 4: under 5. If not declining — systematic issue needs root cause analysis), performance monitoring (queries that were fast on source may be slow on target — different indexing, data distribution, query optimizer), and user adoption monitoring (are users working in the new system? are they reverting to old? low adoption at week 2 indicates: data trust issues, functionality gaps, or training needs).
Testing Tools and Automation
| Testing Layer | Tool | Purpose |
|---|---|---|
| Data profiling | Great Expectations, Purview | Source data profiling and quality baseline |
| Reconciliation | Custom SQL/Python scripts | Automated source-target comparison |
| Transformation testing | pytest, dbt tests | Unit testing for transformation rules |
| Business validation | Structured checklists | Golden record verification |
| Performance testing | Query benchmarks | Target system performance validation |
The reconciliation framework is the most valuable investment: build it once, run it on every migration iteration (development, testing, rehearsal, production), and reuse it for post-migration monitoring. The framework pays for itself on the first migration where it catches a defect that would have reached production. For organizations running multiple migrations (ERP replacement + data warehouse modernization + CRM migration): the reconciliation framework becomes a reusable enterprise asset that reduces testing cost on every subsequent migration by 50-70%.
Reconciliation Framework: Reusable Across Migrations
Build the reconciliation framework as a reusable asset — not a one-time script: framework architecture (configuration-driven: define source connection, target connection, and table-level rules in YAML/JSON. The framework reads the configuration and executes all checks automatically. Adding a new table: add a configuration entry, not new code), check types (row count, numeric sum, distinct key count, null count, min/max, pattern validation — each implemented as a reusable function. Custom checks: business-specific validation rules added as plugins), reporting (HTML report showing: table name, check type, source value, target value, pass/fail, and discrepancy percentage. Summary dashboard: total tables, total checks, pass rate, failed checks by category. Email notification on completion with: pass/fail summary and link to detailed report), and scheduling (run reconciliation: on every migration iteration during development, nightly during testing phase, every 15 minutes during cutover, and daily for 4 weeks post-migration). The framework pays for itself on the second migration: development cost amortized, testing coverage thorough from day one, and post-migration monitoring automated.
Edge Cases That Break Migrations
Edge cases that migration testing must cover: NULL handling (source NULL → target empty string? target NULL? target default value? — each NULL mapping explicitly defined and tested), character encoding (Latin-1 source → UTF-8 target: special characters preserved? Unicode source → ASCII target: how are non-ASCII characters handled? — test with actual multi-language data from the source), date/time edge cases (timezone conversions: UTC source → local target; DST transitions: November 3 at 2:00 AM — does the hour repeat or skip? date formats: MM/DD/YYYY vs DD/MM/YYYY — January 2nd vs February 1st), numeric precision (DECIMAL(10,2) source → FLOAT target: $1,000,000.01 may become $1,000,000.009999999 in floating point — financial data must use fixed-precision types), max-length values (VARCHAR(500) source → VARCHAR(255) target: the 300-character customer note is silently truncated — test with maximum-length values from every string column), and referential integrity across load order (child records loaded before parent records → foreign key violation. Solution: load parents first, children second — or disable constraints during load and validate after). These edge cases account for 80% of post-migration defects. Include them in Layer 3 (transformation testing) with automated unit tests for each.
Migration Testing for Regulated Industries
Regulated industries (financial services, healthcare, pharmaceuticals) require additional testing rigor: validation documentation (IQ/OQ/PQ protocol for pharmaceutical GxP systems — Installation Qualification, Operational Qualification, Performance Qualification — each step documented with: expected result, actual result, pass/fail, and investigator signature), data integrity proof (demonstrate that migrated data maintains ALCOA+ properties — Attributable, Legible, Contemporaneous, Original, Accurate + Complete, Consistent, Enduring, Available — each property tested and documented), audit trail migration (audit trail data migrated alongside business data — regulatory requirement that the complete history of data changes is preserved in the new system), 21 CFR Part 11 compliance (for pharmaceutical companies: electronic signatures, access controls, and audit trails in the new system validated against FDA requirements — before migration data enters the system), and parallel operation (regulated migrations often require parallel operation for 30-90 days — both old and new systems running simultaneously, with reconciliation proving equivalent results. This is expensive but regulatory necessity — the cost of a compliance finding far exceeds the cost of parallel operation).
Continuous Migration Testing: Shift-Left Approach
Traditional migration testing: build the migration → test at the end → find defects → fix → retest → ship. Shift-left migration testing: test continuously from day one: mapping reviews (every mapping rule reviewed and approved before implementation — catch mapping errors before they become code defects), transformation unit tests (written alongside transformation code — each rule tested with: normal cases, edge cases, nulls, and maximum values. Tests run on every code change), integration tests on sample data (after every development sprint: run the migration on a representative sample. Reconciliation automated. Defects caught within days of introduction — not weeks later during formal testing), full-volume test early (don't wait for formal testing phase to run at full volume — run a full-volume test at the end of development to catch: performance issues, memory issues, and data patterns that don't appear in samples), and automated regression (every defect fix accompanied by a test that prevents regression. The test suite grows throughout the project — by cutover, the test suite is thorough enough to catch any regression introduced by last-minute changes).
The Xylity Approach
We test data migrations with the 5-layer framework — automated reconciliation (row counts + value validation + key integrity), transformation unit testing, business validation (golden records + report comparison), cutover rehearsal (2+ full rehearsals), and post-migration monitoring (4 weeks daily reconciliation). Our data engineers and ETL developers build testing infrastructure that catches defects before they become Monday-morning emergencies.
Go Deeper
Continue building your understanding with these related resources from our consulting practice.
Testing That Prevents Monday-Morning Surprises
5-layer validation, automated reconciliation, cutover rehearsal, rollback planning. Migration testing that catches defects before production.
Start Your Migration Testing →