Legacy Ceiling: When On-Premises Can't Scale

On-premises warehouses (SQL Server, Oracle, Teradata) hit ceilings hardware can't fix:

Compute ceiling: Nightly ETL takes 8 hours on the largest server. Adding sources pushes past the morning deadline. Cloud solution: elastic compute scaling to 100+ nodes during ETL, processing 8 hours of work in 45 minutes, then scaling down.

Concurrency ceiling: 50 concurrent users is fine. At 200, queries queue — the 10 AM report arrives at 10:45 AM. On-premises fix: buy another $500K appliance. Cloud fix: auto-scaling compute that adds capacity when users increase — no procurement, no 12-week lead time.

Flexibility ceiling: Adding a data source requires capacity planning, ETL development, impact assessment, and a change window. Timeline: 4-8 weeks. Cloud: add to the pipeline, scale compute, deploy. Timeline: days. The cloud platform removes the infrastructure friction that makes every change a project.

The legacy warehouse isn't failing because it's bad technology. It was designed for 10x less data, 4x fewer users, and zero AI workloads. Cloud architecture is designed for the world that actually exists. — Xylity Data Engineering Practice

What Changes in Cloud Architecture

DimensionOn-PremisesCloud
StorageFixed, planned years aheadUnlimited, pay for what you store
ComputeFixed, shared by allElastic, independent per workload
ScalingVertical — months lead timeHorizontal — minutes
ETLOvernight batchContinuous / near-real-time
ConcurrencyLimited by hardwareLimited by budget (auto-scale)
AI servingSeparate infrastructureIntegrated — warehouse serves ML features
CostCapEx (hardware every 3-5 years)OpEx (monthly consumption)

Separation of Storage and Compute

The defining shift: storage and compute scale independently. On-premises, they're welded in the same appliance. In cloud: add 10TB without changing compute; add 10 query nodes without adding storage. This enables: workload isolation (ETL doesn't compete with BI), elastic scaling (extra compute for month-end, scale down after), cost optimization (turn off compute off-hours), and unlimited growth (no capacity planning). Snowflake pioneered this; Fabric, BigQuery, and Databricks implement similar patterns.

The Lakehouse-Warehouse Convergence

Fabric provides both: Lakehouse (schema-on-read, Spark processing, raw storage in OneLake) and Warehouse (schema-on-write, T-SQL querying, dimensional models). Both access the same storage (OneLake, Delta format). Data engineers process in the lakehouse using Spark; the result is queryable from the warehouse using T-SQL — no data movement.

Databricks implements the same through Delta Lake: data in Delta format serves both Spark (engineering, ML) and SQL (BI, reporting) through SQL Warehouses. The architectural implication: don't design separate lake and warehouse — design a unified platform where the same data serves engineering, analytical, and AI workloads through different compute engines.

Making the Warehouse AI-Ready

Traditional warehouses serve BI. AI-ready warehouses also serve AI/ML workloads.

Feature serving: ML models consume features — calculated attributes from warehouse data (90-day purchase frequency, average order value, days since last interaction). The warehouse serves features through SQL for batch scoring or feature store integration for real-time inference. It becomes a feature computation engine — not just a BI engine.

Training data management: ML needs point-in-time correct data — features as they existed at prediction time, not today. The warehouse's SCD Type 2 dimensions provide this: query customer attributes as of March 15, 2024. Without point-in-time correctness, models train on future data (leakage) and produce artificially high accuracy that collapses in production.

Prediction storage: Model outputs (churn scores, demand forecasts, risk ratings) stored in the warehouse alongside historical actuals. Power BI shows both: actual revenue and predicted revenue. The warehouse serves backward-looking reporting AND forward-looking prediction from one platform.

AI-Ready = Analytics + ML From One Platform

An AI-ready warehouse serves BI and ML from the same data, governance, and quality standards. Separate platforms create: data duplication, governance gaps (ML data ungoverned while BI is governed), and consistency issues (different numbers from different copies). The unified platform eliminates these.

Migration Path: On-Premises to Cloud

1

Phase 1: Assessment (2-4 weeks)

Inventory: schema, volume, queries, ETL complexity, users, baselines. Determine: rehost, replatform, or refactor for lakehouse convergence. Produce migration plan with platform selection and timeline.

2

Phase 2: Foundation (4-6 weeks)

Deploy Fabric/Synapse/Snowflake. Implement zones. Migrate ETL to cloud-native (ADF, Fabric pipelines, dbt). Migrate 2-3 pilot subject areas.

3

Phase 3: Migration (8-12 weeks)

Remaining subject areas in waves. Convert ETL to ELT. Validate: results match, performance meets baseline, concurrency acceptable. Cut over BI tools.

4

Phase 4: Optimization (4-6 weeks)

Right-size compute. Auto-scaling for variable loads. Materialized views. Near-real-time freshness (CDC replacing batch). Decommission on-premises after 30-day parallel run.

Cost Model: CapEx to OpEx

ComponentOn-PremisesCloud
Hardware$500K-2M every 3-5 years$0 (consumed as service)
Storage$500-1,000/TB/year (SAN)$20-50/TB/month
ComputeIncluded in hardware (fixed)$5-25/hour per unit (elastic)
DBA/Operations1-2 FTE ($150-300K/year)0.5 FTE (managed service)
Licensing$100K-500K/yearIncluded (consumption-based)

Typical on-premises $800K/year migrates to $400-600K/year with 3-5x more capacity. Savings from: eliminating hardware CapEx, reducing ops staff, and right-sizing compute. The advantage grows as optimization matures (auto-scaling, reserved instances, off-hours shutdown).

Platform Selection for Cloud Warehousing

Microsoft ecosystem → Fabric Warehouse. Unified with lakehouse, Power BI, OneLake. T-SQL compatibility eases SQL Server migration.

Multi-cloud / flexibility → Snowflake. Runs on Azure, AWS, GCP. Independent warehouses for workload isolation. Most mature storage/compute separation.

Google Cloud → BigQuery. Serverless, zero infrastructure. Pay per query or flat-rate. Automatic optimization.

Databricks-centric → Databricks SQL. SQL layer on Delta Lake. Unified with Spark for engineering and ML.

Real-Time Cloud Warehousing: From Batch to Streaming

Legacy warehouses refresh nightly — users see yesterday's data. Cloud warehouses enable near-real-time freshness through: Change Data Capture (CDC) — capture source changes as they happen (Debezium for databases, Kafka/Event Hubs for event streams) and apply incrementally to the warehouse. Latency: seconds to minutes instead of hours. Micro-batch loading — instead of one nightly load, run smaller loads every 15-30 minutes. Each load processes only changed records (identified by last_modified timestamp or CDC). Fabric supports both patterns: Eventstream for real-time ingestion, and pipeline scheduling for micro-batch. The business impact: finance sees today's revenue at 2 PM (not tomorrow at 8 AM), operations monitors inventory in near-real-time (not based on last night's snapshot), and customer service sees the order placed 5 minutes ago (not the one placed before last night's batch).

Multi-Temperature Storage: Hot, Warm, Cold

Not all warehouse data needs the same performance tier. Multi-temperature storage assigns data to tiers by access frequency: Hot (last 3 months): Premium storage, fully indexed, materialized views — maximum query performance for current operational analytics. Warm (3-24 months): Standard storage, partitioned by month — good performance for historical analysis at lower cost. Cold (24+ months): Archive storage (Azure Cool/Archive tier) — rarely accessed but available for regulatory compliance and historical deep-dives. Automatic tiering rules move data between temperatures based on last access date. Hot → Warm after 90 days without access; Warm → Cold after 24 months. This reduces storage costs by 40-60% compared to keeping everything on premium storage — significant at 50TB+ scale.

Disaster Recovery for Cloud Warehouses

Cloud warehouse DR strategies differ from on-premises (where DR meant maintaining a secondary datacenter). Geo-redundant storage: Azure and AWS replicate storage to a paired region automatically. Data survives regional outages without manual intervention. Compute failover: In a regional outage, provision compute in the paired region and point to the geo-replicated storage. Fabric: geo-redundant OneLake provides automatic storage DR; compute is re-provisioned in the failover region. Snowflake: database replication to a secondary region with automatic failover. Recovery Time Objective (RTO): 1-4 hours for compute failover. Recovery Point Objective (RPO): near-zero for storage (continuous replication). The cloud DR cost is a fraction of on-premises (no secondary hardware) — DR is a configuration setting, not a datacenter investment.

Data Warehouse as a Service: Managed vs Self-Managed

Cloud warehouses range from fully managed (BigQuery — zero infrastructure decisions) to self-managed (PostgreSQL on VMs — full control, full responsibility). The spectrum: Fully managed (BigQuery, Fabric Warehouse) — the provider handles: compute scaling, storage optimization, indexing, maintenance, and upgrades. You write SQL and manage data. Operational overhead: minimal. Control: limited (can't tune at the engine level). Semi-managed (Snowflake, Synapse Dedicated) — you control: compute sizing, clustering keys, warehouse scheduling. The provider handles: infrastructure, patching, HA. Operational overhead: moderate. Control: substantial. Self-managed (PostgreSQL/MySQL on VMs, Greenplum) — you control everything. The provider handles: VM infrastructure. Operational overhead: high (DBA required). Control: complete. For most enterprises, semi-managed provides the right balance — enough control to optimize for workload patterns, managed enough to avoid DBA staffing for infrastructure tasks. Fully managed suits organizations that prioritize zero operational overhead over fine-grained control.

Security Architecture for Cloud Warehouses

Cloud warehouse security operates at four levels: network security (private endpoints, VNet integration — the warehouse is accessible only from the corporate network, not the public internet), authentication (Entra ID integration — no local database accounts, all access through centralized identity with MFA), authorization (RBAC for schema access + row-level security for data-level restrictions — the Sales team sees only their region's data, Finance sees all), and encryption (TDE for data at rest, TLS for data in transit, customer-managed keys in Key Vault for organizations requiring key control). Column-level security masks sensitive columns (SSN, salary) from unauthorized roles — the column exists in the schema but returns null or masked values for users without the required permission. Audit logging captures every query, every login, and every permission change — providing the evidence trail that SOX, HIPAA, and PCI audits require.

The Xylity Approach

We migrate and modernize warehouses with cloud-native architecture — zone design, lakehouse convergence, AI-ready serving, and cost optimization. Our data architects, data engineers, and Fabric architects assess your warehouse, design cloud architecture, execute phased migration, and optimize — delivering a warehouse that serves BI, AI, and real-time workloads from a unified platform.

Continue building your understanding with these related resources from our consulting practice.

Warehouse Architecture for the AI Era

Cloud-native, AI-ready, lakehouse-converged. Warehouse architecture that scales from 1TB to 100TB and serves BI and ML from one platform.

Start Your Warehouse Modernization →