In This Article
ISV Architecture Decisions: Day-One Choices
Three decisions made on day one that are expensive to change later: tenancy model (shared infrastructure vs isolated infrastructure per tenant — determines cost structure and security posture), data isolation strategy (shared database with tenant column vs separate database per tenant vs separate schema per tenant — determines data security, query complexity, and migration difficulty), and deployment model (single deployment serving all tenants vs regional deployments vs customer-managed deployment — determines latency, compliance, and operational complexity). These decisions compound: a shared-database, single-deployment architecture handles 10,000 tenants efficiently but requires rigorous tenant isolation at the application layer. A separate-database, per-tenant-deployment architecture provides strong isolation but costs 100x more at 10,000 tenants. Choose the model that matches your target market: SMB customers accept shared infrastructure (cost-efficient). Enterprise customers require stronger isolation (premium pricing supports the cost).
Multi-Tenancy Patterns: Shared vs Isolated
| Pattern | Infrastructure | Database | Cost/Tenant | Isolation | Best For |
|---|---|---|---|---|---|
| Fully shared | Shared compute | Shared DB, tenant column | Lowest ($1-5/mo) | Application-level | SMB SaaS, high volume |
| Shared compute, isolated DB | Shared compute | Separate DB per tenant | Medium ($10-50/mo) | Database-level | Mid-market, regulated |
| Isolated deployment | Separate per tenant | Separate per tenant | Highest ($100+/mo) | Infrastructure-level | Enterprise, max security |
| Hybrid (tiered) | Shared for SMB, isolated for Enterprise | Matches compute tier | Varies by tier | Tier-dependent | Multi-segment SaaS |
The hybrid model is the most practical for ISVs serving both SMB and Enterprise: SMB tenants share infrastructure (low cost, high density), Enterprise tenants get isolated infrastructure (higher cost, premium pricing). The application code is identical — the deployment configuration determines isolation level. Kubernetes with namespace-per-tenant (shared) or cluster-per-tenant (isolated) provides the infrastructure flexibility for hybrid tenancy.
Data Isolation: Protecting Tenant Data
Tenant data isolation is the #1 security requirement for multi-tenant SaaS. A tenant should never: see another tenant's data, affect another tenant's performance, or access another tenant's configuration. Isolation strategies:
Row-level isolation (shared database): Every table has a tenant_id column. Every query includes WHERE tenant_id = @current_tenant. Enforced at the application framework level (not per-query — a missed WHERE clause is a data breach). Additional defense: database-level row-level security policies (PostgreSQL RLS, SQL Server row-level security) as a backup to application-level filtering. Risk: a code bug that omits the tenant filter exposes all tenants' data. Mitigation: automated testing that verifies tenant isolation on every query, plus database-level RLS as defense-in-depth.
Schema-level isolation (shared database, separate schemas): Each tenant has their own database schema (tenant_123.orders, tenant_456.orders). Queries don't need tenant filtering — the schema boundary provides isolation. Schema-level access control prevents cross-tenant access at the database level. More isolated than row-level, less expensive than separate databases. Good middle ground for 100-1,000 tenant deployments.
Database-level isolation (separate database per tenant): Each tenant has their own database instance. Maximum isolation — a code bug can't leak data across databases. Supports tenant-specific customization (different indexes, different retention policies). Cost: database licensing or managed service cost per tenant. Practical for: 10-500 tenants at premium pricing. Impractical for: 5,000+ tenants (management overhead exceeds value).
Billing and Subscription Management
SaaS billing components: subscription management (plan selection, upgrades, downgrades, cancellation — Stripe Billing, Chargebee, or custom), usage metering (for usage-based pricing: API calls, storage consumed, users active, compute hours — metered and reported to the billing system), invoice generation (automated monthly/annual invoices with: subscription charges + usage charges + taxes), and payment processing (credit card, ACH, wire transfer — PCI-compliant payment handling through Stripe, Braintree, or Adyen). For Azure Marketplace distribution: marketplace handles billing and payment — Microsoft collects from the customer and pays the ISV (minus marketplace commission). This simplifies: sales tax, international payments, and enterprise procurement (customers buy through their existing Microsoft agreement).
Azure Marketplace: Distribution at Scale
Azure Marketplace provides: distribution (listing visible to millions of Azure customers — product discoverable through marketplace search and Microsoft seller recommendations), co-sell (Microsoft sales teams incentivized to sell marketplace solutions — your product recommended alongside Azure services), billing integration (customers pay through their Azure commitment — no separate procurement process for enterprise customers with Azure MACC), metered billing API (report custom usage meters to marketplace — marketplace handles billing based on reported usage), and transactable offers (SaaS, managed application, or container offer types — each with different deployment and billing models). Marketplace listing requirements: security review, technical validation, and marketing assets. Timeline: 4-8 weeks from application to live listing.
SaaS Security Architecture
SaaS security spans: identity (customer SSO via SAML/OIDC — customers authenticate with their own identity provider. Azure AD B2C for consumer SaaS. Multi-factor authentication for all administrative access), network (WAF for web application protection, DDoS mitigation, TLS 1.2+ for all communications, private endpoints for database access), data (encryption at rest with customer-managed keys for enterprise tenants, encryption in transit, data residency compliance for multi-region deployment), application (OWASP Top 10 mitigation, dependency scanning, container image scanning, secrets management via Key Vault), and compliance (SOC 2 Type II certification for enterprise sales, HIPAA BAA for healthcare, GDPR compliance for European customers). SOC 2 certification: budget 6-9 months and $50-100K for first-time certification. Required for: enterprise B2B SaaS sales (procurement teams won't approve vendors without SOC 2).
Scaling: From 10 to 10,000 Tenants
| Scale | Architecture | Key Challenges |
|---|---|---|
| 1-100 tenants | Single deployment, shared DB | Product-market fit, feature velocity |
| 100-1,000 | Regional deployments, schema isolation | Noisy neighbor, onboarding automation |
| 1,000-10,000 | Full multi-region, tiered tenancy | Operational complexity, cost optimization |
| 10,000+ | Global distribution, sharded everything | Massive scale operations, billing at volume |
SaaS Operations and SRE
SaaS uptime is the product's credibility: SLA targets (99.9% for standard tenants = 43 minutes downtime/month. 99.95% for enterprise = 22 minutes. 99.99% for critical = 4.3 minutes), monitoring (per-tenant health dashboards, aggregate platform health, synthetic monitoring for availability, real-user monitoring for performance), incident management (on-call rotation, runbooks for common incidents, post-incident reviews, customer communication during outages), and deployment practices (blue-green or canary deployments — zero-downtime. Feature flags for gradual rollout. Automated rollback if health checks fail post-deployment). SRE team sizing: 1 SRE per 50-100 tenants for hands-on operations. At 1,000+ tenants: automation must handle 90%+ of operations — SRE focus shifts from manual tasks to building automation.
Onboarding Automation: From Sign-Up to Active Tenant in Minutes
Manual tenant onboarding (create database, configure permissions, seed data, send welcome email) takes 1-4 hours and requires engineering involvement. At 100+ tenants: bottleneck. Automated onboarding: customer signs up, payment processed, provisioning pipeline triggered, database/schema created, default configuration applied, admin user created, welcome email sent, tenant active. Total: 2-5 minutes with zero engineering involvement. Components: subscription webhook (marketplace or Stripe), tenant provisioning service (creates database resources, applies schema migrations), configuration service (tier-specific settings: storage limits, user limits, feature flags), and notification service (welcome email with login instructions). The provisioning pipeline determines whether the product scales from 50 to 5,000 tenants without proportionally scaling operations.
SaaS Metrics: What to Track From Day One
| Category | Metrics | Why |
|---|---|---|
| Revenue | MRR, ARR, ARPU, expansion revenue, churn revenue | Business health and growth |
| Usage | DAU, WAU, feature adoption, API call volume | Product-market fit |
| Retention | Logo churn, net revenue retention, cohort retention | Customer satisfaction |
| Operations | Uptime, P95 latency, error rate, deploy frequency | Platform reliability |
| Cost | COGS per tenant, infra cost per $1 revenue | Unit economics |
Build the metrics dashboard before launch. The metrics inform: pricing decisions (is ARPU sustainable?), product decisions (which features drive retention?), and infrastructure decisions (at what tenant count do we need to re-architect?).
Noisy Neighbor Problem: Preventing Tenant Interference
The noisy neighbor problem: one tenant consumes disproportionate resources (CPU, database connections, API rate) and degrades performance for all other tenants sharing the same infrastructure. Prevention strategies: resource quotas per tenant (API rate limiting: 100 requests/second per tenant. Database connection limits: 20 connections per tenant. Storage quotas: 50GB per tenant on the standard plan. Exceeding quotas returns 429 Too Many Requests — not degraded service for other tenants), workload isolation (heavy operations like report generation and data exports run on separate compute from real-time request serving — a tenant generating a 10GB export does not affect other tenants response times), queue-based processing (batch operations queued with per-tenant fair scheduling — tenant A submitting 1,000 records for processing does not block tenant B single-record submission), and auto-scaling that responds to tenant-specific load (if one tenant drives a traffic spike, the auto-scaler adds capacity before other tenants experience latency). Monitor per-tenant resource consumption continuously — the noisy neighbor appears without warning, and the first symptom is usually: "why is the application slow for everyone?"
Data Residency and Compliance for Multi-Region SaaS
Enterprise SaaS customers increasingly require data residency guarantees: EU customers need GDPR-compliant data storage in EU regions. Healthcare customers need HIPAA-compliant environments. Government customers need sovereign cloud. Multi-region SaaS architecture: regional deployments (separate infrastructure stacks in: US East, EU West, Australia East — tenant data stored and processed in their selected region), global control plane (tenant provisioning, billing, and configuration managed from a single control plane — operational data centralized, customer data regional), data sovereignty compliance (tenant data never leaves the designated region — verified through: Azure Policy region restrictions, network rules preventing cross-region data transfer, and audit logging of data access), and regional pricing (infrastructure costs vary by region — EU West is 10-15% more expensive than US East. Build regional cost differences into pricing or absorb as COGS).
The Xylity Approach
We build ISV SaaS products with the multi-tenant architecture — right-sized tenancy model (shared, isolated, or hybrid), data isolation (row-level + database-level defense-in-depth), Azure Marketplace distribution, and SRE-grade operations. Our cloud architects, application developers, and DevOps engineers build SaaS products that scale from 10 to 10,000 tenants — with the security, billing, and operational practices that enterprise customers require.
Go Deeper
Continue building your understanding with these related resources from our consulting practice.
Build SaaS That Scales to 10,000 Tenants
Multi-tenant architecture, data isolation, marketplace distribution, SRE operations. ISV cloud architecture for enterprise-grade SaaS products.
Start Your SaaS Architecture →