ISV Architecture Decisions: Day-One Choices

Three decisions made on day one that are expensive to change later: tenancy model (shared infrastructure vs isolated infrastructure per tenant — determines cost structure and security posture), data isolation strategy (shared database with tenant column vs separate database per tenant vs separate schema per tenant — determines data security, query complexity, and migration difficulty), and deployment model (single deployment serving all tenants vs regional deployments vs customer-managed deployment — determines latency, compliance, and operational complexity). These decisions compound: a shared-database, single-deployment architecture handles 10,000 tenants efficiently but requires rigorous tenant isolation at the application layer. A separate-database, per-tenant-deployment architecture provides strong isolation but costs 100x more at 10,000 tenants. Choose the model that matches your target market: SMB customers accept shared infrastructure (cost-efficient). Enterprise customers require stronger isolation (premium pricing supports the cost).

SaaS architecture decisions made on day one determine the product's cost structure, security posture, and scalability ceiling for years. Choose your tenancy model based on where you'll be at 10,000 tenants — not where you are at 10.

Multi-Tenancy Patterns: Shared vs Isolated

PatternInfrastructureDatabaseCost/TenantIsolationBest For
Fully sharedShared computeShared DB, tenant columnLowest ($1-5/mo)Application-levelSMB SaaS, high volume
Shared compute, isolated DBShared computeSeparate DB per tenantMedium ($10-50/mo)Database-levelMid-market, regulated
Isolated deploymentSeparate per tenantSeparate per tenantHighest ($100+/mo)Infrastructure-levelEnterprise, max security
Hybrid (tiered)Shared for SMB, isolated for EnterpriseMatches compute tierVaries by tierTier-dependentMulti-segment SaaS

The hybrid model is the most practical for ISVs serving both SMB and Enterprise: SMB tenants share infrastructure (low cost, high density), Enterprise tenants get isolated infrastructure (higher cost, premium pricing). The application code is identical — the deployment configuration determines isolation level. Kubernetes with namespace-per-tenant (shared) or cluster-per-tenant (isolated) provides the infrastructure flexibility for hybrid tenancy.

Data Isolation: Protecting Tenant Data

Tenant data isolation is the #1 security requirement for multi-tenant SaaS. A tenant should never: see another tenant's data, affect another tenant's performance, or access another tenant's configuration. Isolation strategies:

Row-level isolation (shared database): Every table has a tenant_id column. Every query includes WHERE tenant_id = @current_tenant. Enforced at the application framework level (not per-query — a missed WHERE clause is a data breach). Additional defense: database-level row-level security policies (PostgreSQL RLS, SQL Server row-level security) as a backup to application-level filtering. Risk: a code bug that omits the tenant filter exposes all tenants' data. Mitigation: automated testing that verifies tenant isolation on every query, plus database-level RLS as defense-in-depth.

Schema-level isolation (shared database, separate schemas): Each tenant has their own database schema (tenant_123.orders, tenant_456.orders). Queries don't need tenant filtering — the schema boundary provides isolation. Schema-level access control prevents cross-tenant access at the database level. More isolated than row-level, less expensive than separate databases. Good middle ground for 100-1,000 tenant deployments.

Database-level isolation (separate database per tenant): Each tenant has their own database instance. Maximum isolation — a code bug can't leak data across databases. Supports tenant-specific customization (different indexes, different retention policies). Cost: database licensing or managed service cost per tenant. Practical for: 10-500 tenants at premium pricing. Impractical for: 5,000+ tenants (management overhead exceeds value).

Billing and Subscription Management

SaaS billing components: subscription management (plan selection, upgrades, downgrades, cancellation — Stripe Billing, Chargebee, or custom), usage metering (for usage-based pricing: API calls, storage consumed, users active, compute hours — metered and reported to the billing system), invoice generation (automated monthly/annual invoices with: subscription charges + usage charges + taxes), and payment processing (credit card, ACH, wire transfer — PCI-compliant payment handling through Stripe, Braintree, or Adyen). For Azure Marketplace distribution: marketplace handles billing and payment — Microsoft collects from the customer and pays the ISV (minus marketplace commission). This simplifies: sales tax, international payments, and enterprise procurement (customers buy through their existing Microsoft agreement).

Azure Marketplace: Distribution at Scale

Azure Marketplace provides: distribution (listing visible to millions of Azure customers — product discoverable through marketplace search and Microsoft seller recommendations), co-sell (Microsoft sales teams incentivized to sell marketplace solutions — your product recommended alongside Azure services), billing integration (customers pay through their Azure commitment — no separate procurement process for enterprise customers with Azure MACC), metered billing API (report custom usage meters to marketplace — marketplace handles billing based on reported usage), and transactable offers (SaaS, managed application, or container offer types — each with different deployment and billing models). Marketplace listing requirements: security review, technical validation, and marketing assets. Timeline: 4-8 weeks from application to live listing.

SaaS Security Architecture

SaaS security spans: identity (customer SSO via SAML/OIDC — customers authenticate with their own identity provider. Azure AD B2C for consumer SaaS. Multi-factor authentication for all administrative access), network (WAF for web application protection, DDoS mitigation, TLS 1.2+ for all communications, private endpoints for database access), data (encryption at rest with customer-managed keys for enterprise tenants, encryption in transit, data residency compliance for multi-region deployment), application (OWASP Top 10 mitigation, dependency scanning, container image scanning, secrets management via Key Vault), and compliance (SOC 2 Type II certification for enterprise sales, HIPAA BAA for healthcare, GDPR compliance for European customers). SOC 2 certification: budget 6-9 months and $50-100K for first-time certification. Required for: enterprise B2B SaaS sales (procurement teams won't approve vendors without SOC 2).

Scaling: From 10 to 10,000 Tenants

ScaleArchitectureKey Challenges
1-100 tenantsSingle deployment, shared DBProduct-market fit, feature velocity
100-1,000Regional deployments, schema isolationNoisy neighbor, onboarding automation
1,000-10,000Full multi-region, tiered tenancyOperational complexity, cost optimization
10,000+Global distribution, sharded everythingMassive scale operations, billing at volume

SaaS Operations and SRE

SaaS uptime is the product's credibility: SLA targets (99.9% for standard tenants = 43 minutes downtime/month. 99.95% for enterprise = 22 minutes. 99.99% for critical = 4.3 minutes), monitoring (per-tenant health dashboards, aggregate platform health, synthetic monitoring for availability, real-user monitoring for performance), incident management (on-call rotation, runbooks for common incidents, post-incident reviews, customer communication during outages), and deployment practices (blue-green or canary deployments — zero-downtime. Feature flags for gradual rollout. Automated rollback if health checks fail post-deployment). SRE team sizing: 1 SRE per 50-100 tenants for hands-on operations. At 1,000+ tenants: automation must handle 90%+ of operations — SRE focus shifts from manual tasks to building automation.

Onboarding Automation: From Sign-Up to Active Tenant in Minutes

Manual tenant onboarding (create database, configure permissions, seed data, send welcome email) takes 1-4 hours and requires engineering involvement. At 100+ tenants: bottleneck. Automated onboarding: customer signs up, payment processed, provisioning pipeline triggered, database/schema created, default configuration applied, admin user created, welcome email sent, tenant active. Total: 2-5 minutes with zero engineering involvement. Components: subscription webhook (marketplace or Stripe), tenant provisioning service (creates database resources, applies schema migrations), configuration service (tier-specific settings: storage limits, user limits, feature flags), and notification service (welcome email with login instructions). The provisioning pipeline determines whether the product scales from 50 to 5,000 tenants without proportionally scaling operations.

SaaS Metrics: What to Track From Day One

CategoryMetricsWhy
RevenueMRR, ARR, ARPU, expansion revenue, churn revenueBusiness health and growth
UsageDAU, WAU, feature adoption, API call volumeProduct-market fit
RetentionLogo churn, net revenue retention, cohort retentionCustomer satisfaction
OperationsUptime, P95 latency, error rate, deploy frequencyPlatform reliability
CostCOGS per tenant, infra cost per $1 revenueUnit economics

Build the metrics dashboard before launch. The metrics inform: pricing decisions (is ARPU sustainable?), product decisions (which features drive retention?), and infrastructure decisions (at what tenant count do we need to re-architect?).

Noisy Neighbor Problem: Preventing Tenant Interference

The noisy neighbor problem: one tenant consumes disproportionate resources (CPU, database connections, API rate) and degrades performance for all other tenants sharing the same infrastructure. Prevention strategies: resource quotas per tenant (API rate limiting: 100 requests/second per tenant. Database connection limits: 20 connections per tenant. Storage quotas: 50GB per tenant on the standard plan. Exceeding quotas returns 429 Too Many Requests — not degraded service for other tenants), workload isolation (heavy operations like report generation and data exports run on separate compute from real-time request serving — a tenant generating a 10GB export does not affect other tenants response times), queue-based processing (batch operations queued with per-tenant fair scheduling — tenant A submitting 1,000 records for processing does not block tenant B single-record submission), and auto-scaling that responds to tenant-specific load (if one tenant drives a traffic spike, the auto-scaler adds capacity before other tenants experience latency). Monitor per-tenant resource consumption continuously — the noisy neighbor appears without warning, and the first symptom is usually: "why is the application slow for everyone?"

Data Residency and Compliance for Multi-Region SaaS

Enterprise SaaS customers increasingly require data residency guarantees: EU customers need GDPR-compliant data storage in EU regions. Healthcare customers need HIPAA-compliant environments. Government customers need sovereign cloud. Multi-region SaaS architecture: regional deployments (separate infrastructure stacks in: US East, EU West, Australia East — tenant data stored and processed in their selected region), global control plane (tenant provisioning, billing, and configuration managed from a single control plane — operational data centralized, customer data regional), data sovereignty compliance (tenant data never leaves the designated region — verified through: Azure Policy region restrictions, network rules preventing cross-region data transfer, and audit logging of data access), and regional pricing (infrastructure costs vary by region — EU West is 10-15% more expensive than US East. Build regional cost differences into pricing or absorb as COGS).

The Xylity Approach

We build ISV SaaS products with the multi-tenant architecture — right-sized tenancy model (shared, isolated, or hybrid), data isolation (row-level + database-level defense-in-depth), Azure Marketplace distribution, and SRE-grade operations. Our cloud architects, application developers, and DevOps engineers build SaaS products that scale from 10 to 10,000 tenants — with the security, billing, and operational practices that enterprise customers require.

Continue building your understanding with these related resources from our consulting practice.

Build SaaS That Scales to 10,000 Tenants

Multi-tenant architecture, data isolation, marketplace distribution, SRE operations. ISV cloud architecture for enterprise-grade SaaS products.

Start Your SaaS Architecture →