AI Integration Company | Custom AI Solutions & Services

Building a SaaS platform is not primarily a coding problem. It is a series of architecture decisions made under uncertainty, and those decisions, made in the first few weeks, determine what the system can and cannot do for years. Here is how we approach them and why.

In this article

Where most SaaS builds go wrong from day one
The seven architecture decisions that matter most
The tech stack we settled on and why
What the build timeline actually looked like
What we would do differently

This article is written from direct experience the decisions, tradeoffs, and occasional mistakes that came out of building production SaaS platforms for clients across B2B, healthcare, and professional services. The specifics change from project to project. The principles behind the decisions stay consistent.

If you are planning a SaaS build, whether you are doing it in-house or working with an agency, understanding these decisions in advance will make you a significantly better client and significantly reduce the risk of building the wrong foundation.

1. Where most SaaS builds go wrong from day one

The most common failure mode in a SaaS build is not technical incompetence. It is premature optimisation — building for a scale and complexity that does not exist yet, at the cost of the speed and simplicity needed to get to market and learn.

The second most common failure mode is the opposite: building so fast and so simply that the foundation cannot support the product once it starts to grow. User data is mixed across tenants. A monolithic architecture that cannot be scaled independently. An authentication system that works for ten users and breaks for ten thousand.

The art of a good SaaS architecture is building the right level of complexity for where the product is now, with the clear ability to evolve toward where it needs to be. Not tomorrow’s architecture today. Not today’s architecture forever.

The guiding principle we use

Every architecture decision should be made as late as responsibly possible when you have the most information. Decisions made on day one with incomplete information are the most expensive to reverse. Decisions made at week eight, when you understand the real usage patterns, are almost always better ones.

2. The seven architecture decisions that matter most

Decision 01Monolith vs microservices

Options considered:

Modular monolith~~Full microservicesServerless functions~~

For almost every early-stage SaaS, a well-structured modular monolith is the right starting point. Microservices introduce operational complexity — separate deployments, inter-service communication, distributed tracing — that a team of four does not need and cannot effectively manage. A modular monolith with clean internal boundaries can be extracted into services later, when the load and team size actually justify it.

When this changes: When specific components have genuinely different scaling requirements, or when the team grows large enough that independent deployment becomes a bottleneck.

Decision 02Multi-tenancy model

Options considered:

Shared database, tenant ID per row~~Separate database per tenantSeparate schema per tenant~~

For most B2B SaaS products in the early phase, a shared database with a tenant identifier on every table is the right balance of simplicity and isolation. Separate databases per tenant makes operational complexity explode with every new customer migration, backups, and monitoring multiplied by tenant count. Separate schemas sit in the middle but add complexity without meaningful benefit at small scale. The shared model works cleanly up to hundreds of tenants when implemented with proper row-level security.

When this changes: When customers have contractual data residency requirements, or when a single tenant’s data volume is large enough to affect others.

Decision 03Authentication and authorisation

Options considered:

Third-party auth provider~~Custom-built auth~~

Building custom authentication is one of the most reliable ways to introduce security vulnerabilities into a SaaS platform. Auth providers handle session management, token rotation, MFA, social login, and compliance requirements in ways that would take months to replicate correctly in-house. The cost is dependent on an external service but the alternative is owning a security-critical system that your team almost certainly lacks the specialised expertise to maintain properly.

When this changes: Almost never for the auth layer itself. Custom authorisation logic role and permission systems is always built in-house because it reflects your specific product’s data model.

Decision 04Real-time features

Options considered:

WebSockets via managed service~~Self-hosted WebSocket serverPolling~~

Real-time functionality — live notifications, collaborative features, dashboard updates — is expected in modern SaaS products. Building and operating a self-hosted WebSocket infrastructure is a non-trivial operational burden. Managed real-time services handle connection management, scaling, and reliability at a cost that is almost always lower than the engineering time required to do it yourself.

When this changes: At very high connection volumes where managed service costs become significant, or when latency requirements are extreme enough to require a self-hosted solution.

Decision 05Background job processing

Options considered:

Queue-based job system~~Cron jobs in the application layerServerless functions for all async work~~

Any SaaS platform has work that should not happen in the request cycle, such as sending emails, processing uploads, generating reports, and triggering third-party integrations. A proper queue-based system with retry logic, dead letter queues, and job monitoring is essential for reliability. Cron jobs fail silently. Serverless functions for all async work becomes expensive and hard to debug at scale.

When this changes: The queue system itself rarely changes but the worker infrastructure it runs on scales with volume.

Decision 06: Billing and subscription management

Options considered:

Dedicated billing platform ~~Custom~~ ~~billing logic~~

Billing logic is deceptively complex: proration, upgrade and downgrade handling, free trials, usage-based pricing, failed payment recovery, tax calculation, invoice generation, and compliance with payment regulations across jurisdictions. Dedicated billing platforms have solved these problems at significant engineering investment. Custom billing code almost always has edge cases that surface at the worst possible moment when a customer is trying to pay.

When this changes: When pricing models become complex enough that the billing platform’s opinionated approach does not accommodate them — typically at enterprise scale with highly custom contract terms.

Decision 07Observability and monitoring

Options considered:

Centralised logging + APM from day one~~Add monitoring after launch~~

The most expensive version of “we will add monitoring later” is discovering a production issue from a customer complaint rather than an alert. Application performance monitoring, error tracking, and centralised logging should be in place before the first real user touches the system, not retrofitted after the first incident. The cost is low. The cost of not having it is high and unpredictable.

When this changes: It does not. This one is non-negotiable on any production system we build.

3. The tech stack we settled on and why

Tech stack decisions are context-dependent the right stack for a team of two senior engineers is not the right stack for a team of ten mid-levels. That said, for a typical B2B SaaS build with a small, experienced team, the following combination has proven reliable across multiple projects.

Frontend

React + TypeScript

Large talent pool, mature ecosystem, type safety catches errors early

Backend

Node.js / Python

Node for API-heavy systems, Python when AI / ML integration is significant

Database

PostgreSQL

Reliable, extensible, handles both relational and JSON data cleanly

Infrastructure

AWS / GCP

Managed services reduce operational burden; chosen based on client preference and compliance requirements

Auth

Auth0 / Clerk

Handles MFA, SSO, and compliance without custom security code

Billing

Stripe

Industry standard, extensive documentation, handles most subscription models

Jobs

BullMQ / Celery

Robust queue system with retry logic and job monitoring built in

Monitoring

Datadog / Sentry

Full-stack observability from day one, not retrofitted after first incident

On the “best” tech stack debate

There is no universally correct tech stack. The best stack is the one your team knows deeply, that has the ecosystem support you need, and that does not introduce unnecessary complexity for your current scale. A team that knows Django inside out will ship better software in Django than in a trendier framework they learned last month. Consistency and depth beat novelty every time.

4. What the build timeline actually looked like

Weeks 1–2

Discovery and architecture. Requirements finalised, data model designed, tech stack confirmed, development environment set up, CI/CD pipeline in place before the first feature is written.

Weeks 3–6

Core infrastructure. Auth, multi-tenancy layer, billing integration, base API structure, and monitoring configured. Nothing visible to end users yet, but everything that subsequent features depend on is solid.

Weeks 7–14

Core feature development. The primary user workflows built in sprint cycles, with client review at the end of each two-week sprint. Integration work runs in parallel where possible.

Weeks 15–17

QA, performance testing, security review. Full regression testing, load testing against expected traffic profiles, penetration test on auth and data access layers.

Weeks 18–20

Beta and stabilisation. Limited release to a small group of real users. Daily monitoring of error rates, performance metrics, and user feedback. Refinements before full launch.

Week 20+

Launch and ongoing development. Full release, monitoring handover, and the start of the post-launch development cycle, new features, performance optimisation, and the inevitable things that only surface under real usage.

5. What we would do differently

Honesty about what went less well is more useful than a polished success narrative.

We underestimated the onboarding flow

The time between a new user signing up and them getting genuine value from the product, the onboarding experience consistently takes longer to build well than any other part of a SaaS platform. We scoped it as a secondary concern, and it became a primary one after launch, when we saw where users were dropping off. It should have been scoped as a first-class feature from the start.

We added analytics too late

Behavioural analytics, understanding how users actually move through the product, was added after launch rather than instrumented from the beginning. This meant the first few months of real usage generated less usable data than they should have. Instrument everything from day one, even if you do not look at it immediately.

We did not stress-test the billing edge cases hard enough

Billing systems have a long tail of edge cases that only appear in production: concurrent subscription changes, payment method updates mid-cycle, failed payments during trial conversion. These were mostly handled correctly, but a more systematic edge-case testing process during QA would have caught the two or three that made it to production.

The bottom line

Building a SaaS platform from scratch is a sequence of decisions under uncertainty, made with incomplete information, against a deadline. The teams that do it well are not the ones that make the objectively correct technical choices every time they are the ones that make defensible, reversible decisions quickly, instrument everything so they can learn from production, and move fast enough to get real users generating real feedback before the architecture is locked in.

The seven decisions above are the ones that matter most in the first few weeks. Get them roughly right, build the observability to know when you need to revisit them, and you have a foundation worth building on.

Planning to build a SaaS platform?

SmartWayLabs builds production-ready SaaS platforms from the ground up with architecture decisions made for where your product is going, not just where it starts. Talk to the team ↗

How we built a SaaS platform from scratch: architecture decisions explained

1. Where most SaaS builds go wrong from day one

2. The seven architecture decisions that matter most

3. The tech stack we settled on and why

4. What the build timeline actually looked like

5. What we would do differently

We underestimated the onboarding flow

We added analytics too late

We did not stress-test the billing edge cases hard enough

The bottom line

Planning to build a SaaS platform?

Leave a Comment Cancel Reply