How Figma scaled PostgreSQL to millions of users

This newsletter explores Figma’s journey from a small startup to 4M users with Postgres. Learn how simplicity and smart database choices led to massive, sustainable scale.

12 mins read

Aug 27, 2025

Figma began with a refreshingly simple foundation: a single, monolithic PostgreSQL database in an industry that often defaults to microservices and distributed systems for scale. This allowed the team to iterate quickly, simplify the process, and focus on delivering a world-class collaborative design experience.

As usage grew into the millions, so did the architectural demands. Rather than adding complexity prematurely, Figma scaled with precision. First, they vertically optimized their monolith. Later, they introduced Postgres-based sharding. Their journey offers a blueprint for pragmatic, long-term scalability while keeping things simple.

A system powers every smooth cursor movement in a shared Figma file under constant pressure to sync edits in real time, protect data integrity, and ensure instant responsiveness. Most engineering teams in this position reach for distributed databases or microservices rewrites. Figma didn’t—they scaled with discipline, pushing existing tools to the limit before introducing complexity.

This newsletter breaks down the disciplined journey step by step.

We’ll begin with Figma’s initial architecture and the performance bottlenecks that drove change. Next, we’ll explore the vertical scaling techniques that extended PostgreSQL’s capabilities, then examine application-layer sharding. Finally, we’ll cover the high availability features and the core engineering principles that ensured long-term simplicity and reliability.

The initial architecture and its emerging bottlenecks#

In its early stages, Figma’s architecture was a model of simplicity: a single, high-capacity Amazon RDS instance running PostgreSQL. This monolithic design was a strategic choice that allowed the engineering team to focus on product development rather than infrastructure complexity. The database housed all core entities, from user accounts and organization hierarchies to detailed data structures behind design files, such as vector networks and version histories.

This single system architecture served them well in the beginning. PostgreSQL’s maturity supported rapid iteration, and PgBouncerPgBouncer is a lightweight connection pooler for PostgreSQL. was used to manage connections efficiently. The benefits were clear: fast deployment, simple debugging, and minimal coordination overhead. This simple architecture is depicted in the diagram below.

As user adoption grew rapidly through 2020 and beyond, the operational pressure on the database mounted. Figma’s real-time collaboration model required every interaction to propagate near-instantly to other users in a session, triggering a cascade of reads, writes, and diff operations on deeply nested, shared data structures. This intense workload turned the monolithic database into a single point of contention, and the team identified four specific bottlenecks:

Write contention: The high volume of concurrent edits from thousands of designers placed the primary database under extreme write pressure, causing stability risks and impacting routine maintenance operations like autovacuum.
Query latency: As tables grew to hundreds of gigabytes, queries combining large datasets slowed down, impacting UI responsiveness.
Connection saturation: The number of active connections from backend services pushed the limits of what PgBouncer could manage effectively.
Resource exhaustion: Even on a powerful r5.12xlarge RDS instancehttps://aws.amazon.com/ec2/instance-types/r5/, CPU utilization hovered around 65 percent during peak usage, leaving little headroom for growth or traffic surges.

A monolithic database is often the right foundation for early scalability and development velocity. But it requires close monitoring of metrics like query latency, write throughput, and connection load to know when the architecture is reaching its limits. These signals provide time to implement targeted optimizations before a more fundamental rearchitecture becomes unavoidable.

It was clear that simply scaling up the instance was no longer a viable long-term strategy. The team’s decision to continue building on PostgreSQL marked a pivotal moment. The next section explores the rationale behind that commitment.

The rationale for choosing and sticking with PostgreSQL#

With growing scale and increasing system pressure, many engineering teams switch to a distributed NoSQL system for easier horizontal scalability. However, Figma’s team chose to continue investing in PostgreSQL. A pragmatic evaluation of product requirements and system behavior drove this choice.

Data integrity and ACID guarantees: At the core of every Figma file is a highly structured, versioned graph of vector objects, design components, and collaborative state. Ensuring that every change is applied in order and without corruption is essential. PostgreSQL’s ACID compliance provided strong transactional guarantees under high concurrency—something most eventually consistent NoSQL systems can not match.
Relational power with flexible data modeling: Figma’s data model spans structured and semi-structured domains. User profiles, organizations, and access policies are cleanly relational, while design file content, such as layer hierarchies and shared properties, is more dynamic. PostgreSQL’s support for structured data and its ability to efficiently handle complex, nested data structures within traditional columns allow Figma to manage its diverse data model in a single system. This dual capability eliminates the need for a separate document store. The diagram below illustrates how PostgreSQL manages this unified data model.

Operational maturity and optimization headroom: PostgreSQL’s maturity gave Figma’s infrastructure team operational leverage. Tuning tools like EXPLAIN ANALYZEhttps://dev.mysql.com/blog-archive/mysql-explain-analyze/, predictable index strategies, and control over autovacuumhttps://www.postgresql.org/docs/current/runtime-config-autovacuum.html behavior allowed them to fine-tune query performance without introducing risk. PgBouncer was deployed to manage connection pooling, and read replicas were added to offload reporting and analytics traffic from the primary database.
Clear, incremental scaling path: Unlike purpose-built distributed databases that require architectural buy-in from the beginning, PostgreSQL offers a predictable, incremental path to scale. This includes vertical scaling, vertical partitioning, and application-layer sharding. It allowed the team to address concrete performance bottlenecks and defer more complex architectural changes until necessary. These fundamental trade-offs are summarized in the comparison matrix below.

Choose your database based on product needs, not scaling hype. When data accuracy and transactional safety are critical, the guarantees of an ACID-compliant system like PostgreSQL matter more than the early appeal of distributed alternatives.

This commitment to an incremental scaling strategy began with the most straightforward approach: maximizing performance from a single, powerful instance through vertical scaling and deep optimization.

Inside the architecture: How Figma scaled PostgreSQL#

Figma’s approach to scaling PostgreSQL is incremental. Rather than introducing distributed complexity early, the engineering team prioritized extracting maximum performance from a single PostgreSQL instance before evolving the architecture. This section outlines how Figma used vertical and horizontal scaling techniques to support a rapidly growing, high-concurrency workload.

Vertical scaling and deep performance tuning#

In the first optimization phase, the team focused on pushing their monolithic RDS instance to its limits.

Instance upgrades: The database initially ran on an r5.12xlarge instance, later upgraded to r5.24xlarge as CPU utilization approached 65% and memory pressure increased.
Connection pooling: As backend services spawned thousands of connections, PgBouncer was deployed to multiplex them efficiently and avoid memory exhaustion on the database.
Read replicas: To reduce load on the primary write instance, read replicas were introduced to offload reporting and internal analytics queries, isolating read-heavy traffic from latency-sensitive workloads.
Isolating new workloads: To limit the growth of the original database, new features or services were built with dedicated databases, preventing them from adding load to the core monolith.
Query optimization: Engineers used EXPLAIN ANALYZE to identify expensive queries, introduced composite indexes for frequent access patterns, and rewrote N+1 query chainsWhere one query triggers a separate query for each related record. to reduce I/O overhead.
Autovacuum tuning: PostgreSQL’s default autovacuum settings were adjusted to prevent table bloat and ensure timely cleanup without causing background contention.

The diagram below depicts this architecture, segregating workloads by directing read-heavy analytics traffic to separate replicas.

Horizontal scaling through application-layer sharding#

When vertical scaling reached its limits, Figma used horizontal scaling, using a technique called application-layer shardinghttps://www.figma.com/blog/how-figma-scaled-to-multiple-databases/. This involves splitting data across multiple databases (shards), with the application responsible for directing queries to the correct shard. Unlike database-native sharding, this method gives engineering teams fine-grained control over how data is partitioned and routed, while avoiding the complexity of a fully distributed database engine.

To support this approach, Figma introduced several components:

Shard key selection: Figma chose file_id as the shard key. This mapped naturally to collaborative documents, avoided cross-shard operations, and minimized the risk of “hot shards.”
Shard map service: A central lookup table was introduced to map each file_id to its corresponding database shard. Applications used this map to route requests efficiently.
Custom DB proxy: A lightweight Go-based proxy was built to handle shard resolution and routing. This service directed traffic and managed advanced features like load shedding, request hedging, and scatter-gather queries across shards, making the system more resilient and performant.
Zero-downtime migration: Data was migrated incrementally. The shard map was updated in controlled batches, and routing correctness was validated before fully cutting over to the sharded system.

The fundamental change from a single database to this sharded model is illustrated in the diagram below.

Application-layer sharding offers precise control and avoids premature architectural complexity, but it shifts operational responsibility to the application tier. Routing logic, schema coordination, and observability must be carefully managed.

Figma’s sharded architecture addressed scale, but it introduced new reliability challenges. The next section explores how the team ensures high availability and operational continuity across the distributed database infrastructure.

Ensuring reliability and high availability at scale#

In a real-time system, resilience matters as much as performance. A single failure can result in lost user data. To prevent this, Figma’s team built a high availability strategy using PostgreSQL’s native features, enhanced with custom automation and deep observability.

Streaming replication and automated failover#

Figma used PostgreSQL’s built-in streaming replication to keep hot standby replicas synchronized within seconds of each primary shard using Write-Ahead Logging (WAL) entries. These standbys also served read traffic for internal tools and analytics, freeing the primary instances to focus on user-facing writes.

This setup enabled an automated failover strategy that used health checks and dynamic routing to promote a standby replica in case of a primary shard failure. To ensure this system worked under pressure, the team regularly ran failover drills, testing the automation and team readiness for real-world incidents. The diagram below depicts this high availability architecture for each shard.

Recovery and observability#

To guard against data corruption, Figma utilized PostgreSQL’s native point-in-time recovery (PITR) capabilities, leveraging WAL to ensure any shard could be restored to a precise moment before an incident occurred. By regularly validating their recovery procedures, the team ensured they could rewind any shard to a precise moment before an incident occurred.

Finally, the team built deep observability into the system, continuously monitoring metrics like replication lag and I/O saturation. Deviations from baseline behavior triggered alerts for proactive intervention, while structured logs from the DB proxy provided crucial auditability across the sharded environment.

Reliability at scale is not accidental. It results from rigorous replication, tested failover strategies, validated backups, and comprehensive observability. Performance gains must never compromise trust or uptime.

This reliability shaped the engineering principles behind Figma’s long-term scaling strategy.

Engineering principles#

Figma scaled to millions of users without architectural sprawl by following engineering values that shaped every technical decision. Figma prioritized restraint, clarity, and disciplined iteration in an industry quick to chase trends.

Avoiding premature complexity: The team avoided distributed systems until necessary. Rather than designing for a hypothetical scale, they solved real performance bottlenecks based on system needs. This minimized unnecessary complexity and kept operational overhead low.
Mastering core tools: Instead of expanding their tech stack, the team developed expertise in PostgreSQL. The familiarity with tuning practices and internal behavior allowed them to optimize performance precisely and confidently.
Incremental, measurable change: Architectural changes were introduced in small, testable steps. The shift to sharding was gradual. Routing logic was rolled out behind feature flags. Shards were validated in production, and the monolith was retired only after confidence was built.
Pragmatism over dogma: Figma’s engineers deferred complexity until it was justified. Their system never grew beyond what was necessary.

Well-designed systems scale when simplicity is protected. Figma’s architecture changed only in response to measurable, real-world needs.

Test your knowledge!

Scenario: Your team’s monolithic database is experiencing slowdowns during peak hours. A new and popular distributed database promises to resolve these issues. Based on Figma’s engineering philosophy, what should be your team’s immediate priority?

Begin planning the migration to the new distributed database to future-proof the system.

Double the server size immediately (vertical scaling) to handle the load.

Use performance analysis tools to identify the queries or transactions causing the slowdown.

1 / 1

These core principles guided Figma to its current scale, but infrastructure engineering is never complete. The next section explores the future challenges Figma faces as it continues to grow.

Scaling beyond millions of users#

The architecture that scaled Figma to millions of users reflects its engineering principles. While Figma hasn’t published a public road map, we can infer likely areas of evolution based on patterns common to systems operating at this scale.

Reducing global latency: Figma already ensures low-latency, real-time collaboration using a distributed Multiplayer servicehttps://www.figma.com/blog/under-the-hood-of-figmas-infrastructure/ that processes edits locally. Because the primary source-of-truth data may still be centralized, the next evolution is to regionalize it (likely with database replicas or an active-active setup). This would accelerate initial file loads and improve resilience for users worldwide.
Scaling cross-shard operations: Figma’s file-based sharding works well for collaboration, but org-wide queries that require scatter-gather operations may become bottlenecks at a larger scale. Teams in similar positions often explore tools like the Citus extension for PostgreSQL or custom distributed query layers to streamline these operations.
Managing data volume and cost: With billions of design objects and deep version histories, the sheer volume of data presents both a performance and cost challenge. A likely future strategy involves tiered storage, where older, less-frequently accessed data is moved to cheaper object storage like Amazon S3 using PostgreSQL’s foreign data wrapper capabilities to keep it accessible.

Scaling to the next level often requires new strategies. If Figma’s philosophy holds, those strategies will be adopted incrementally, introducing complexity only when needed.

Disclaimer: This article draws technical inspiration from various publicly available Figma engineering blogs. While original in its narrative and interpretation, certain architectural insights and problem framings are adapted or inferred based on content shared by Figma’s engineering team.

Wrapping up#

Figma’s scaling strategy reflects an engineering vision rooted in discipline and pragmatism. By methodically evolving a monolithic PostgreSQL database, Figma achieved massive scale while protecting operational simplicity. Their architecture demonstrates that high performance and architectural simplicity can reinforce each other.

As the pressure to scale quickly pushes more teams toward distributed systems by default, Figma’s model of disciplined, incremental scaling is a good example. It shows how to build infrastructure that delivers high performance and reliability while remaining operationally sustainable and cost-effective over time.

Figma’s journey with PostgreSQL shows what thoughtful, pragmatic System Design looks like in practice. If you’re looking to build equally scalable and resilient systems, explore the following System Design courses.

Written By:

Fahim ul Haq

Streaming intelligence enables instant, model-driven decisions

Learn how to build responsive AI systems by combining real-time data pipelines with low-latency model inference, ensuring instant decisions, consistent features, and reliable intelligence at scale.

13 mins read

Jan 21, 2026