Design Considerations for Chat Application Mobile System
Explore critical design considerations for mobile chat applications including architecture patterns like MVVM with coordinators, dependency injection, and communication using WebSockets and REST APIs. Learn how to manage message delivery states, user presence, typing indicators, and secure multi-device synchronization. Understand strategies to handle offline usage, background execution, and ensure real-time responsiveness despite mobile platform constraints.
With the functional requirements and constraints laid out in the first lesson, it’s now time to confront the critical design considerations that shape how a mobile chat application behaves, scales, and adapts to real-world usage. Unlike a read-heavy experience like a newsfeed, chat applications demand real-time communication, precise state synchronization, and intuitive conversational flows. These demands require thoughtful decisions at every layer, from selection and UI responsiveness to how the system handles offline usage and background constraints.
This lesson focuses on the foundational decisions that will determine the robustness, responsiveness, and maintainability of your mobile chat system. We’ll revisit key architectural choices, such as the use of coordinators, dependency injection, and MVVM, to evaluate whether they still hold up in the context of chat and, if so, how they must evolve. We’ll also address communication architecture, message delivery mechanics, state synchronization, and interaction patterns like presence and typing indicators.
Before exploring chat-specific mechanics like message streams and real-time sync, it’s important to revisit foundational architectural decisions that can serve as the backbone of the entire application.
Let’s start with a mobile architecture pattern.
Architecture pattern
The choice of architectural patterns becomes critical to maintaining clarity, extensibility, and real-time responsiveness. Messaging workflows span multiple screens, background states, and asynchronous event streams. Without a disciplined structure, the system risks becoming brittle and difficult to reason about as it scales.
We adopt an MVVM-C (Model-View-ViewModel with coordinators) pattern coupled with dependency injection (DI) for a chat application. This pairing offers a powerful combination of modularity, separation of concerns, and flexibility across both UI and data layers.
Let’s understand the logic behind the decision.
MVVM (Model-View-ViewModel): In a chat application, message lists are updated in real time, sometimes multiple times per second. This includes new messages, status indicators, reactions, edits, and typing states. By isolating presentation logic in ViewModels, we ensure that UI components (Views) remain simple and declarative, focused purely on rendering the state exposed by the ViewModel. This not only avoids bloated view controllers, but also enables easier unit testing for scenarios like optimistic sends or failed deliveries.
Coordinators for navigation: Chat applications involve a surprisingly rich set of navigation flows: opening a thread from a push notification, launching media previews, transitioning to profile views, or returning to the inbox after archiving a conversation. Embedding this logic in view controllers leads to tangled code. Coordinators encapsulate these flows, allowing navigation to be declared explicitly and in a testable manner. They also help manage edge cases, like dismissing a media overlay when the underlying conversation is archived remotely.
Dependency injection (DI): Unlike stateless browsing features, chat sessions often involve persistent resources like WebSocket clients, media uploader, or encryption layers that should exist only during the lifetime of a conversation. Using DI containers allows us to:
Construct per-session service graphs dynamically.
Easily swap or mock dependencies (e.g., during testing or A/B experiments).
Isolate failure domains: a bug in media uploading doesn’t compromise core messaging.
This approach also improves memory efficiency and life cycle management, since unused services are deallocated once the conversation ends.
As the application grows, new engineers or new features like polls or replies can be onboarded into the architecture without requiring deep rewrites. A well-structured MVVM-C + DI setup reduces ramp-up time by making responsibilities discoverable, components independently testable, and behaviors more predictable.
Our chat screen uses optimistic UI to show messages immediately after hitting send. Where should the message state (sending, sent, failed) be managed: the ViewModel or a sync service?
Up next, we’ll look at how this architecture interacts with the core of messaging, choosing the right architectural style, communication protocol, and data format for real-time delivery.
Client-server communication considerations
The real-time nature of mobile chat systems imposes a different set of constraints than many other mobile interactions. Message delivery, typing indicators, read receipts, and presence updates all occur under unpredictable network conditions and tight responsiveness expectations. This makes the architecture of the communication layer one of the most decisive factors in achieving a smooth, and dependable user experience.
We’ll design the chat system around a REST architectural style and WebSockets. It will communicate over HTTP/2, use UTF-8 encoded JSON and binary formats, and employ a hybrid push-pull data-fetching strategy.
Each of these decisions is made with specific constraints and affordances of mobile chat systems in mind, as discussed below.
API architectural style: While WebSockets offer persistent, low-latency, bidirectional communication, ideal for delivering chat messages, typing indicators, and presence, they do not replace REST APIs. Mobile operating systems can suspend or terminate WebSocket connections when the app is in the background, or under heavy power optimization. This makes WebSockets reliable only during active foreground sessions. Therefore:
WebSockets (WS) are used when the app is active, enabling real-time message flow and presence updates.
REST APIs handle control flows like message history pagination, media metadata exchange, user profile updates, and background-compatible sync tasks.
Communication protocols: For all REST interactions, the system communicates over HTTP/2. This protocol brings clear advantages to mobile environments:
Supports multiple concurrent streams over one connection.
Reduces latency through header compression.
Enables better resource prioritization (e.g., fetching conversations while syncing receipts).
Although HTTP/3 improves further on these capabilities, its limited adoption across mobile networks and libraries makes HTTP/2 the pragmatic choice for compatibility and stability.
Data formats: Mobile platforms benefit from predictable and efficient parsing. For this reason:
UTF-8-encoded JSON is used for all text-based and control-layer payloads, including chat messages, typing indicators, and user metadata. JSON is lightweight, readable, and directly supported by Android and iOS toolkits.
Binary formats for handling media uploads and downloads (images, audio, videos). These are transmitted using pre-signed URLs or multipart HTTP forms. Binary encoding reduces file sizes, speeds up transfer, and minimizes memory usage during transmission.
Unlike server environments, mobile clients do not use binary for messaging payloads due to debugging complexity and the overhead of custom serialization.
Data fetching patterns: For data fetching in a chat application, we’ll use a mix of
andpull Pull model: The client actively requests data (like chat history or media) from the server when needed. patterns.push Push model: The server sends data (like new messages or typing indicators) to the client in real time without being asked. We’ll use polling to fetch and modify data such as chat history, media downloads, user authentication, etc.
We’ll use WebSockets for real-time chat, and notifications will be pushed through it when the connection is maintained. When the app is not in the foreground, we’ll rely on platform-specific push notifications (via APNs for iOS or FCM for Android) to serve as the primary delivery mechanism.
The following table summarizes what we have decided for different aspects of API design and data:
Aspect | Decision |
Architecture style | REST |
Communication protocol | HTTP/2 |
Data format | UTF-8 encoded JSON, Binary |
Data fetching pattern | Pull (polling), Push (WebSockets), push notifications (APN or FCM) |
Our chat app experiences inconsistent message delivery due to flaky networks. Should we fall back to HTTP polling or introduce WebSocket reconnection and buffering logic?
With a resilient communication foundation in place, we’re ready to tackle the next critical layer of chat architecture: how messages are reliably queued, delivered, and synchronized across client and server states.
Message delivery and queuing semantics
Delivering a message in a chat application might seem like a simple action on the surface, but beneath that tap of a “send” button lies a coordinated sequence of state transitions, reliability guarantees, retries, and acknowledgments. Mobile platforms further complicate this with background limitations, connection drops, and asynchronous state recovery.
To ensure that messages are reliably transmitted and consistently reflected across devices, the chat system must manage a precise message delivery life cycle, from the moment they are typed to the moment they are confirmed as read.
We adopt a queue-backed, stateful delivery model on the client. This is supported by acknowledgment-based synchronization with the server, with clear transitions across these states:
Pending → sent → delivered → read.
The flow of the message delivery is as follows:
When a sender initiates a message, it is first added to their local queue with a “pending” status. The message is then dispatched to the connected chat server, which responds with an acknowledgment, updating the status to “sent.”
From there, the message is forwarded to the receiver’s server. If the recipient is online, the message is delivered directly to their local queue, triggering a “delivered” acknowledgment back to the sender. If offline, the server holds the message and issues a push notification. Upon reconnect, the message is delivered and acknowledged.
Each stage includes retry logic, ensuring reliability even under poor connectivity. Read receipts follow the reverse path: once the receiver views the message, the acknowledgment flows back to the sender, marking it as “read.”
This architecture balances real-time responsiveness with offline tolerance, ensuring users always experience timely, accurate message state transitions, as illustrated below:
Note: We use an exponential backoff retry mechanism, a retry strategy where each failed attempt waits progressively longer before the next retry, typically doubling the delay each time (e.g., 1s, 2s, 4s, 8s). This helps reduce network load and prevents overwhelming the server during outages or instability.
As shown in the message delivery flow, both sender and receiver use local queues, which are backed by local persistent storage (e.g., SQLite, Core Data, Room, or local file-based DBs). This local-first design ensures messages are retained, displayed, and retried automatically, allowing chat to continue seamlessly even when offline.
We receive a “read” receipt for a message before receiving its “delivered” status. How should the UI respond?
Features like user presence and typing indicators make chat feel instant and engaging, but implementing them effectively on mobile requires careful handling of constraints like limited background execution, network variability, and battery consumption. To maintain responsiveness without sacrificing efficiency, the system must use real-time events and scoped updates that align with the realities of mobile platforms. Let’s explore it.
User statuses in a chat application
In a mobile chat system, user presence (online, offline, or idle) helps create a sense of real-time connection. However, accurately reflecting presence on mobile devices requires careful orchestration, as apps are frequently backgrounded, suspended, or terminated by the OS. The presence system must be efficient, resilient, and lightweight, especially when supporting large numbers of concurrent users.
Here’s how presence tracking is handled in a mobile context.
Establishing a presence connection: When a user opens the app, it establishes a WebSocket connection that communicates with the presence service. This service keeps track of active sessions using a fast, distributed in-memory store (e.g., Redis, DynamoDB). On mobile, presence is only considered valid while the app is in the foreground and the socket is active.
Updating presence status: When the user is active (e.g., interacting with the app), the client periodically sends heartbeat signals through the WebSocket to confirm availability. These updates are relayed to other clients to update presence indicators (e.g., online dot, “last seen” timestamp).
Handling disconnections and idle transitions: If the app is backgrounded, closed, or the OS suspends it, the WebSocket disconnects, and the presence service marks the user as offline or idle. Some mobile chat systems also support idle detection (e.g., “away” after a period of inactivity). The system uses last heartbeat timestamps to infer state changes and updates presence accordingly.
Typing indicators add an extra layer of real-time interaction by letting users see when the other person is composing a message. However, they must be lightweight and optimized to avoid excessive network traffic.
When a user starts typing, the message input component triggers an event to the typing indicator service, notifying the recipient. To prevent unnecessary updates, this information is sent via WebSocket only if the user has typed over a short threshold (e.g., 1 second).
The recipient’s frontend listens for typing events and updates the UI dynamically. If the sender stops typing for a few seconds, the typing event expires, and the UI removes the indicator.
Should we store participant metadata inside the conversation document or as a separate relational reference?
Data synchronization strategy
In a mobile chat system, keeping data accurate and up-to-date across sessions, devices, and network conditions is a persistent challenge. Messages might arrive while the app is backgrounded, users may switch devices, or network availability may fluctuate. The system must maintain synchronization guarantees without sacrificing performance or user experience.
To address this, we use a hybrid synchronization model, combining real-time updates with background-aware syncing strategies. Incremental synchronization will be our default approach here, and the sync scenarios outlined below reflect its application.
App resume or reconnect: When the user opens the app or returns from the background, a delta sync is triggered to fetch updates since the last known timestamp or message ID. The sync covers new messages, updated delivery/read statuses, and metadata changes (e.g., thread renames, role changes)
Real-time updates via WebSocket: During active sessions, message delivery, typing indicators, and presence are streamed over WebSocket. The local state is updated immediately and can be persisted for offline use. Moreover, no polling is required, and the socket keeps the UI in sync with minimal delay.
Background sync and silent push: When the app is suspended, updates are paused to conserve resources. If supported (e.g., silent push on iOS/FCM on Android), a background fetch may be triggered to pre-warm the state. On the resume, the sync completes and updates are displayed immediately.
Note: Messages deleted or edited on another device need to be reflected accurately. We implement tombstoning, retaining the deleted message as a placeholder with metadata (e.g., “This message was deleted”) until the next sync.
Multi-device data synchronization
In chat applications where privacy is a foundational requirement, especially those supporting end-to-end encryption (E2EE), messages and media are not stored in plaintext on the server. This design introduces a complex challenge: how to synchronize chats across a user’s devices without compromising security or violating the system’s privacy guarantees.
Even though the server cannot read or persist chat content, it still plays an essential role in relaying encrypted data between devices. The architecture must ensure secure, selective visibility while maintaining delivery reliability, multi-device consistency, and support for rich media.
When a message is sent, it is encrypted locally and sent as ciphertext to the server. The server, acting purely as a passive relay, forwards the encrypted payload to the user’s other devices, each of which decrypts it using its own private key. This allows multi-device sync without giving the server access to message contents.
Media sharing follows a similar path. Files are encrypted on the sender’s device and uploaded securely. A separate encryption key is shared with the recipient’s devices through the encrypted message metadata. Only authorized devices can decrypt and access the content.
To support this, the system uses device-level key management, ensuring only trusted devices, those that have completed a secure key handshake, can access the message history. The server facilitates delivery and retry logic, but never stores or reads the actual content.
Background synchronization
As we have talked earlier, mobile chat applications must deliver timely messages, even when the app is closed, suspended, or in the background. Since persistent connections like WebSockets are often paused by the operating system, especially on iOS, the system must rely on push notifications and background sync strategies to keep the user informed and the chat state current.
The push notifications are triggered by the backend, we’ll discuss the silent push triggered by the app in the background.
The silent push is sent without user-visible UI; it wakes the app in the background (if permitted) to sync new messages, update unread counts, or preload content. This is especially useful for reducing load time when the user next opens the app. When a silent push is received (or the app is resumed manually), a background task is triggered:
The app queries the server for delta updates (new messages, status changes, etc.).
Updates are written to local storage.
The UI is refreshed once the app returns to the foreground.
This ensures minimal delay between opening the app and seeing the latest state. Moreover, without push and background orchestration, users would only receive messages when actively using the app, breaking the real-time expectation.
Our app relies on silent push to sync unread messages while in the background, but some users report inconsistent updates. What could be going wrong, and how should we handle it?
Let’s now summarize our key findings and decisions on design considerations for a chat app in the following table:
Section | Focus | Key Decisions or Findings |
Architecture pattern | Structural foundation for modular chat UI |
|
Communication architecture | Real-time messaging protocol and API design |
|
Message delivery and queuing | Reliable message state transitions and retries |
|
Local-first state | Offline support and optimistic UX |
|
Typing and presence | Lightweight real-time interactivity |
|
Data synchronization | Keeping chat state fresh across sessions |
|
Multi-device sync | Consistent state across all logged-in devices |
|
Background sync | Ensuring timely delivery when the app is inactive |
|
Test your knowledge
Your mobile chat app supports end-to-end encryption and multi-device login. Due to increasing storage usage on user devices and complaints about slow sync after login, your team is considering implementing selective message sync. This is where only part of a user’s message history is fetched when logging in from a new device or reinstalling the app.
Say “Hi” to Ed (the AI persona) in the widget below to test your knowledge on design considerations for a chat application mobile System Design for the above scenario.
If you’re unsure how to do this, click the “Want to know the correct answer?” button.
Conclusion
What makes a chat app feel effortless is the invisible orchestration underneath: messages arriving at the right time, indicators updating in real time, unread counts syncing across devices. Behind that smooth experience lies a system that’s constantly negotiating with unreliable networks, limited power, and tight OS restrictions.
Designing for mobile chat means designing with constraints as a core feature. You don’t always control the connection, the life cycle, or even the device, but you must still deliver clarity, speed, and trust. That’s the real challenge, and the real craft, and the design considerations we made here will pave the path for designing such a system.
In the next lesson, we’ll explore how these decisions shape the mobile System Design and API design that hold everything together. As we move on to the next lesson, it’s important to remember that a small oversight here can lead to unreliable behaviors or hard-to-scale systems later.