Introduction to the Pub-Sub Service

Explore the fundamentals of pub-sub services and their role in asynchronous communication among microservices. Understand the challenges of request-response models, event-driven architectures, and the components of pub-sub middleware. This lesson helps you grasp how pub-sub improves scalability, selective event pushing, and reduces client-server coupling in modern product architectures.

We'll cover the following...

Motivation
Request-response model
- Challenges
Event-driven architecture
Pub-sub model
- The inner structure of the pub-sub
Event-driven vs. pub-sub

Motivation

In today's era, systems are designed by following the microservices architecture in which multiple services interact to perform a specific task. A microservice architecture needs the interconnectivity and interoperability of different subsystems. Communication between the subsystems can be performed either synchronouslyA service waits for another service to complete the request and then performs the required operation. or asynchronously.A client, while requesting a service, does not wait for or expect an immediate response and can start doing something else. When a response is ready, it is sent to the service.

Asynchronous communication benefits us the most in the microservice architecture because services generally do not wait for a response. Nevertheless, a decoupling layer between services is required to enable them to communicate and perform tasks asynchronously.

In the following section, we’ll answer some important questions, such as what happens if the services depend on each other for data processing and how to make communication possible between them. Let's start by highlighting challenges in a request-response architecture that can lead to opting for event-driven architectures.

Request-response model

Most client-server communication follows the request-response model using HTTP with synchronous communication. Once the client initiates the request, the server responds with content after performing all the required operations. Let's look at the video file upload service in the YouTube service to understand the request-response model.

The client initiates a request to upload the video file. The API gateway sends the request to the upload service, which routes the request to other services, such as the processing and notification services. In the whole procedure, the client waits for the success or failure status until the entire process is completed. Until then, the client can keep working on something else and check if the response is received.

In case of successful execution or failure, the response is sent back via the previous service in the path toward the client, as shown below:

Challenges

An application with request-response architecture becomes challenging in the following scenarios:

Consider a situation where the upload and processing services complete their task successfully, but the notification service fails. In the request-response model, the following two problems will arise:
- The awaiting client will receive a failed response, requiring the client to try to perform the task again, which leads to increased response time.
- While the processing and upload services successfully update the database locally after executing the user request, the overall request fails because of the notification service. This failure requires updating the databases on the processing and upload services. The remediation of the operations puts too much overhead on the connected services.
Another drawback in the request-response model is the addition of new services. To communicate properly, we must embed and integrate the model with all the coordinating services. Due to the services’ increased complexities and responsibilities, it involves too much overhead to integrate and scale the request-response architecture.
Similarly, the request-response model is inefficient when the server wants to inform the client about events occurring in the backend, irrespective of the client’s request.

In such cases, event-driven architecture is beneficial, which we’ll discuss in the following section.

Event-driven architecture

Event-driven architecture (EDA) decouples microservices and centers around the events between clients and services. The updates from the server come only if the client requests them in the request-response model, whereas in the case of EDA, the client places the request once by subscribing to an event that may happen over time, and the server will send a notification.

EDA enables event listenersThe clients or services interested in the events generated in event-driven architecture. to react to a specific event generated by performing a task. For this purpose, event-driven protocols play a major role in informing the clients of particular event updates. The illustration below explains this concept.

In the illustration, service A triggers the event and adds it to the event queue. All other connected services consume the event generated by service A (if they want to)—for instance, services B, C, and D.

One of the advantages is that scaling the system by integrating other services in EDA is easier than in the request-response model. The new service connects to the event queue and can consume the occurring events. Similarly, the updates are broadcast automatically to the consumers whenever an event occurs instead of when explicitly requested.

The drawback of EDA that leads us to design the pub-sub service is the lack of selective pushing of events. In EDA, the events are broadcast to any of the potential listeners. In contrast, the pub-sub (an updated implementation of the EDA model, which we will discuss in subsequent sections) will push them to the selective destinations.

Pub-sub model

The pub-sub (which stands for publisher-subscriber) model is the implementation of EDA and consists of the following components:

Publishers: These are the content producers that trigger an event sent to the middleware.
Middleware: This is a middleman that decouples publishers and subscribers. It serves as a content distributor that takes the specified content from the publisher and pushes it toward the subscriber. It has many names, like a broker, hub, message queue, etc.
Subscriber: This is a consumer who receives an update for a particular published topic. Whenever an event related to a specific topic is triggered, the middleware pushes it to the subscribers.

The workings of a video upload service in a pub-sub EDA is shown in the following illustration:

Let's explain each step involved in the illustration above:

The upload service triggers the event by submitting a video to the event queue.
The event queue responds with a success status to the upload service, which informs the client.
The event queue sends the event details to the processing service because it has subscribed to this event.
After processing the video, the processing service generates events related to the compressed and formatted video and submits it to the event queue.
The notification service has subscribed to the event related to the formatted video. As soon as the queue receives the event, it pushes it to the notification service.

Let's discuss the inner structure of the pub-sub next.

The inner structure of the pub-sub

Several design issues regarding middleware can arise in the pub-sub model. The important ones are listed below:

How does the middleware manage multiple events?
How are events and subscribers prioritized?
How are the intended subscribers of an event identified?

This section focuses on answering the questions above by exploring the inner structure of a pub-sub system. Let's start with the illustration below:

Let's break down each component of the middleware:

Cluster: This is the middleware in a pub-sub service that couples all the components required for event-driven communication between publishers and subscribers.
Topics: This is a category of events with subcategories related to a specific activity.
Brokers: These are the event managers. They manage events according to the topic the events belong to. A cluster contains replicas of brokers for parallel communicationBy parallel communication, we mean events of topic A are sent from broker 1 and events of topic B are sent from broker 2 simultaneously. and to act as a backup for the events. Each replica also acts as a leaderBroker 1 is a leader for topic A, and broker 2 is a leader for topic B. The events are sent by a leader for a relevant topic. for a specific topic and is responsible for pushing notifications for its respective topic.
Partition: The partitions (P0 and P1 above) represent subcategories of a topic. The purpose of the partitions is to filter and save the events based on the events' subcategories, subscribers, or priorities.

Note: In the subscriber's table in the illustration above, it’s shown that user 1 has subscribed to topic A and lies under partition P0.

Filtering: This means separating the events based on the category they belong to (topic A or topic B), the priority of the events, and the notification typeBy type, we mean messages, emails, or push notifications.. The priority may involve filtering out the precedence of the subscribers who want to get notifications. Similarly, it filters the subscribers who do not need a notification due to the maximum notification limit being reached, configuration preferences, or lesser affinityThis refers to the subscribers who are not active from a certain time period..
Storage: A large storage capacity will best serve the middleware because many events can be stored simultaneously. It helps the service to periodically store the topics to be fetched later when they’re requested explicitly by a subscriber. This also stores information, such as which subscriber has subscribed to which topic to receive notifications. This operation is performed whenever a user subscribes to a specific topic.

The pub-sub service doesn’t allow users to poll events because it’s based on a publish-subscribe model, where publishers send messages that are sent/pushed to the subscribers. The pub-sub service is designed to be a highly scalable and efficient messaging system for real-time data communication. In contrast, polling can be a relatively inefficient way to consume events in a high-volume system, requiring the client to continually send requests to the server to check for new messages. This can create unnecessary network traffic and add additional server load, which can be a performance bottleneck in large-scale systems. The retries mechanism to send events is good enough that clients receive all the events without polling them. However, a system where clients pull the events might scale better as compared to a system where the service pushes the messages to subscribers because of managing little state in the former case.

Event-driven vs. pub-sub

As we discussed, the pub-sub service follows EDA, but some differences exist because event-driven is an architecture and pub-sub is an updated and advanced implementation of EDA. The following table shows the key differences between the two:

EDA Compared to Pub-Sub Service

EDA	Pub-sub
It’s an architecture	It’s an implementation of EDA
Events are handled in a single queue with round-robin structure	Events are managed in different queues
Events are broadcast, and the potential listeners listen and act, or are just informed that a certain event has occurred	Events are pushed to intended subscribers only
Because events are broadcast, it’s preferred for live streaming	It’s used for the delivery of content or data to intended subscribers
No filtration for priority or category	Filtration according to users and type of events
Events are temporarily stored until they are consumed	Events can be stored permanently in the pub-sub storage
Faster because no response is needed from event consumers	Slower because it expects acknowledgment or response from intended subscribers
No parallel broadcasting because events are streamed/broadcast through a single queue	Replication of brokers makes parallel communication possible and speeds up the message delivery