Introduction to Redis

Get an overview of Redis and its features, including core data structures, messaging solutions, and other advanced features.

Redis is an open-source, in-memory data structure. It belongs to the NoSQL category of databases and falls under the key-value database umbrella. It has been one of the leading databases in this category, according to DB-Engines ranking.

After its initial release in 2010, its popularity and usage have grown significantly, and Redis has been rated the most-loved database by developers for five years in a row.

Source: Stack Overflow Developer Survey 2021
Source: Stack Overflow Developer Survey 2021

Redis core data types

Redis’ core data types include String, List, Hash, Set, and Sorted Set. Redis also has specialized features such as Redis Streams, Pub/Sub, Geospatial indexes, HyperLogLog, etc. Although it’s an in-memory store, we can choose from a spectrum of persistence options.

If we need high availability and want to shard our data across multiple Redis servers, we can use Redis Cluster. Redis also has interesting features that can be found in relational databases such as transactions (these aren’t the same as ACID transactions) and server-side scripts written in Lua. In addition to these core features, Redis modules let you extend and implement custom data types. There are lots of widely used Redis modules already available, for example RediSearch, RedisJSON, etc.

These features, coupled with rich data structures, make Redis really versatile. It can act as a high-performance in-memory cache, a message broker, streaming engine, and can be used to solve a wide range of problems.

This course will cover lot of Redis data structures. Here’s a high-level overview of the fundamental data types in Redis:

  • String: A basic Redis data type commonly used for caching, atomic counters, etc.

  • Hash: A Redis hash can store attribute-value pairs and is commonly used to model objects.

  • Set: A set can only contain unique elements, but doesn’t provide any ordering guarantees. It’s generally used to store data when duplicates can’t be tolerated, and it’s also used to represent relationships and execute operations such as union, intersection, etc.

  • Sorted Set: Items in a sorted set have a name and score associated with them. It’s similar to a set because it allows unique elements. However, it differs from a set in that it provides ordering guarantees based on the member score—or name, if the scores are the same.

  • Geospatial index: This allows you to store and query latitude and longitude data (coordinates). This is very useful for use cases that need to search for locations within a specific area, for example, finding restaurants within a five-mile radius.

  • HyperLogLog: This is a probabilistic data structure. Its main use case is to count the unique number of elements. This sounds like a job for Set, but HyperLogLog is much more space-efficient for high data volume (millions of elements), and it sacrifices accuracy for optimizing storage.

Note: Bitmaps and bitfields are really interesting data types that are also supported by Redis.

Messaging with Redis

The data types mentioned in this section are also a part of the core data structures in the previous section. They’ve been separated into a separate “messaging” category because that’s their primary use case.

  • List: In addition to traditional operations like add, search, delete, etc., lists can be used to implement a consumer-producer pattern for reliable asynchronous job processing, also known as worker queues.

  • Pub/Sub: This is an implementation of the Publish/Subscribe messaging paradigm. Redis allows producers to send data to channels, which can then be received by one or more consumers (subscribers). It provides a high-performance message bus, but it’s ephemeral. In other words, the messages are not persisted in Redis for offline consumers to receive them after they connect later.

  • Streams: Streams also support the producer-consumer pattern, but messages are retained even after they’re consumed and processed. This is a powerful feature, similar to that of an append-only log structure. It also provides fault tolerance along with the ability to traverse the stream in a flexible way—for example, processing data from the beginning of a stream, processing only new data, and processing data from a specific point in time.

Core, cross-cutting concepts

In addition to the core data structures, certain features are common across Redis as a whole.

  • Redis transactions: A transaction allows us to execute a group of actions (Redis commands) in an isolated way. For example, during a transaction, other client requests are not served, and we can be sure that the data is only being accessed by a single client.

  • Pipelines: A pipeline can be used to execute multiple operations efficiently. In pipeline mode, Redis doesn’t return the response to back to the client. Instead, it queues the response in the memory and returns the responses after all the commands have been executed—this greatly reduces the round-trip time.

  • Lua scripts: You can use Lua to write server-side scripts that Redis can execute (similar to stored procedures). This has a few advantages, including atomicity (similar to a transaction, other client requests are blocked during a Lua Script execution), efficiency (data is processed where it’s present), and flexibility.

Redis modules

Redis modules are an advanced feature. But, in simple terms, they allow you to implement custom data types specific to our use case without needing to change or update the core Redis server. Implementing a new data type with Redis modules is a nontrivial effort. There are many popular and widely used Redis modules to solve problems of full-text search (RediSearch), processing time series data at scale (RedisTimeSeries), native JSON support in Redis (RedisJSON), and much more.

Running Redis across multiple servers

Redis is a high performance database, and we can go a long way with a single Redis server. But, it’s also possible to operate Redis across multiple servers if data requirements exceed the limitations of a single server. This also ensures high availability and redundancy.

There are multiple ways to run Redis in a distributed way:

  • Redis Cluster: A Redis Cluster has multiple nodes and the data is partitioned (or sharded) into these nodes. There should be a minimum of three primary nodes in a Cluster and each of them can have one or more replica nodes as well.

  • Redis Sentinel: This is a high-availability feature but doesn’t provide automatic data partitioning. The setup involves running a separate Cluster of Sentinel nodes that are configured to monitor a set of Redis servers and perform automatic remediation in case of server crashes or faults. This is a much more complex system and predates the Redis Cluster, which is the recommended solution for running Redis in a scale-out architecture.

  • Proxy: A proxy setup involves relying on systems that can act as a middleman in front of a fleet of Redis servers. It also takes care of data partitioning using its own custom schemes.

    • Server-side proxy: You can set up an intermediate server that speaks the Redis protocol and fans out requests to the appropriate Redis server. A popular solution is Twemproxy.

    • Client-side proxy: Instead of running a separate fleet of proxy servers, a client-side proxy is aware of all the Redis servers and knows how to partition data for storage and querying. All the logic is implemented in the client library itself.

This was a high-level, yet comprehensive coverage of the Redis ecosystem and its capabilities. These topics will be covered in depth in the subsequent lessons.