Database Fundamentals for Cloud Architects

In this lesson, we will talk about a few important concepts and terms regarding databases. We’ll discuss the different types of databases and how they handle scalability and high availability.

Database fundamentals

A database is a structured and systematic collection of dataA collection of raw facts and figures. A database can be anything that stores information. It may be a single sheet or a complex database system.

A database stores data electronically.

Database management systems (DBMSs) help us store and effortlessly retrieve data from a database. We can use them to create, manage, and alter the database.

A database management system (DBMS) is software to access and manage data in a database.

In reality, both terms are used more loosely, so if you hear a software engineer talk about a database, they could be referring to the database and/or the DBMS. In this course, we’ll use the term “database” for both database and DBMS as well because in reality, both are inseparable.

Databases are usually accessed through a database-specific protocol or interface and these interfaces are normally protected by authentication (e.g., username and password). We did see that with our WordPress instance; its configuration contained a set up username and password to access the MariaDB database.

Importance of databases

Because the database stores important, sensitive, and business-critical data, it’s often the centerpiece of the application architecture. Therefore, it should have good security and backup procedures in place. In most cases, a highly available architecture is desirable.

Important terms

Now, let’s talk about a few important terms that are strongly intertwined with considerations of scalability for databases.

Distributed database

A distributed database is a database running on multiple nodes (e.g., EC2 instances) as part of a cluster. A distributed database can have much higher performance than a single database, but it requires special care regarding data consistency and synchronization.

Transaction

A transaction is a unit of work performed within a database and can be a set of single actions. Transactions are used to ensure that the database is in a valid state before and after the transaction and usually can be rolled back (reversed) if something within a transaction fails. Therefore, the concept of transactions is very important to maintain the consistency of a database.

Synchronization

Synchronization is the process of replicating data between different instances (or replicas) of a database. In cloud environments, especially with scalable architectures, synchronization is a very important but difficult process to keep all instances at the same state.

There are two forms of synchronization: synchronous and asynchronous synchronization. They relate to the state of each instance of a database after a write operation has been performed. Will they all return the same data, or will some of them return old data for a while?

Get hands-on with 1200+ tech skills courses.