Introduction

In this lesson, we will discuss different types of databases used in designing a system. Moreover, this lesson will enable users to select a database suitable for their applications.

Primarily, databases are divided into two groups; Relational and Non-Relational databases. Non-relational databases are also called NoSQL databases, stands for Not-Only SQL.

There are various factors affecting the choice of database to be used in an application.

  1. The type of data

    The first factor affecting the choice of database is the type of data that needs to be stored. It may include the structured data, which follows some predefined structure, and the non-structured data.

    Structured data can fit into a pre-defined structure, like in tabular format organized by rows and columns. An example of structured data is employee data of an organization, where different attributes related to employees can be stored in tabular format.

    Unstructured data is complex and does not consist of any pre-defined shape; therefore, it is difficult to store in tabular format. An example would be imagery and video data, data collected through sensors, etc. Another kind of data is semi-structured, which does not have any rigid shape, but has some characteristics of structured data. An email can be considered as a good example of unstructured data, where some fields are pre-defined while the text/data can be of varying length.

  2. Query pattern

    Another factor affecting the database choice is the variety of queries. Some queries required lesser data while some require the bulk of data to be transferred to/from a database.

  3. The scale of the data

    Choice of the database also depends on the scale and amount of data that needs to be stored.

Relational databases

Relational databases adhere to particular schemas before storing the data. The data stored in relational databases has some prior structure. Mostly, this model organizes data into one or more relations, with a unique key for each tuple (instance). Each entity of the data consists of instances and attributes where instances are stored in rows and each column representing the attributes of each instance. Since each tuple has a unique key, therefore, a tuple in one table can be linked to another tuple in other tables by storing the primary keys in other tables generally known as foreign keys.

A Structure Query Language (SQL) is used for maintaining the database; insertion, deletion, and retrieval of data.

There are various reasons for the popularity and dominance of relational databases which include simplicity, robustness, flexibility, performance, scalabilityTraditional databases are vertically scalable., and compatibility in managing generic data. Such databases follow the Atomicity, Consistency, Isolation, and Durability (ACID) properties to maintain the integrity of the database, as explained below.

  • Atomicity: A transaction is considered an atomic unit. Hence, all the statements within a transaction will successfully execute, or none of them will execute. If a statement fails within a transaction; it should be aborted and rollback.

  • Consistency: At any given time the database should be in a consistent state, and it should remain in a consistent state after every transaction. For example, if multiple users want to view a record from the database it should return a similar result each time.

  • Isolation: In the case of multiple transactions running concurrently, they should not be affected by each other. The final state of the database should be the same as the transactions were executed sequentially.

  • Durability: The system should guarantee that transactions completed will survive permanently in the database even in system failure events.

There are various databases that come under the category of relational databases, including.

  • MySQL
  • Oracle Database
  • Microsoft SQL Server
  • IBM DB2
  • Postgres
  • SQLite

Why relational database

Relational databases are the default choices of software professionals from the start for serious data storage. There are a number of advantages of these databases, one of the greates power of traditional databases are its abstractions of ACID transactions and related programming semantics that make it very convenient for the end-programmer to use a relational database. Let’s revisit some important features of relational databases.

Flexibility

In the context of SQL, Data Definition Language (DDL) provides us the flexibility to modify the database, including tables, columns, renaming the tables, and other changes. DDL even allows us to modify schema while other queries are happening and the database server is running.

Reduced redundancy

One of the biggest advantages of the relational database is that it eliminates data redundancy. The information related to a specific entity appears in one table while the relevant data to that specific entity appears in the other tables links through foreign keys. This process is called normalization which has the additional benefit of removing an inconsistent dependency.

Concurrency

Concurrency is an important factor while designing an enterprise database. In such a case, the data is read and written by many users at the same time. We need to coordinate such interactions to avoid inconsistency in data for example double booking of hotel rooms. Concurrency in a relational database is handled through transactional access to the data which worked well to contain the complexity of concurrency. As explained earlier, a transaction is considered as an atomic operation so it also works in error handling to either rollback or commit a transaction on successful execution.

Integration

The process of aggregating data from multiple sources is a common practice in enterprise applications. A common way to perform this aggregation is to integrate a shared database where multiple applications store their data. This way all the applications can easily access each others’ data while the concurrency control measures handle the access of multiple applications.

Backup and disaster recovery

Relational databases guarantee the state of data is consistent at any time. The export and import operations make backup and restoration easier. Most commonly cloud-based relational databases perform continuous mirroring to avoid loss of data and make the restoration process easier and quicker.

Impedance mismatch

Impedance mismatch is the difference between the relational model and the in-memory data structures. The relation model organizes data into the tabular structure; relations and tuples. SQL operation on this structured data yields relations aligned with relational algebra; however, it has some limitations. In particular, the values in a table take simple values which can’t be a structure or a list. The case is different for in-memory, where a complex data structure can be stored. To make the complex structures compatible with the relations, a translation of the data in light of relational algebra would be needed. Hence, the impedance mismatch- requiring translation between two representations, as denoted in the following figure.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy