Databases are universal - they underlie the technology you use every day. They’re a crucial component of everything from telecommunications systems, banking systems, and video games to just about any other software system that maintains some amount of persistent data.
Today, we’ll look at how databases are used in programming fields like web development, cloud computing and more!
In this tutorial we will cover the following:
A database is a collection of information. When you look at a database it isn’t pretty – it’s raw data that needs a lot of work to be displayed nicely in a user interface. Databases are important because they represent how information is modeled logically.
Data itself is even considered a valuable resource now – many companies make money selling users data to advertisers – like Google.
In this article, the audience are developers new to computer programming.
To go back to web development, in the Server-Client model, the database is the server.
Databases come in two main flavors: Relational Databases which have a table schema, and NoSQL databases which vary.
Imagine if you were running a pet store and you were searching for a count of individual pets that had been in that year:
If you used receipts to store information, you’d have to:
With a database, information is stored in an organized tabular format, so you can query the database to see the count of pets you’ve had in the year. The great thing about a database, is you can easily break the information down further: how many dogs, how many cats, and the count for how many times they visited your pet sitting business are all quickly accessible using a database.
Database management and access is one of the main reasons cloud computing was invented.
A long time ago, in the 2000’s most companies would buy their own server, and store it in their own buildings. On these servers, they would have many databases, holding tables, storing information. This is known as on-prem. The databases live on your premises.
In this on-prem time, things were complicated. Companies had to employ their own DBA’s (database administrators) and figure out what to do if the power went out. Also, all databases were siloed and lonely. Each company had their own servers, databases, and table schemas holding information, but it was near impossible to share data between companies.
Then came Cloud Computing! Cloud computing is paying a database specialist to do the servers and databases for your company. What does this mean for businesses? Businesses can focus on their value proposition instead of computers! For example:
All this Cloud Computing leads into all the acronyms ending in aaS:
Relational databases are a type of database used both over the cloud and on-prem. RDBMS stands for Relational Database Management System - a way to control your database system. Relational databases model data logically using tables – often called tabular relations.
Spreadsheets in Excel are a good, easy metaphor to think about how a relational database works. Relational databases are like tabular - that means they are like tables in Excel. If you imagine a relational database as an Excel workbook:
Let’s imagine you’re running the pet sitting business. How do you keep track of owner’s information? Here’s 2 pieces of information we’ll start with.
Both of these pieces of information can change – people get married, move. So how does a programmer keep track of information in a way that’s quick to update, and flexible for all the ways life changes?
The developer designs a Relational Databases would versioned name table, and a versioned address table. Each table has a primary key – a unique number that points to that row’s information. NameId and AddressId. Primary keys are not repeated - in the Excel screenshots - the primary key is how the developer can find the exact row of information.
The first step to designing a relational database is to define the schema. The schema is a map of where all the data lives in tables – table names, column names. The schema shows how tables relate to each other – from our example above, how to use the name table to look up the address associated with the person.
In a relational database, all information is sorted, structured, defined, and designed using schema. Relational Databases work well for when the developer knows what their data inputs are going to be – for example, if address information comes in on forms, the data has a defined structure already.
Structured Query Language (SQL) is the most common way to access, or “query”, a relational database. Querying is a way of pulling information back from the database. When querying the database, the developer focuses on efficiency. If you imagine a database with billions of addresses, how do we get to the one address to display to the user quickly?
The answer is using keys for efficiency. Primary keys, or clustered indexes, are the unique address pointing to only that data. Primary keys can’t be re-used. Non-clustered indexes are additional keys the developer & database administrator add to the database to make often used queries faster. Secondary keys are often added once the software goes into production, and through monitoring performance, the developers can identify the largest speed bottlenecks, and add secondary keys to alleviate traffic.
CREATE TABLE tblNames (NameId int,LastName varchar(255),FirstName varchar(255),AddressId int,Ver int)CREATE TABLE tblAddress (AddressId int,AddressAsStr varchar(255),ZipCode int(255),Ver int)
SELECT * FROMtblNames n join tblAddress a ONn.AddressId = a.AddressIdAND Ver = 0WHERE NameId = @NameId
In the SQL code snippets above we created 2 SQL tables. The schema defines how the tables are created and linked.
Notice in the query, we joined the tables on AddressId. The star * means pull back all columns so this query pulls back all columns from both tables. We also told SQL to find the correct row of data by using the primary key NameId. For efficiency reasons, it’s essential to use keys whenever possible. Imagine databases with billions of rows of data; finding the right information can take forever when it’s a production amount of information.
Learn database design without scrubbing through videos or documentation. Educative’s text-based courses are easy to skim and feature live coding environments - making learning quick and efficient.
Non-relational databases are another type of database that are used when architects are unsure what type of information the database will recieve. Recently, lots of advancements have been made on Non-Relational databases, which can take unstructured information, and store it. Non-relational databases don’t require as much up-front design, and they are more flexible. The downside of non-relational database is they are generally harder to use – because the developer doesn’t know what kind of information they are going to receive – data could come a picture, or a movie, a .zip file, or plain text for example. After storage, once the developer has to use information from the Non-relational database, it’s harder to write coding logic to process that information because there are so many options.
NoSQL means the database is not SQL. It’s something other than the traditional tabular relations. NoSQL is great for big data, and real-time web applications. No SQL is a bit of an exaggeration. NoSQL can better be thought of as “not-only SQL”, many NoSQL databases use some table relationships, and some other relationships, for example, a picture storage database may take multiple kinds of files, and still have a key to file table relationship. NoSQL compromises consistency – the developer doesn’t know what they’re going to get when they query their NoSQL database – for other benefits.
When developing a NoSQL database different data structures are used. We won’t go into them in-depth here, but here’s a list of common NoSQL data structures so you can get an idea:
NoSQL databases have different query options, querying is asking the database for information. NoSQL databases are often used to store unformatted information. The software can take in the data now, process it later. This is very helpful when you don’t know what kind of information you’re going to get up-front: like when a user can email in a picture, PDF, attachment, or text in an email.
If you are writing a stock trading, banking, or a Finance-based app or you need to store a lot of relationships, for instance, when writing a social networking app like Facebook, then you should pick a relational database. Here’s why:
Transactions & Data Consistency
If you are writing software that has anything to do with money or numbers, that makes transactions, ACID, data consistency super important to you. Relational DBs shine when it comes to transactions & data consistency. They comply with the ACID rule, have been around for ages & are battle-tested.
If your data has a lot of relationships like “friends in Seattle”, “friends who like coding” etc. There is nothing better than a relational database for storing this kind of data.
Relational databases are built to store relationships. They have been tried & tested & are used by big guns in the industry like Facebook as the main user-facing database.
Popular relational databases:
Here are a few reasons why you’d want to pick a NoSQL database:
Handling A Large Number Of Read Write Operations
Look towards NoSQL databases when you need to scale fast. For example, when there are a large number of read-write operations on your website and when dealing with a large amount of data, NoSQL databases fit best in these scenarios. Since they have the ability to add nodes on the fly, they can handle more concurrent traffic and large amounts of data with minimal latency.
Running data analytics NoSQL databases also fit best for data analytics use cases, where we have to deal with an influx of massive amounts of data.
Popular NoSQL databases:
If you’re curious about trying a NoSQL database like MongoDB then I highly suggest checking out Nikola Zivkovic’s course, The Definitive Guide to MongoDB.
There has been a lot covered in this post, but we’ve barely scratched the surface. You should invest time learning about data modeling, normalization, functional dependencies, and SQL.
Database Design Fundamentals for Software Engineers is a great course for learning the key aspects of database design. In this course, you will:
You can check out a free preview by clicking the link above.
Join a community of more than 1.4 million readers. A free, bi-monthly email with a roundup of Educative's top articles and coding tips.