Trusted answers to developer questions

How does indexing work in a database?

Get Started With Machine Learning

Learn the fundamentals of Machine Learning with this free course. Future-proof your career by adding ML skills to your toolkit — or prepare to land a job in AI or Data Science.

Database Indexing allows us to cut down the number of rows/records that need to be examined when a select query with a where clause is executed.

Let’s try to understand what happens when a select query is executed on a database table without an index.

Suppose we have an Employee table, with fields ‘Employee_Id’ and ‘Employee_Name’. We want to execute the following select query on this table:

select * from Employee where Employee_Name = 'abc';

When this query is executed, what goes on behind the scenes? Every single row is checked to see if the employee_name matches with ‘abc’. This effectively means that the entire table will have to be scanned (known as the full table scan).

An index is a data structure that stores the values for a certain specific column of a table and helps us avoid a full table scan. Let’s look at the common data structures used for database indexing:

Data structures for indexing

1. B-trees
B-trees are the most commonly used data structures for indexes as they are time-efficient for lookups, deletions, and insertions. All these operations can be done in logarithmic time. Data that is stored inside of a B-tree can be sorted.

2. Hash Tables
Indexes that use hash tables are generally referred to as hash index. Hash tables are extremely efficient for looking up values, which means that queries that look for an exact match can be executed very quickly. In a hash index, the key is the column value and the value in the hash table is a pointer to the row data in the table. However, hash tables are not sorted data structures, so they may not be efficient for other types of queries.

3. R-tree
R-tree is commonly used with spatial databases. It helps with queries like "find all the coffee shops within 2 miles of my location".

4. Bitmap Index
Bitmap index works for columns that have many instances of those values, i.e., columns with low selectivity. For example, a column with boolean values.

So, what is the cost of having a database index?
The first thing to note is that the index takes up additional space, so the larger the table, the bigger the index. Every time you perform an add, delete, or update operation, the same operation will need to be performed on the index as well.

Creating an index
The following snippet shows how to create an index on a single column and on multiple columns.

CREATE INDEX name_index ON Employee (Employee_Name)
CREATE INDEX name_index ON Employee (Employee_Name, Employee_Age)
}

As a general rule, an index should only be created on a table if the data in the indexed column will be queried frequently.

RELATED TAGS

CONTRIBUTOR

Anjana Shankar
Attributions:
  1. undefined by undefined
Did you find this helpful?