An estimate about B-trees is that

- In the external memory model, the running time of
`find(x)`

,`add(x)`

, and`remove(x)`

in a B-tree is $O(\log_Bn)$. - In the word-RAM model, the running time of
`find(x)`

is $O(\log n)$ and the running time of`add(x)`

and`remove(x)`

is $O(B\log n)$.

The following lemma shows that, so far, we have overestimated the number of merge and split operations performed by B-trees.

**Lemma 1:** Starting with an empty B-tree and performing any sequence
of m `add(x)`

and `remove(x)`

operations results in at most $3m/2$ splits, merges,
and borrows being performed.

**Proof:** The proof of this has already been sketched for the
special case in which $B = 2$. The lemma can be proven using a credit
scheme, in which

- each
`split()`

,`merge()`

, or`borrow()`

operation is paid for with two credits, i.e., a credit is removed each time one of these operations occurs; and - at most three credits are created during any
`add(x)`

or`remove(x)`

operation.

Since at most 3 m credits are ever created and each split, merge, and borrow is paid for with two credits, it follows that at most $3m/2$ splits, merges, and borrows are performed. These credits are illustrated using the $¢$ symbol.

To keep track of these credits the proof maintains the following credit
invariant: Any non-root node with $B − 1$ keys stores one credit and any
node with $2B − 1$ keys stores three credits. A node that stores at least $B$ keys and most $2B − 2$ keys need not store any credits. What remains is to show that we can maintain the credit invariant and satisfy properties $1$
and $2$, above, during each `add(x)`

and `remove(x)`

operation.

## Adding

The `add(x)`

method does not perform any merges or borrows, so
we need only consider split operations that occur as a result of calls to
`add(x)`

.

Each split operation occurs because a key is added to a node, $u$, that
already contains $2B−1$ keys. When this happens, `u`

is split into two nodes,
$u'$ and $u''$ having $B − 1$ and $B$ keys, respectively. Prior to this operation, `u`

was storing $2B − 1$ keys, and hence three credits. Two of these credits can be used to pay for the split and the other credit can be given to $u'$
(which has $B−1$ keys) to maintain the credit invariant. Therefore, we can pay for the split and maintain the credit invariant during any split.

The only other modification to nodes that occur during an `add(x)`

operation happens after all splits, if any, are complete. This modification involves adding a new key to some node $u'$. If, prior to this, $u'$ had $2B − 2$ children, then it now has $2B−1$ children and must therefore receive three credits. These are the only credits given out by the `add(x)`

method.

## Removing

During a call to `remove(x)`

, zero or more merges occur and are possibly followed by a single borrow. Each merge occurs because two nodes, `v`

and `w`

, each of which had exactly $B − 1$ keys prior to calling `remove(x)`

were merged into a single node with exactly $2B − 2$ keys. Each such merge, therefore, frees up two credits that can be used to pay for the merge.

After any merges are performed, at most one borrow operation occurs, after which no further merges or borrows occur. This borrowing operation only occurs if we remove a key from a leaf, `v`

, that has $B − 1$ keys. Therefore, the `v`

node has one credit, and this credit goes towards the cost of the borrowing. This single credit is not enough to pay for the borrowing, so we create one credit to complete the payment.

At this point, we have created one credit and we still need to show that the credit invariant can be maintained. In the worst case, `v`

’s sibling, `w`

, has exactly $B$ keys before the borrow so that, afterward, both `v`

and `w`

have $B − 1$ keys. This means that `v`

and `w`

each should be storing a credit when the operation is complete. Therefore, in this case, we create an additional two credits to give to `v`

and `w`

. Since a borrow happens at most once during a `remove(x)`

operation, this means that we create at most three credits, as required.

If the `remove(x)`

operation does not include a borrow operation, this
is because it finishes by removing a key from some node that, prior to the
operation, had $B$ or more keys. In the worst case, this node had exactly $B$
keys, so that it now has $B−1$ keys and must be given one credit, which we create.

In either case—whether the removal finishes with a borrow operation
or not—at most three credits need to be created during a call to `remove(x)`

to maintain the credit invariant and pay for all borrows and merges that occur. This completes the proof of the lemma.

The purpose of Lemma 1 is to show that, in the word-RAM model the cost of splits, merges, and joins during a sequence of $m$ `add(x)`

and `remove(x)`

operations is only $O(Bm)$. That is, the amortized cost per operation is only $O(B)$, so the amortized cost of `add(x)`

and `remove(x)`

in the word-RAM model is $O(B + \log n)$. This is summarized by the following pair of theorems:

**Theorem 1 (external memory B-trees):** A `BTree`

implements the `SSet`

interface. In the external memory model, a `BTree`

supports the operations
`add(x)`

, `remove(x)`

, and `find(x)`

in $O(\log_Bn)$ time per operation

**Theorem 2 (Word-RAM B-Trees):** A `BTree`

implements the `SSet`

interface. In the word-RAM model, and ignoring the cost of splits, merges, and borrows, a `BTree`

supports the operations `add(x)`

, `remove(x)`

, and `find(x)`

in $O(\log n)$ time per operation. Furthermore, beginning with an empty `BTree`

, any sequence of $m$ `add(x)`

and `remove(x)`

operations results in a total of $O(Bm)$ time spent performing splits, merges, and borrows.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy