Basics of Btrees - Database internals 01

Written on : 2023-11-10

Storage engines

Storage engines are characterised mainly by three properties

Often LSM are considered non-mutable and Btree is mutable. LSM doesn't modify your records in place, Btrees do.

Although some variants exist that don't comply with this.

This blog covers some important points from chapter 2 Btree Basics of the Database Internals - Alex Petrov book.

BST - Binary Search Trees

The book begins by discussing BST. A BST has the following :

It is important to note that fan out and height are inversely proportional to each other. Also, Just because a structure is optimised for space and time, it cannot be used as a disk-based structure, it needs to be adapted to the mediums limitations.

Medium

Most of these structures and algorithms were built for spinning disks. These days we are seeing an uptick in structures that optimise for byte addressable non-volative memory

Hard drives

These have physically spinning disks, hence have the following :

Solid-state drives

These are some important behaviours to consider in SSDs :

B-trees

From here on the book covers complexity and algorithm (insertion and deletion) for trees. I have skipped more of it except for some important points. I have considered knowledge on b-trees basic.

Insertion (splits)

Most splits don't propagate for more than 1 level. BTrees are good this way (Happens due to high fanout).

Deletion (merges)

Conclusion

In this chapter, we simply understand how btree works as a data structure and algorithm, on how to use (implement) them in disks is another chapter. Looking forward to it!