As of October 2020, 358 database management products have been registered on the db-engines website (the most famous world DBMS rating). Among them you can find traditional relational and modern NoSQL products, solutions that use RAM for data processing, and even hardware resources of the video card. The competition is still high – processing speeds and data volumes are increasing.

However, even the most modern software is not always able to effectively solve the tasks assigned to it. Michael Stonebraker, a prominent researcher in the field of data management, gave an example in one of his speeches: a traditional DBMS spends only 4% of CPU time directly on payload. Everything else can be spent on maintaining system operation.

Naturally, this figure is largely abstract. But the fact remains: today’s DBMS do not fully disclose all the capabilities of the multi-core hardware architecture, but only force the end user to build up this most hardware part. Examples were given in the previous article of this series.

Modern DBMS do not fully disclose all the features of the multi-core hardware architecture, but only force the end user to build this hardware part.

Why is this happening? What “under the hood” of modern DBMS prevents them from fully opening up on multi-core architecture?

Classic approach with locking

To begin with, let us recall that the basic architecture of the most popular today classic relational disk DBMS was formed in the 80-90s, when computers were large, processors – single-core, programs – single-threaded, and data was stored on hard drives. At that time, the work of DBMS internal structures was based on the principles of mutual exclusion when accessing a shared resource. If some process wants to change the data block, the data should be temporarily locked for other processes, which may also try to make their changes. The access is provided with the help of so-called synchronization objects. The simplest example of such object is a spinlock.

Suppose that a process tries to capture a spinlock. If the attempt is successful, the process gets monopolistic access to the resource and performs the necessary actions. If the attempt is unsuccessful, the process waits in the loop until spindles are free. This method spends a computational resource without doing any useful work, so a spinlock is used, assuming that the wait will not last long. A more complex type – mutex (from the English “mutual exclusion” – mutual exclusion) can give control to the task scheduler (kernel) and free the CPU for another task or thread. The mutex is more convenient than a spinlock, but it takes more computation and time to work.

Lock mechanisms ensure data consistency and transaction isolation, but they also severely limit the possibilities of parallel operation. In an effort to avoid locks when working with data in the 90’s appeared multiversion competitive access control (MVCC) based on the “snapshots” of the database – data changes are not visible to other users until they are confirmed. Today, this mechanism is in most DBMS, even in modern NoSQL-solutions.

Meanwhile, the entire internal structure of most classic DBMS is still based on the idea of blocking algorithms and mutual exclusion. Examples of such solutions are Oracle, Microsoft SQL Server, PostgreSQL, MySQL.

While the processor was single-core, no other solution was required. But the appearance of multi-core chips and multithreaded programming quickly showed the drawbacks of the locked architecture.

First, the work itself of the locking mechanism (reading the lock, waiting for release, writing “busy”, capture, release, writing “free”) requires a significant (by the standards of the program) processor time in the presence of competition. And secondly, there is a common problem – any parallel work is impossible in the area protected by the lock.

For the end user, the problem is expressed in the fact that under conditions of competition on multicore hardware, the DBMS work and the application which uses it slows down dozens of times.

And if not blocked?

The alternative is non-blocking algorithms, which just appeared with the development of multi-core architecture. The principle of their work is not to forbid competing threads to access the resource, but to ensure the consistent existence of the resource for all the threads that are currently working with it in parallel. The advantage of non-blocking algorithms is in better scalability by the number of processors.

The main disadvantage is that non-blocking algorithms are intended for data processing in memory only and imply cooperative work of competitive threads. The use of such algorithms in classic DBMS is difficult, primarily because of the active work with the disk. Significant time delays for accessing the disk do not allow for cooperative work of competitive threads when embedding long operations into non-blocking algorithms. Therefore, non-blocking algorithms work best in in-memory systems.

For the consumer, this means that the entire system will work several times faster than a classic DBMS. However, the database will be limited by the amount of RAM in the server or server cluster. Of course, if the database grows, it will have to buy RAM.

Examples of such solutions are Redis, Tarantool, Hazelcast, SAP HANA. They are becoming increasingly popular in a variety of systems and can be used as a cache to accelerate access to a classic relational disk database. But the total cost of ownership of such solutions is still quite high.

How to connect a non-connected

Naturally, the developers of both classical disk and in-memory DBMS in the course of development of their products, though from different ends, but came to the same problem – how to combine the advantages of both approaches and get rid of their disadvantages. Developers of in-memory solutions want to remove unused data from memory to disk and thus increase the efficiency of the system. And the creators of classic DBMS want to work faster on a multi-core “hardware” in the case when the data is processed solely or mainly in memory.

But so far, none of the giants on the market have managed to create a product or at least a prototype that meets all these requirements.

And is there a third option? From the authors’ point of view, there are several factors that affect the DBMS performance. First, it is the approaches and methods of synchronization that eventually provide the level of efficiency of modern multi-core processors. Secondly, in real systems not all data have the same frequency of calls, so data caching in memory allows a more rational use of server resources.

Therefore, according to the authors, it is interesting to build a DBMS architecture, which is based on a combination of non-blocking algorithms for data processing in memory and classical synchronization methods when waiting for events with long delays. Such architecture allows to inherit the best properties from different DBMS classes: specialized in-memory and classical systems.