TiDB: Performance-tuning a distributed NewSQL database

Doing performance tuning on distributed systems is no joking matter. It’s much more complicated than on a single node server, and bottlenecks can pop up anywhere, from system resources in a single node or subcomponent, to cooperation between nodes, to even network bandwidth. Performance tuning is a practice that aims to find these bottlenecks and address them, in order to reveal more bottlenecks and address them as well, until the system reaches an optimal performance level.

In this article, I introduce you to TiDB, a distributed NewSQL database, and share some best practices on how to tune write operations in TiDB to achieve maximum performance. TiDB is an open source, hybrid transactional/analytical processing (HTAP) database, designed to support both OLTP and OLAP scenarios.

One TiDB cluster has several TiDB servers, several TiKV servers, and a group of Placement Drivers (PDs), usually three or five nodes. The TiDB server is a stateless SQL layer, the TiKV server is the key-value storage layer, and each PD is a manager component with a “god view” that is responsible for storing metadata and doing load balancing. Below is the architecture of a TiDB cluster. You can find more details on each component in the official TiDB documentation.

tidb architecturePingCAP

Gathering TiDB metrics 

We gather a lot of metrics inside each TiDB component. These are periodically sent to Prometheus, an open source system monitoring solution. You can easily observe the behaviors of these metrics in Grafana, an open source platform for time series analytics. If you deploy the TiDB cluster using Ansible, Prometheus and Grafana will be installed by default. By observing various metrics, we can see how each component is working, pinpoint the bottlenecks, and address them via tuning. Let’s see an example.

No Comments, Be The First!

Your email address will not be published.