Beyond MySQL: Advancing into the New Era of Distributed SQL with TiDB

Audience:

Topic:

In this talk we discuss the history, innovation and evolution of the MySQL ecosystem and how TiDB, an open source distributed SQL database with HTAP capabilities, supports the MySQL protocol and solves a lot of the major pain points facing MySQL users today. A lot of the success of MySQL can be credited to its replication capabilities. It started out by sending SQL statements from a single write node to execute over a network to another replicated MySQL instance. This worked well enough when the datasets were small (e.g., < 128Gb). There has been an explosion in data in the last 15-16 years. The simple statement level replication has undergone several iterations at MySQL. The first change was row based replication, to address performance issues on the leader node when replicating data. However, this too left users with many challenges. Consistency of data was a big challenge because there were no guarantees given by MySQL, failover was manual and a fingers crossed operation. Backup and recovery was another challenge due to the large datasets and this caused unacceptable replication lag between the leader and the replica node. Finally, DDL has always been a challenge, first due to large datasets and secondly due to maintaining consistency across the cluster. So,DDL is still a single node operation, recent instant DDL operations mitigate this problem to some extent but not all e.g., create index. To address some of these challenges MySQL introduced Group Replication. Group Replication relies on Paxos to manage the cluster membership and leader election. However, it’s still an eventually consistent system according to MySQL documentation. Though we can classify it as timeline consistency, which is actually stronger, because there is no reordering of events which is possible in eventual consistency. When datasets grew larger than what MySQL could handle, customers in the early days started to manually shard the data and then build more sophisticated systems around it like Vittese from PlanetScale. Some cloud vendors like AWS went further and scaled out the storage layer of the MySQL/InnoDB storage engine with their cloud and hardware specific changes. There are several other products in this space like PolarDB from Alibaba and Aurora for MySQL from Amazon, unfortunately they are all closed source. In addition, products like AWS Aurora for MySQL have inherited some of the limitations from upstream like a maximum data size (of all tables) of 128Tb. There are still other issues with these systems, and two of the bigger common issues are DDL over large datasets and automatic scale out to Petabyte storage. Finally, for people that have to maintain sharded systems, the frequent resharding everytime a table becomes too large is a major pain point. A major development that influenced the direction of innovation in the database space in the past decade was the advent of the cloud infrastructure and the convenience of its elasticity based model. One of the first innovations in the MySQL ecosystem in the cloud era came from AWS with the disaggregated compute and storage architecture in AWS Aurora for MySQL. This allowed compute and storage to scale independently and provided better end user experience. However, one major limitation that remains is that it’s still a single writer multiple reader eventually consistent system. Another growing challenge in the past decade is doing analytic queries over data gathered from an OLTP database. Users are forced to move their data to specialized OLAP systems to query their data and this ETL step is expensive, time consuming and can lead to consistency and data governance issues. MySQL does have a closed source Oracle cloud only solution called Heatwave but it’s eventually consistent, and you cannot span transactions across MySQL/InnoDB (OLTP) and MySQL/Heatwave (OLAP) unlike in (true) HTAP databases. TIDB an open source distributed SQL and HTAP database was born to take advantage of the elasticity available in the cloud and architected from the beginning to be a disaggregated compute and storage system that shards and reshards automagically and uses the Raft consensus protocol to maintain consistency across the cluster. There are TiDB SQL parser and optimizer compute nodes, TiKV storage nodes and TiFlash column storage nodes. The numbers of each type of node can vary independently. The DDL operations in TiDB are done using a documented and proven algorithm. In addition, with the HTAP capabilities you get the best of both OLTP and OLAP worlds with transactions that are consistent across both the OLTP (TiKV) and OLAP engine (TiFlash). In conclusion, this talk will briefly go over the history of MySQL and MySQL’s attempts to address the challenge and issues around big data. We’ll then present how TiDB addressed the same issues with TiDB’s mutl-cloud and on-premise offering. We will do an in-depth review of the TiDB architecture and its various components and how they come together to solve the challenge of scale out, HTAP, multi-tenancy and DDL.

Room:

Room 106

Time:

Friday, March 15, 2024 - 15:45 to 16:45

Audio/Video: