We are excited to announce that the TPC-C benchmark implementation for YugabyteDB is now open source and ready to use! While this implementation is not officially ratified by the TPC organization, it closely follows the TPC-C v5.11.0 specification.
For those new to TPC-C, the aim of the benchmark is to test how a database performs when handling transactions generated by a real-world OLTP application. This blog post shows the results of running the TPC-C benchmark in addition to outlining our experience of developing and running a TPC-C benchmark against YugabyteDB.
The results of running the above TPC-C benchmark with 10, 100, 1000, and 10,000 warehouses on a YugabyteDB cluster running in a single zone of AWS are shown below.
You can find the instructions to reproduce the above results in the benchmarking section of YugabyteDB docs. The rest of this post goes into some details about the TPC-C workload itself, how we built the benchmark tool, and our considerations when running it in public clouds.
Linear TPC-C scalability in the context of a distributed relational database refers to the fact that support for a larger number of warehouses without compromising high efficiency can be achieved by simply adding new nodes to the cluster. As shown below, we are excited to prove this property in the context YugabyteDB. YugabyteDB shows a tpmC value of 12,590 (while running 1000 warehouses on a 3 node cluster of c5d.4xlarge nodes) which is 97.90% of the theoretical maximum. In order to handle scaling the workload up by a factor of 10 from 1,000 to 10,000 warehouses, the cluster was scaled up to 30 nodes. This resulted in 10 times as many transactions per second being handled, for a tpmC of 125,194 (which is 97.35% of the theoretical maximum possible tpmC value of 128,600).
TPC-C models a business that has a warehouse, multiple districts, and inventory for those warehouses, as well as items and orders for those items. The TPC-C benchmark tests five different transaction workloads, which are briefly described below.
The complete entity-relationship diagram for the TPC-C workload is shown below.
The number of warehouses is the key configurable parameter that determines the scale of running the benchmark. Increasing the number of warehouses increases the data set size, the number of concurrent clients as well as the number of concurrently running transactions. A warehouse can have up to ten terminals (point of sale or point of inquiry counters) which generate transactions such as entering a new order, settling payments, and looking up the status of an existing order. TPC-C also models other behind the scenes activities at warehouses that would result in transactions, such as finding items that need to be restocked or marking items as delivered.
#community news #databases #distributed sql #how it works #open source #performance benchmarks