蚍蜉实验室

Sysbench OLTP Benchmark

Original link: http://www.storagereview.com/sysbench_oltp_benchmark

The Sysbench OLTP application benchmark runs on top of a MySQL database running the InnoDB storage engine. The job of the storage engine is to manage the interface from the on-disk database to the applications reading and writing data to and from the storage engine. The storage engine in turn manages IO threads and logs, and it keeps an in-memory cache to minimize disk access. The chart below gives a simplistic overview of the engine.

Since the InnoDB engine keeps an in-memory cache called the buffer pool, performance wil be directly affected by the ratio of the working set size to the size of the buffer pool. In other words, if the buffer pool is large enough to hold the working set or the working set is small enough to fit in the buffer pool, most operations will never be IO bound. However, if the database is too large to fit into memory, then IO performance will dictate transactional response times and throughput. We are characterizing the performance of the drive in this situation where the database does not fit in memory, which leads to increased IO traffic from the InnoDB storage engine.

Multiple threads (the number depends on database configuration for high performance applications and varies from 32 to 128) read data in a random fashion from the backing store with a block size of 16Kbytes. These reads are related to database queries requesting data from the backing store. As reads are processed, they are automatically cached in the buffer pool. As the buffer pool fills up, InnoDB uses a LRU (Least Recently Used) policy to evict older pages to make room for newer data.

Database writes are first directed to the transaction log and to the buffer pool. The transaction log is a sequentially written ring buffer that is updated for every write transaction. Depending on the MySQL configuration, this update can lead to an immediate on-disk write, or it can persist temporarily in RAM as a buffer that is eventually flushed to disk by the file system. The size of this log buffer varies, but is usually around 256MB. Recommended settings for ACID compliance require that the log writes hit the disk on every database commit. These writes are 4K in length.

The log only does “physiological” writes where the delta between the previous data and the new data is written. For the original data, InnoDB writes to the buffer pool in RAM, which has to be drained asynchronously as well. The process of writing out the data back to the file system is configurable and occurs in the background. The writes are also 16K. The writes are actually double-writes where the engine first writes the data into an intermediary location called the double-write buffer. The engine then copies that data into its final location in the file system. This is necessary to avoid the “torn page” problem where a 16KB database page is partially written to disk due to a power loss or other catastrophic event.

Origins of the Sysbench Benchmark

We brought the Sysbench test setup into our lab after several conversations with Micron about how they simulate real-world MySQL application environments for SSD testing and measurement. They developed a test methodology around Sysbench after finding that synthetic storage benchmarks rarely provided a complete view of how a drive will behave when under a particular application workload. The Sysbench test allows Micron to simulate an environment that most closely represents a standard MySQL database workload, which are prevalent in applications such as Facebook, Craigslist and Booking.com

We worked particularly closely with Moussa Ba, who helped co-author this article. Moussa is a software engineer on Micron’s PCIe development team where his work includes application and system software optimization for high performance IO devices.

Sysbench OLTP Benchmark

Sysbench is a system performance benchmark that includes an OnLine Transaction Processing (OLTP) test profile. The OLTP test is not an approximation of an OLTP test, but is rather a true database-backed benchmark that conducts transactional queries to an instance of MySQL in a CentOS environment.

The first step in setting up the benchmark is to create the database itself, which is done by specifying the number of tables in the database as well the number of rows per table. In our test, we defined 100 tables with 10 million rows each, which resulted in a database of 1 billion entries. This database was 260GB.

Sysbench has two operating modes: default mode which reads and writes to the database and a readonly mode. The default R/W mode will execute the following query types: 5 SELECT queries, 2 UPDATE queries, 1 DELETE query and 1 INSERT. Looking at IO counts, the observed read/write ratio is about 75% reads and 25% writes.

Sysbench Testing Environment

Storage solutions are tested with the Sysbench OLTP benchmark in the StorageReview Enterprise Test Lab utilizing stand-alone servers. We currently utilize off-the-shelf channel PowerEdge R730s from Dell to show both realistic performance and a solid price/performance ratio, changing only the storage adapter or network interface to connect our R730 to different storage products. The PowerEdge R730 has proven itself to offer great compatibility with third-party devices, making it an excellent go-to platform for this diverse testing environment. The R730 also leverages Intel’s powerful Haswell-class architecture, which gives us the compute power to properly stress a wide range of storage offerings and maximize their performance potential.

First Generation Sysbench Benchmark Environment

Lenovo ThinkServer RD630 - SATA/SAS/PCIe Testing Platform

  • 2594-ABU Topseller Model
  • Dual Intel E5-2650 CPUs (2.0GHz, 8-cores, 20MB Cache)
  • 128GB RAM (8GB x 16 DDR3, 64GB per CPU)
  • 100GB Micron RealSSD P400e SSD (via LSI 9207-8i) Boot Drive
  • 960GB Micron M500 (via 9207-8i) Pre-built Database Storage
  • CentOS 6.3 64-bit
  • Percona XtraDB 5.5.30-rel30.1
    • Database Tables: 100
    • Database Size: 10,000,000
    • Database Threads: 32
    • RAM Buffer: 24GB

Second Generation Sysbench Benchmark Environment

Dell PowerEdge R730 - SATA/SAS/PCIe Testing Platform

  • Dual Intel E5-2690 v3 CPUs (2.6GHz, 12-cores, 30MB Cache)
  • 256GB RAM (16GB x 16 DDR4, 128GB per CPU)
  • 100GB Boot SSD, 480GB Database Storage SSD
  • 2 x Mellanox ConnectX-3 InfiniBand Adapter
  • 2 x Emulex 16GB dual-port FC HBA
  • 2 x Emulex 10GbE dual-port NIC
  • CentOS 6.6 64-bit
  • Percona XtraDB 5.5.30-rel30.1
    • Database Tables: 100
    • Database Size: 10,000,000
    • Database Threads: 32
    • RAM Buffer: 24GB

Dell PowerEdge R730 Virtualized Sysbench 4-node Cluster

  • Eight Intel E5-2690 v3 CPUs for 249GHz in cluster (Two per node, 2.6GHz, 12-cores, 30MB Cache)
  • 1TB RAM (256GB per node, 16GB x 16 DDR4, 128GB per CPU)
  • SD Card Boot (Lexar 16GB)
  • 4 x Mellanox ConnectX-3 InfiniBand Adapter (vSwitch for vMotion and VM network)
  • 4 x Emulex 16GB dual-port FC HBA
  • 4 x Emulex 10GbE dual-port NIC
  • VMware ESXi vSphere 6.0 / Enterprise Plus 8-CPU

The main goal with this platform is highlighting how enterprise storage performs in an actual enterprise environment and workload, instead of relying on synthetic or pseudo-synthetic workloads. Synthetic workload generators are great at showing how well storage devices perform with a continuous synthetic I/O pattern, but they don’t take into consideration any of the other outside variables that illuminate how devices actually work in production environments. Synthetic workload generators have the benefit of showing a clean I/O pattern time and time again, but they won’t ever replicate a true production environment. Introducing application performance on top of storage products begins to show how well the storage interacts with its drivers, the local operating system, the application being tested, the network stack, the networking switching and external servers. These are variables that a synthetic workload generator simply can’t take into account, and they are also an order of magnitude more resource- and infrastructure-intensive in terms of the equipment required to execute this particular benchmark.

Overall Sysbench Performance Results

We test a wide range of storage solutions with the Sysbench OLTP benchmark that meet the minimum requirements of the testing environment. To qualify for testing, the storage device must have a usable capacity exceeding 260GB and be geared towards operating under stressful enterprise conditions. Locally-attached storage devices such as SAS, SATA and PCIe SSDs are tested on a bare-metal server with one Sysbench instance. Newer SAN and Hyper-converged platforms run either 4, 8, 12 or 16 VMs simultaneously to show how well multiple workloads operate at the same time on each. This testing methology helps demystify the performance comparisons between newer hyper-converged systems against traditional SAN storage arrays.

Hyper-Converged / SAN Virtualized Sysbench Performance Results (16 VM Aggregate)

Device 32-Thread Aggregate TPS Average Response Time (ms) 99th Percentile Latency (ms)
X-IO ISE 860
(4) Dell R730, X-IO ISE 860 AFA
(2) 10TB Volumes
6625 80 418

Hyper-Converged / SAN Virtualized Sysbench Performance Results (12 VM Aggregate)

Device 32-Thread Aggregate TPS Average Response Time (ms) 99th Percentile Latency (ms)
X-IO ISE 860
(4) Dell R730, X-IO ISE 860 AFA
(2) 10TB Volumes
7160 54 177

Hyper-Converged / SAN Virtualized Sysbench Performance Results (8 VM Aggregate)

Device 32-Thread Aggregate TPS Average Response Time (ms) 99th Percentile Latency (ms)
X-IO ISE 860
(4) Dell R730, X-IO ISE 860 AFA
(2) 10TB Volumes
6568 39 83
VMware VSAN (ESXi 6.0)
(4) Dell R730xd, 80 1.2TB HDDs, 16 800GB SSDs
4259 60 131

Hyper-Converged / SAN Virtualized Sysbench Performance Results (4 VM Aggregate)

Device 32-Thread Aggregate TPS Average Response Time (ms) 99th Percentile Latency (ms) Peak Latency (ms)
DotHill Ultra48 Hybrid
(4) Dell R730, DotHill Ultra48 Hybrid
(2) 14-disk RAID1 Pools, 40 1.8TB HDDs, 8 400GB SSDs
4645 28 51 676
X-IO ISE 860
(4) Dell R730, X-IO ISE 860 AFA
(2) 10TB Volumes
4424 29 57 983
VMware VSAN (ESXi 6.0)
(4) Dell R730xd, 80 1.2TB HDDs, 16 800GB SSDs
2830 45 94 480
Nutanix NX-8150 (ESXi 6.0)
(4) NX-8150, 80 1TB HDDs, 16 800GB SSDs RAID0 x 4 vDisk Database Volume
2390 54 173 4784
Nutanix NX-8150 (ESXi 6.0)
(4) NX-8150, 80 1TB HDDs, 16 800GB SSDs Default Database Deployment
1422 90 216 5508

PCIe Application Accelerator / Multi-SSD/HDD RAID Sysbench Performance Results

Device 32-Thread Average TPS Average Response Time 99th Percentile Latency
Huawei ES3000 2.4TB
MLC PCIe SSD x 1
2734.69 11.7 19.84
Huawei ES3000 1.2TB
MLC PCIe SSD x 1
2615.12 12.23 21.80
Fusion ioDrive2 Duo 2.4TB
MLC PCIe SSD x 1
2521.06 12.69 23.92
Micron P320h 700GB
SLC PCIe SSD x 1
2443.56 13.09 22.45
Micron P420m 1.4TB
MLC PCIe SSD x 1
2361.29 13.55 25.84
Fusion ioDrive2 1.2TB
MLC PCIe SSD x 1
2354.06 13.59 29.35
Virident FlashMAX II 2.2TB
MLC PCIe SSD x 1
2278.11 14.04 26.04
LSI Nytro WarpDrive 800GB
MLC PCIe SSD x 1
1977.67 16.18 39.94
LSI Nytro WarpDrive 400GB
MLC PCIe SSD x 1
1903.14 16.81 39.30

Individual SAS / SATA SSD Results

Device 32-Thread Average TPS Average Response Time 99th Percentile Latency
Toshiba HK3R2 960GB
MLC SATA x 1
1673.23 19.12 49.65
SanDisk CloudSpeed Eco 960GB
cMLC SATA x 1
1556.99 20.55 49.05
Intel S3700 800GB
eMLC SATA x 1
1488.71 21.49 40
Toshiba PX02SM 400GB
eMLC SAS (6Gb/s) x 1
1487.03 21.52 62.02
Smart Optimus 400GB
eMLC SAS x 1
1477.1 21.66 52.69
OCZ Talos 2 C 480GB
MLC SAS x 1
1438.7 22.24 47.32
OCZ Intrepid 3600 400GB
eMLC SAS x 1
1335.3 23.96 48.42
OCZ Talos 2 R 400GB
eMLC SAS x 1
1421.15 22.51 45.06
SanDisk Extreme 480GB
MLC SATA x 1
1303.48 24.55 53.56
STEC s842 800GB
eMLC SAS x 1
1293.56 24.74 67.20
Intel S3500 512GB
eMLC SATA x 1
1,287.65 24.85 64.15
Intel S3500 480GB
eMLC SATA x 1
1241.59 25.77 54.27
Seagate 600 Pro 400GB
eMLC SATA x 1
1198.2 26.7 62.69
Hitachi SSD400S.B 400GB
SLC SAS x 1
1191.47 26.85 47.87
OCZ Vector 512GB
MLC SATA x 1
1130.03 28.32 62.67
SanDisk Extreme II 480GB
MLC SATA x 1
981.34 32.61 102.58
Hitachi SSD400M
eMLC SAS x 1
878.9 36.41 68.03
Toshiba eSSD
SLC SAS x 1
758.41 42.19 140.59
Micron M500 480GB
MLC SATA x 1
668.6 47.86 461.80