SSD performance measurement: Best practices
SSDs have pronounced write-history sensitivity, which means that they have unique requirements for accurately measuring their performance. This article documents how to measure enterprise SSD performance, with an emphasis on the following points:
• Main goals of performance measurement (accuracy, consistency, and repeatability)
• Key assumptions about enterprise SSD requirements
• Definitions of commonly used terms
• Tools used
• Metrics of primary interest
• Test mechanics, including:
o Using a common measurement tool to construct a write-saturation plot
This article cites empirical data demonstrating the main NAND-based SSD performance regions—FOB (fresh-out-of-box), transition, and steady state—and shows an example of the markedly different results that can be achieved (on the same drive) depending on the order in which the stimuli are applied. It also shows alignment to the Storage Network Industry Association (SNIA)’s Performance Test Specification protocol, provides explicit details on our precondition and measurement techniques (along with an example), and shows the relevance of the collected metrics.
Enterprise SSD performance measurement is concerned primarily with the following standards:
It is based on some key assumptions about enterprise SSDs and the enterprise market— specifically:
• Drive fill state
• Access and I/O traffic patterns
• Decision criteria (for purchase and deployment)
Accuracy, Consistency, Repeatability
The testing used is designed to minimize the impact of external elements. Specifically, minimizing the influence of the operating system, host configuration, and target variables to enable reliable, relative comparisons.
To do this, the tests discussed here use an invariant server-grade operating system and apply only predetermined service packs and hotfixes. Automatic patch installation is disabled, target cache options are set manually, and all host components have the same BIOS settings, firmware version, and drivers installed. This ensures consistent run-to-run, host-to-host, and drive-to-drive measurement consistency. For purposes of the tests discussed, drives are always put in the same state at the beginning of each measurement, preconditioned the same way for each measurement, and stimulated to the same performance state—the test process is deterministic.
When measuring performance for enterprise SSDs, it is assumed that the requirements are very different from those of client SSDs.
Enterprise SSD Performance Measurement Assumptions
• Drive fill state: The drive is always 100% full.
• Accesses: It is being accessed 100% of the time (that is, the drive gets no interface idle time).
• Decisions: The enterprise market chooses enterprise SSDs based on their performance in steady state, and that steady state, full, and worst case are not the same thing.
• Consequences of failure: Failure is catastrophic for multiple users.
Client SSD Performance Measurement Assumptions
• Drive fill state: The drive has less than 50% of its user space occupied.
• Accesses: It is accessed a maximum of 8 hours a day, 5 days a week (but typically is written much, much less frequently).
• Decisions: The client market chooses enterprise SSDs based on their performance in the FOB state.
• Consequences of failure: Failure is catastrophic for a single user.
Furthermore, it is assumed that the focus of enterprise SSD performance measurement should include the intended workload. For example, an enterprise SSD intended to accelerate a database should be measured at least using a typical database workload (typically 8K transfer size; random; two-thirds read, one-third write traffic; fully loaded queue); if the same drive is intended to be a cache for streaming files, it should be measured with that workload (larger blocks, read-intensive, fully loaded queue).
Because SSD performance can change as it is written, enterprise performance measurement focuses on the steady-state performance region. The following definitions are used in measurements.
This paper assumes a complex definition of steady state, from the SNIA Solid State Storage Initiative’s Performance Test Specification:
• Max(y) – Min(y) within the measurement window is no more than 20% of the Ave(y) within the measurement window, AND
• [Max(y) as defined by the linear curve fit of the data with the measurement window] - [Max(y) as defined by the linear curve fit of the data with the measurement window] is within 10% of the average within the measurement window.
A full drive is one that has been overwritten some multiple (could be 1X) of the user-accessible LBA space by a fixed pattern that may vary from the test stimulus (i.e., 2X user LBA space written sequentially with 128K transfers).
Worst case performance is when the drive has been stimulated over some fixed time with a workload intentionally designed to demonstrate the drive’s worst possible performance. For example, this type of stimulus may include (but is not limited to) small transfers mixed with large transfers and intentionally misaligned writes.
The mechanics of enterprise SSD performance measurement discussed in this paper include:
Figure 1: Test-flow sequence
Purge: Regardless of what has been done previously to the SSD, put the drive in a known, fixed state that emulates the state in which the drive would be received from the manufacturer. This is the fresh-out-of-box (FOB) state. This paper refers to using the secure erase command to place sample SATA SSDs in this state; other methods may be protocol- or vendor-specific.
Precondition: Following SNIA’s Performance Test Specification for workload-independent precondition; writes the drive with 128KB sequential transfers aligned to 4K boundaries.
Test: For any metric of interest, test only one metric at a time, and test into steady state.
Collect & Report: At the end of the test run, results are compiled, drive performance from FOB to steady state is analyzed, and then steady-state performance values are recorded.
Setting Up the Test Platform
This section gives an example of host configuration using Microsoft Windows Server 2008; other operating systems should be treated similarly.
To ensure run-to-run consistency, fix a service pack and patch level, and then disable the operating system from installing updates because automatic installation may cause an automatic system restart, without regard to a test that may be running, which can result in run-to-run variance. Note: This setting may not be in accordance with your personal or corporate security policy. Micron assumes no responsibility for any consequences of this setting or its adjustment. This information is provided for reference only. Because most SSD performance measurement is done on an isolated system, this may be less of an issue, but you should know and understand your corporate security policy.
Figure 2: Change operating system settings
Next, you need to endure that the SSD is recognized. First, open the Disk Manager, locate the SSD to be tested, and then check that the drive does not contain a partition. Then, although the disk number assigned to the SSD may vary, ensure that the SSD is visible, as shown in the center of Figure 3. You may have to mark the SSD as online manually, and you may also have to initialize the SSD. (Note: Initializing the SSD is not the same as formatting it. You should see a prompt for initializing the disk. Click Yes, and then select the MBR option.)
Figure 3: Locate the SSD and check partitioning
If the SSD being tested does not support a complete power-backup solution (all volatile areas protected by an internal power-backup mechanism), then, for purposes of this discussion, the write cache option is disabled, as shown in Figure 4.
Figure 4: Disable write caching