SSD Benchmarking and Performance Tuning: pitfalls and recommendations

SSDs are complex devices (see this post). Even though most SSDs have HDD like interfaces and can be used as an HDD replacement, they are very different from HDDs, and there are some issues and pitfalls you need to be aware of when you benchmark an SSD. In this post I want to cover some the these issues. Some graphs of real SSD benchmarks are attached to depict the described issues.

I. SSD (write) performance is not constant overtime

Most SSDs change their write performance overtime. This happens mainly due to the internal data mapping and allocation schemes. When the disk is empty (i.e it is trimmed or secure erased), the placement logic has little difficulties to find a place for the new data. After a while the disk fills up and the placement logic may need to do some cleaning work to make room for the data.

GSkill's write performance 16k span from 80G yo 120G

GSkill's FALCON write performance @ 16k, span from 80G yo 120G

The actual sustained write performance is affected by the writing pattern (random or sequential), write block sizes, cleaning/allocation logic, and most importantly the write span (the relative area of the disk that the benchmark uses). In general, most (non enterprise level) SSDs perform better as you write on smaller span. The graph to the right demonstrate how the write performance degrades overtime and stabilizes on a different base line for each span. The disk is a ~120G disk, and the tested spans are 80G to 119G.

II. SSD’s read and write performance is asymmetric

Basically, HDD’s read and write performance is symmetric. This is not the case for SSDs mainly due to the following reasons:

  1. The flash media read write performance is asymmetric – reads are much faster than writes (~50 usec for a single page read operation vs. ~800 usec of a single page write operation on the average).
  2. The flash media does not support re-write operation. You have to erase the entire block (~ 128 pages area) before you can rewrite a single page. As the erase operation is relatively slow (~2 msec) and using it requires costly read/modify/write operation, most SSDs avoid such RMW operations by using large write behind buffers, write combining and complex mapping, allocation and cleaning logic.
  3. The internal SSD’s write and read logic is asymmetric by design. For example, due to the internal mapping layer, the read logic only use the mapping meta-data, while the write may need (and in most cases have) to change the mapping.
  4. Due to the placement/cleaning logic, a single user write may require several internal writes to complete (this is the famous “write amplification”).
GSkill's read test

GSkill's read test

GSkill's write test

GSkill's write test

Even worse, the read write combination (mix) is even more complex due to many internal read/write arbitration issues (that are beyond the scope of this post). Still, as many applications are doing reads and writes at the same time, the read/write mix patterns may be very important, sometimes much more important than the pure read or pure write patterns. [Click on the read and write test thumbnails pictures to display the full images].

III. SSD’s performance is stateful

In addition to the data placement problems mentioned above, the SSD’s complex internal data mapping can affect the performance results in many other ways. For example the sequential writes performance of a disk that is filled with random (access) writes may be different comparing to the exact same benchmark done on a disk that is filled with sequential writes. In fact, some SSD’s sequential write performance may be reduced to the random access write performance if the disk is filled with random writes! . Another small issue is that reading an unmapped block (i.e. a block that was never written or that was trimmed) is different from reading a mapped one.

IV. SSDs are parallel devices

Unlike HDDs that have a single moving arm and therefore can only serve one request at a time, most SSDs have multiple parallel data channels (4-10 channels in most desktop oriented SSD’s). This means in order to reach the SSD’s maximal performance you may need to queue several requests on the disk queue. Note that the SSD’s firmware may utilize its internal channels in many ways so it hard to predict what will be the results of queuing more requests. For many SSDs, it is enough to use a relatively small number of parallel/flight requests (4-8 requests).

V. SSDs may suffer from HDD oriented optimizations

The entire storage stack in your computer/server is HDD oriented. This means that many related mechanisms and even hardware devices are tuned for HDDs. You have to ensure that these optimizations do not harm SSD performance. For example most OSes do read-ahead/write coalescing, and/or reorder writes to minimize the HDD’s arm movements. Most of these optimizations are not relevant to SSDs, and in some cases can even reduce the SSD’s performance.

In addition, most HDDs have 512 bytes sector, and most SSD’s have 4k bytes sector. You have to ensure that the benchmark tool and the OS do know to send properly (offset and size) aligned data requests.

Another issue are the RAID controllers. Most of them are tuned to HDD’s performance and behavior. In many cases they may be a performance bottleneck.

Benchmarking/tuning recommendations

  1. Ensure that your SSD’s firmware is up-to-date.
  2. Ensure that your benchmark tool, OS and hardware are suited for SSD benchmarking:
    1. Make sure AHCI is ON (probably requires BIOS configuration). Without it the OS would not be able to queue several requests at once (i.e. use NCQ).
    2. Make sure that the disk’s queue depth is at least 32.
    3. For Linux (and other OSes) make sure the IO scheduler is off (change it to “noop”).
    4. Use direct IO to avoid caching effects (both read and write caching).
    5. Most SSD’s are 4k block devices. Keep your benchmark’s load aligned (for example IOMeter load is 512 aligned by default).
    6. The SATA/SAS controller may affect the result. Make sure it is not a bottleneck (this is especially important for multiple disks benchmarks).
    7. Avoid RAID controllers (unless it is critical for your application).
  3. Always trim or secure erase the entire disk (!) before the test.
  4. After the trim, fill the disk with random 4k data, even if you want to benchmark reads.
  5. Ensure that the benchmark duration is long enough to identify write logic changes. As a rule I would test a disk for at least couple of hours (not including the initial trim/fill phases).
  6. Adjust the used space span to your need. In most case to have to balance capacity vs. performance.
  7. Adjust the benchmark parallelism to suite your load.
  8. Remember to test mixed read/write patterns.

General recommendations:

  1. If you benchmark the disk to evaluate it for a specific use (case), try to understand what is the relevant pattern or patterns for your use case and focus on these patterns. Try to understand what are your relevant block sizes, alignments, randomness, read vs. write ratio, parallelism, etc.
  2. SSD’s performance may be very different from a model to model, even if they are from the same manufacturer, and/or using the same internal controller.
An end note: the attached graphs are just for demonstration. This is not a GSkill related post and it is not a disk review – the demonstrated behavior is common for many SSDs.

Interesting links:

Advertisements

Leave a comment

Filed under ssd

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s