Tuesday, November 23, 2010

SATA versus SAS - Realistic Expectations

As a Performance Engineer, I do benchmarks all the time.  And I frequently get asked about SAS versus
SATA drives.  While it is a no-brainer as to which performs better, SATA drives are priced about 50% of that of SAS drives and offer a lot more capacity.

Price and capacity are attractive enough, but under what circumstances can a SATA drive perform comparably to a SAS drive?  As I do benchmarks on Oracle databases, I did a few simple tests to ascertain the performance characteristics of SATA versus SAS.

The drives I used for the tests are from Seagate. The SATA Drive is the Seagate Constellation ES 7200 RPM 2TB Drive with SAS interfaces. The SAS Drive is the Seagate Cheetah 15K RPM 600GB Drive with SAS interfaces. 

Both are 3.5inch drives.

The below table summarizes some key specs of these drives.

The newer generation of SATA drives are equipped with SAS interfaces. Basically they look the same except for internals. Of particular interest is the Areal Density showing that SATA drives are far more densely packed than SAS.

For the test, I created a partition of 64GB from the outer most sectors of the drive. From my experience,
for reasonable performance from a hard drive, short stroking is a must. The degree of short stroking a drive depends on how much storage you are willing to sacrifice over performance. To give an example, for optimal performance, a 146GB SAS drive must be short stoked to ~50GB (1/3rd of the size).

I used the partition as-is with no filesystem - basically a raw device.

I performed 3 tests on the Drives using vdbench. Vdbench is a tool from Oracle and is highly flexible in terms of testing options. It is authored by Henk Vandenbergh from Oracle.

  1. Transaction System - Small reads/writes of 16K IO size simulating single block activity - 70% Reads + 30% Writes. 
  2. DW System - Large reads/writes of 1024K IO size simulating multiblock activity - 70% Reads + 30% writes.  
  3. Hybrid System - Combination of 16K and 1024K IO sizes - 70% Reads + 30% writes

I generally only focus on response time as this is what an application such as Oracle will report on via the wait events. Ideally a single block IO request would complete in < 5ms (peak) and a large block IO request in < 20ms (peak).

This is hard to meet with Spinning Media during high concurrency and so I would settle for peak sustainable response time < 15-20 ms.

The number of IOPS is dependent on Use Case Scenarios and cannot be generalized.

The graphs below correlate IOPS versus Response Time.  As the number of requests (IOPS) increase, the response time for a IOP will start increasing.  

We would be more interested in where exactly the hard drive fails (in the sense it cannot deliver a predictable response time)or cannot satisfy the number of IO requests.

Transaction System Test

If you are considering SATA for a Transaction System, then depending on number of single block requests and your threshold for response times, it may be cost effective to use SATA. Looking at the below graph, for a 16K block, it would appear that 100 IOPS/drive would be about the maximum a SATA drive can sustain without falling of a cliff.   If using an 8K block size, it would be even better.

DW System Test

A DW system is typically characterized by smaller number of large block IO when compared to a transaction system. As you can see below, a SATA drive cannot sustain more than 30-40 IOPS before experiencing a failure in response time. And it cannot sustain more than 68 IOPS.

A SAS drive is a lot more scalable and predictable than a SATA drive. Even doubling the number of SATA drives cannot equal the performance of a single SAS drive. So SAS would be a much better fit for DW than SATA.

Hybrid System Test

For a Hybrid System, again, it may not make sense to use SATA. You can squeeze in a few more IOPS out of the drive (max of 105) before it fails, but the response times are high.

To summarize,  for low to mid volume transaction systems, SATA drives may be quite affordable and also deliver reasonable performance.  For DW and Hybrid systems, it is driven by use-case and potentially not effective or efficient to use SATA.