Tuesday, March 25, 2014

I/O Wait, I/O Bottleneck

I/O wait means the processor is waiting for the file to be read from disk. Since hard disks are mechanical, you need to wait for the disk to rotate to the required disk sector.

I/O wait measurement is the canary for an I/O bottleneck. I/O wait is the percentage of time your processor is waiting on the disk. For example, lets say it takes 1 second to read 10,000 rows from MySQL, when reading, the disk is being processed. During the retreiving time, the processor is idle. Let's say the disk access took 700ms, so we can calculate the I/O wait is 80%.

I/O Wait Example:

There are different ways of checking I/O wait, one is using the top command. Type "top" in your comamnd line and check the line start with "Cpu(s)":
$ top
Cpu(s):  8.3%us,  3.1%sy,  0.0%ni, 88.0%id,  12.3%wa,  0.0%hi,  0.5%si,  0.0%st

"wa" stands for I/O wait.

If your I/O wait percentage is greater than (1/number of cores) then your CPUs are waiting a significant amount of time.

In the above example, I/O wait is 12.5%. There server has 8 cores, 1/8 = 0.125, this number is very close to 12.3%, it is reaching the threshold.

Factors that impact I/O performance:
For random disk access, the input/output operations (IOPS) is an important factor.

  • Multidisk Arrays – More disks in the array mean greater IOPS. If one disk can perform 150 IOPS, two disks can perform 300 IOPS.
  • Average IOPS per-drive – The greater the number of IOPS each drive can handle, the greater the the total IOPS capacity. This is largely determined by the rotational speed of the drive.
  • RAID factor - If you are using RAID, some RAID configurations have a significant penalty for write operations.
  • Read/Write workload - If you have a high percentage of write operations and a RAID setup that performs many operations for each write request (like RAID 5 or RAID 6), your IOPS will be significantly lower.

Monitor Disk I/O:
You can use the "sar" command from sysstat, by default it gives output:
01:40:01 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
01:50:01 PM     all     12.80      0.00      0.52      0.02      0.00     86.66
02:00:01 PM     all     12.80      0.00      0.58      0.15      0.00     86.46
02:10:01 PM     all     12.81      0.00      0.52      0.03      0.00     86.64
02:20:01 PM     all     13.08      0.00      0.51      0.02      0.00     86.39
02:30:01 PM     all     12.88      0.00      0.52      0.05      0.00     86.55
Average:        all     20.87      0.00      8.49      0.04      0.00     70.59

The %iowait is the time spent waiting on I/O.

To see current utilization broken out by device, you can use the iostat command, also from the sysstat package:
$ iostat -x 1
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.25    0.00    0.00   99.75

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    8.00    0.00  4096.00     0.00   512.00     0.04    5.25   3.00   2.40
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

  • rrqm/s : The number of read requests merged per second that were queued to the hard disk
  • wrqm/s : The number of write requests merged per second that were queued to the hard disk
  • r/s : The number of read requests per second
  • w/s : The number of write requests per second
  • rsec/s : The number of sectors read from the hard disk per second
  • wsec/s : The number of sectors written to the hard disk per second
  • avgrq-sz : The average size (in sectors) of the requests that were issued to the device.
  • avgqu-sz : The average queue length of the requests that were issued to the device
  • await : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
  • svctm : The average service time (in milliseconds) for I/O requests that were issued to the device
  • %util : Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.

Performance tuning:
Many people think of "performance tuning" as optimizing loops, algorithms, memory usage and CPU usage. In truth, yuo don't get huge performance gains from optimizing CPU or memory usage but from I/O calls. CPUs aren't the bottleneck anymore, your hard drive is. At the hardware level, the hard drive is the slowest component by an incredibly large factor. Today's memory ranges between 3200 and 10400 MB/s. Hardrive is about 100MB/s. Few modern hard drives have latencies under 13 milliseconds - while memory latency is usually about 5 nanoseconds - 2,000 times faster.

Since hard drives are slow, pending operations are going to a queue. So even if your app only needs a single byte of data from the hard drive, it still has to wait its turn.

No comments: