Beyond Access Time
Beyond Access Time: Evaluating Storage Performance
By G.A. "Andy" Marken
Because of the proliferation of corporate information and the need to access this data storage is becoming more important than the computers that access it. If a computer breaks down, other computers can be used to assume the workload. However, when data storage is dead everyone is impacted.
To keep pace with the storage demand firms almost routinely double their hard drive capacity every 12-15 months. But now that storage is growing an average of 98 percent a year in terms of gigabytes shipped simply adding more hard disk capacity is no longer financially realistic. Faced with this ominous storage and management task, firms are seeking more cost-effective workgroup, departmental and enterprise-wide storage solutions.
To keep pace with demand optical manufacturers have introduced next-generation direct-overwrite 3.5 and 5.25-inch drives; faster, higher-capacity jukeboxes; and higher-capacity, higher-performance media. Suppliers have unveiled a number of integrated network and Internet-ready subsystems as well as a host of robust, seamlessly integrated software products for workgroup, departmental and enterprise-wide applications.
But the biggest advance has been in the area of performance. Since some manufacturers tout faster seek times and others boast higher disk rotational speed, the issue of performance becomes confusing, at best.
In reality, performance is a composite of a number of different factors, including seek time, controller overhead and data transfer rate as well as read and write speeds. All of these features need to be considered when evaluating stand-alone optical disk drives.
A typical SCSI-based optical disk system is composed of the computer side - which includes the CPU and the SCSI host adapter - and the optical drive side - which includes the SCSI controller, cache memory, the drive electronics and the media. The speed in which data is read from or written to the optical disk is affected by all of these components.
Access time is the measurement of time it takes for the requested data to reach the computer. This measurement begins once the CPU requests the information and ends when all of the information is delivered to the CPU. There are a number of events that must take place for this to occur so access time is usually broken into seek time, latency time, data transfer rate and overhead time.
Seek time is dependent upon how far the laser head has to travel to reach the requested data. Average seek time - the time it takes for the head to be positioned over the data - is usually calculated by taking an average of worst-case and best-case scenarios or the time it takes for the head to move one-third the radius of the disk.
While the average seek time may provide some insight into performance, it is important to consider the full range of seek times as well as your specific application requirements. Individual users, images, data bases and other large files are typically recorded sequentially. As a result, the contribution of seek time to the access time is minimal. On the other hand, multiple users accessing a drive on a network will usually require data that is scattered throughout the media so seek time becomes more critical.
Once the head is positioned on the right track, the media must be rotated until the requested data is directly under the head. This is called the rotational latency time. As with seek time, rotational latency is usually given as an average of worst-case and best-case scenarios and is equal to the time it takes for the disk to spin half a rotation.
The data transfer rate (DTR) simply tells you how fast the data moves from one location in the system to another. The problem is that data travels at different rates and at different times from different parts of the drive. For an accurate picture of the drive's performance, the data transfer rate is usually given for two different forms...burst and sustained.
The burst rate is the fastest rate that data can be transferred and is generally measured at the SCSI interface. It is at this connection that the fastest data transfer occurs.
The sustained data transfer rate is the rate at which the drive can actually move data from the host to the media or visa-versa.
While it is important to consider both rates, the burst rate is typically the same for most SCSI drives. Therefore, the sustained rate usually provides a better indicator of the drive's overall performance.
DTR is also dependent on the particular application. With small files such as letters and memos, the importance of the DTR is minimal. However, when managing larger files such as images, DTR can play a significant role in overall performance.
Controller overhead is the time it takes the drive to receive and perform the computer commands. This includes such variables as command transfer and decoding, the buffer, data path and defect management, data verification, retry errors and posting messages.
While overhead may appear to be a small element in the overall performance, application considerations are again a significant factor. When a drive is used for average accessing functions, overhead may account for less than 10 percent of the entire access time. However, if the drive is used to continuously access data and uses a large buffer, overhead may account for as much as 80 percent of total access time.
Another important consideration in overall performance is the driver software. Some driver software is designed to be transparent to the operating system (OS) and you're limited to using the OS commands. However, application programs can be designed to bypass the operating system so you can set up your own file system thus improving performance.
Adding to the complexity of the situation, most drives will be incorporated into optical jukeboxes with capacities ranging from 20 to 1,000 disks. Optical jukeboxes are very attractive to management because complete storage subsystem costs are only about 11 cents per MB.
Individual drive performance specifications become less meaningful than such items as the number of drives in the jukebox (minimizing data thrashing and wait time), read ahead cache, total disk swap time, automated scheduling and jukebox partitioning.
Enhanced jukebox mechanical systems and intelligent services can provide requested information to the user in as little as 5 seconds even when searching a full terabyte of stored data. Considering the cost and speed of alternatives including RAID, tape libraries, microfilm or even the filing cabinets the cost is minimal and the retrieval speed is almost blisteringly fast.
Performance of optical disk drives and jukebox storage system is dependent upon a number of factors. Relying on access time alone can be misleading - especially when (as is often the case) the reported access time is actually only the seek time.
The most important thing to consider is your specific application. Once you know how the system will be used, you can determine what elements of the performance equation are most critical to you.