By Mike Dziekan
Connecticut Analytical Corporation
An earlier version of this article originally appeared in The Citizen Scientist,Â 01 June 2007
Everyone reading this article is viewing a stored representation located on the SAS server through an Internet connection via the World Wide Web. We know how to store and read files, such as this article, family pictures, favorite songs, and a plethora of other programs and data. But have you ever wondered how all this information is stored?
You probably know that the information is stored safely on your computer's hard drive, and that's about as far as it goes. How many of you have recently defragmented your hard drive, and were always curious as to why you had to do it? Why did your drive become fragmented in the first place?
Before I answer these questions, letâ€™s look back to when DOS (Disk Operating System) ruled the personal computing world. Back then the majority of programs required only a few tens of kilobytes, and a single 5 Â¼â€ or 3 Â½â€ floppy diskette was adequate to store them. That was until Microsoft Windows arrived on the scene. Instead of loading a program from a single floppy diskette, we had to use multiple floppies. Later versions of Windows required more and more floppies. I remember loading as many as 26 floppies to install Windows NT v3.5.
As software became increasingly complex and required more storage space, hard disk drives became essential. It wasnâ€™t long ago that a 30 MB (megabyte) hard drive made you the big man on campus, but that little capacity is completely inadequate today. My first â€œrealâ€ computer had a 68 MB hard drive, a good sized drive at the time. Today typical hard drives have capacities of from 80 to 250 GB (gigabytes). Recently 500 GB drives have become commercially available, and terabyte (1,000 GB) drives are on the horizon.
Hard Drive Design
Figure 1 shows the construction of a typical hard drive. The drive includes a head actuator mechanism that rapidly moves a series of read/write heads above a spinning platter that is coated with a thin layer of magnetically sensitive material. A small magnetic field from one of the read/write heads will form a directional magnetic field on the surface of its respective magnetically sensitive platter. This non-volatile magnetic polarization is the basis for the hard drive's operation.
I should caution you that opening a hard drive will likely shorten its life expectancy or cause instant failure. The read/write heads typically fly on a cushion of air only about 100 nanometers (10-9 meters) over the rapidly spinning platters. The average human hair is about 100 microns (10-6 meters) in diameter, which is about 1,000 times larger than the gap between the read/write head and the platter.
This is why hard drives are assembled in highly controlled, clean room environments, for even a fingerprint can cause problems. So take my advice and donâ€™t open a hard drive you intend to keep using.
Hard drives include a series of very tiny holes that allow air to pass through a barometric or breathing filter that is â€œsealedâ€ inside the hard drive assembly. I put â€œsealedâ€ in quotes to indicate that the hard drive is not really completely sealed. If you plan to take your laptop to the top of a mountain, then it would be beneficial to check the manufacturer's altitude rating to ensure that it will work properly.
I discussed the potential altitude problem with SASâ€™s Forrest M. Mims III about six months ago, and Forrest (already aware of the problem) reported that the only trouble he had ever experienced when working up on Hawaiiâ€™s Mauna Loa (3,400 Meters), was with an external 80 GB drive that worked fine at sea level, but not at altitude. He asked several staff scientists at Mauna Loa and Mauna Kea (4,300 meters), and none reported any hard drive problems.
If the air pressure gets too low (altitude too high), then there might not be enough air pressure to float the read/write head above the platter surface. This could result in the drive not operating, or worse, a hard disk crash!
In addition to the barometric filter, there is also a recirculating filter within the sealed hard drive assembly. The filters are permanent and designed to last the life of the drive. The recirculating filter is primarily to remove any material that might be dislodged from the platter surface or the read/write heads.
Storing and Accessing Data on a Hard Drive
Figure 2 shows how the data is organized on the platter surfaces. When the head is stationary over the spinning platter, a concentric ring of information will be read from or written to the disk. These concentric rings are called tracks.
There are typically thousands of disk tracks on both sides of the platter surfaces, each of which can be rapidly accessed by the head actuator mechanism. A typical track width is only a few microns (10-6 meters). The narrow gap between disk tracks is not shown. This was not drawn to make the diagram easier to visualize. This small amount of gap is needed to prevent data stored on one track from interfering with adjacent tracks.
Since modern drives include multiple platters, and the read/write heads are rigidly connected together to function as one unit, designers have coined the term â€œcylinderâ€ to describe a virtual cylinder passing through all the platters at once. Figure 3 shows a three dimensional representation of the disk cylinders, with portions of the platters cutaway. This example shows only four discrete read/write head positions out of many thousands. It is a testament to the designer's understanding of control theory, as they have to balance speed, accuracy, precision, and damping to provide for rapid and accurate data access.
In addition to disk tracks and cylinders, there is a further division of the platters into sectors. Figure 4 is a diagram of how the magnetic information stored on the disk platters is divided into sectors. The number of sectors and tracks can vary from manufacturer to manufacturer and even from one capacity drive to another from the same manufacturer.
Figure 4 shows how the disk tracks are further divided into angular sectors. Before the advent of Microsoft's â€œPlug and Playâ€ (or â€œPlug and Pray,â€ as many people call it), you had to know some characteristics of the hard drive being installed.
For instance, when running the BIOS setup program, it was necessary to manually input the number of cylinders, heads, and sectors. With later versions of Windows, this became more and more abstract, as the common CHS (Cylinders, Heads, Sectors) formatting no longer matched up so nicely.
Improving Hard Drive Storage Efficiency
Referring to Fig. 4, notice that the outer edge tracks are physically larger then the inner edge tracks. This is wasted space, since all sectors contain the same amount of data.
Like the tracks, the sectors contain a small gap between them to prevent interference. To make use of this wasted space, designers created a multiple zone arrangement known as a ZBR (Zoned Bit Recording) structure. This approach provides multiple zones on the platter surface where the tracks are divided into groups or zones of equal numbers of sectors. Instead of a uniform number of sectors over the entire platter surface, there are more and more sectors as zones progress toward the outer edge of the platter surface.
Figure 5 shows how the ZBR process is carried out. The platter surfaces are divided into tracks and cylinders as before. However, now there are six discrete zones containing equal numbers of sectors. Each zone has a different number of sectors than the adjacent zones, while having the same number within the zone. The end result is efficient use of formerly wasted space.
Many articles descibe various algorithms that can be used to control the read/write heads for a ZBR disk, and some are listed at the end of this article. For now, I am providing only a basic overview of hard drive operation.
A sector can have a storage capacity of from 512 bytes to 4 kilobytes. If the sector size is large, data access speed will be faster, but more space could be wasted. If the sector size is small, data access speed will be slower but less space is wasted. To visualize this, see the non-ZBR storage arrangement shown in Fig. 6.
How Data Stored on a Hard Drive Becomes Fragmented
Referring again to Fig. 6, let's assume that each sector is 4 KB. The grey sectors show where data is already stored, and the red portion shows a new piece of data, say a Word document.
In this case 22 sectors are used, so the document is roughly 4 KB x 22 or 88 KB in size. Notice that the sectors are arranged in a nice contiguous pattern. This means that the file can be read rapidly, for no head movement will be required.
Now let's consider what happens when our efficiently stored document is updated with new text. When the updated document is saved, more sectors are now required. But since there is already existing data on the hard drive, new empty sectors will have to be found and used. Figure 7 shows what happens as the document is broken up into groups of sectors (called clusters) and scattered around the platters. As more and more data is modified and stored, there are more and more isolated sectors. The result is that the data becomes increasingly fragmented and slower to access.
Disk fragmentation will occur with normal usage, so a disk defragmentation program should be run on a regular basis to prevent lost clusters. Lost clusters can occur when a disk becomes heavily fragmented. To keep track of all this information, a map of sectors that correlates to each file is stored in the computer's FAT (File Allocation Table). Keep in mind that if a sector is mapped to a specific file, it does not matter if it utilizes only 1 byte or all 4 KB of space, for it still unavailable to other files.
Files of 1 byte, 100 bytes, 2 KB, or 4 KB all occupy an entire sector. Just because a sector is marked as used does not mean that it is filled with data. In my example of 1, 100, 2K and 4K files, four individual sectors would be required. The 1 byte sector will be almost empty, and the 4 KB sector will be completely filled. Reducing sector size increases storage efficiency, and 512 byte sectors are better than 4KB sectors.
Figure 8 shows how each sector is uniquely identified. The sector ID contains control information for each sectors unique angular position and track number. This information is followed by an Error Correction Code (ECC) to ensure integrity of the sector ID information. A small gap allows for the sector identification to be processed by the disk controller circuitry. After this is the actual data that will be read or written to, followed by the same type of Error Correction Code and gap.
The control information is invisible to the user. For example, if you retrieve a Word document, only the data stored in the sector will be viewable and not the relevant sector control information. If the sector size is 512 bytes, the control data and error correction will exceed the sector capacity, but only the rated user data capacity is called out in the specifications.
When the user accesses specific files, the FAT gives a map of all the sectors that contain all the pieces of that file. The read/write heads read in all the appropriate sectors in the correct order. This data is then reassembled and transferred into the computer's RAM so it can be utilized.
The FAT is ALWAYS located in a specific portion of the hard drive, and an exact copy is also stored elsewhere on the drive to ensure error free operation. If the FAT and its copy both become corrupted, the disk cannot be read.
The way that data is stored also depends on the operating system and its the maximum allowable hard drive capacity. For example, Windows XP uses NTFS (New Technology File System). Each file system (FAT16, FAT32, and NTFS) has different capabilities and features. For a detailed look at different file system technologies, check out this site.
It really is amazing how far hard drive technology has advanced in the last few decades. One can only imagine where we will be a decade from now.
I hope this article has answered some questions that you had (or didnâ€™t know you had) about hard disk drives. For additional hard drive information, see http://etutorials.org/Misc/pc+hardware/Chapter+14.+Hard+Disk+Drives/14.3+Installing+a+PATA+Standard+ATA+Hard+Disk/
Detailed OEM specifications for an older 3.5â€ ATA hard drive can be found at http://www.hitachigst.com/tech/techlib.nsf/techdocs/85256AB8006A31E587256A94005F4219/$file/djaa_spw.pdf