The Development and History of Solid State Drives (SSDs)
Wednesday, September 10, 2014
Early SSDs using RAM and similar technology
Solid state drives (SSDs) had origins in the 1950s with two similar technologies: magnetic core memory and card capacitor read-only store (CCROS). These auxiliary memory units (as contemporaries called them) emerged during the era of vacuum-tube computers. But with the introduction of cheaper drum storage units, their usage ceased.
Later, in the 1970s and 1980s, SSDs were implemented in semiconductor memory for early IBM, Amdahl, and Cray supercomputers, but they were seldom used because of the prohibitively high price. In the late 1970s, General Instruments produced an electrically alterable ROM (EAROM), which operated somewhat like the later NAND flash memory. Unfortunately, a ten-year life was not achievable and many companies abandoned the technology. In 1976 Dataram started selling a product called Bulk Core, which provided up to 2 MB of solid state storage compatible with Digital Equipment Corporation (DEC) and Data General (DG) computers. In 1978, Texas Memory Systems introduced a 16 kilobyte RAM solid-state drive to be used by oil companies for seismic data acquisition. The following year, StorageTek developed the first RAM solid-state drive.
The Sharp PC-5000, introduced in 1983, used 128 KB solid-state storage cartridges containing bubble memory. In 1984 Tallgrass Technologies Corporation had a tape backup unit of 40 megabyte with a solid state 20 MB unit built in. The 20 MB unit could be used instead of a hard drive. In September 1986, Santa Clara Systems introduced BatRam, a 4 MB mass storage system expandable to 20 MB using 4 MB memory modules. The package included a rechargeable battery to preserve the memory chip content when the array was not powered. 1987 saw the entry of EMC Corporation (EMC) into the SSD market, with drives introduced for the mini-computer market. However, by 1993 EMC had exited the SSD market.
Software-based RAM disks were still used as of 2009 because they are an order of magnitude faster than other technology, though they consume more CPU resources and cost much more on a per-gigabyte basis.
Flash-based SSDs
In 1983, a mobile computer was the first to included four slots for removable storage in form of flash-based solid-state disks, using the same type of flash-memory cards. Flash modules did have the limitation of needing to be re-formatted entirely to reclaim space from deleted or modified files; old versions of files which were deleted or modified continued to take up space until the module was formatted.
In 1991 a 20MB solid state drive (SSD) sold for $1,000.
Early in 1995, the introduction of flash-based solid-state drives was announced. They had the advantage of not requiring batteries to maintain the data in the memory (required by the prior volatile memory systems), but were not as fast as the dynamic random-access memory (DRAM)-based solutions. Since then, SSDs have been used successfully as hard disk drive (HDD) replacements by the military and aerospace industries, as well as for other mission-critical applications. These applications require the exceptional mean time between failures (MTBF) rates that solid-state drives achieve by virtue of their ability to withstand extreme shock, vibration and temperature ranges.
Around 2007 a PCIe-based SSD was introduced with 100,000 input/output operations per second (IOPS) of performance in a single card and capacities up to 320 GB. A 1 terabyte (TB) flash SSD using a PCI Express ×8 interface can achieve a maximum write speed of 654 megabytes per second (MB/s) and maximum read speed of 712 MB/s.
Enterprise Flash Drives
Enterprise flash drives (EFDs) are designed for applications requiring high I/O performance (IOPS), reliability, energy efficiency, and consistent performance. In most cases, an EFD is an SSD with a higher set of specifications, compared with SSDs that would typically be used in notebook computers. There are no standards bodies controlling the definition of EFDs, so any SSD manufacturer may claim to produce EFDs when they may not actually meet the requirements.
Architecture and Function
The key components of an SSD are the controller and the memory to store the data. Though the primary memory component in an SSD was traditionally DRAM volatile memory, it is now more commonly NAND flash non-volatile memory. Other components play a less significant role in the operation of the SSD and vary among manufacturers.
Controller
Every SSD includes a controller that incorporates the electronics that bridge the NAND memory components to the host computer. The controller is an embedded processor that executes firmware-level code and is one of the most important factors of SSD performance. Some of the functions performed by the controller include:
- Error-correcting code (ECC)
- Wear leveling
- Bad block mapping
- Read scrubbing and read disturb management
- Read and write caching
- Garbage collection
- Encryption
The performance of an SSD can scale with the number of parallel NAND flash chips used in the device. A single NAND chip is relatively slow, due to the narrow (8/16 bit) asynchronous I/O interface, and additional high latency of basic I/O operations (typical for SLC NAND, ~25 μs to fetch a 4 KB page from the array to the I/O buffer on a read, ~250 μs to commit a 4 KB page from the IO buffer to the array on a write, ~2 ms to erase a 256 KB block). When multiple NAND devices operate in parallel inside an SSD, the bandwidth scales, and the high latencies can be hidden, as long as enough outstanding operations are pending and the load is evenly distributed between devices. Faster SSDs implement data striping (similar to RAID 0) and interleaving in their architecture. This enabled the creation of ultra-fast SSDs with 250 MB/s effective read/write speeds with the SATA 3 Gbit/s interface in 2009. Two years later, consumer-grade SATA 6 Gbit/s SSD controllers could support 500 MB/s read/write speeds.
Memory
Flash-Based Memory
Comparison of architectures | |
SLC vs MLC | NAND vs NOR |
10× more persistent | 10× more persistent |
3x faster Sequential Write same Sequential Read
|
4x faster Sequential Write 5x faster Sequential Read
|
30% more expensive | 30% cheaper |
Most SSD manufacturers use non-volatile NAND flash memory in the construction of their SSDs because of the lower cost compared with DRAM and the ability to retain the data without a constant power supply, ensuring data persistence through sudden power outages. Flash memory SSDs are slower than DRAM solutions, and some early designs were even slower than HDDs after continued use. Flash memory-based solutions are typically packaged in standard disk drive form factors (1.8-, 2.5-, and 3.5-inch), or smaller unique and compact layouts because of the compact memory.
Lower priced drives usually use multi-level cell (MLC) flash memory, which is slower and less reliable than single-level cell (SLC) flash memory. This can be mitigated or even reversed by the internal design structure of the SSD, such as interleaving, changes to writing algorithms, and higher over-provisioning (more excess capacity) with which the wear-leveling algorithms can work.
DRAM-Based Memory
SSDs based on volatile memory such as DRAM are characterized by ultrafast data access (generally less than 10 microseconds) and are used primarily to accelerate applications that would otherwise be held back by the latency of flash SSDs or traditional HDDs. DRAM-based SSDs usually incorporate either an internal battery or an external AC/DC adapter and backup storage systems to ensure data persistence while no power is being supplied to the drive from external sources. If power is lost, the battery provides power while all information is copied from random access memory (RAM) to back-up storage. When the power is restored, the information is copied back to the RAM from the back-up storage, and the SSD resumes normal operation (similar to the hibernate function used in modern operating systems). SSDs of this type are usually fitted with DRAM modules of the same type used in regular PCs and servers, which can be swapped out and replaced by larger modules.
A remote, indirect memory-access disk (RIndMA Disk) uses a secondary computer with a fast network or (direct) Infiniband connection to act like a RAM-based SSD, but the new, faster, flash-memory based, SSDs already available in 2014 are making this option not as cost effective.
While the price of DRAM continues to fall, the price of flash memory falls even faster. The “flash becomes cheaper than DRAM” crossover point occurred approximately 2004.
Other Types of Memory
Some SSDs use MRAM. Some SSDs use both DRAM and flash memory. When the power goes down, the SSD copies all the data from its DRAM to flash. When the power comes back up, the SSD copies all the data from its flash to its DRAM. Some drives use a hybrid of spinning disks and flash memory.
Cache or Buffer
A flash-based SSD typically uses a small amount of DRAM as a cache, similar to the cache in hard disk drives. A directory of block placement and wear leveling data is also kept in the cache while the drive is operating. Data is not permanently stored in the cache. Eliminating the external DRAM enables a smaller footprint for the other flash memory components in order to build even smaller SSDs.
Battery or Super Capacitor
Another component in higher performing SSDs is a capacitor or some form of battery. These are necessary to maintain data integrity such that the data in the cache can be flushed to the drive when power is dropped; some may even hold power long enough to maintain data in the cache until power is resumed. In the case of MLC flash memory, a problem called lower page corruption can occur when MLC flash memory loses power while programming an upper page. The result is that data written previously and presumed safe can be corrupted if the memory is not supported by a super capacitor in the event of a sudden power loss. This problem does not exist with SLC flash memory.
Host Interface
The host interface is not specifically a component of the SSD, but it is a key part of the drive. The interface is usually incorporated into the controller discussed above. The interface is generally one of the interfaces found in HDDs. They include:
- Serial attached SCSI (SAS, > 3.0 Gbit/s) – generally found on servers
- Serial ATA (SATA, > 1.5 Gbit/s)
- PCI Express (PCIe, > 2.0 Gbit/s)
- Fibre Channel (> 200 Mbit/s) – almost exclusively found on servers
- USB (> 1.5 Mbit/s)
- Parallel ATA (IDE, > 26.4 Mbit/s) – mostly replaced by SATA[
- (Parallel) SCSI (> 40 Mbit/s) – generally found on servers, mostly replaced by SAS; last SCSI-based SSD was introduced in 2004
Configurations
The size and shape of any device is largely driven by the size and shape of the components used to make that device. Traditional HDDs and optical drives are designed around the rotating platter or optical disc along with the spindle motor inside. If an SSD is made up of various interconnected integrated circuits (ICs) and an interface connector, then its shape could be virtually anything imaginable because it is no longer limited to the shape of rotating media drives. Some solid state storage solutions come in a larger chassis that may even be a rack-mount form factor with numerous SSDs inside. They would all connect to a common bus inside the chassis and connect outside the box with a single connector.
For general computer use, the 2.5-inch form factor (typically found in laptops) is the most popular. For desktop computers with 3.5-inch hard disk slots, a simple adapter plate can be used to make such a disk fit. Other types of form factors are more common in enterprise applications. An SSD can also be completely integrated in the other circuitry of the device, as in the Apple MacBook Air (starting with the fall 2010 model). As of 2014, mSATA and M.2 form factors are also gaining popularity, primarily in laptops.
Standard HDD form factors
The benefit of using a current HDD form factor would be to take advantage of the extensive infrastructure already in place to mount and connect the drives to the host system. These traditional form factors are known by the size of the rotating media, e.g., 5.25-inch, 3.5-inch, 2.5-inch, 1.8-inch, not by the dimensions of the drive casing.
Standard card form factors
For applications where space is at premium, like ultrabooks or tablets, a few compact form factors were standardized for flash-based SSDs. There is the mSATA form factor, which uses the PCI Express Mini Card physical layout. It remains electrically compatible with the PCI Express Mini Card interface specification, while requiring an additional connection to the SATA host controller through the same connector.
The M.2 form factor, formerly known as the Next Generation Form Factor (NGFF), is a natural transition from the mSATA and the physical layout it used to a more usable and more advanced form factor. While mSATA took advantage of an existing form factor and connector, M.2 has been designed to maximize usage of the card space, while minimizing the footprint. The M.2 standard allows both SATA and PCI Express SSDs to be fitted onto M.2 modules.
Disk-on-a-Module (DOM) Form Factors
A disk-on-a-module (DOM) is a flash drive with either 40/44-pin Parallel ATA (PATA) or SATA interface, intended to be plugged directly into the motherboard and used as a computer hard disk drive (HDD). The flash-to-IDE converter simulates a HDD, so DOMs can be used without additional software support or drivers. DOMs are usually used in embedded systems, which are often deployed in harsh environments where mechanical HDDs would simply fail, or in thin clients because of small size, low power consumption, and silent operation.
NAND Flash-Based SSD "Wear-Out": The Impact on Endurance and Reliability
Fundamental to the design of NAND flash is the potential for irreparable damage to the floating gate due to multiple program/erase cycles. Simply put, the endurance (which means the number of cycles in which a block can be erased and programmed) is limited. The relatively strong electric fields used during the program/erase cycle are capable of damaging the floating gate, which, if damaged, permanently alters the NAND cell characteristics. The potential for this problem is exacerbated when the SSD has a limited number of NAND blocks or a fixed amount of capacity available to use. Thus, multiple program/erase cycles based on the amount of data written to the device (or workload), the efficiency with which the program cycles are spread over all cells in a flash device evenly (or wear leveling), or the efficiency between the data written to the NAND media and the data received from the host (or write multiplication) can cause the NAND cells to "wear out" prematurely and negatively impact the endurance of the overall SSD device and reliability/accessibility of the data contained therein.
Because additional program cycles are required to operate MLC NAND and its tighter voltage threshold window, an MLC NAND cell inherently will wear out faster than an SLC NAND cell because the signal to noise of the NAND media degrades over time. It is important to recognize the difference between these attributes of SLC and MLC flash because it affects the endurance specified for a given block:
- SLC NAND generally is specified at 100,000 write/erase cycles per block.
- MLC NAND typically is specified at 10,000 write/erase cycles per block.
Additionally, data retention (or the integrity of the stored data on a flash cell over time) is impacted by the state of the floating gate in a NAND cell where voltage levels are critical. Leakage to or from the floating gate, which tends to slowly change the cell's voltage level from its initial level to a different level after cell programming or erasing, may change the voltage level. This altered level may be interpreted incorrectly as a different logical value by the system. Thus, due to the tighter voltage tolerances between MLC levels than SLC levels, MLC flash cells are more likely to be affected by leakage effects. Consequently, care must be taken to ensure the long-term data retention capabilities of both SLC and MLC NAND when used in enterprise storage. In response to these issues, NAND flash OEMs recently have announced technology (called Enterprise MLC, or eMLC) that dramatically extends the life span of flash based storage for enterprise applications.
4 Techniques Used to Improve Enterprise NAND-Based SSD Reliability and Endurance
On the surface, many of the issues associated with NAND as a storage media may appear too overwhelming or challenging for the technology to be used in the enterprise environment. However, enterprise SSDs integrate a number of advanced techniques and intelligence to help overcome the endurance and reliability limitations at the NAND flash media level:
- Error correction code (ECC). ECC is used to detect and correct errors by adding additional bits to the data. ECC algorithms, such as Reed-Solomon codes, Hamming coding, and others, are typically used in storage applications. In general, the more bits of ECC that are used, the higher the level of error correction. Hence, an SSD with effective ECC will be able to correct more errors, which ultimately improves the time to wear-out.
- Wear-leveling techniques. Wear leveling is a process that SSDs utilize to minimize the impact of the NAND endurance limitation by spreading the program cycles over all cells in a flash device evenly. Two primary techniques, static and dynamic, commonly are used in SSDs to manage access to the NAND media. Static wear leveling prevents infrequently accessed data from remaining stored on any given block for a long period of time. Static wear leveling is designed to distribute data evenly over an entire system by searching for the least used physical blocks and then writing the data to those locations. Dynamic wear leveling distributes the data over free or unused blocks. In the end, the combination of these wear-leveling techniques increases the life span of an SSD by spreading the data across all the cells in the device evenly to avoid individual cell wear-out.
- Use of spare blocks (or overhead). Providing spare blocks of additional NAND capacity is another way to improve endurance. For example, an SSD marketed as a 25GB SSD may show 25GB of capacity available to the user to store data. Yet, the SSD may be constructed with 32GB of actual NAND capacity. The 7GB of overhead (or spare blocks) in this example can be used to improve wear leveling efficiency and other program/erase operations to increase the endurance and performance at the device level. This is commonly referred to as overprovisioning.
- Buffering the data. In an SSD, and also with an HDD, buffering the data with a small amount of DRAM memory can improve performance. In an SSD, buffering the data also improves device-level endurance by optimizing writes, limiting the program/erase cycles, and eliminating any mismatch between erase block size and the data size.
Summary
Customers expect that data stored on an SSD or an HDD will always be there and that it will be accurate regardless of the conditions: loss of power, temperature fluctuations, vibrations, and shock. They also expect storage to be relatively low cost. The use of NAND-based SSDs is a relatively new phenomenon in the critical data sensitive, enterprise storage market. As end users seek higher-performing systems,
SSD adoption will grow, and as the installed base grows and the market matures, the reliability of SSDs will be exposed — for good or for bad. Hence, it is most beneficial for device and system OEMs to define a common set of metrics that characterize solid state storage reliability consistently and appropriately, and to establish these definitions sooner rather than later. The development of a set of standards around reliability will allow customers and system designers to evaluate SSDs for their use in their given application, to set expectations appropriately, and to increase customer satisfaction.
If you still have questions about SSDs, please call Symmetry Electronics at (310) 536-6190, or contact us online.