Hard Disk Selection Criteria in Embedded Applications

Comments · 246 Views

The development of science and technology promotes social changes, and the advancement of flash memory chip technology has triggered a surge in the solid-state drive (SSD) market.

The development of science and technology promotes social changes, and the advancement of flash memory chip technology has triggered a surge in the solid-state drive (SSD) market. Its excellent performance and gradually reducing costs have made SSDs increasingly replace traditional mechanical hard drives (HDD) in the storage field. It also plays an increasingly important role in the embedded computing applications we are focusing on today.

Today, flash memory chips are also in a new transition, from 2D NAND to 3D NAND. By vertically stacking memory cells in multiple layers to achieve higher storage densities and faster read/write operations, this evolution is notable and quickly.

At first, SLC was considered the most advanced because of its durability and fast read and write speeds. Over the past decade in which MLC has dominated the industrial market for quite some time because it is cheaper, with larger capacity, and better meet the general market demand. Since 2018, 96-layer 3D (BiCS4) SSDs have become commonplace in the market. In 2020, NAND directly jumped from 96 layers to 128 layers, bringing greater application opportunities to the entire market.

 SSD

Today we focus on discussing several key points in the selection of SSD for embedded computers, mainly from the working environment, hard disk capacity, erasing and writing cycles and so on.

 Environment

As we all know, different equipment has different application fields, and different application fields have different surrounding environments, such as operating temperature, and some work indoors, with a temperature range of 0~70°C, while working outdoors may require considering

the subzero or high temperature factors. According to the current market division, it can be divided into -20~75℃, -40~85℃ and more extreme range  -55~105℃.

In addition, factors such as air humidity , dust , oxidation, and vibration should also be considered.

 capacity

 

Proper use of hard drives starts with looking at available capacity, however, inconsistent specifications make it a challenge.

SSD manufacturers may set the full flash capacity (that is, the actual flash capacity), while some manufacturers may only set part of the capacity and use the hidden remaining capacity as spare capacity, which we call over provisioning (OP). The OP approach uses the extra flash capacity to perform garbage collection to improve hard drive efficiency and extend SSD life.

For example, for an SSD with a capacity close to 256GB, manufacturer A specifies the full 256GB, while manufacturer B specifies 240GB, and manufacturer C specifies 200GB. The actual available capacity of the hard drive may be less than the stated capacity. A common reason is that a flash area is used for internal processing. Therefore, engineers should conduct tests under near-real field conditions to analyze SSD performance and life. Manufacturers usually classify hard drives as "industry" or "consumer". Compared with consumer-grade SSDs, industry-grade hard drives are often used in data centers or servers, which require larger flash capacity as a reserve to provide more stable performance over a longer period of time. Especially for storage arrays, engineers should ensure that their design maintains low latency during peak loads, so the choice of capacity is also important in practical applications.

 

durability

 

When choosing an SSD, lifetime endurance and write performance endurance are important criterions, as the wrong choice can come with considerable costs. Endurance does not play a role when using an SSD as the boot medium, but is very important for application scenarios responsible for data logging tasks.

Information on SSD write performance can be found in DATASHEET, usually in terabytes written (TBW) or drive writes per day (DWPD) as units. For example, a TBW value of 100 means that 100 TB of data can be written to the SSD over the whole life. The DWPD value represents how often the same amount of data is written to the SSD each day, for three to five years, depending on the capacity of the SSD, until it reaches its service life.

Several factors affect an SSD's endurance rating, including how best engineers implement wear leveling (evenly distributing data writes across all blocks in the SSD), write amplification factor, and W/E NAND flash cycles.

But engineers don't just rely on data sheet specifications, engineers need to use the proper tools to assess the right capacity, performance, and operating temperature to determine the achievable lifespan of an SSD in an application. In the application cases, the engineer should test the SDD under conditions similar to the actual application, then the SMART value in the test tool can be read. For example, in an application that needs to write a lot of small data (4KB), the lifetime will be shortened, in this case, it is better to pack the data first and then write it. Additionally, the structure of an SSD causes the firmware to move incoming data multiple times until it eventually finds a place in flash memory, a process called write amplification factor (WAF). The higher the WAF, the more wear and tear the flash memory cells are, and the lifespan declines rapidly. Once the WAF is determined, and the corresponding TBW capacity is obtained from the datasheet, engineers can calculate the approximate write performance of the SDD.

power

 

The power supply is part of the industrial embedded application. If there is a problem with the power supply, errors will inevitably occur during the data transmission process. To solve this problem, a complex error detection and correction (ECC) function is implemented at each data transmission point. ECC protects data transfers between the host system and NAND flash by providing potential errors with complete end-to-end data path protection. Engineers need to evaluate the stability of the power supply, while also considering the effects of a sudden failure of the power supply voltage. The internal structure of an SSD and its programming of incoming data can cause certain problems, possibly even data loss.

Thanks to the low power detector integrated in the SSD, there is good protection for emergency situations. If a voltage dip is encountered, the SSD controller will stop accepting any further commands and attempt to save the data currently being transferred between the controller, cache and flash. Some SSD manufacturers take the extra step of placing capacitors on the SSD board to maintain the internal voltage to safely write data currently in the DRAM cache.

 

Comments