Flash Memory in a Flash

Published in

Spider R&D

9 min readJul 30, 2020

Today, we see that flash memory is available in many places, be it on your digital camera’s memory cards or the SPI flash, which stores the Arduino UNO program. However despite being called a “Flash memory”, the Flash technologies used in these applications are quite different. Flash memory comes in two technologies mainly — NAND and NOR. NAND flash is the one on your memory cards and MP3 players, while NOR flash is the one present in embedded applications such as your cell phones and those microcontroller boards you prototype with. Before you begin to scratch your head over why the types are named after universal logic gates or what difference in these seemingly similar ( both non-volatile memories ) make them suitable for different applications, take a look at the fundamental circuits of these technologies.

Courtesy: https://www.embedded.com/flash-101-nand-flash-vs-nor-flash/

If you are not familiar with such diagrams, just know these few basics,

Bitline is where you place the data to be stored during a write operation and from where you read the stored data during a read operation.

Wordline is the one that activates cells based on the address input both while reading and writing.

Sourceline connects all the sources to a common ground potential.

Black dots at intersections of lines indicate the lines are in electrical contact at that point.

Each memory cell is a MOSFET with a floating gate capable of storing binary data in the form of an electric charge.

To simplify the differences between NAND and NOR, let’s consider an analogy.

Visualise each cell as a house, and you are the data residing inside it. You coming out of the house to the road is equivalent to a data read operation (where data has to reach Bitline). Now take a look at the diagrams closely again. NOR flash is like living in an independent house from where you can walk to the street directly. On the other hand, NAND flash is like an apartment building with a house on each floor, so you’ll have to pass through the other floors to reach the street.

Considering you walk at the same pace and the only elevator in the apartment is out of order, what takes more time?
Also, which design takes up more space on the earth for the same number of houses?

The architectural distinction, which is obvious from the figure above, is that the Bitline and the Sourceline are connected to each cell in NOR flash while a string of cells in NAND flash is connected to the Bitline at one end and source line at the other end through select transistors.

A direct result of this connection is that a cell in NOR flash can be read much faster in comparison with the NAND flash. A typical NOR flash works in the order of nanoseconds similar to DRAM, whereas it takes in the order of several microseconds for a cell in NAND to be read as all the other cells in the string (which are typically 16 or 32 in number) have to be activated.

In the physical design, NOR requires more space for each cell as they all have contact with the Bitline, making NAND the more compact option. The size consideration in NAND flash has made it popular in storage applications.

The reason behind the names is no mystery. The circuit diagram for the n-MOS implementation of NAND and NOR logic gates is as follows

n-MOS implementation of a 2 input NAND gate-MOS implementation of a 2 input NOR gate

If you have not already guessed it, each MOSFET is connected to the output line directly in the NOR gate and at the end of a series of MOSFETs in the NAND gate giving the memory architectures their names.

How to read data from the memory cells?

Memory cell. Courtesy: https://www.adestotech.com/wp-content/uploads/AN500.pdf

A very common analogy to understand MOSFET is to understand the principles of a water tap, where the water flow resembles the motion of the charge carrier from source to drain. The water flow is regulated by changing the pressure via the valve, while in MOSFETs, the electron flow is controlled by changing the voltage applied at the gate terminal. Just as a minimum force is required to turn a tap on for water flow, a minimum voltage called the threshold voltage applied in the gate terminal (VTH) is required by the MOSFET to start conducting.

The threshold voltage of a floating gate MOSFET depends on the amount of electrons trapped in the floating gate. If the floating gate is charged with electrons, this negative charge shields the channel region and hinders the process of channel formation between the source and drain. As a result, more voltage has to be applied for conduction, thus increasing the threshold voltage of the MOSFET. This technique is used to determine whether or not a memory cell is programmed.

The programmed cell corresponds to the cell with a floating gate charged with electrons, while the erased cell corresponds to a floating gate with depleted electrons.

**Read operation in a memory cell.** Courtesy: https://www.adestotech.com/wp-content/uploads/AN500.pdf

VGSTH1 is the threshold voltage of the MOSFET in the erased state, and VGSTH2 is the threshold voltage of the MOSFET in the programmed state. A reference Voltage VTREF is chosen in between VGSTH1 and VGSTH2 and is applied to the cell. If the MOSFET conducts current, then it is in erased state else in a programmed state

How to program and erase?

Simple as the question sounds, we will now transfer electrons to the floating gate to program the cell and disperse the electron from the floating gate to erase it.

The program operation for the cell is based on “Hot-Electron Injection”. To explain this better, know that you’re going to mess from your hostel. On your journey, you come across a burger spot. Now the burger spot is more appealing to you than the mess, so you end up heading there instead. Similarly, creating a potential difference between the source and the drain of about 4V will set the electrons to move from source to drain. By applying a high enough positive voltage of about 9V to the gate, the floating gate is at higher positive voltage than the drain. Hence, electrons are pulled into the floating gate, just like how you are pulled into the burger spot (Lockdown seems to have a higher pull over us, and hence we are trapped inside our homes :/)

Once inside the floating gate, these electrons no longer have the required energy to escape the confines of the floating gate, and they remain trapped — the cell is programmed.

The erase operation happens through the process known as “Fowler-Nordheim Tunneling”. Now that you are at the burger spot (the cell has been programmed) with the previously described analogy, you are being served with not so good food. Fortunately, you remember today’s mess menu is something that you love, so you leave the spot and go back to the mess. Similarly, by applying a large negative voltage (approximately -10V) to the gate and positive voltage to the MOSFETs body (approximately 8V), the electrons are repelled by the floating gate and tunneled through the oxide layer, which depletes the floating gate of electrons — the cell is erased.

Once programmed, data stays forever?

The next big question now, because you know how a cell is programmed and erased, is how long it can maintain its programmed state. Typically we would like the data to be retained for at least a few years. To achieve this, the level of charge stored must be within a threshold limit to maintain data integrity. The state of charge degrades naturally as electrons slowly tunnel back to the channel. This degradation depends on factors like flash structure, amount of flash wear (number of P/E cycles performed on the cell), and the storage temperature.

So an important factor to be considered about a flash cell is its endurance — the number of Program/Erase cycles an individual cell can undergo before the cell becomes unreliable.

Let’s consider a new hairband to demonstrate the endurance of a cell. You’ve got to stretch the new hairband by pulling it hard to fit your hair because of its small size. On repeated usage, the band loses its elasticity its rest position is bigger now. Moreover, the band cannot reach the maximum stretched state that a new, unused band can, till one day, there is almost no elasticity left and you cannot say the difference between the stretched and default states as they are the same.

Something similar happens in a flash cell too. With repeated P/E cycles, the dielectric oxide layer is degraded due to electrons getting trapped in it. These trapped electrons result in a negatively charged oxide layer, which decreases the field strength raising the erase state threshold voltage, VGSTH1. During programming, these trapped electrons prevent electrons from tunneling into the floating gate, reducing charge that can be stored in the gate and resulting in a lower VGSTH2. The window between the thresholds closes, preventing further usage of the flash cell.

The trapped electrons enable trap assisted tunneling for the electrons stored in the floating gate, reducing the retention of data as the flash wears.

Can we store more than 2 states in a single memory cell?

From the above discussions, we have seen there exist two states depending on whether or not there are electrons in the floating gates (1 and 0). You might wonder why not have more than two states? Afterall the threshold voltage can be changed to a range of values by varying the charge stored in the floating gate. This results in multilevel flash memories, where we can store 2-bit values by having four states in a single erased cell (erased state, and 3 levels of different charges being stored in the floating gate).Therefore by creating 2 ⁿ-1 levels, we can store n bits in the same memory cell. MLCs (Multilevel cells) have a higher bit rate (more cost reduction). The main drawbacks with MLCs are that the number of write-cycles the cell can sustain reduces and the chances of error increases. It also has lesser retention than a single-level cell. Despite the drawbacks, MLCs are mainly used in products that don’t need long term reliability like USB Flash drives and portable media players.

How much has Flash grown up?

One of the latest developments in the flash memory is 3D NAND which is literally the “Grown Up” version where strings of flash cells are built vertically in silicon. Such a design enables more bits to be packed in the same surface area, increasing density and reducing cost per bit. Promising almost 3X packing capacity than planar NAND, faster read/write, and more power efficiency, this technology may replace the current technologies and become “the solution” in cloud and enterprise storage!

References

Prof. Aneesh Nainani’s lecture for Stanford University class available at https://www.youtube.com/watch?v=WXKYLLARQf4
Grossi, M., Lanzoni, M., & Ricco, B. (2003). Program Schemes For Multilevel Flash Memories. Proceedings of the IEEE, 91(4), 594–601.
https://www.adestotech.com/wp-content/uploads/AN500.pdf
https://www.csd.uoc.gr/~hy428/reading/M-Systems_NANDvsNOR.pdf
https://www.embedded.com/flash-101-nand-flash-vs-nor-flash
https://www.ni.com/en-in/support/documentation/supplemental/12/understanding-life-expectancy-of-flash-storage.html#section-1349300005
http://www.cse.scu.edu/~tschwarz/coen180/LN/flash.html

— This article was co-authored by Amritha Baskaran

This article is published as a part of the ‘Hardware Series’ under Spider Research and Development Club, NIT Trichy on a Tronix Thursday!

Flash Memory in a Flash

References

Written by Bhavya Krishna