Soldered CPU vs. ‘Cheap Paste’

A study into the technical details of TIM to understand why Intel may have abandoned soldered CPU designs.

What is Thermal Interface Material (TIM)

TIM, commonly called ‘Thermal Paste,’ is a solution to provide improved heat transfer from one surface to another. There are two levels of TIM generally used in modern CPU cooling setups. TIM1 is used to transfer heat from the die to the internal heat spreader of the CPU while TIM2 is used to transfer heat from the heat spreader to the heat sink. Even if one was to smooth out the contact surfaces to a mirror like finish, microscopic imperfections in the surface would lead to less than stellar contact between the surfaces. TIM solves this problem:

Taken from [6]

There are several forms of TIM: Gel, grease, Phase change material, Phase Change Metallic Alloy and Solder. We will focus of the two more prominently used forms — Grease and Solder. In this study we will have a look at some research papers and determine if the criticism of Intel’s TIM1 choice is valid and also why they chose to make these changes.

Effects of Manufacturing on TIM choice

Let us first consider the manufacturing process of a CPU package to understand what physical effects that we encounter. Flip chip style substrate connections must deal with thermal effects at the first solder process. Underfill is used to protect interconnects under the die (First level interconnects) such that the difference between the Coefficient of Thermal Expansion (CTE) mismatch between the organic substrate and the die do not cause solder fatigue crack. The solder bumps are made of tin silver/ tin silver copper alloy since the move to lead free construction [1]. In simple terms, the die and the substrate heat up (expand) and cool down (contract) at different rates which causes solder bumps to feel stress. The underfill controls the amount of sheer stress.

Pentium 3 package clearly shows underfill

This picture of my Pentium 3 shows the large black epoxy blob that holds the die in place (the underfill). Note that this processor did not include an IHS.

Taken from [1]

For attachment of die to substrate, CTE causes bending of the substrate in the reflow process for the solder. Tensile stresses are seen at the top side of the surface while the bottom side sees compressive stresses [2].

Taken from [2]

To this stressed die + substrate a nickel plated, copper Internal Heat Spreader (IHS) is applied to disperse the heat into a large area heat sink plate. This process usually requires an Indium (In) based solder to connect the die to the IHS at a temperature that is lower than that required for the First Level Interconnect solder bumps. However the same sort of stresses are imparted onto the packages as the Indium cools causing further bending of the substrate. This is an exaggerated view but at the microscopic scale, small effects can cause big problems given enough time:

Taken from [4]

Due to the the property of the indium based solder, thermal cycling forms microcracks in the solder layer. The crack formation resembles that of a layered separation leaving voids where thermal energy cannot transfer. The crack pattern is referred to as “Delamination.” Over time these cracks contribute to significant increase in overall thermal resistance of the TIM1 layer. While the largest thermal swing occurs at production time, the life cycle of the processor can also contribute to delamination. Thus, a fresh soldered processor BNIB is at its peak thermal performance prior to running hundreds of heating cycles. Larger dies are less prone to stress based delamination at production time.

Heat: A new challenger

A small die has a higher thermal density hotspot. Solder based TIM1 was replaced by thermal grease based TIM1 at the mainstream desktop front with Ivy Bridge, a die shrink of the Sandy Bridge architechture.

Ivy Bridge Power Density heat map — Taken from [3]

The thermal energy is consistently focused on the core region (as opposed to the cache, system agent or iGPU) due to the high activity and dynamic power consumption of that region. Also note that Core 1 and 2 have the highest density since they cannot dissipate energy into cooler adjacent silicon area. As die sizes decrease, a delamination occurring at such a critical region can be disastrous for the die health and performance. Most immediately, the cores will begin to thermal throttle, and the risk of breakdown die failure becomes ever present. A larger die would dissipate this heat over more surface area and be less prone to localized delamination failure.

https://www.intechopen.com/source/html/30950/media/image16.jpeg

Above is an example of a delamination occurring at solder bump to pad connection. The physical effects are similar on TIM solder also.

Zeppelin Ryzen R7 and Skylake Quad GT2 Core i7 Die sizes to scale — With hotspots shown on right

In this picture I have attempted to generate a scaled image of Zen and Skylake dies using measured estimates of die sizes (Skylake = 13.52mm x 9.05mm and Zen = 22.01mm x 8.87mm ). We can see the drastic difference in die sizes especially when you consider that Core Group on the Skylake die only makes up half the die space and the Ryzen does not contain an integrated graphics unit.

Furthermore, highlighting the hottest portions of each core group/CCX shows how distributed the power densities are for Ryzen. At 122 mm², the Skylake Quad die is approximately 60% the size of the Zeppelin die. In fact the Zeppelin die size is only slightly smaller (10%) than the Sandy Bridge die. With a smaller total die size, and a more focused thermally energized region, Skylake does not qualify as good candidate for solder based TIM cooling. Perhaps Ryzen’s die size and more even heat distribution could still satisfy requirements for being soldered. From here going forward, I doubt any 7nm or small die size solutions will have Indium solder TIM1. At the time of writing this piece, Ryzen 3 has not been launched and the package has not been analyzed. It will be quite interesting to see what unfolds.

The TIM of the gaps argument

Taken from [4]

It is generally accepted that the new approach to use paste based TIM1 since the Ivy Bridge days has lowered the performance of the chips by limiting overclocks due to TJ max. This is undoubtedly become a limiting factor in Kaby Lake and Skylake overclocks and delidding has shown tremendous improvements (20 degrees Celsius+ at high OC levels). Der8auer has insisted that the reason for this is not necessarily the change to ‘cheap paste’, but that the glue that holds the IHS to the substrate creates a large spacing between the IHS and die. This requires more volume of thermal paste and slows down the thermal transfer to the heatsink. Let us evaluate this claim:

Test setup of interest from [5] shows the cross section of the areas we discussed

There are a few important notes to consider when understanding the results from [5] which is primarily concerned with the effect of the TIM1 thickness, die size and the Cup Lid type IHS. Unfortunately the die sizes tested in this paper are much smaller than the Ryzen or Skylake (they are around the size of a Core Group/CCX) but the results should carry over none-the-less. The thickness of the TIM1 paste/solder is technically refered to as ‘Bold Line Thickness.’ The TIM1 paste itself is composed of a compound with thermally conductive fillers. Pastes can come in a variety of forms that impact their:

1) Thermal resistance (which we will evaluate using Theta-JC junction to case thermal resistance)

2) Surface contact resistance

3) Mechanical Stiffness

4) Void formation susceptibility

All of these properties must be considered when choosing a TIM1 solution. A highly conductive paste, such as one with mettalic fillers, loses its adhesive qualities. Mechanical stiffness and surface contact resistance affect its ability to fill microscopic airgaps and hold its position while being thermally cycled. There are several tradeoffs and isn’t a no-brainer task that your neighbor Bob might be implying when he says “Intel uses cheap TIM.” Five TIM substances were chosen which varried by their Thermal Conductivity.

MAT1 = 0.2W/m K

MAT2 = 0.8W/m K

MAT3 = 2W/m K

MAT4 = 10W/m K

MAT5 = 20W/m K

For reference air has a thermal conductivity of 0.02 W/m K

The results of CFD based simulation results are:

Taken from [5]

The effect of increasing the material thermal conductivity results in a lower thermal resistance just as we would intuitively expect. However it seems that as the conductivity increases, the bold line thickness of the paste plays a less significant role in the heat transfer. MAT4 and MAT5 behave almost the same and show almost no change with increasing thickness! Also see that a larger die has a much smaller Theta-JC, 10x less in this case, since larger die sizes allow for better heat distribution. Cup Lid IHS allows for more permissive parameters for mechanical properties of the thermally conductive epoxy material. IHS thickness made little difference in the results.

Further tests on void formations were conducted and can be summarized as follows: An increase in Theta-JC of 10%, 17% and 23% is observed with voids of 5%, 10% and 15% respectively. A void, which usually only occurs with paste based TIMs, is quite literally an interstitial void between the silicon die and IHS cup lid. Air being an excellent insulator must be avoided like the plague when it comes to heat transfer. Void formation is greatly dependent on the mechanical properties of the paste.

What is a good TIM?

In one study a metric defined TIM ‘contact quality’ is used to evaluate efficacy. The contact quality of TIM, according to [6], can be represented by the equation (lower is better):

Thus to improve contact quality we can do 3 things:

1) Increase conductivity (k TIM term)

2) Reduce contact resistance Rc by filling the surfaces more more smoothly

3) Reduce BLT by reducing bulk modulus of elasticity of the TIM

The contact resistance due to surface smoothness is constant factor independent of the TIM so we can ignore it. So what range of conductivity do we expect out of a TIM grease and what is a acceptable amount of BLT thickness or grease bulk modulus?

Taken from [8]

Grease BLT are generally between 0.1mm-2mm and greases have been shown to provide Theta-TIM values that are competitive with PCM, Gel and Solder [6]. A research study in [7] reveals that the bulk modulus, a technical term for the compressibility of a substance, changes based on the polymeric matrix of choice for the grease. The matrix is the base substance in which the high thermal conductivity filler is added. The chemical interaction between the filler and matrix must also be take into account. A Polymer matrix is chemically modifiable to tailor to required mechanical properties to a high degree. Epoxy resins have higher modulus and adhesion quality where as silicones have lower modulus and absorb stresses [8]. Even thermal pads, although solid, are a variation of this concept (matrix + filler). Although they are a cheaper solution, they suffer from lower wettability, lower adhesion strength and lower thermal conductivity but work fine for lower power ICs such as GPU VRAM chips.

Often materials chosen for any design must be adequate to maintain desired performance level for the given life-cycle of the product. Essentially this boils down to a reliability issue. Traditionally, the filler material used are ceramic based alumina or magnesium oxide due to their strong thermal conductivity but electrical non-conductivity property. Adding conductive metallic based particles like aluminium nitride can raise cost 10x-100x for performance improvement of 5x-10x [8].

Aluminum Nitride Powder is a cost effective filler for conductive greases

Conduction is half the battle

Thermal paste manufacturers advertise themselves as being better thermal conductors than the stuff you get on stock coolers. There is usually no mention regarding wetability, long term pump out effects, performance consistency etc. But conduction is not everything. Can your aftermarket paste survive the solder bakes that most TIM1 does?

In manufacturing tolerances also play a strong role in how a device is put together. We previously discussed stresses that the die experiences during manufacturing. Various pressure points can cause warping of the mated surfaces which can affect the wettability [7]. Assembly is highly dependent on the material quality of the die and the substrate (this changes generation to generation). New process nodes or substrate design changes can produces different levels of CTE warping effects.

What I’m trying to get at here is that perhaps the BLT spacing is consciously chosen to provide a safe level of tolerance from any warping possible. This would allow more products to pass QA. Perhaps the amount of filler used in the grease in the TIM1 is not up to par with aftermarket solutions. But filler loading, putting large amounts of filler material into the matrix, can cause the grease to begin to form large voids. This reminds me that manufacturers of GPUs sometimes apply a bit too much thermal paste. According to [7], my guess would be they are afraid of some sort of pump out or void formation. Pump out is a slow process that would occur after several (think hundreds) or thermal cycles.

This particular paper found that a 0–100 C power cycle conducted 7500 times results in four to sixfold increase in thermal resistance compared to 0–80 C done 2500 times. Degradation grows exponentially with TIM temperature cycling.

A safe bet for a company that produces products designed for up to 10 year life is to select materials that can:

1) Withstand stress effects of manufacturing (CTE based warping)

2) Withstand thermal cycling without severe performance degradation (delamination and voiding)

3) Keep a low level of product failure rates or return rates (grease pump-out, wetting failure and void formation)

Intel has a section in its support pages indicating how to apply TIM2 and lists Part number G15816–001 as approved for use. Googling this SKU gives us the name Dow Corning TC-1996 Thermal Grease Metal Oxide Compound 0.5G. Dow Corning might likely be the supplier for the TIM1 as well as their product catalog shows various silicone based grease options. A popular off the shelf paste like IC diamond, that many use to replace the TIM in delidding, has a conductivity of 4.5 W/m-K which continues to be far lower than the values where BLT becomes insignificant and is similar to options delivered by Dow Corning.

The highest performance compounds do not necessarily have the highest conductivity but have the lowest thermal resistance as Dow Corning mentions. Dow Corning’s advertised thermal resistances are far lower than the research papers I had previously cited; likely due to test setup differences. Even the BLT thickness is much lower (100um or less). The thermal reliability is also stable through 20,000 of their house test power cycles.

Conclusion

The thickness of the TIM1 matters if the TIM does not have high thermal conductivity. The thermal conductivity of OEM or aftermarket greases do not vary by much and are in fact relatively low. There might be slight improvements in performance due to shortening of the gap. Intel’s thermal solution is designed to meet more requirements than just performance; reliability and manufacturability is as important from a profit standpoint. As die sizes become smaller the ability to transfer the maximum amount of thermal energy from the surface of the hot die itself becomes the limitation far before the conduction of the TIM. Surface chemistry is as important, if not more important than volume behaviour. There comes a point where die size alone disqualifies solder based TIM from being a viable solution. More research needs to be conducted in material science to provide a synthetic solution that better meets all the requirements of a modern TIM.

Hope you learned something exploring this topic as I have. If there are any corrections to be made please notify me. If you happen to work in the package design industry, I’d love to have a chat. Thanks for reading.

[1] Materials Technologies for Thermo mechanical Management of Organic Packages (2005)

[2] Performance Characteristics of IC Packages (2000)

[3] Evaluating the impact of scaling on temperature in FinFET-technology multicore processors (2014)

[4] The Truth about CPU Soldering — der8auer (2015)

[5] Interface Thermal Characteristics of flip chip packages — A numerical study (2009)

[6] Interface Material Selection and a Thermal Management Technique in Second-Generation Platforms Built on Intel Centrino Mobile Technology (2005)

[7] Thermal Performance Challenges from Silicon to Systems (2000)

[8] No More Silos Allowed for Thermal and EMI Engineers as Frequencies Rise — www.ecnmag.com (2010)

Like what you read? Give Damien Perrier a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.