FutureBlade: 24 Nodes Per 3U, Standardized Expansion Modules, Water Cooling

Ersun Warncke
Epycly
Published in
4 min readFeb 12, 2018

Expanding available computing power relies on simultaneous advances across all facets of the computing infrastructure including CPUs, Memory, Storage, and networking.

Whatever the best in class components may be at any given time there is an additional challenge for cloud computing providers of how to pack as many of those components into as small of a space as possible, cool them as efficiently as possible, and assemble them as cheaply as possible.

Some of the best high density systems in the world are currently designed and manufactured by Supermicro.

https://www.supermicro.com/newsroom/news/DataCenter_MicroBlade.cfm

The Supermicro MicroBlade™ series packs 14 nodes into a 3U enclosure enabling a deployment of 196 node per standard 42U cabinet.

For this project we took the current best in class server design and looked at how we could radically improve it.

The blade of the future will be modular, standardized, and water cooled.

In our design each module is 17.75mm in width allowing 24 module stacks to be packed into a 3U enclosure.

Each module has a standardized interface that includes a water inlet, a water outlet, and an integrated I/O connector that carries both power and PCI-Express lanes.

The front and back connectors are inverted so multiple modules can be chained together in order to combine different components.

Modules come in standard sizes of 300mm, 150mm, and 75mm so that different sized modules from different manufacturers can be mixed and matched together.

All modules must provide the same standardized mechanisms for routing water, power, and PCI-Express communications.

CPU Module

The CPU module sits in the center of a module stack and expansion modules can be attached on either end.

CPU modules must be highly integrated with CPUs, RAM, and possibly even NAND all directly attached to the board.

The outer packaging is a full-cover water block that encases the motherboard and provides cooling for all the components.

GPU Module

GPUs could be packed into half-length modules.

Like the CPU Module the module itself is a full-cover water block that provides cooling for the GPU and other components.

The GPU attaches to PCI-Express lanes provided by the CPU and then passes through lanes that it does not use.

SSD Modules

Additional NAND storage can be packed onto standalone SSD modules.

A quarter-length module with NAND attached to both sides of the board could provide the capacity of 8 M.2 2280 SSDs.

Enclosure

The enclosure provides the water, power, and networking required by each module stack.

Data centers deploying these systems would use DC power distribution so large power supplies would not be needed.

Data center water distribution systems would pipe pressurized chilled water to each cabinet and provide outlet pipes for the return flow.

Standardization

Standardization is essential for the continued scaling of cloud computing operations.

Once the standards for modules are in place multiple companies can manufacture the module components and any company can manufacture a specific module that can integrate with the rest of the ecosystem.

The standards proposed here for module size, interconnection, and water cooling are flexible and extensible enough that they could last for decades while the internal components and communication interfaces are switched out over time.

Conclusion

What is proposed here is a system that can increase density 4–5X over current best-in-class system, make those systems dramatically easier to assemble, and reduce barriers to entry for manufacturers to build components for those systems.

Hyperscale cloud computing is currently highly proprietary and consequently inefficient. Like all industries hyperscaling must standardize and commoditize in order to mature.

Standardization and commoditization will drive down costs dramatically by eliminating barriers to entry.

Dramatically increasing density means that even if you don’t have the absolute best components you can still pack more of them in the same space and get greater effective compute capacity.

Increased density along with standardization and commoditization can increase compute power per dollar several times over and establish a trajectory for continued growth.

--

--