`// Random math between two input valuesuint32_t Math(uint32_t a, uint32_t b, uint32_t r){    switch (r % 11)    {        case 0: return a + b;        case 1: return a * b;        case 2: return mul_hi(a, b);        case 3: return min(a, b);        case 4: return ROTL32(a, b);        case 5: return ROTR32(a, b);        case 6: return a & b;        case 7: return a | b;        case 8: return a ^ b;        case 9: return clz(a) + clz(b);        case 10: return popcount(a) + popcount(b);    }}`
• 32-bit Add, simple logic, ~300gate for a fast one.
• 32-bit Multiplier, mature IP, ~20Kgate for a fast one, since multiplier only have ~4/11 activity rate, we can use a two cycle multiplier to half the area, small possibility to increase delay.
• Rotation operation can easily map to a multiplier, for example I want to calculate ROTL(0x12345678, 8), I can do 0x12345678 * 0x00000100 = 0x0000001234567800, then we just need to OR higher word and lower word together to get 0x34567812. so just cost ~160gate extra logic
• logic operation, A&B only cost 32 gate, A|B 32 gate, A^B 96 gate, it looks like three different instructions but actually extremely small on silicon (<30um²)
• clz and popcount are also very small
• We only need a multiplexer to select output.
• Total size of Math() is about 0.0015mm² on a TSMC16ULP process.
• Merge() is similar but even smaller, only shifter, adder, and tiny logic (no multipliers because constant multiply can be mapped into adder).
• Size of Merge() is roughly ~0.0005mm².

• 0.55V voltage (typical voltage for TSMC16ULP)
• generating ~10T Math() + Merge() throughput per second
• Power estimate roughly 3mW each pipeline, 30W in total.
• No customized circuit/layout
• All standard cells, auto placement route
• Use mature IP only
• No aggressive overclocking
• No aggressive under voltage
• ~8.4K Merge() and ~4.8K Math() per hash
• pipeline includes 1 Merge() and 1 Math()
• pipeline is only the logic part, not the memory part
• the xGB memory are not included
• we have 10T throughput on single chip asic, divided by 8.4K Merge() per hash, means 1.2GHash

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade