Understanding ProgPoW

Performance and Tuning

Overview

ProgPoW — a short-hand form for ‘Programmatic Proof-of-Work’ (sometimes colloquially called ProgPoW, or PorgyPoW, after the eponymous Porg from Star Wars VII: The Last Jedi), is a GPU-tuned extension of Ethash that minimizes the efficiency gap available to fixed-function hardware.

Design

“Historically, proof-of-work mining has taken a fixed algorithm and modified the hardware to be ‘efficient’ at executing the algorithm. With ProgPoW, we’ve put this paradigm in reverse — we have taken hardware, and modified the algorithm to match it. “

An efficient algorithm for hardware needs to match the access patterns and available space of that hardware. This is why AMD GPUs with firmware edits saw large performance gains on Ethereum — because the access patterns of memory chips were matched to the access patterns of Ethash.

A GTX 1070 Executing Ethash — Taken From NSight Profiler
A RX 580 Executing Ethash — Taken from CodeXL
A Titan X (Pascal) — similar to a 1080Ti — is extremely inefficient at Ethash due to 128 byte reads.
A GTX 1070 executing Ethash, with Keccak removed.
Summarized from “gtx1070-ethash-source.csv
  • A keccak engine.
  • A small compute core to do inner loop FNV and modulo operations.
ProgPow.h
ProgPow.h
ProgPow.h
ProgPow.h

Results: GDDR5

With the above settings, ProgPoW is able to saturate both compute (the SMs) and memory bandwidth at once.

A GTX 1070 executing ProgPoW.
A GTX 1070’s shared memory utilization executing ProgPoW.
A RX 580 executing ProgPoW.
A GTX 1070 executing ProgPoW, with Keccak and KISS99 removed.
Summarized from “gtx1070-progpow-source.csv
  • A compute core with a large register file.
  • A compute core with a high throughput integer math.
  • High throughput, highly banked cache.
  • Small Keccak + KISS99 engines.

Results: GDDR6

We also managed to get our hands on an RTX 2080, which has GDDR6 memory, to perform initial benchmarking on. The CUDA profiler does not fully support the new Turing chip: a number of performance metrics (including framebuffer utilization) are listed as 0, as a result.

RTX 2080 executing ProgPoW. Note that the Memory% does not correspond to FB%, unlike previous images.
RTX 2080 shared memory utilization executing ProgPoW.

Results: Hashrate

All cards are tested at stock frequencies.

Performance Profiling Reports

The following profiler images are attached in .png format:

AMD | NVIDIA

Note: You will need to zoom-in to clearly view the data.

We are the team behind ProgPoW, a GPU-tuned extension of Ethash.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store