Comprehensive ProgPoW Benchmark
By a miner, for the miners!
As of the last Developers call on March 15, 2019, and on R/Ethereum ProgPoW has been confirmed for Ethereum. As of 03/26/2019 no hard-fork date(Update: ProgPoW is scheduled for Istanbul) is set and the audit is still pending by Least Authority. As an Ethereum miner myself, and overall GPU miner, I personally spent my time conducting a comprehensive benchmark of ProgPoW. In doing so, I hope I can help the Ethereum community and the miner community at large. I’ve previously benchmarked ProgPoW 0.9.2 but I’ve benchmarked 0.9.2 vs 0.9.3 with more graphics cards and updated findings. So let us begin with the dull stuff.
The Test Bench
1095T 6-core AMD Phenom II
12GB DDR3 RAM
ASUS CROSSHAIR IV FORMULA
160GB RAPTORS in raid 0
EVGA 650w Platinum
The GPUs tested
Gigabyte RX460 4GB GDDR5 (Unlocked Shaders Mod)
MSI RX470 8GB GDDR5 Armor
XFX RX480 8GB GDDR5 GTR
Sapphire RX580 8GB GDDR5 Nitro+
Gigabyte Vega 64 8GB HBM
Zotac GTX1060 6GB GDDR5 SFF
EVGA GTX1070 8GB GDDR5 SC
EVGA 1070ti 8GB GDDR5 FTW
PNY GTX1080ti 11GB GDDR5X Blower
EVGA GTX1660ti 6GB GDDR6 SSC
MSI RTX2080ti 11GB GDDR6 Duke
Windows 10 v1809
About the GPUs
I want to be extremely clear here, that GPUs vary from manufacture to manufacture. An example would be EVGA GTX1070ti and MSI GTX1070ti. While they are the same Pascal GPU, how each one overclocks, the design and voltages vary greatly. Same goes for AMD. A reference RX480 performs very different from XFX’s RX480 custom design. It’s impossible to know how and what each GPU will do. Some will do better, others perform worse. Even if they are the exact same architecture. Additionally, every graphics card has varying memory types such as Samsung, Micron, or Hynix. This can all play into performance gained in ProgPow or Ethhash.
How I tested ProgPoW Performance
I compiled using Andrea Lanfranchi’s Github Repo of Etherminer v18 alpha with ProgPoW added. I self-compiled 0.9.2 ProgPow Spec from the master branch and 0.9.3 from his experiment-knob branch. He has now since closed his miner due to the inaction of funding for an open-source miner.
Original link here https://github.com/AndreaLanfranchi/ethminer
For Ethhash value in graphs, I used Etherminer v18. I tested with Claymore v12 for comparison purposes.
Hashrate was measured via benchmark mode.
Program Benchmark Settings
ethminer.exe -M 7169430 -A progpow -diff 5 -HWMON 2
ethminer.exe -M 7169430 -diff 5 -HWMON 2
Claymore v12 Benchmark -240
Block #7169430 (Epoch 240~) was chosen to represent actual Ethereum block height to give a real-world simulation.
hashrate can vary on the Period every 50 blocks (for 0.9.2) or every 10 blocks (0.9.3). This is due to the fact that every period generates a completely new hashing kernel. If you use a different block height than mine your results will not be the same.
ProgPow 0.9.2 vs 0.9.3
Ifdelelse team released an updated spec, 0.9.3 of ProgPoW which slightly reduced the core requirements. Giving a small hash boost to AMD GPUs and Nvidia’s.
PROGPOW_PERIOD: 50 -> 10
PROGPOW_CNT_CACHE: 12 -> 11
PROGPOW_CNT_MATH: 20 -> 18
Andrea Lanfranchi’s, one of the developers of Ethminer has said; “0.9.3. is more performant and reduces the period to 10 blocks (2 minutes) thus making FPGAs bitstream compilation that much harder.”
Benchmark testing the GPUs
Each GPU was plugged in via 6-pin powered riser v008c. The GPU and riser is completely separated from the system. This was powered with a platinum powersupply for the most accurate results. Wattage from GPU was taken from software readings including; Ethminer, GPU-Z, and NVinspector. Wattage (from wall) was measured by Wattmeter 120v/110amps. Do note in the charts below, wattage from wall measures differently from TDP of the card because idle wattage, which includes the PSU, Risers, and GPU. For most of the GPUs these equal 30 watts give or take.
In order pinpoint the actual wattage GPUs used or it’s real TDP I’m using what I call ‘True wattage’. I took Idle wattage minus wall wattage readings. Example RX470 in the charts you’ll see its 165w from wall, to get ‘true wattage’ I simply take(165w - 30w) = 135w, close it’s actual rated TDP.
Note: The RX460/GTX1060/GTX1660ti idle wattage was recorded at 20w.
AMD BIOs modded cards
According to Kristy, aka Ohgodagirl, AMD RX series bios should be left to stock for best performance. However, I’m going to ignore her because I know many miners have hundreds of AMD GPUs with modified BIOs. Many miners don’t want to take the time to unmodify them. It’s best to see how that would perform in a switch from Ethhash to ProgPoW.
To bios mod I used PolarisBioEditor (PBE) Ubermix and overclocked accordingly. Since my last benchmark I tested non-modded cards. I’m going to Bios mod both RX470/480/580 and run them through Ethereum and ProgPoW Modded. Let’s see what happens?
With AMD and BIOs modding everyone has different BIOs mods some allowing for better Ethhash rates. So please consider the BIOs mods a best effort approach as most miners would take.
With AMD cards it’s the GPU lottery. Hashes vary depending on various factors such as memory, cooling, and brand. RX470 and RX580 could well reach beyond 30mh/s. Same goes with Nvidia GPUs. So I’ve tested Ethhash these speeds, but I don’t consider them fully stable merely for testing purposes.
Vega 64 GPU
I had borrowed a Vega 64 from a friend and tested it on ProgPoW. Vega’s require far too much fine tuning to get working with finicky AMD Win 10 Drivers. Linux may prove to better suited here however, I wasn’t at leisure to test since I only had it for a couple of hours.
I was unsure what the issue for this Vega 64 but I could not get it to run at stock speeds when mining. Impacting results for both ProgPoW or Ethereum. I tried 18.4.1, 18.6.1 and 19.3.1 drivers. OverDriveNtool, Wattman, etc. I followed some XMR guides and tried registry mods with no luck.
The GPU was stuck at 1000mhz core/800mhz HBM. I was able to trick it into 1280/945 but couldn’t get a fair wattage reading since to even get that I had to turn up +50 freq in driver 19.3.1 to achieve this. This doesn’t represent full stock speeds of Vega as the best I was able to get was close to 25mh/s with 0.9.3.
It would seem Vega’s will be excellent at ProgPoW matching the GTX1080ti at stock and I’m sure with some tuning, low power consumption. I cannot directly confirm this. Because of this I only include Vega 64 in ProgPoW spec and value comparison, nothing else.
Overclocking and Undervolting GPUs
There are plenty of ways to overclock and undervolt GPUs. MSI afterburner is the most popular used for both AMD and Nvidia. When benching AMD GPUs it required a bit of trickery to actually achieve correct core speeds, I’ve written a medium article about it here, similar with Nvidia. Nvidia GPUs will start off at full core speeds, when creating the DAG then drop although this doesn’t impact Nvidia as much as it does with AMD.
For my benchmarks I used direct control tools, OverdriveNtool for AMD and NVinpsector for Nvidia. These programs allow me to directly control voltages and core speeds. Allowing better overclocks and greater efficiency.
- AMD GPUs were tested at Stock then Ethereum overclock/undervolt. memory/core was tuned towards Ethash as best as possible.
- Nvidia GPUs were tested at stock then varying voltages for Ethereum overclock/undervolt. I picked 800mv for most higher-end GPUs, 700mv for the lower end, as it represented a decent medium between Ethereum wattage for each corresponding GPU.
- For the GTX1080ti I used the ETH-PILL created by ifdefelse to achieve the higher Ethereum hashrates most miners are aware of.
- Compute mode was enabled for AMD RX series GPUs.
- My full results, documents, graphs, and pictures will be posted at the end for those who are interested.
Those of you who benchmark just using MSI Afterburner will see very different results than me. As I stated previously; “…using MSI afterburner to control power % of TDP showed problems. The difference from 70% power and 80% was significant, with 70% power in ProgPow resulting in low power but far lower hashrate. Using power percent of 80–100% showed normal results for both the 1080ti and 1070. Because of this I used NVinspector and set voltages which locked the core.”
Why test stock? no one runs stock GPUs when mining. The answer is a Baseline. For testing anything a baseline needs to be established along with a starting point. From there assumptions and conclusions can be drawn along with how best to adjust. I cannot stress enough that stock benchmarks are what should be more heavily counted towards as non-negotiable values since every GPU performs differently once you start applying overclocks (YMMV).
ProgPoW 0.9.3 v 0.9.2
Starting with stock benchmarks, let’s see how the two specifications compare? From what I observed hashrate gains are actually had across the entire board of GPUs. Looking at stock results AMD GPUs gain about 1mh/s~ or about 8% increase in speed across the board.
Stock Clock speeds
RX460 1151/1750Mhz 1131mv/850mv
RX470 1230/1650Mhz 968mv/950mv
RX480 1266/2000Mhz 1150mv/1000mv
RX580 1430/2150mhz 1150mv/1000mv
Vega 64 1287/945Mhz
With Nvidia’s GPUs the gains are not pronounced with 3% gained on the GTX1070s and 6% on GTX1060, the exception Turing based GPUs which gained 9%.
Stock Clock speeds
GTX1060 1696/4000Mhz (900mv)
GTX1660ti 1875/6000Mhz (950mv)
GTX1070 1708/4000Mhz (850mv)
GTX1070ti 1888/4000Mhz (900mv)
GTX1080ti 1556/5500mhz (850mv)
RTX2080ti 1650/7000Mhz (850mv)
From now on all ProgPoW Benchmark numbers are done with 0.9.3 spec
For Ethereum hashrates Ethminer was used in benchmark mode. Please note this is not a direct comparison! This is merely what to expect when Ethereum moves to from Ethhash to ProgPoW with graphics cards.
Looking at the difference in hashrates in both AMD and Nvidia, we see about 50% reduction in performance across the board, the exception being the higher end GPUs 1080ti, 28% and 2080ti, 31%.
Before we get into the wattage charts, which are extremely important for miners, it’s quite hard to gauge wattage values due to many variables. For AMD it’s a trickier than Nvidia. Because Nvidia represents true values, GPU as a whole, while AMD software readings only show partial values.
Take note that Custom designs can far exceed normal GPU TDP wattage ratings due to manufactures forcing more power through the cards as AMDs GPU are designed differently than Nvidias. Nvidia has “power-limiter”. It varies depending on the GPU and board partner. Overclocking allows higher limits if input through Software(I.E MSIafterburner 110% power). AMD’s volt/power limits can be programmed directly into BIOs allowing higher voltage at stock speeds. Additionally tuning will be up to the end-user, miner, to adjust there GPUs accordingly.
For the GPU wattage charts, wattage was taken from software GPU-Z/Ethminer/Nvinpsector.
Wattage Taken from wall.
As you can see from the wall readings compared to software readings AMD has far less accurate results. The RX480 and RX580 are custom designs for overclocking beyond normal reference designs which means that far exceed normal TDP readings, RX480 AMD’s TDP rating is 150w RX580 is 180w. This results in a 19% increase in power from wall compared to progpow for the RX480. Nvidia’s readings are quite more accurate. GTX 1060 has a 15% increase in power. Putting the RX480 and GTX1060 are roughly same power increase for ProgPoW.
Without a way to accurately measure AMD card’s power consumption using software that would apply to Nvidia’s giving a fair assessment across the board. I Subtracted Idle wattage(Varying from 20w–40w)by wall wattage (Idle -Wall = ‘Truewatts’). Then divided (TrueWatt/Hashrate= Ratio). Stock values will be our baseline to compare how much tuning affects power in AMD and Nvidia.
RX480/580 show poor results as they are custom board partner designed with high wattage. They are right inline with each other in both Ethhash and ProgPoW. While RX470 with it’s non-custom design rating put the best showing.
With Nvidia, it becomes clear that newest Nvidia GPUs, Turing, makes excellent gains in performance and wattage.
I took what I consider “Ethhash OC” and benchmarked. These cards are lower voltage, lower core, and higher memory that many miners are currently using for Ethereum. This represents a fairly decent speed-switch when Ethereum will switch over to ProgPoW.
Clock speeds used
RX460 1151/1750Mhz 1131mv/850mv
RX470 1230/2150Mhz 900mv/900mv
RX480 1200/2150Mhz 900mv/900mv
RX580 1200/2200mhz 900mv/900mv
I know many don’t use Ethminer(even though it’s free and opensource) and Claymore is the most used miner out there. So for comparison sake, I tested Claymore and ProgPoW.
Something to take of note here is that since the RX470 stock speed is already 1230mhz, fairly low, nothing else was done just lowed the voltage according. For all AMD GPUs, it’s been my testing and my recommendation that set the voltage as low as possible and increase core as much a possible. Memory speeds for AMD didn’t seem to matter as much, anything around 2000mhz GDDR5 was perfectly fine.
Moving from stock benchmarks to overclock/undervolt benchmarks we see AMD card fall into line. Since the RX580 is nothing more than a refresh of the RX480, similar clock speeds provide zero advantages. I’d hold the same assumption between RX470/570. The RX580’s only benefit is the higher core clock speeds, but that comes at a high cost of power.
The RX590, which I tested previously showed the same results. I saw no point in retesting the RX590 since once we downclock to save power, all gains are null. The differences will come to miners and how much they are willing to sacrifice power increase for core increases leading to megahash gains.
Clock speeds used
GTX1060 1328/4500Mhz (700mv)
GTX1660ti 1455/6800Mhz (700mv)
GTX1070 1608/4400Mhz (800mv)
GTX1070ti 1600/4500Mhz (800mv)
GTX1080ti 1595/6000mhz (800mv)
RTX2080ti 1500/7500Mhz (725mv)
Moving on to Nvidia you’ll notice the Hashrate gains aren’t as pronounced as Claymore and Ethminer use the same CUDA implementation.
The GTX 1660ti is the one that performs remarkably well considering it’s price and performance far outstripping the GTX1060 it has replaced. The GTX1070 and GTX1070ti both hold up well achieving roughly half of there Ethereum perform. The high end you get what you pay for, the best performance for its class.
The RX480 and RX580 fall directly in line achieving similar watts. Due to exact same overclocks/undervolt. We see the RX580 becoming ‘tame’ with its 262 wall power draw coming down to a more realistic 150w.
You’ll notice that Ethhash and ProgPoW wattage are extremely close. Again since we’re limiting power it only has so far to go. Taking percentages into account RX480 increased about 10%~ and the RX470 about 9%~ from Ethhash to ProgPoW overclock/undervolt. By taking power-savings measures we saved about 8% off from stock power wall readings. Meaning there is minor increase in power for AMD GPUs by switching to ProgPoW.
For Nvidia, switching from Ethhash to ProgPoW shows a 33% increase for GTX1070/1070ti. GTX1060/1660ti show almost no increases due to the fairly low voltage/TDP of the cards. While you can definitely go lower on the millivolts for GTX1070/1070ti to cut down on wattage, you’ll lose performance on ProgPoW. Allowing Nvidia card, unfettered voltage, similar to AMD means excellent performance, at the cost of power. In the end it all comes down to individual miners and there GPU settings.
I want to reiterate again here that overclocking and undervolting results will vary! Are their better configurations? More than likely. I chose what I felt like gave the best representation of Hashrate, wattage, and what I’d run my farm at daily. Others will select what best suits them.
Watt to hash
After some voltage tweaks and overclocking let’s see what gains we had.
After some tuning, we see performance improvements for both ProgPoW and Ethereum. The RX580 is the biggest gainer with .03 improvement(.06 stock). Again since RX470/480/580 are identically the same thing for Ethhash, a gloried memory control they all fall within the range of each other.
Nvidia’s power savings from more efficient GPUs really shows here, in both Ethereum and ProgPoW, with the next generation Turing taking top Hash/watt ratios.
Hashrate to Price
Here it’s going to be difficult gauge real values using price for GPUs. This is because the RX470/480 and Nvidia’s entire pascal line are in use in miner operations everywhere but they’re EOL. I put them in according to what the original MSRP was. For GPUs still in production, RX580, Vega 64, GTX1660ti RTX2080ti, I priced accordingly to current MSRP from E-tailers like Amazon and AMD's website.
RX460 4GB $70
RX470 8GB $170
RX480 8GB $220
RX580 8GB $200
Vega 64 8GB $400
GTX1060 6GB $250
GTX1660ti 6GB $280
GTX1070 8GB $400
GTX1070ti 8GB $450
GTX1080ti 11GB $700
RTX2080ti 11GB $1200
For this comparison, I left out stock numbers and used Ethhash overclock/undervolt numbers, to represent the value to miners who would use similar clocks to ROI their GPUs. I included Vega 64 for these results and gave an estimated Ethash overclock hashrate of 41mh/s and ProgPoW of 25mh/s. This purely for value round-up purposes.
It’s clear to see why many miners chose AMD for Ethhash, AMD brings the value proposition. This carries on to ProgPoW. Nvidia’s cards are comparably expensive yet provide better hash to watt ratios. So even though AMD hashrates in ProgPoW are few mh/s slower and the watt/hash ratio isn’t as good as Nvidia, every AMD GPU comes out on top in the ProgPoW value comparison charts.
Even though Pascal has been discounted there still plenty floating out used for well under MSRP prices, I have listed here. GTX1070 I’ve seen for around $250 USD. However, I’ve seen RX580’s go for as lower as $125 and RX470’s for $85. Vega 56 is going between $270–$320. Even in the used market, AMD holds a better value to hashrate.
The Miners Analysis
The change to 0.9.3 ProgPoW algorithm clearly provides a favorable position to AMD than Nvidia. I’d assume this will be the spec going forward when it’s released to an Ethereum testnet Ropsten. This also provides a great insight into ProgPoW, because it’s just that, Programmable Work.
The numbers are there and while it’s clear to see that Nvidia has excellent numbers in ProgPoW compared to AMD. We all need to remind ourselves exactly what we are looking at. It’s not because ProgPoW is favored towards Nvidia, I hear this all too much, it’s because Nvidia is making all the right moves in the GPU sector compared to AMD.
AMD has been competing with Nvidia on the value front, undercutting Nvidia’s mid-range segment with excellent values time and time again. GPU mining is no different. As we move into the realization of new PoW algorithm for Ethereum many AMD miners, who built their farms based on Ethhash, are going to relearn the entire process. This will require revamp for many farms, which I personally can understand many are begrudgingly unwilling to undertake.
It’s going to come down to the miners, themselves, to figure out the best optimization for there farms. I hope my results here can serve as a guide for future miners and farms to optimizing power savings. From my own experimenting RX480 I was able to do about 11.4Mh/s @ 100w GPU~.
How would ProgPoW hashrate affect the Ethereum network?
From A theoretical standpoint, let’s just assume ProgPoW is implemented and the network hashrate drops by 1/2 of the current 140Th. We’d then be in 70Th range. This doesn’t account for what percentage of ASIC are currently on the network and other factors. So it could possibly go lower however any gap would quickly be filled by profit-seeking GPU miners. Until a balance would be achieved between difficulty and price.
Slower Sync and Validation because of ProgPoW Change?
I’ll let Andrea Lanfranchi explain this.
Validation of PoW nonce on node’s side is negligible. Less than 2ms per block on Ethash (on a veeery slow celeron type testing machine). At current figures (ie. 7450000 blocks) the time spent by a node to validate block sealing nonces for all blocks is roughly 4 hours. Assuming a ProgPoW nonce requires 1.5x the time for the same validation we’d get at most 6 hours for the very same task. This is the only “overload” the change of the algo could bring.
Reality is to sync a full node from scratch you need in average several days if not weeks so the marginal increment in PoW validation is less than negligible. The heavy lifting in node sync is validation of state (which is very IO resource intensive) which by no means is affected by the sealing algo.
The ticking bomb of doom
Since ProgPoW is set to reduce hashrates theoretically by 1/2. How will this affect the iceage/Difficulty bomb? The Difficulty bomb was currently defused again, Feb 25, with the upgrade to Constantinople. When the difficulty bomb is defused what happens is the baseline for doubling the value is changed to a different “zero” block. Zero for up to Byzantium, 3MM for Byzantium, and 5MM for Constantinople
The Difficulty bomb is a constant added to the difficulty adjustment. It starts out at 1, then doubles about every 2 weeks. Difficulty (1.7x10¹⁵) it takes a long time to impact the block times noticeably, about 4.28MM blocks for Byzantium to Constantinople. The question is will the resulting halving of the hashrate have Ethereum miners feeling the impact weeks earlier or years earlier?
According to Ifdefelse;
difficulty bomb adds 2^(periodCount — 2), where periodCount increments every 100,000 blocks so the absolute hashrate being 1/2 means the difficulty bomb hits roughly 17 days earlier at the assumed hashrate of 70Th.
To me, this causes a bit concern as a miner and while I personally believe the Iceage/Diff bomb need to be seriously reworked/looked at, it seems ProgPoW may bring the Diffbomb increase sooner than expected.
(Thanks to people in Ethereum Gitter channels for helpful information)
This analysis was purely for research reasons. It is not indicative of what will happen. When the change happens from Ethhash to ProgPoW what could we, as miners, and the developer's expect.
Now we all just must sit, wait and watch for Ethereums next move.
If your a miner and would like to help fund the Audit of ProgPoW please donate here: ProgPoW Audit
Disclaimer: I am an avid GPU miner, huge crypto fan, and hold Bitcoin and Ethereum. I run a farm of 100~ GPUs. So it’s in my best interests to test, bench, and follow everything GPU and PoW related.