Eth 2.0 Dev Update #42 — “Mainnet-Capable Testnet + Now Hiring!”

Raul Jordan
Prysmatic Labs
Published in
10 min readJan 13, 2020

Mainnet-Capable Testnet Launched

Our first, mainnet-capable test network for Ethereum 2.0 phase 0 launched with a genesis time of Thu Jan 09 2020 18:00:00 GMT-0600 Central Standard Time, marking a huge milestone for Prysmatic Labs. With 16,384 validators at genesis and a queue of over 10,000 in line beyond that, we were thrilled to see it function and get users trying to validate. Unfortunately, right when genesis triggered, disaster struck! Validators were not able to propose blocks for an entire 217 slots, causing all sorts of crazy issues, bottlenecks, and of course, us freaking out. The issue had to do with a bug in our configuration and our serialization library, SSZ that led to discrepancies between genesis state roots between the nodes in the network. You can read more about the issue here. After rolling out a fix to certain nodes and encouraging users to fetch the latest changes, we were able to get over the hurdle and blocks started happening right away. This is a massive learning experience for us, and ensuring genesis will never break again is one of our main priorities moving forward.

Now, our testnet has been running for 4 days at 19667 active validators with over 9000 more in the queue. With this number of validators, our sapphire testnet is largest, public ETH2 testnet of all time.

We’ll be ramping this up more over the coming days, improving sync and a lot more bug fixes to create an even more robust network. Become a validator today at https://prylabs.net

Merged Code, Pull Requests, and Issues

Mainnet-Style Optimizations to Core Logic

Given now we are running mainnet config, optimizations for run time have come really important. Any slowness could cause proposer to miss block proposal or attester to miss attestation submission. We are working to optimize on all fronts, starting from state transition to fork choice and to RPC responses. By using Jaeger for tracing and Flame graph for profiling, we were able to solve most of the performance issues. The beacon node run time has been running a lot smoother than just a few days ago when our testnet launched. We’ll continue to make progress on this and aim to have the best optimized beacon node. To follow our progress, check out: https://github.com/prysmaticlabs/prysm/issues/4508

Store Eth1 Deposit Logs for Node Restarts

Every time a Prysm beacon node launches, it has to read a bunch of logs from the eth1 proof-of-work chain corresponding to deposits into the Validator Deposit Contract to figure out which validators are part of the genesis state in the system. This is naturally expensive as it involves a lot of signature verification, reading via JSON-RPC, and various i/o intensive operations due to heavy writes to our database. Given our new testnet is following a mainnet-style configuration, users spinning up nodes have to wait for a long time every time they restart their process to read > 16,000 deposits from the eth1 Goerli testnet. We instead added the nice feature of persisting the last read logs so frequent node restarts will not harm UX. Unfortunately, parsing 16k deposits at the beginning is still time consuming and it is not very clear to the user what is occurring underneath the hood. We think this can be improved via more careful logging and time estimates based on the number of deposits to read and will be improving this flow over the next week.

Fuzz Testing Caches

Fuzz testing is an important method for ensuring systems distributed at scale can survive all sorts of inputs from the outside world. Say you have some function that does some important piece of logic but anyone can call it. You need to ensure it can handle junk data as well as it can handle proper, nicely structured data as its input. Fuzz testing is an approach that uses all sorts of junk, pseudorandom data to find edge-cases and panic scenarios in code. If a function can successfully pass fuzz tests, it is robust enough for public usage and developers can at least have confidence an incorrect input won’t kill their system. Our teammate Terence Tsao has been working on some cache fuzz testing so we can have more confidence in rolling out certain features to production here. We’ll be doing this a lot more over the coming weeks in our repo as a whole.

Validator Slashing Protection in Prysm

In order to be production ready, ensuring a stakers security is paramount. Ivan Martinez from our team has been working on adding validator slashing protection into Prysm, starting with preventing the validator client from committing a devastating slashable proposal. While slashable votes do not normally occur, (note: one dangerous way to get slashed is to have 2 clients running at the same time, for the same validator, be careful!) in order to be mainnet-ready, all ETH2 clients should protect their validators from committing anything slashable in order to prevent a stakers loss as much as possible.

After the work on preventing the proposer side is done, we will start working on making sure the attester side is protected as well.

Slashing Detection

Our work on slashing detection has come a long way from being a fully open research problem to actually detecting slashable events in our testnet! As a result of certain configuration problems and network instability, some of our own running validators produced slashable offenses in our previous iteration of our testnet release. The way this works is thanks to a service called the `HashSlingingSlasher` created by our teammate Shay Zluf. The Slasher subscribes to events from the beacon node regarding votes occurring throughout the network and raises alarms if there are any items that satisfy slashing conditions according to the eth2 specification. Slashing is very heavy on memory and data required to have a full picture of what constitutes an offense in the network, but our team has been working hard on ensuring some strong initial optimizations.

In terms of next steps, we need to wire up slasher such that it submits `ProposerSlashing` and `AttesterSlashing` objects to the beacon node, which will then be included in blocks by proposers attached to the node. After that, processing said slashing objects and allowing for whistleblower rewards will come into play. This whole process is a great candidate for an end-to-end test we can leverage our testing suite for. Stay posted for updates on this over the next few weeks.

Upcoming Work

Improving beacon chain sync

We grew really happy with the various initial sync optimizations we made for our previous iteration of our testnet, reaching almost 100 blocks per second after a combination of clever caching and certain tricks to speed up the process based on certain assumptions. However, our previous testnet would only run < 1000 validators, which is radically different than today. We ended up optimizing a ton of other parts of our beacon node to allow for us to run the genesis number of validators, 16384, according to the official eth2 spec, but that meant a lot of our initial sync code would suffer as a consequence. Currently, sync is painfully slow with many users reporting 0.5–0.8 blocks per second, with their CPU’s typically maxed.

Having a chain configuration as big as mainnet has uncovered a lot of other inefficiencies in our codebase. The biggest reason initial sync maxes computational resources for most people is due to having way too many DB writing operations fighting to acquire their corresponding mutex to complete. There are also lots of expensive for loops that cause expensive, sequential operations instead of batch calls in tandem with unnecessary verification of certain items. Given there are no perfect solutions (only trade-offs) to this problem, we have to find a good balance of caching and optimizations to get us where we want to be and allow users to sync within a reasonable time interval. Follow this issue thread to get a better idea on our progress.

Recursively fetching attestations’ voted blocks

One issue that’s been in the back of our minds ever since we had a testnet has been this pesky error log:

“Error: attestation points to block not found in the database”

This means we received a vote on a block from somewhere (via p2p, RPC, or any other form) which is voting on a block we have not yet seen. The right behavior if this happens is to broadcast a request for that block to the network and hold off on processing the attestation until we receive it. We unfortunately do not support this feature today in Prysm, but we have an open issue thread for its implementation here.

Better UX for onboarding new validators

Currently, onboarding a new validator takes a long time by design of the system. In previous runs of our testnet, we created a basic front-end that allows users to join as validators by following a 6 step process which culminates in a progress bar indicating how much they have to wait to be activated. This is not sustainable given how activation will really work in the system. Additionally, users have to complete each step sequentially before seeing instructions for the other steps. We want to revamp this whole onboarding flow to make more sense with our new testnet. Instead, we will be offering features such as email notifications for validator activations and enhanced tracking of validators in the future.

Improved documentation

Currently, our official documentation is maintained on Gitbook here. It has been the go-to source we point our users wanting to learn more about how to activate a validator in eth2. Unfortunately, Gitbook has its own full-fledged editing and collaboration system, making it a bit difficult for us to adopt given we are used to a fully open source environment. Instead, our docs expert Celeste came in with the suggestion of using Docusaurus, an open source alternative to Gitbook that uses Markdown by default, making our lives extremely easier. Docusaurus would also make it easy for us to host our docs under our domain name, making them easier to find through search engines. We’re looking forward to pushing this out to production soon as it will evolve into a more full-fledged docs portal for all things Prysmatic.

Now Hiring

We are currently hiring a full-time, remote software engineer to work on Ethereum 2.0. We’re offering a fully remote working experience, health insurance, and competitive compensation. Prysmatic Labs started in early 2018 after a few of us on our team were curious about contributing to blockchain technology and posted on reddit and various forums trying to find other similar individuals to contribute to open source code. We realized scalability was the biggest problem the Ethereum blockchain was facing and wanted to take a stab at it given there was little work on the matter happening at that time.

Our teammates have significant experience as software engineers having worked on cloud, networking, and systems design at enterprises such as Google and Riverbed Technologies, shipping mission critical products.

We all began as volunteers writing Go code to implement a minimal sharding specification for Ethereum during our free time, and were noticed by the Ethereum Foundation and a few other groups for the work we were doing. Today, we love to work at Prysmatic because of our motivation to build revolutionary technology while having fun doing so — the same reasons we started this project in the first place.

We’re searching for a highly-motivated, team-oriented, and inquisitive learner to build great software with us. If you’re passionate about cutting-edge technology, working with awesome tools such as Go, Kubernetes, Docker, and more, apply today at https://prysmaticlabs.com/careers.

Interested in Contributing?

We are always looking for devs interested in helping us out. If you know Go or Solidity and want to contribute to the forefront of research on Ethereum, please drop us a line and we’d be more than happy to help onboard you :).

Check out our contributing guidelines and our open projects on Github. Each task and issue is grouped into the Phase 0 milestone along with a specific project it belongs to.

As always, follow us on Twitter or join our Discord server and let us know what you want to help with.

Official, Prysmatic Labs Ether Donation Address

0x9B984D5a03980D8dc0a24506c968465424c81DbE

Official, Prysmatic Labs ENS Name

prysmatic.eth

--

--