New nodes scoring on aleph Network

c.pascariello
Aleph.im
Published in
9 min readApr 13, 2023

--

At aleph.im, we are constantly working to improve the quality and reliability of our network. To that end, we are excited to announce a new scoring method for nodes on the aleph.im network. This new system will incentivize good behavior and penalize bad behavior, ultimately leading to a more robust and reliable network. In this post, we will discuss the new scoring method in detail and explain how it will benefit users of the aleph.im network.

TL;DR

New scoring method positively affects nodes decentralization

In an effort to continually improve the quality of the network and provide better insights for all participants, aleph.im has made some updates to its scoring method for nodes. This comes as part of an ongoing effort to ensure that all nodes on the network are operating at their best and to address any issues that may arise.

While unreliable nodes and others have run obsolete versions, aleph.im is committed to working with all participants to ensure the network runs smoothly and efficiently.

  • New metrics. A program regularly measures the status and performance of the nodes and publishes this data on the network. This program sends multiple requests to each node in order to evaluate how well it behaves.
  • New scoring. A score below 20% indicates that the node is dysfunctional and should not be used. A score above 80% indicates that the node is fully functional and behaves well. ****
  • New rewards. No reward is distributed when the score is below 20% (Grey ⚪️) . A direct reward proportion is distributed when the score is between 20% and 80% (Orange 🟠). The complete reward is distributed when the score is equal to or greater than 80% (Green 🟢).

We will closely collaborate with the community to resolve any concerns related to their node scoring within the next weeks, as several nodes presently have a score of zero. Please do not hesitate to contact aleph.im’s team through our official Telegram if you require any assistance.

Why a new scoring method?

In addition to introducing the new scoring method for nodes on the aleph.im network, it is important to understand how node operators can earn rewards. Node operators receive aleph tokens for keeping their node up and running. This incentivizes participation in the network and helps to ensure its continued operation.

The new method is designed to measure the decentralization of nodes on the aleph.im network. The more decentralized a node is, the higher its score will be. This incentivizes node operators to link to multiple resource nodes in different geographic locations, thereby improving the resilience and performance of the network.

Compared to the previous method, which was more simplistic in its approach when checking the status of nodes, the new scoring method is more advanced and comprehensive. It considers factors such as the number of linked resource nodes, the decentralization of the nodes, and performance metrics. This allows for a more accurate and nuanced assessment of node quality and helps to ensure that the most reliable nodes are rewarded for their contributions to the network.

Overall, the new scoring method significantly improved over the previous system. By incentivizing decentralization and rewarding reliable node operators, it will help ensure the continued operation and success of the aleph.im network.

While the new scoring method for nodes on the aleph.im network measures the decentralization of all nodes, it is important to note that, in the short term, this will primarily affect Compute Resource Nodes (CRN).

Improving the score of your nodes

If you are a node operator on the aleph.im network, and you have seen a significant change on the scoring of your nodes, you will need to update your system to improve your node’s score. Accordingly, to the Tokenomics update from November 2022, you must follow these new steps to increase rewards and recognition for your contributions to the network. Here are some tips to help you improve your node’s score:

  1. Keep your node up to date by running the latest version of the node software and installing all system updates. This helps ensure that your node is secure and performing optimally.
  2. Make sure your node is running on performant hardware. This includes having a fast enough CPU, enough RAM, and fast disk and bandwidth connectivity. Using high-quality hardware helps ensure that your node is performing at its best, which can lead to a higher score and increased rewards.

Materials and Methods

On the aleph.im network, the page at https://account.aleph.im/ provides a convenient way to view the nodes registered on the platform. This page retrieves the necessary data in JSON format from an AGGREGATE message available on the Aleph.im network. This message can be accessed from core channel nodes using the path /api/v0/aggregates/0xa1B3bb7d2332383D96b7796B908fB7f7F3c2Be10.json?keys=corechannel&limit=50

old scoring

When the page retrieves data from an AGGREGATE message, it allows for an accurate and current display of all nodes registered on the aleph.im network. This message provides essential information about both registered Core Channel Nodes (CCN) and Compute Resource Nodes (CRN) on the aleph.im network, such as their multiaddress (for CCNs) or URL (for CRNs).

Using this information, the status and performance of each node can be thoroughly analyzed, and a score can then be assigned to each node. The scoring process considers various factors, including node availability, performance metrics, and the number of linked resource nodes. This score provides valuable insight into each node’s reliability and contributions to the network, allowing node operators to identify areas for improvement and potentially receive increased rewards for their efforts.

new scoring

Metrics

To assess the performance and status of each node, a program continuously measures and evaluates its behavior. This data is then published as POST messages on the network using the type aleph-scoring-metrics.

To conduct this evaluation, the program sends multiple HTTP requests to each node, analyzing its behavior and performance metrics. Using this method, the program can accurately measure the availability and reliability of each node on the network, providing node operators with valuable insights into potential areas for improvement.

📘 The documentation for the new metrics can be found on the aleph documentation.

Common metrics

Several performance metrics apply to all types of nodes, regardless of whether they are Core Channel Nodes (CCNs) or Compute Resource Nodes (CRNs). These universal metrics help to ensure that all nodes on the aleph.im network meet the minimum standards for performance and reliability.

  1. Software version: We compare the node version to the latest version available. Node operators have a grace period to update their nodes to the latest release.
  2. Automatic System Number (ASN): Gives a rough estimate of the server's location. This helps us score the decentralization of the nodes.

Metrics for Core Channel Nodes

  1. Base latency: The base latency to respond to a request, measured by calling /api/v0/info/public.json (no processing on that page).
  2. Metrics latency: The latency to fetch public node metrics, measured by calling /metrics.json
  3. Aggregate latency: The latency to fetch a large aggregate, measured by calling /api/v0/aggregates/0xa1B3bb7d2332383D96b7796B908fB7f7F3c2Be10.json?keys=corechannel&limit=50.
  4. File download latency: The latency to fetch a 6.7 kB file, measured by calling /api/v0/storage/raw/50645d4ccfddb7540e7bb17ffa5609ec8a980e588e233f0e2c4451f6f9da6ebd

Metrics are only valid if the HTTP response code is a success.

The metrics for a CCN have the following form:

{
“measured_at”:1680715202.614388,
“node_id”:”5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03",
“url”:”http://12.13.14.15:4024/",
“asn”:12345,
“as_name”:”INTERNET-SERVICE-PROVIDER, AD”,
“version”:”v0.5.0",
“base_latency”:0.0545351505279541,
“metrics_latency”:0.05013394355773926,
“aggregate_latency”:0.03859257698059082,
“file_download_latency”:0.04321122169494629,
“txs_total”:0,
“pending_messages”:3430570,
“eth_height_remaining”:114822
},

Metrics for Compute Resource Nodes

Implementing the new metrics ensures that Compute Resource Nodes are reachable over IPv6, which is a significant measure towards enhancing the accessibility and resilience of the aleph.im network. By prioritizing IPv6 accessibility for Compute Resource Nodes, Aleph.im is making the network accessible to all users, irrespective of their location or network setup, thereby creating a reliable and accessible network for all. A significant number of the current CRN nodes are not utilizing IPv6 and, as a result, have a score of 0 in the interface.

  1. Base latency: The base latency to respond to a request, measured by calling /about/login. Should return HTTP code 401 Unauthorized.
  2. Diagnostic VM latency: The latency to call a common user program, measured by calling /vm/67705389842a0a1b95eaa408b009741027964edc805997475e95c505d642edd8
  3. Full check latency: The latency to run a collection of checks on the node, measured by calling /status/check/fastapi.

The metrics for a CRN have the following form:

{
“measured_at”:1680715253.669524,
“node_id”:”8cd07f3a5ff98f2a78cfc366c13fb123eb8d29c1ca37c79df190425d5b9e424d”,
“url”:”https://node01.crn.domain.org/",
“asn”:12345,
“as_name”:”INTERNET-SERVICE-PROVIDER, AD”,
“base_latency”:0.9623174667358398,
“diagnostic_vm_latency”:0.06729602813720703,
“full_check_latency”:0.5257446765899658
},

Scores

Once the metrics have been collected for each node, a global score is computed based on the collected data. The score is calculated as a value between 0 and 1 and is rounded to a percentage for display purposes.

A score of 0 indicates that the node is dysfunctional and should not be used, while a score of 1 indicates that the node is fully functional and operating perfectly.

The formula used to calculate the global score considers various factors, such as node performance and availability, and is continually refined to reflect the network's reality. Feedback from the community and node operators is especially welcome in this regard to help ensure the scoring system accurately reflects the network’s overall health and performance.

📘 The full documentation for the new scoring method can be found on the aleph documentation.

Aggregation over time

The scoring of a node is determined based on its metrics from the previous four weeks, and the score is published daily. This approach helps to reduce the impact of noise on the metrics and provides a stable score over time. As a result, the score indicates the overall behavior of the node and is not expected to change rapidly.

However, if a node performs poorly, it will have a lasting effect on its score. A certain degree of tolerance is built into the system to allow for short downtimes for maintenance without penalizing the score.

Each numeric metric is compared to a reference value using percentiles when computing the score.

Rewards

Contributors to the aleph.im network and its ecosystem, including node operators and stakers, are eligible to receive rewards.

The performance score of a CCN (Core Channel Node) impacts the rewards distributed to the node operator and stakers. Specifically, the following rules apply:

  • If the score is below 20%, no rewards are distributed.
  • If the score is between 20% and 80%, a proportion of the reward is distributed directly.
  • The full reward is distributed if the score equals or exceeds 80%.
  • The rewards distributed are not affected by the scores of other nodes in the network, but fewer tokens will be distributed from the pool.

Reminder: node owners who do not have three Compute Resource Nodes (CRN) linked, will still incur a penalty of 10% of the rewards for each unfilled CRN spot.

To illustrate, the base_latency of CRNs contribute to the node’s score in the following manner:

  1. The 25th percentile reflects the base_latency value below which 25% of the samples taken during the sampling period fall.
  2. The 95th percentile reflects the base_latency value below which 95% of the samples taken during the sampling period fall.
  3. If the node fails to respond, a default value of 100 seconds is assigned.
  4. A scaling factor of 1/2 is applied.
  5. The resulting value is bounded between zero and one.

The final formula for the contribution of the base_latency in the score is:

By taking the 25th and 95th percentiles, the base_latency value is calculated in relation to the distribution of samples during the sampling period.

The scaling factor of 1/2 adjusts the base_latency score accordingly, so the score reflects half of the measured latency in seconds.

📘 The full documentation for the rewards can be found on the aleph documentation.

Thanks and keep in touch

Join our live conversation on our Telegram Community Chat.

🌴 Linktr.ee | 🌐 Website | 🗞 Blog | 📄 Papers | 🐦 Twitter | 💬 Telegram |
💼 Linkedin | 💻 GitHub | 📒 Dev Docs | 🤖 Reddit

--

--