# Creating a Work Token — Part I

*Worker Selection Algorithms*

*tldr; In this article we outline two approaches to Worker selection in a Work contract: Stake-weighted and Fair. Stake-weighted selection uses a probability proportional to the stake size, while fair has a uniform distribution. We found Stake-weighted worker selection gas usage, at best, grows linearly with the number of workers. Fair worker selection gas usage is constant, making it possible to support a large number of workers.*

# Introduction

A Work contract is a mechanism that determines who is able to participate as a “Worker” in a cryptoeconomic system. To become an eligible Worker, a user must stake tokens. When a new Job is available, a Worker is selected to complete it.

There are two approaches to the selection of Workers from Work Contracts. The first method is Stake-weighted Worker Selection, which permits a flexible stake size and selects workers using weighted probability. The second method is Fair Worker Selection, which ignores the stake size and instead selects workers using a uniform distribution.

# Stake-Weighted Worker Selection

This method uses the proportional size of a worker’s stake to determine the probability that they are selected for the next job.

For example: imagine there are two workers where one has staked 1000 tokens while the other has staked 2000 tokens. The workers can be mapped to the probability space using their tokens:

[0, 1000)[1000, 3000)

The space consists of intervals, whose length is determined by the number of tokens a Worker has staked. When we wish to select a Worker, we generate a pseudo-random number in this space and select the worker to whom the chosen interval belongs.

More formally, we are looking to sample from a discrete probability distribution. This is a well-known problem and there are a number of algorithms we can use, each with their own advantages and disadvantages. I will detail the linear search, binary search, and the alias method.

## Linear Search

Linear search is the most naive approach of the algorithms. Here we store a list of Workers that represents the order of the intervals in the space. When we wish to select a Worker we generate a pseudo-random number then iterate through the list to find the interval that contains it.

We assume we are using a doubly linked list with an index that maps Workers to list nodes.

**Time Complexity**

*Insertion:**Removal:*if we use a doubly linked list and index the nodes then we can perform removal in O(1) by using the index to look up the node and unlink it.*Search:*

**Space Complexity**

*Insertion:*O(1)*Removal:*O(1)

## Binary Search

This method improves on the Worker selection by sorting the list of Workers and subdividing the search space in iterations.

We assume that we are using an array to store the sorted intervals.

**Time Complexity**

*Insertion:**Removal:*First we find the element in the list, then we remove it and shift the elements of the array back. O(log2(n)) + O(n)*Search:*O(log2(n))

**Space Complexity**

*Insertion:*We would make n writes at worst case, so O(n)*Removal:*

## Alias Method

The alias method allows us to select a worker in constant time. The method works by aliasing elements that are weighted below average to ones that are above average. There are two arrays; the first stores the probabilistic weight of each worker, the second stores the alias of each worker.

When a worker is to be selected a pseudo-random number between 0 and 1 is generated. That number is normalized to the number of workers and rounded to the next lowest integer to select a worker from the first array. The remainder from the rounding is then used as a ‘coin toss’ to determine whether to select the first element, or it’s alias. If the remainder is less that the probability of the worker, then select the worker. Otherwise select the worker’s alias.

**Time Complexity**

*Insertion:*The algorithm is known to need O(n) to calculate the arrays*Removal:*We’d need to recalculate the arrays, so O(n)*Selection:*Selection always take the same number of steps, so O(1)

**Space Complexity**

*Insertion:**Removal:*O(n) to re-calculate the two arrays

## Limitations

If we start to consider space complexity in terms of gas, we can see the limitations of each algorithm using rough estimates.

As seen below, implementing a weighted selection algorithm isn’t feasible for high numbers of Workers with the current Ethereum gas limits (8,000,000). Linear search is fast to update, but the selection is expensive. At 1,000 workers linear search costs 200,000 gas, and that estimate only includes the SLOAD opcode. Binary search and the alias method don’t even begin to approach that efficiency; at 100 workers they’re already hitting 2,000,000 in gas costs.

For this reason, most Work protocols that utilize Stake-weighted selection limit the number of Workers `numWorkers = _numWorkers;`

# Fair Worker Selection

Selecting workers using an even probability simplifies the cost dramatically. By removing the weighted probability we no longer need to search to find the matching interval, instead we can simply pseudo-randomly select a worker from an array.

The most equitable way to structure such a Work contract is to explicitly define the stake size per worker `require(msg.value == workerStake);`

This, admittedly, has the down side of requiring Workers who hold large amount of stakes to run multiple Worker accounts and would disqualify Worker candidates who do not meet the minimum required stake.

## Array Storage

Assume we store the workers in an array and have a mapping from worker to array index.

**Time Complexity**

**Insert:**We can push a new worker onto the end of the array in O(1)**Remove:**We can find the array index using the mapping, then put the last element into that slot in O(1)**Select:**We generate a pseudo-random number in the range [0, n) and use that to index the workers array in O(1)

**Space Complexity**

**Insert:**inserting one element would write once onto the end of the array O(1)**Remove:**Removing one element will clear the old value and transfer the last in O(1)

# Conclusion

We have outlined two broad categories of Worker selection: Stake-weighted and Fair. Stake-weighted worker selection selects workers with a probability proportional to the size of the Worker’s stake, while in Fair selection all Workers have an equal chance of being selected. We found that the gas costs of stake-weighted selection algorithms increase along with the number of Workers, whereas Fair selection gas costs remain constant.

Some stake-weighted approaches limit the number of workers, such as LivePeer which limits it’s number of workers to twenty. If the number of workers is small or capped to a few hundred workers or less, then a stake-weighted algorithm may be feasible. Gas costs for linear search would only be in the hundreds of thousands. If you select a worker far more often than you update the workers an algorithm such as the alias method may be more gas-efficient.

Fair worker selection, on the other hand, can support an unlimited number of workers (potentially). Its only drawback is that it’s not possible to favor particular workers.

In summary, the worker selection algorithm you choose will be highly dependent on your use-case. It’s important to consider the limitations of the algorithms before deciding on which one to use.