Distributed Computing with Raven Distribution Framework (RDF)

Anirudh Rajiv Menon
RavenProtocol
Published in
4 min readFeb 15, 2022

The current release of Raven Distribution Framework (RDF v0.3)provides an easy to use library that allows developers to build mathematical algorithms or models and computes these operations by distributing them across multiple clients. This provides an increase in speed and efficiency when dealing with a large number of mathematical operations.

What is Distributed Computing?

Distributed Computing is the linking of various computing resources like PCs and smartphones to share and coordinate their processing power for a common computational requirement, such as the training of a large Machine Learning model. These resources or nodes communicate with a central server and in some cases with each other, such that each node receives some data and completes a subset of a task. These nodes can coordinate their computations to complete a large and complex computational requirement in a fast and efficient manner.

Distributed computing is a scalable, efficient and reliable approach to solving tasks that would prove tedious and time consuming to an individual system. Moreover, even large distributed systems can be cost effective as they can be implemented with nodes that are low-cost, mundane hardware configurations. To learn more, you can refer to the following link

How does the RDF facilitate Distributed Computing?

One can participate in the RDF either as a Developer or a Contributor. A Developer is someone who wants their computational requirement (like training an ML model or computing a complex algorithm) fulfilled. A Contributor allows their system to form a node in the distribution framework and contribute its processing power for the Developers’ requirements.

The RDF provides Developers with a library called RavOp which contains a large number of mathematical operations or ops on mathematical objects like scalars, vectors, matrices and tensors.

RavOp can be used to build complex algorithms and machine learning models which, when executed, will connect to Ravsock (RDF’s central server). Ravsock uses a scheduling algorithm to assign ops to the available Clients (Contributors) and coordinates between them. The scheduling algorithm converts the Developer’s code into subgraphs and sends them to the best suited idle clients based on a benchmarking check. Once all subgraphs are computed, the result is returned to the Developer.

Usage

1. Configure RDF

Make sure RDF is configured correctly and Ravsock server is up and running. Refer to the following article for details

2. Developer Side

The Developer has to build their Model/Algorithm by declaring a graph followed by the required ops. Ops can be created for various mathematical objects like scalars, vectors, matrices and tensors. The RavOp documentation can be found here.

The following is a simple example for using RavOp:

import ravop as R

graph = R.Graph(name='test', approach='distributed')

a = R.t([1, 2, 3])
b = R.t([5, 22, 7])
c = a + b

print('c: ', c())

graph.end()

The output of c() will get returned once the participating clients have calculated their assigned Ops.

The proper way of wrapping up ops in a graph is by calling graph.end() at the end of the code. This checks for any failed ops and lets the developer know.

A slightly more complex implementation can be found in distributed_test.py which is available in the RDF Github Repository.

3. Client Side

As of this release, distributed computing is supported only by RavJs, Raven’s Javascript client. The RavJs repository gets automatically cloned during RDF configuration.

  • Make sure Ravsock server is up and running.
  • In ravjs/raven.js file update the CID variable to a unique string before opening a new client.
  • On a new browser tab, open the following URL: http://localhost:9999/
  • Once connected, click on Participate button. This triggers the execution of a local client benchmarking code and returns its results to the server. The server utilises this data for optimising the scheduling algorithm.

The client will now dynamically receive groups of Ops from the server, compute them and return the results back to the server.

The greater the number of available clients, the faster will be the overall computation. For testing on a local machine, multiple Clients with unique CIDs can be opened on different browser tabs to observe RDF’s scheduling algorithm in action.

Conclusion

Our Distributed Computing tool can now be set up and tested on your own custom algorithms and models. Raven’s GitHub Repositories accept contributions from developers. We’ll be releasing new versions of RDF and related libraries on a regular basis.

Join our discord server to get updates on what comes next

Join us on Telegram

--

--