Imagine this: in 5–10 years your kid comes up to you and says: “Dad (Mom), yesterday I saw a program on the TV about a scary disease from the past called AIDS (cancer, influenza — you name it). And now we can cure it! And people not get them anymore!” And you answer: “I know, right? I participated in the development of the cure”. Sounds like a fantasy? Well, it’s not one! Even now, every human in the world can participate in the development of a great many of science projects. This is possible thanks to volunteer distributed calculations.
Parallel computation principle is one of those which virtually “hang thick in the air” It is natural, because every work is better done in cooperation. Parallel computations have appeared long before the first electronic calculation device, though it was the computer age when the idea flourished, because of the appearance of the tasks which require big computational capacities and devices all over the world which could provide such capacities. And upon the appearance and boom of the Internet, the idea of volunteer connecting PCs of regular users to organize parallel computations has become more and more popular.
In 1994, David Gedye suggested the idea of using in calculations a combined network of PCs, users of which voluntarily connect to the network constantly or periodically. Data from the telescopes is fractured into thousands of independent noises and these noises are sent on certain frequencies to the computers, users of which have agreed to help in search of extraterrestrial intelligence.
In January 1996, the GIMPS project started, which aimed at finding Mersenne numbers.
On January 28, 1997, the RSA Data Security contest started, aimed at solving the task of hacking by simple enumeration of the 56-bit information encryption RC5. Thanks to a good technical and organizational preparation, the project created by a non-profit community distributed.net, quickly became popular.
The SETI project dates back into 1959, when Giuseppe Cocconi and Philip Morrison published their article entitled “Searching for Interstellar Communications” in the international science journal “Nature”. And in 1999, astronomers from the California University in Berkeley demonstrated a band new approach to the problem: they launched the SETI@home project, which was based on a vast number of computers provided by volunteers.
And after that, more and more projects based on volunteer calculations started to develop.
These projects offer volunteers to install programs on their computers. Said programs are written in such a way that they spend resources when computer is in idle mode. The principle is similar to the screensaver one. When the user proceeds their activity, for instance, launches a video game or converts a video, the program “goes to sleep”, waiting for another idle moment. A task, requiring considerable computational capacity, is separated into simpler parts and is sent to different people. The process can be briefly demonstrated by Fig.1.
This scheme has a number of flaws, one of which is the situation occurring if the client program does not establish contact after having received the task. This can be caused by the participant’s losing the interest for the project. A more dangerous problem is the probability of sending wrong results by the client program for various reasons.
The first problem is solved by establishing the deadlines for the client program’s response.
The second problem is solved by sending the same task to several client programs. For each task, the number of duplicate performers is calculated separately, depending on the task, but by default it equals five. When the client program re-sends the result, the server compares it with the previously received answers and the ending result. The one that matched the most times is accepted, the others are rejected; the least number of matches is also calculated for the task, by default it equals three.
For volunteer projects, program platforms were developed, which consist of two parts: the server and the client. Here are the main ones:
- Apache Hadoop — project by the Apache Software Foundation, the set of freely distributed utilities, libraries and program frame for developing and executing the distributed programs, working on the clusters from hundreds of and thousands of nodes. It is used for the realization of searching and context mechanisms of many highly-loaded websites, including Yahoo! and Facebook. It is developed as a part of the computational paradigm MapReduce, according to which the application is divided into a huge number of equal simple tasks which are completed in the nodes of cluster and are lead to the end results naturally.
- Condor is a cluster system created by a team of developers from the Wisconsin-Madison University, where its first configuration was launched more than 10 years ago. Presently, the university has 350 tabletop UNIX stations, which are connected to the Condor network and provide access for the users from all over the world.
- Globus Toolkit is a set of programs, which makes creation and managing the distributed computations considerably easier. It is represented by a number of modules for constructing the virtual organization of distributed computations. Each module determines the interface used by high-level components can be realized in various execution environments. Their combination forms a virtual machine called Globus.
- BOINC (stands for Berkeley Open Infrastructure for Network Computing) is an infrastructure distributed under the LGPL license. BOINC is a program complex for quick organization of distributed computations. It consists of two parts: the server and the client. Initially it was developed for the largest volunteer computing project — SETI@home, but later the developers from the California University in Berkeley made the platform available for side projects. Presently, BOINC is a universal platform for the projects from the fields of mathematics, molecular biology, medicine, astrophysics and climatology. BOINC grants the researchers the opportunity to utilize vast computational capacities of the PCs all over the world.
We will give a more detailed review of some of the projects based on volunteer computing in our next article.