A Note on Distributed Computing

Rushal Verma
Paper Readings
Published in
4 min readMar 19, 2018

I am curious to read about Distributed Systems as I am fascinated with these systems. So this post is about the paper titled “A Note on Distributed Computing” published in 1994 by Jim Waldo and others.

The paper deals with the objects in Distributed Computing, it’s context with Object Oriented Programming and it also suggests distributed application developers to what you should take into account before developing a distributed application.

Introduction:

The paper starts by stating that the current work in Distributed Computing is revolving around Object Oriented Programming and objects, specifically a unified view of objects: objects are defined by their supported interfaces & the operations they support. It also says that there are fundamental differences between the interactions of distributed objects and of non-distributed objects. Further, work in distributed object-oriented systems that is based on the model that ignores or denies these differences is doomed for failure, and could easily lead to an industry-wide rejection of the notion of distributed object-based systems.

Local Computing:

It deals with the programs that are confined in a single address space(local object invocation).

Distributed Computing:

It deals with programs that can make calls to objects in different address spaces either on the same machine or on different machines(remote object invocation).

The Vision of Unified Objects:

It assumes that the programmer doesn’t have to worry about the objects whether they are calling local or remote objects. There is a single paradigm of object use and communication used no matter what the location of the object might be, interface will take care of these implementation details.

This vision is centered around the following principles that may, at first, appear plausible:

1. there is a single natural object-oriented design for a given application, regardless of the context in which that application will be deployed;

2. failure and performance issues are tied to the implementation of the components of an application, and consideration of these issues should be left out of an initial design; and

3. the interface of an object is independent of the context in which that object is used.

But it says these principles are false, and tries to show why it is important to know the implementation details as well as the differences between local and Distributed Computing.

Déjà Vu:

The hard problems in Distributed Computing are not the problems of how to get things on and off the wire.

It says that the percentage of distributed applications in past have not increased and even after having a vision of unified objects it is not increasing because of the problems we face in development of distributed applications.

Problems in Local and Distributed Computing:

The paper goes on to define what are the toughest challenges of building a distributed systems:

  1. Latency;
  2. Memory Access;
  3. Partial Failure and Concurrency

Latency

The most obvious one is the time difference between local and remote invocation of the object, which could be four to five times in magnitude. Ignoring this can lead to performance issues. Paper suggests two ways, if we have to unify the objects:

  • Rely on the hardware to get faster with time in order to eliminate the difference in efficiency; and
  • Develop tools which allow us to visualize communication patterns between different objects and move them around as required. Since location is an implementation detail, this shouldn’t be too hard to achieve.

Memory Access

Another concerning difference is accessing the memory as the pointers in a local address are not valid in remote address space. There are only two choices for this:

  • the underlying interface handles all memory access; or
  • the developer must be made aware of the differences between the access patterns.

For unification, we need to rely on system to handle the issues of memory. We can achieve this by:

  • distributed shared memory; or/and
  • using the OOP’s paradigm which deals only with object references. This means that the transfer of data between address spaces can be dealt with by organising and disorganising the data by the layer underneath. This approach, however, makes the use of address-space-relative pointers obsolete.

The danger lies in promoting the myth that “remote access and local access are exactly the same” and not enforcing the myth.

The programmer is therefore quite likely to make the mistake of using a pointer in the wrong context, producing incorrect results. “Remote is just like local,” such programmers think, “so we have just one unified programming model.”

The programmer needs to understand the difference between local and remote access, making the programmer aware that the remote address space access is very different from local address space access.

Partial Failure and Concurrency

Partial failure is a central reality of distributed computing. Both the local and the distributed world contain components that are subject to periodic failure.

The paper says that in local computing, such failures are either total, affecting all the entities of application or detectable by a Central Resource Allocator(like, OS on local machine). But in case of Distributed Computing, one component(machine, network link) can fail while other components keeps working, i.e components are independent. There is no common agent that is able to track everything in the system.

The interface should deal with this partial failure and inconsistencies. The interface must be able to state whenever possible the cause of failure, and there must be interfaces that allow reconstruction of a reasonable state when failure occurs and the cause cannot be determined.

The question is not “can you make remote method invocation look like local method invocation?” but rather “what is the price of making remote method invocation identical to local method invocation?”

There are two ways to deal with Partial Failure and Concurrency:

  1. treat all interfaces and objects as local. The problem with this approach is that it doesn’t take into account the failure models associated with distributed systems and hence it’s indeterministic by nature.
  2. treat all interfaces and objects as remote. The flaw with this approach is that it over-complicates local computing. It adds on a ton of work for objects that are never accessed remotely.

A better approach is to accept that there are irreconcilable differences between local and Distributed Computing, and to be conscious of those differences at all stages of the design and implementation of distributed applications.

--

--