Decentralized Computing Language: Features and Goals

John Lakness
Decentralize.Today
Published in
6 min readApr 8, 2018
Credit: https://www.youtube.com/watch?v=lFgnrd3G81U

This is part of a series of posts introducing DCL. The first post on April 1st, Decentralizing The Flat Earth Society, introduces some of the motivating high-level problems which this paradigm aims to solve simply and generally. This post goes into more detail about the specific features and goals from the programmer’s perspective.

Preface

This document is the result of ideas that have been in my head since I started writing computer programs 25 years ago, mostly with no understanding or awareness of computer science discipline whatsoever. Most concepts stem from frustration with common useful computing tasks that are difficult or inefficient or failure-prone or insecure. The similarly frustrated should hopefully feel some relief, and the more optimistic will also hopefully recognize great opportunities to do things that they had not previously considered. I wish to present these things in such a way as to satisfy both. Of course there are trade-offs to some features of this paradigm, and in these too I have attempted to select tradeoffs which will be more advantageous as both computing capabilities and scale advance. I must also give general credit to the several existing paradigms that have already demonstrated each component of this separately, but there are too many to list. It is only my intention to unify them coherently. Note that I have intentionally avoided semantics which are specific to particular implementations of other languages which may be too broad for applicability to this paradigm.

Features and Goals

No Repeated Computation

Consider the case where an algorithm is run on the same input twice. There are two ways to handle this situation. Either the execution engine re-runs the algorithm each time, or it stores the result and recalls it. The selection of the most efficient path should be transparent to the programmer, and applicable at any level of locality. Not just in a loop or recursion, for instance, which is common in dynamic programming, but even across programs and time and even remotely. This should be applied transparently without any compiler directive or shared knowledge between calls.

No Repeated Storage

Except for intentional redundancy, the same data or code should not be stored twice at the same level in the cache hierarchy. Even slightly modified versions of data or code should not be repeated if it is more efficient in terms of resources to store the original and apply modifications lazily.

Efficiency

Unused code or data should not be loaded. Unused results should not be computed. This should apply at every level of locality. If a program is large, but only sections of it are used by a particular device, only that code should be downloaded and run. The same goes for remote data. There should be no special handling for this to occur.

Libraries

A program should not have separate dependencies that must be installed separately. It should just run. If a programmer includes the use of a complicated program within the program, it should be automatically loaded by the execution engine wherever it is run. And still, the code should not be repeated or installed prior to use. It should be found and loaded from the most efficient location as needed.

Versioning

Software versioning should not impede computation. It should not be possible to perform a computation that gives a different result based on which version of code was loaded from a dependency. Similarly, if a particular piece of code exists on a machine, regardless of its version, it should run as expected without error. Note: Software versioning causes exponential growth in the number of libraries loaded as the number of libraries used. It is a ticking time bomb for javascript dependency management in web page loads.

Parallelization

The execution engine should parallelize automatically based on knowledge of common state modifications. Any code that is not modifying the same element of state should be parallelized automatically. If two processes modify the same state, the execution engine should maintain a convergent state and kill off divergent branches automatically. In this way atomicity is guaranteed without lockup.

Scale

The same principles of efficient computation should be applied across scales without any special handling. Local processes should not be handled differently than remote threads, and remote data should not be handled differently than memory-resident. The execution engine should manage locality for efficiency, and it should not be a concern to the programmer whether data is in CPU cache or a remote location on the internet. It should also not matter where the actual code is executed on the data, so long as the result is returned to the requesting thread.

Security

Data requested should always match expectation. Data modification should not affect this. Logic exploits aside, most security vulnerabilities are the result of mutable state across execution boundaries. For instance, corruption of the stack, or writing to another process’s memory.

Privacy

Data should have authorizations automatically applied in the request process not by local user permissions, which are exploitable by permissioned execution, but by a combination of execution path and access token. This allows any program to access an authorized execution path on a data source without giving access to the data itself or an arbitrary set of permissions. If arbitrary code execution is requested, there data should have policy applied such that execution may happen inside of a protected boundary and results by operation are specifically authorized to be returned across the boundary.

Licensing

Some elements of software licenses should be built into the compiled code, and the execution engine should enforce or at least notice conformance. For instance, when calling a library with copyleft provisions, it can provide a certificate of the execution path that the results of the function are used within and send the certificate to the library owner. This could be an explicit provision of getting results from a library through remote execution. If there is an access key required, it should be built into the call rather than authorized by some parallel mechanism.

Performance

This is a hopeful consideration, but this should enhance high-level system performance despite huge sacrifice in low-level task performance. The performance objective is ease of optimizing for scale. The low-level task performance will suffer essentially because the execution engine itself will doubtlessly be massively complex compared to most existing scheduling and memory management overhead.

Compiling

The written code should be a near-exact representation of the compiled code. The compiler should be a one-to-one translation without optimization. Optimization should be performed directly within the program design tool and fully visible to the programmer, or by the execution engine at runtime.

Development

Because it is not a lexical language, nor an abstract compiled language, it should be simple to view the data structures and algorithms as graphs, and edit them visually. I’m calling here for the end of lexical programming languages. The idea that we can program a computer by talking to it with human words in a kind of intermediate language is a poor abstraction of reality. A computer program is nothing but a bunch of bits which are modified in some way by logical instructions. There is no such thing as a variable or object or applicative functor or any other lexical abstraction we’ve used to try to tighten the relationship between language and hardware. All of these concepts, to the extent they are useful, should be reflected as a natural part of the development experience without pedantism.

Cooperation

Bear with me; this is the most outlandish aspiration yet, and perhaps far more complex than the core language, but maybe you’ll see into my world a bit. All computers should be cooperating with all other computers. To do this, they need to be able to trust the results they are getting, and they need an economic incentive to cooperate. The execution engine should be able find the most efficient way to get the correct result through cooperation and trust amongst all other computers on the internet. This should be automatic, default, and transparent to the programmer.

--

--