Introducing Red Sift, a serverless, full stack, polyglot compute platform

Michael Karliner
Red Sift Outbox
Published in
3 min readMay 10, 2017

Summary

RedSift is a serverless, full stack programming platform for machine learning, IoT and advanced analytics. Applications can be written in a mix of languages, and security is provided by per user data encryption keys and containerized code modules. While code is run in the cloud, development is done locally and deployment is simply by commiting code to Github.

Data Flow Programming

RedSift provides a data flow style programming model, code resides in containerized nodes which run independently of each other and communicate through key/value data stores. Each node has exactly one input and multiple outputs; data changing on the input of a node triggers it’s execution. Nodes are stateless and have no side effects within the system, although they are allowed to contact external systems for data retrieval and asynchronous processing.

The definition of the overall processing graph is defined in a json file with a simple syntax, sift.json, the SDK provides a visualisation tool to aid in checking the graph.

Node to node communication

While nodes only have one input, and only one store can trigger a node’s execution, a node’s input may be made up of data from more than one store, and there are a variety of mechanism’s analogous to SQL selects and joins that allow for a rich variety of map/reduce, aggregation, queues and other processing models. Stores also have the ability to have multiple keys per entry, and support wildcard selections to further enhance the processing models.

RedSift programming model (not to scale)

External Data Sources

There are currently three different mechanism’s for feeding data into RedSift, webhooks, email feeds and Slack messaging. Support for the Microsoft Bot Framework, which supports a wide variety of messaging systems, will be released shortly.

External processing

It is sometime necessary to interrupt a data flow to recover data from an external source, such as a web feed, or use specialized processing systems, like GPU based neural networks. In this case nodes may return Javascript Promises (or the equivalent in other languages) instead of directly outputting to the key/value store, to allow for such asynchronous operations. Once the Promise is resolved by the external request completing, RedSift updates the relevant key/value store, thus triggering the next node.

Polyglot processing

RedSift nodes can be implemented in several different languages, Javascript, Python, Java, Clojure or Julia are currently supported. As the execution environments are Docker containers, support for other languages and code environments (eg Python SciPy), is as simple as creating a new Docker image that conforms to the RedSift conventions.

Web Client Support

RedSift directly provides a mechanism to directly support real time, single page web clients. All RedSift applications allow for the definition of browser code that is integrated with the overall dataflow model. A special kind of data store, called an export, dynamically mirrors the data into the browser and stores it there in an HTML5 IndexedDB database. RedSift client library code provides simple publish/subscribe mechanism to drive the most modern client side frameworks. At this point React and Vue.js clients have been written, but RedSift is fundamentally framework agnostic.

Deployment and Publishing

RedSift applications are deployed by commiting to a Github repository. RedSift’s web dashboard will detect the commit and allows end users to update to the new version at will. Developers may opt to publish their applications, which will allow all users to browse them on our app store.

Scaling

RedSift operates as a serverless cloud infrastructure service. As nodes are constrained to be stateless and run in containers, and the data is kept in a distributed key/value data store, RedSift provides inherently good scaling for your application, with no need to worry about VM sizes or service levels.

Security

Each instance of a RedSift application has its data encrypted with a separate key, thus sandboxing even a user’s applications from each other. Additionally, all node code, even within in an application runs in separate Docker containers, increasing the isolation further.

Pricing

Developer registration, creation and hosting of applications are all free. Currently, we are also providing free execution of applications, but when charging does start, it will be on a pay per use basis, charged to the end user of an application rather than its author.Introducing RedSift

RedSift is available for use now. Go to https://docs.redsift.io to find out more.

--

--