Introducing Red Sift, a serverless, full stack, polyglot compute platform
Summary
RedSift is a serverless, full stack programming platform for machine learning, IoT and advanced analytics. Applications can be written in a mix of languages, and security is provided by per user data encryption keys and containerized code modules. While code is run in the cloud, development is done locally and deployment is simply by commiting code to Github.
Data Flow Programming
RedSift provides a data flow style programming model, code resides in containerized nodes which run independently of each other and communicate through key/value data stores. Each node has exactly one input and multiple outputs; data changing on the input of a node triggers it’s execution. Nodes are stateless and have no side effects within the system, although they are allowed to contact external systems for data retrieval and asynchronous processing.
The definition of the overall processing graph is defined in a json file with a simple syntax, sift.json, the SDK provides a visualisation tool to aid in checking the graph.
Node to node communication
While nodes only have one input, and only one store can trigger a node’s execution, a node’s input may be made up of data from more than one store, and there are a variety of mechanism’s analogous to SQL selects and joins that allow for a rich variety of map/reduce, aggregation, queues and other processing models. Stores also have the ability to have multiple keys per entry, and support wildcard selections to further enhance the processing models.
External Data Sources
There are currently three different mechanism’s for feeding data into RedSift, webhooks, email feeds and Slack messaging. Support for the Microsoft Bot Framework, which supports a wide variety of messaging systems, will be released shortly.
External processing
It is sometime necessary to interrupt a data flow to recover data from an external source, such as a web feed, or use specialized processing systems, like GPU based neural networks. In this case nodes may return Javascript Promises (or the equivalent in other languages) instead of directly outputting to the key/value store, to allow for such asynchronous operations. Once the Promise is resolved by the external request completing, RedSift updates the relevant key/value store, thus triggering the next node.
Polyglot processing
RedSift nodes can be implemented in several different languages, Javascript, Python, Java, Clojure or Julia are currently supported. As the execution environments are Docker containers, support for other languages and code environments (eg Python SciPy), is as simple as creating a new Docker image that conforms to the RedSift conventions.
Web Client Support
RedSift directly provides a mechanism to directly support real time, single page web clients. All RedSift applications allow for the definition of browser code that is integrated with the overall dataflow model. A special kind of data store, called an export, dynamically mirrors the data into the browser and stores it there in an HTML5 IndexedDB database. RedSift client library code provides simple publish/subscribe mechanism to drive the most modern client side frameworks. At this point React and Vue.js clients have been written, but RedSift is fundamentally framework agnostic.
Deployment and Publishing
RedSift applications are deployed by commiting to a Github repository. RedSift’s web dashboard will detect the commit and allows end users to update to the new version at will. Developers may opt to publish their applications, which will allow all users to browse them on our app store.
Scaling
RedSift operates as a serverless cloud infrastructure service. As nodes are constrained to be stateless and run in containers, and the data is kept in a distributed key/value data store, RedSift provides inherently good scaling for your application, with no need to worry about VM sizes or service levels.
Security
Each instance of a RedSift application has its data encrypted with a separate key, thus sandboxing even a user’s applications from each other. Additionally, all node code, even within in an application runs in separate Docker containers, increasing the isolation further.
Pricing
Developer registration, creation and hosting of applications are all free. Currently, we are also providing free execution of applications, but when charging does start, it will be on a pay per use basis, charged to the end user of an application rather than its author.Introducing RedSift
RedSift is available for use now. Go to https://docs.redsift.io to find out more.