Serverless and Recurrent Neural Networks with Fn, GraphPipe and TensorFlow

Ralf Mueller
Sep 17, 2018 · 12 min read
Capturing Time (Image © Ralf Mueller)

The last article First steps in serverless with marked the start of my journey into serverless computing. My first proof of concept in this area was quite promising so I have decided to continue on this path and do a couple more experiments. I have a set of use cases in mind where serverless architectures might be beneficial for certain integration scenarios that include Systems, People and Developers.


  • Function will be invoked with a conforming event. The vigilant reader might notice that I’ve been using CloudEvents in my previous example. This is not by accident, I’m envisioning an architecture that is based on standards and CloudEvents seems a natural choice here for multiple reasons; it is part of the Cloud Native Computing Foundation (although in Sandbox status at the time of this writing), it’s a simple but extensible data format, etc.
  • Function will extract the Data portion of the CloudEvent and then calls into a Machine Learning model for scoring.
  • Function will create a CloudEvent based response with the result of scoring against the Machine Learning Model.

As with my previous article, this is a very simple and contained use case. However it should give some ideas on what can be done in a larger context. Also, since I’m still a newbie in both the Go programming language and serverless, I’d like to keep the examples as small and simple as possible for the moment.

For the use case itself I picked something in the area of Time Series analysis, more precisely forecasting time series data. Time Series is a series of data points indexed in time order. Time Series analysis comprises methods of analyzing time series data to extract meaningful information. Time series are quite powerful and well known; there are a vast amount of literature and open-source tools to deal with Time Series Data and Analysis. Time Series are used in a variety of domains including:

  • Analysis of multi-sensor networks (aircrafts, nuclear power plants, manufacturing systems, etc.).
  • Forecasting of financial data: Stock, Mortgage, Utility price, etc.
  • Analysis and Prediction of complex IT systems. One prominent example here is Prometheus, which internally uses a time series database to store metrics reported by software systems. Prometheus is also part of the Cloud Native Computing Foundation. It offers only very basic time series analysis though, mostly by range queries and some sort of linear regression for forecasting.

Up until recently, the methods used for time series analysis and forecasting were pure statistical methods like:

With the availability of modern Machine Learning Frameworks like TensorFlow, Keras and others, it almost has become good practice now to predict time series data using Recurrent Neural Networks (RNN’s) or Long short-term Memory Networks (LSTM’s). I’m not going too deep into this discussion here and which technique should be used for which use cases. For this article I just wanted to pick an example with a broad range of use cases that isn’t too complex but also isn’t a “Hello Neural Network” kind of thing.

Machine Learning Environment

  • Anaconda with Python 3.x and a rich set of Python, Machine Learning libraries and Jupyter notebook. I’m not going into the details on how to install and configure Anaconda, there is already a plenty of material on this subject. You can download Anaconda here.
  • Jupyter Notebook. I’m a big fan of notebooks and for this example I’m going to use Jupyter notebook, which is quite popular in the Python world. In a future article involving Machine Learning, Graph or Databases I will switch to Oracle Data Labs Studio though since it offers a much richer notebook experience. Make sure to watch this Interactive Data Analytics and Visualization with Collaborative Documents video that my Oracle Labs colleagues put together.
  • TensorFlow. For this article we’re going to use TensorFlow as our ML implementation of choice. In subsequent articles I’m going to use a variety of ML and AI libraries, both open-source ones and Oracle specific ones.
  • GraphPipe. This is a protocol and collection of software designed to simplify machine learning model deployment and decouple it from framework-specific model implementations. GraphPipe was recently open-sourced by Oracle and can be cloned from

For the serverless infrastructure, I’m using Fn Project. Make sure to check my previous article on how to setup a complete Fn environment with docker compose.


shell commands to install conda packages

This will install the appropriate libraries into your Anaconda environment. In my environment, I’m using the following library versions:

TensorFlow: 1.10.0
Pandas: 0.23.1
Numpy: 1.15.1
Matplotlib: 2.2.3

The problem I had initially was that when importing TensorFlow into my Jupyter notebook, i got tons of exceptions with no clear indication of what was wrong. With some research, I figured that the dask library was too old. This was easy enough to fix.

> conda update dask

This caused an update of pandas as well and after this, things started working. You probably might not run into this with a brand new install of Anaconda. If you have Anaconda installed for quite some time, it is worth updating certain libraries to prevent some nasty issues.


GraphPipe Architecture

GraphPipe is NOT a new Machine Learning framework. GraphPipe is a machine learning model serving protocol and specification. It provides a simple and efficient reference model for serving ML models from TensorFlow, Caffe2 and ONNX and a specification based on flatbuffers for ultra-fast communication of a Client with the ML framework of choice. Client implementations exist for Go, Python and Java.

The GraphPipe Specification defines a thin protocol for Tensors (multi-dimensional array of data with a specific shape and type) based on flatbuffers for the following:

  • Request and Response of the GraphPipe Server
  • Metadata Request and Response

The protocol is intentionally kept simple so that new GraphPipe Server implementations can easily be developed. A single GraphPipe server comes in the form of a Docker image and serves a single model. This is in contrast to some generic ML services with a more involved and fat input/output contract, manageability, etc.

So in this regard I consider GraphPipe to fit well within a Serverless Architecture since it comes with a very efficient communication protocol based on flatbuffers, and is quite compact and contained so that it serves the short-lived serverless paradigm quite well.

Modeling the Time Series Use Case

Import required libraries

Import of required packages

Generate some test data

Create some random data and plot it

This will show a plot that might look something like this:

Sample Time Series Data

The situation shown here might be a typical IT resource utilization curve and we can clearly see an upward trend starting around 2007 and by forecasting the curve we might re act to an eventual over-utilization before it happens.

Prepare Data for ML

Prepare data for RNN and split into training and test data

Creating the RNN in TensorFlow

Prepare RNN, Optimizer and Loss function

Training the RNN model

Illustration of RNN Model Training and Testing

We did split the time series data into two pieces; one piece outside of the red rectangle (data from 2000 — roughly mid 2016) and the other piece in the red rectangle. We’re going to use the data outside of the red rectangle for model training and then we can test the model predictions against the data in the red rectangle in the left figure. The right figure is a visualization of Actual vs. Forecast where the blue dots represent the actual data (compare this to the curve in the red rectangle on the left, it is identical except its on a different scale). The predicted values from the RNN model are represented by the red dots and come actually quite close for a first round of training.

But first things first, lets continue with the training of the RNN model code. We start a new TensorFlow Session and iterate over the number of epochs and train the RNN on the data stored in x_batches and y_batches.

At the end of the loop we run the test of the RNN model by predicting the value given the input X_test.

Finally we’re going to store the model on local file system so that we can consume it with GraphPipe Server for TensorFlow.

Train the RNN model, predict using the test data and save the model to local file system

While we run the epochs, we can monitor the progress of the training by printing the value of the Mean Square Error to the console. A typical output might look like this:

Decreasing MSE by increasing number of epochs

The output shows a decreasing value for the Mean Square Error with increasing number of epochs. This shows that the RNN is improving with each epoch.

Testing the RNN model

Plot the comparison Actual vs. Forecast

The output might look something like this. Please note that the blue-dotted curve is a piece of the actual data from the initial time series data set, namely the piece in the red rectangle from the figure above. The red-dotted curve is the forecast which comes quite close to the actual curve.

Test of the RNN model by comparing Actual vs. Forecast

This isn’t bad for a first test. One might even further improve the RNN by running more epochs and/or increase the number of (hidden) RNN units in the RNN (variable hidden in the Jupyter notebook). Another parameter to play with is the learning_rate for the Adam Optimizer. In our example we selected a slow learning rate of 0.001.

Starting a GraphPipe-TF Server with the RNN model

docker run -it --rm \         
-v "$(pwd)/models:/models/" \
-p 9000:9000 \
sleepsonthefloor/graphpipe-tf:cpu \
--model=/models/rnn_ts_model.pb \

That’s it! We have successfully trained a Recurrent Neural Network in a few lines of Python code using TensorFlow and exposed it via GraphPipe Server for predictions!

Implementing the Fn function

Create the Shell Go Fn code and deployment descriptors

fn init --runtime go --trigger http gpfn

This will create a new gpfn directory that contains some Go function boilerplate and the required configuration file required to deploy the function.

Import required Go libraries

> go get
> go get
> go get

Next, the Go function (func.go) should import the required libraries.

Go main package and imports for Fn and GraphPipe

Adding the function handler

Implementing the Handler

  1. Convert the CloudEvent data into something that we can send to the GraphPipe Server.
  2. Call into the GraphPipe Server that serves our RNN time series forecasting model.
  3. Get the result from GraphPipe Server and transform it into a CloudEvent.
Fn function handler with CloudEvent handling and GraphPipe Server invocation.

Tweaking Gopkg.toml

branch = "master"
name = ""

Fn deployment file

Deploying the function to an Fn Server

> fn deploy \ 
--app gpfnapp \

You can now list the triggers and their corresponding HTTP endpoints of the ‘gffnapp’ app. Just issue the command below

> fn list triggers gpfnapp

which will should gives something like this…

FUNCTION        NAME            TYPE    SOURCE          ENDPOINT
gpfn gpfn-trigger http /gpfn-trigger http://localhost:8080/t/gpfnapp/gpfn-trigger

Last not least, we need to add a config for the variable GP_SERVER_URL that we’re using in our Fn Go Code. This config variable should be set to the URL of the GraphPipe Server.

> fn config function  \
> gpfnapp gpfn GP_SERVER_URL http://localhost:9000

We can check the config with the following command…

> fn list config function gpfnapp gpfn

and it should produce something like …

KEY           VALUE
GP_SERVER_URL http://localhost:9000

Testing the Function

Sample Input Cloud Event

To test the function, we can pipe the content of ce.json into the gpfn function using the Fn CLI…

cat ce.json | fn invoke gpfnapp gpfn

The output will create another CloudEvent with the result of the predictions. The numbers should be identical to the red dots in above diagram.


  • Bring an Event Hub into place (for example Oracle Event Hub) which consumes and publishes CloudEvent conforming events.
  • Work on a Kafka Consumer Interceptor implementation that takes CloudEvents from a Topic, evaluates the “destination” extension of the event and calls into a Fn function.
  • Explore the world of Flatbuffers and Protocol Buffers a bit more and use them as a protocol between the Fn world and some core complementing services (ex. DMN, Machine Learning, etc.). I see this as a promising architecture and a good match between the serverless and microservices world where there would exist some core microservices that could be consumed by functions with a ultra-fast and low latency protocol in between. This needs definitely further investigations.

There is so much more to do and investigate, interesting times ahead of us!



Fn Project

Learn about the Fn Project, containers, and serverless…