LFaaS Serverless in Action: Deploy Your First Data Process in 5 minutes w/JavaScript on LOC Studio

An Illustrated Quick Guide of Building Data Processes and Log Data Flows for Your Business

Alan Wang
FST Network
15 min readMay 12, 2022

--

Photo by Nicole Wolf on Unsplash

Author’s note: the steps and example code described in this article is deprecated for FST Network’s LOC is still activity evolving. Please refer to our official blog and documentation for latest updates.

In our previous article, we’ve overviewed a software architecture, LFaaS (Logic Functions as a Service) and Logic Injection, which are used in FST Network’s new data governance solution, Logic Operating Centre (LOC).

LOC Studio — the user interface of LOC — makes it incredibly easy for company developers to create and monitor data processes. But as the old saying goes, actions speak louder than words —nothing can be more illustrative than actually doing it yourself. A simple example is nonetheless useful for learning how LOC data processes work and what you can do with them.

Here’s what we are gonna do today:

  • Create and deploy a classic “Hello World” data process
  • Invoke the data process with a public API route
  • Log data trails with events — view data flows in Data Discovery

Note: this article is based on an slightly earlier release of LOC (new version coming in!) so there may be changes in the future. We will also provide more detailed documentation on our future website.

If you are interested in LOC Studio (including demo requests) or have any questions, please see the contact information at the end of this article.

Also: check out our online LOC Studio Handbook! There you can find more information, data process tutorials and LOC agent documentations.

“Hello World” Data Service

Photo by Frank Romero on Unsplash

Our “Hello World” service is very simple: it would return a JSON-formatted greeting message based on the user’s request.

For example, if the user send the POST request with the following JSON payload

The service would return a JSON like this:

If no name field was given, the message would be “Hello, World!” instead.

Of course, due to the nature of LOC data processes, the actual result would be embedded in a larger response. We will see this later in the article.

Project, Scenario and data process

Before creating any data process, you need at least one unit, one project and one scenario (hierarchies from top to down). This helps you to manage and group data processes from based on different business processes and use cases.

In our LOC Studio environment we already have an unit called Default Unit. Right-click it and select New Project.

Right-click the new project (I named it AW here) and select New Scenario:

Right-click the scenario (Hello World) and now you can add data processes:

The name and description of a data process does not really matter here, but preferably something that makes its purpose easy to understand.

After clicking Create, you should see a “Add Aggregator Logic” block appear like this:

Data Process Structure

We’ve mentioned in the previous article that a data process has to have at least one generic logic and only one aggregator logic. The generic logic would run first and aggregator logic is the last.

The Hello World data process has exactly two logics and we need to implement these codes. But don’t worry — it’s pretty straightforward actually.

Add Generic Logic

First click the “Add Logic” block in front of the aggregator block:

What’s the difference between Description and Comment? You can add a new comment everytime updating a data process or logic (as versioning documentation.)

Set a logic name (also doesn’t matter but preferably something unique, readable) then click the tab Logic Body:

Every logic has two parts, if OK and if Error. if OK is the "normal” logic and if Error is the error handler if something went wrong in if OK.

So let’s look at the if OK code:

The function has to be named as run and accepts a context parameter — it would be supplied by LOC at runtime with all the things you need, like request payloads and agents (built-in drivers for various data sources).

Remember we’ve mentioned in the previous article that agents are the only way for logic to “contact outside” — primary for ensuring the 3rd party code is safe. Our dev team has indeed been discussing letting users add custom libraries on certain conditions, but agents are already enough to handle most situations.

Here’s the breakdown:

  • TextDecoder is a built-in web API that is also available in the Deno runtime (see the LFaaS article). We wrote a little function here that can transform Unicode array to JavaScript string. (We don’t need to do that again once the data is in LOC.)
  • We load the request body (JSON data) from ctx.payload.http.body and convert it into a JavaScript object with JSON.parse.
  • Next we extract the name from payload. We used optional chaining and nullish coalescing operator here — if the user didn’t provide a JSON payload with name field (the attribute is nonexist so we get a undefined), the string “World” (the backup value) would be assigned to the local variable name.
Photo by Simona Sergi on Unsplash
  • Finally, we prepare the response object and store it in the LOC session store via ctx.agents.sessionStorage.putJson. We will let the aggregator logic send it; the generic logic is only responsible for preparing the response.

Agent functions are asynchronous so it is recommended to use await to wait until the agent operation is finished.

There are three LOC session store methods available: putJson, putString and putByteArray. You can store data in the form you see fit.

Next is the if Error code:

This part is pretty simple: if anything went wrong, we log the error message via ctx.agents.logging.error. For now we can’t see the log directly in LOC Studio, but you get the idea — you can do something like loading backup data or do rollbacks.

If one generic logic fail to run normally, the whole data process would fail. The rest of logic (including aggregator) will still be executed, but only with the if Error part — one failed and the rest of them will be failed too.

Now click Create on the logic window, and you should see the new generic appear, with indication of both if OK and if Error code are in place:

Add Aggregator Logic

We can now create the aggregator logic code in the exact same way:

if OK:

Here we use ctx.agents.sessionStorage.get to read the JSON object out of session store, which we stored it in the generic logic. Every data process has its own session store scope — the data is available between logic of the same data process but not between different data processes.

You can use events between data processes — but we will only discuss how to send events in this article.

Then we call ctx.agents.result.finalize to generate the final response, includes the status, the task ID of this execution as well as our greeting message.

Photo by Glenn Carstens-Peters on Unsplash

The status field here is not the HTTP status code — it’s merely for informing the caller that the data process runs ok. Everything in this JSON object can be customized. In practice, you should probably check if there are any errors “reported” by other logic.

if Error:

If something went wrong in the generic logic, both logic’s if Error handler would be executed. So we add a little error response in the aggregator so the user can know what’s wrong right after execution (but it probably won’t have the chance to be executed in our case anyway).

Now the data process would be like this:

Deploy Data Process

Once your data process has at least one generic logic, one aggregator logic and all the necessary code in place, the Deploy Data Process option would be enabled in the right-click menu:

Click it and the data process would be deployed. Simple as that.

The LFaaS architecture will automatically deploy your data process into the cloud, or more specifically, a Kubernetes cluster. See our previous article for details.

Single Data Process Execution

Before invoking the data process with an API route, we can in fact test it first with single data process execution.

You can only update the code when the data process is undeployed.

Create a JSON file with the following content (or change it to anything you like):

Drag the file into the window to upload it (LOC Studio will give you a preview):

The execution result would be like this:

Click the JSON icon on the right of Success status, and you can see the actual response from the data process itself:

The JSON response contains status, taskId and response fields, which corresponds what we’ve set in the aggregator logic.

Invoke Data Process via API Route

Photo by Maarten van den Heuvel on Unsplash

Of course, the data processes would be mostly useless if you can only invoke them inside LOC Studio. We can also deploy an public API route to trigger one or more data processes.

Deploy an API Route

Go to the API Route tab and click Create Folder:

Then right-click your folder to create an API route:

Set the API route as follows:

Remember to select POST as the HTTP request method (so that we can send JSON payload with it) and we select the API path as

Normally we would only use Sync mode which the client will wait for the full result. In Async mode, you will get the execution ID (task ID) and you’ll have to look up results later. We won’t discuss async API operations in this article.

Finally link up the API route to your data process:

The API route can trigger multiple data processes — they will execute by the order you’ve added them here. Also noted that only deployed data processes can be added.

Invoke the API Route

Now the public API URL is your LOC Studio’s API server path plus your API route. It would be something like this:

You can use any REST clients to invoke your service, like Postman or Insomnia (remember to use POST request and add the JSON payload):

Voilà! You just got yourself a fully functional, cloud-based API service, thanks to the magical LFaaS architecture, in a matter of minutes.

Now we see that the actual response from a LOC data process is actually more than what you’ve seen before — there are actual HTTP status code 200 with additional metadata. Our custom response (including the greeting message) are embedded under the data field. But the effect is still essentially the same.

Note that if a data process is successfully invoked — even though something went wrong — will always return “_status”:200 here, since the invoke action itself is valid. This is why you can consider to wrap your own error messages in the response.

Events: Data Trails in Action

Photo by israel palacio on Unsplash

Of course, LOC data processes are not just your regular FaaS functions. What makes them truly special is the ability to enable data lineage for your business.

Data lineage is the process of understanding, recording, and visualizing data as it flows from data sources to consumption. This includes all transformations the data underwent along the way — how the data was transformed, what changed, and why.

What is Data Lineage?

In LOC, this is realized with events. Do not be confused with events in HTML/JavaScript development. A LOC event plays two key roles:

  1. To exchange information between different data processes.
  2. To mark a data flow between a source and a target — who sent what information to whom? Where does the data go in your business process?

In short, it’s like sending memos to other people to let them know what happened during your task, what kind of data is sent by you to whom, etc. The LOC events essentially logs data trails of your business process, which can be extremely useful for data lineage and data governance purposes.

In order to demonstrate how LOC events work, we will modify our Hello World example a bit further.

Update Logic Code

Go back to Data Process interactive map tab, right-click your data process and undeploy it, then click the generic logic block:

Click Edit Logic on the top right, and insert the following code at the end of if OK function:

We are now using ctx.agents.eventStore.emit to send an event to the LOC event store — it allows you to send multiple events so the object(s) have to be in the form of an array.

In LOC, events will create nodes with names to represent data source/target or both. These node names are called DIDs (digital identity) — they do not have to have direct relationships with data processes or logic. Instead, the five fields of events can be customized in any way you like.

Photo by Jon Tyson on Unsplash

We also use JSON.stringify to encode the JSON object in a string and put it in the meta field. This can be a payload for any other data processes if they want the data.

Click Update. LOC will allow you to leave a optional comment for this change. It’s OK to leave it blank for now.

Right-click the data process to deploy it again.

Potential Labelings

If you now look at the data process, you might notice a tag EventGreeting appears beneath the generic logic, and a block called Potential Labelings shows the fields of the event:

Potential Labelings is a mechanism that auto-detects event schema in your code. You can now check if the code sends any events without having to review the code. Neat!

Writer’s Note: in our latest product development, Potential Labelings in LOC has been renamed to Event Schema, but the effect is the same.

Invoke the Data Process

Since you took down your data process and deployed it again, you either have to re-create the API route or update it.

Simply click Update should do the trick. Or you can remove the linked data process and add it again.

After done that, invoke the data process again via the API route. Remember the execution ID? We’ll need it to find our event that is just sent into the event store.

View the Event, Source and Target

Go to the Data Discovery tab and click Add filter, select field as Execution ID and paste the value:

We can see there is indeed an logge devent:

If you click the Data switch on top left to Graph, you would see a graph like this:

You may see nothing right after invoking the data service. This is because LOC needs a bit time to update itself.

This is a representation of the event EventGreeting in this particular execution or task — it was sent from a source called HelloWorld to a target named Thanos. This indicates the data flow happened between these two entities.

Photo by Mulyadi on Unsplash

In LOC, and nodes can be both source and target. Which means you can send events with Thanos as source and Earth as target, etc. It’s just as easy as snapping fingers, huh?

The Meta Payload

For other data processes, the name of the event, source and target are information of the event. But sometimes we want to send more stuff than these fields. This is where the meta field comes in.

You can click on the event name to inspect its details:

Notice that the meta field indeed contains the original JSON message that we’ve put in there. The maximum string length in meta is 2¹⁵ bytes, so it should be more than enough in most cases!

Logging Complex Data Trails

If you send multiple events to the same target, or send an event to a different target, the graph will grow depending on your search criteria:

The design of LOC events is incredibly flexible. You can change the source, target and event name to reflect different data flows in anyway you’d like. A data process can even send different events under different circumstances.

Although we won’t discuss how to read from events in this article, here’s a example result generated by a scenario with two data processes:

This is a data flows inside a restaurant kitchen — a staff is responsible for certain food/drink items and these items may be in certain orders. We can see how the data flows from one node to another.

Notice that no one is assigned to prepare salad? You would be able to find the missing links quickly with this visualized graph. Order 200, 201 and 202 won’t be stuck indefinitely just because someone forget to update the staff task list with latest menu. A timely response in the modern data world is what makes your company truly better than others.

Photo by ThisisEngineering RAEng on Unsplash

That’s it for today. More use cases, examples and references will be available on FST Network’s new website.

This article was written with help from FST Network’s product and dev team.

For more info about Logic Operating Centre (LOC) or demonstration request, please contact support@fstk.io.

--

--