Image for post
Image for post

One of the biggest problems when developing Big Data applications is figuring out whether or not your components will interact nicely with each other.

Integration testing usually requires setting up a staging environment in the cloud, possibly duplicating the production environment. This is not only expensive, but also cumbersome for developing: I’d much rather have everything set up locally, use my favorite editor and not bother my DevOps team with setting stuff up, granting access, and all that jazz.

The solution seems obvious at first: build a docker environment that you can use to run tests locally!

This, however, presents several…

Image for post
Image for post

Here at Jampp we process and analyze large amounts of data. One of the tools we employ to do so is PrestoDB, which is a “Distributed SQL Query Engine for Big Data”. Presto comes with many native functions, which are usually enough for most use cases. Nevertheless, sometimes you need to implement your own function for a very specific use.

Enter the User Defined Functions (UDFs, for short). Writing one for the first time is not as straightforward as it may appear, mainly because the information to do so is very scattered around the web (and across many Presto versions).

In this blogpost, we present our JSON_SUM function, how we wrote it, and some of the lessons we learned along the way. …

Dante Pawlow

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store