Presto with Kubernetes and S3 — Deployment

Published in

The Startup

7 min readMar 24, 2020

I have been working in the big data arena for more than ten years. If you ask me what is the most popular use case in this area I have seen so far, my answer is definitely SQL for big data. Everyone likes SQL. There are so many SQL for big data solutions, including Apache Hive, SparkSQL, Impala and Presto, just to name a few. Among these solutions, Presto is becoming my favorite, not only for its simplicity and perfomance, but also the design of disaggregating compute and storage. Using its connector mechanism, Presto can query many data sources where it lives, including S3 object storages.

Modern analytics system typically runs across a large number of servers and storage because of the huge amount of data it process. In such a large and dynamic system, traditional architecture with DAS (directly attached storage), e.g., the Hadoop stack, has challenges such as: 1) Complex to scale, operate and maintain 2) Hard to change compute-to-storage ratio 3) Many failure points.

Disaggregating compute and storage, like what Presto does, is the key to address these challenges. Disaggregation’s advantages:

Enable analytics as a service.
Scale compute and storage independently. Scale cluster capacity up and down.
Fast provisioning, deploy and upgrade.
Easy maintenance.

Disaggregation also makes it easy to deploy an elastic Presto cluster on Kubernetes. With Kubernetes, deploying and scaling up and down a Presto cluster is as easy as a single command. In this article, I will walk through the steps to deploy a Presto cluster with Kubernetes and query data in S3 *.

Build Presto Docker Image

I need a Presto docker image to get started. In my Docker image, I simply download and extract Presto server binary package, put minimum configurations, and finally add an entrypoint script to perform some necessary tasks before starting Presto server.

Because each Presto node needs to have a unique node ID, I set node.id to the pod’s hostname and append it to node.properties in my entry point script.

echo "node.id=$HOSTNAME" >> $PRESTO_HOME/etc/node.properties

Another reason I need an entry point script is that I want to put custom settings, especially the sensitive S3 keys, to the Hive Connector (required for Presto to query data in S3) in runtime.

# Add addtional S3 setting
cat /tmp/hive.properties >> $PRESTO_HOME/etc/catalog/hive.properties

# Add S3 credential from environment variable
echo "hive.s3.aws-access-key=$AWS_ACCESS_KEY_ID" >> $PRESTO_HOME/etc/catalog/hive.properties
echo "hive.s3.aws-secret-key=$AWS_SECRET_ACCESS_KEY" >> $PRESTO_HOME/etc/catalog/hive.properties

These additional settings are configured by Kubernetes during deployment. I will explain the details later. My Presto server image is available at:

docker pull uprush/prestosql-server:330

Deploy Presto on Kubernetes

A minimum Presto cluster on Kubernetes include three components: a Deployment for Presto coordinator, a Deployment for workers and a Service.

With this I deploy a Presto cluster on Kubernetes via a single command:

kubectl create -f presto-server.yaml

It creates a Presto cluster like this:

To scale out the cluster, I increase the number of Presto worker replicas spec.replicas in the spec file and apply it.

kubectl apply -f presto-server.yaml

I also want to have the flexibility to change Presto settings without having to rebuild the Docker image. In my spec file, I use ConfigMap to pass custom settings to Presto.

volumeMounts:
- name: presto-etc-vol
  mountPath: /opt/presto-server/etc/config.properties
  subPath: config.properties.coordinator
- name: presto-etc-vol
  mountPath: /opt/presto-server/etc/jvm.config
  subPath: jvm.config.coordinator

It is convenient to deploy a Presto CLI pod in the Kubernetes cluster to send SQLs to the Presto cluster. Here is my Docker file to build a Presto CLI image. From the CLI pod, I can connect to Presto service like this:

presto-cli --server presto.warehouse:8080 --catalog hive

Query Data in S3

I cannot emphasize enough on how much I like that Presto disaggregates compute and storage. My big data sits in a S3 object storage, or specifically in FlashBlade S3. Presto uses its Hive Connector to access data in S3. Hive Connector relays on Hive Metastore to manage metadata about how the data files in S3 are mapped to schemas and tables. This metadata is stored in a database, such as PostgreSQL, and is accessed via the Hive Metastore service.

The architecture looks like this:

Deploy Apache Hive Metastore

In order to deploy a Hive metastore service on Kubernetes, I first deploy a PostgreSQL as my metastore database. Because PostgreSQL requires block storage, I also use FlashArray, which is an extremly fast and simple SAN storage. I also use Pure Service Orchestrator, or PSO in short, to automate provisioning FlashArray as a persistent volume to the PostgreSQL pod.

Deploy PostgreSQL database

I like to use an operator for every long-running applications on Kubernetes as possible as I can. For PostgreSQL, I use this Postgres Operator from Zalando. The below is the spec to deploy a highly available (HA) PostgreSQL database with two instances backed by FlashArray (pure-block storage class).

apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: reddot-postgres
  namespace: default
spec:
  teamId: "reddot"
  volume:
    size: 100Gi
    storageClass: pure-block
  numberOfInstances: 2
  users:
    admin:  # database owner
    - superuser
    - createdb
    hive: []  # role for application hive metastore
  databases:
    metastore: hive  # dbname: owner
  postgresql:
    version: "11"

PSO here provides storage-as-a-service to the Postgres pods. Instead of manually creating a volume in FlashArray and mounting it to the pods, I automates these tasks by simply specifying pure-block storage class. PSO automates actual storage provisioning by calling FlashArray’s REST APIs.

Build Hive metastore image

Next is to build a Docker image for Hive metastore. In my Docker file, I download Hadoop and its s3a connector (as my data sits in S3), Hive binary and PostgreSql connector, and add a Hive metastore starting script. My Hive image is available at:

docker pull uprush/apache-hive:3.1.2

Initialise Hive metastore database

With this, I am ready to initialise my Hive metastore database. This is a one-time job, so I use Kubernetes Job.

apiVersion: batch/v1
kind: Job
metadata:
  name: hive-initschema
  namespace: warehouse
spec:
  template:
    spec:
      containers:
      - name: hive-schema
        image: uprush/apache-hive:3.1.2
        command: ["/opt/hive/bin/schematool"]
        args: ["--verbose" ,"-initSchema" , "-dbType", "postgres" , "-userName", "hive",
          "-passWord", "mypass" , "-url", "jdbc:postgresql://10.244.2.108:5432/metastore?createDatabaseIfNotExist=true"]
      restartPolicy: Never
  backoffLimit: 4

Deploy Hive metastore service

I am ready to deploy the Hive metastore service. Because my data sits in S3, I need to tell Hive metastore where to find my S3 bucket. This configuration is put in core-site.xml. Along with the regular metastore-site.xml, configurations are mounted to the pod through Kubernetes ConfigMap.

volumeMounts:
- name: metastore-cfg-vol
  mountPath: /opt/hive/conf/metastore-site.xml
  subPath: metastore-site.xml
- name: metastore-cfg-vol
  mountPath: /opt/hadoop/etc/hadoop/core-site.xml
  subPath: core-site.xml

I use Kubernetes Secret to pass S3 key settings to the pod. This makes it possible to manage my code in Git without exposing sensitive information.

env:
- name: AWS_ACCESS_KEY_ID
  valueFrom:
    secretKeyRef:
      name: metastore-s3-keys
      key: access-key
- name: AWS_SECRET_ACCESS_KEY
  valueFrom:
    secretKeyRef:
      name: metastore-s3-keys
      key: secret-key

With this, I start my Hive metastore service.

kubectl create -f metastore.yaml

Deploy Presto Hive Connector

The last task is to configure Presto to use the Hive metastore service to query data in S3. I have already included a basic hive.properties in my Presto image. For the other custom settings, I use ConfigMap. The mounted custom hive.properties will be added to the base /opt/presto-server/etc/catalog/hive.properties file by the entry point script.

- name: presto-catalog-vol
  subPath: hive.properties
  mountPath: /tmp/hive.properties

Again, sensitive S3 key setting is passed to Hive connector through environment variables by using Kubernetes Secrets.

containers:
- name: presto-coordinator
image: uprush/prestosql-server:330
env:
- name: AWS_ACCESS_KEY_ID
  valueFrom:
    secretKeyRef:
      name: metastore-s3-keys
      key: access-key
- name: AWS_SECRET_ACCESS_KEY
  valueFrom:
    secretKeyRef:
      name: metastore-s3-keys
      key: secret-key

Use Presto to Query Data in S3

After applying the above setups, I start a Presto CLI pod to run a test query against data in S3. In the below example, tpcds_sf1000_orc.store is a table pointing to my FlashBlade S3 bucket created by the Presto TPCDS Connector. I will describe the TPC-DS setup in the second part of this article.

presto-cli --server presto.warehouse:8080 --catalog hivepresto> select s_store_id, s_store_name from tpcds_sf1000_orc.store limit 3;
    s_store_id    | s_store_name
------------------+--------------
 AAAAAAAABAAAAAAA | ought
 AAAAAAAACAAAAAAA | able
 AAAAAAAACAAAAAAA | able
(3 rows)

I use Kubernetes Port Forwarding to expose my Presto cluster.

kubectl -n warehouse port-forward --address 0.0.0.0 service/presto 8080:8080

Presto dashboard is now accessable at localhost:8080 from the Kubernetes node.

Summary

With Presto on Kubernetes, and by putting data in S3, I was able to easily and quickly spin up and down multiple Presto clusters any time. For me, this is game changing in a large and dynamic environment. Kubernetes, FlashBlade, FlashArray and PSO together bring a simple, scalable and high performance disaggregated solution for running modern analytics systems like Presto. Kubernetes is the operating system for cloud era. It manages a computing pool across many servers for running applications in isolated containers. FlashBlade and FlashArray solve the data hub storage challenge (which is difficult!). PSO helps seamlessly integrate Kubernetes and FlashBlade/FlashArray, providing storage-as-a-service.

In the next part, I will be focus on performance. I will describe setups and observations of a Presto TPC-DS benchmark running in this environment.

Stay tuned.

* Some content and code in this article are based on my colleague Joshua Robinson’s awesome work.