SQL Editor for Apache Flink SQL

Romain Rigaux
Data Querying
Published in
4 min readMay 7, 2020

This is the very first version of the SQL Editor for Apache Flink.

Update January 2021: new version of the Flink SQL Editor

The goal is to demo how to execute Flink SQL queries. We use the new Flink SQL gateway project and point to a Flink cluster with live data in a docker container. Hue is used as the SQL Editor for querying Flink tables.

Feel free to read more about Flink SQL and continuous queries.

Setup

Any Flink 1.10 cluster would work and the demo is based on Ververica SQL Training which has some easy setup instructions:

git clone https://github.com/ververica/sql-training.git cd sql-training docker-compose up -d

Then http://localhost:8081 should be up.

Here we start a SQL client container and install the gateway inside (to avoid installing a local Flink as the gateway needs a FLINK_HOME) but this could be done locally or in another containers.

docker-compose exec sql-client bash

We grab a release of the gateway:

cd /opt wget https://github.com/ververica/flink-sql-gateway/releases/download/v0.1-snapshot/flink-sql-gateway-0.1-SNAPSHOT-bin.zip

unzip flink-sql-gateway-0.1-SNAPSHOT-bin.zip
cd flink-sql-gateway-0.1-SNAPSHOT echo $FLINK_HOME

Then from another shell we copy the Flink SQL config to the gateway so that we get the demo tables by default:

wget https://raw.githubusercontent.com/romainr/flink-sql-gateway/master/docs/demo/sql-gateway-defaults.yaml docker cp sql-gateway-defaults.yaml flink-sql-training_sql-client_1:/opt/flink-sql-gateway-0.1-SNAPSHOT/conf/

Now we can go back to the shell in the container and are ready to start it:

cd bin ./sql-gateway.sh --library /opt/sql-client/lib

Putting the server in the background with CTRL-Z and then:

And now we can issue a few commands to validate the setup:

curl localhost:8083/v1/info 
> {"product_name":"Apache Flink","version":"1.10.0"}
curl -X POST localhost:8083/v1/sessions -d '{"planner":"blink","execution_type":"streaming"}'
> {"session_id":"7eea0827c249e5a8fcbe129422f049e8"}

Note

It “should” also be possible to deploy the SQL gateway not in the SQL client container by:

  1. Having a local Flink binary package with FLINK_HOME configured
  2. Updating the jobmanager.rpc.address to the real jobmanager address in $FLINK_HOME/conf/flink-conf.yaml
  3. Changing the two address properties in sql-gateway-defaults.yaml
server: 
# The address that the gateway binds itself.
bind-address: 172.18.0.7
# The address that should be used by clients to connect to the gateway.
address: 172.18.0.7

Query Editor

As detailed in the connector section of Hue, we add a Flink interpreter:

[notebook] [[interpreters]] [[[flink]]] 
name=Flink
interface=flink
options='{"api_url": "http://172.18.0.7:8993"}'

If we are setting up the gateway in the client container and want to access it via your localhost, we need to update its bind IP with the IP of the SQL client container.

The IP of the gateway service is the one of the running container. We inspect the flink-sql-training_sql-client_1 to retrieve its IP:

docker ps > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
> 638574b31cd6 fhueske/flink-sql-training:1-FLINK-1.10-scala_2.11 "/docker-entrypoint...." About a minute ago Up About a minute 6123/tcp, 8081/tcp flink-sql-training_sql-client_1
> 59d1627c412a wurstmeister/kafka:2.12-2.2.1 "start-kafka.sh" About a minute ago Up About a minute 0.0.0.0:9092->9092/tcp flink-sql-training_kafka_1
> 6711c0707f1e flink:1.10.0-scala_2.11 "/docker-entrypoint...." About a minute ago Up About a minute 6121-6123/tcp, 8081/tcp flink-sql-training_taskmanager_1
> 6a8149af6c1e flink:1.10.0-scala_2.11 "/docker-entrypoint...." About a minute ago Up About a minute 6123/tcp, 0.0.0.0:8081->8081/tcp flink-sql-training_jobmanager_1
> 3de8275dff26 wurstmeister/zookeeper:3.4.6 "/bin/sh -c '/usr/sb..." About a minute ago Up About a minute 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp flink-sql-training_zookeeper_1
> a28cee7627a0 mysql:8.0.19 "docker-entrypoint.s..." About a minute ago Up About a minute 3306/tcp, 33060/tcp flink-sql-training_mysql_1
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' flink-sql-training_sql-client_1
> 172.18.0.7

And now we can use the left assist to browse the tables, drag and drop one to query it easily as well as leverage the autocomplete for writing more powerful queries:

SELECT 
taxiId, isStart
FROM default_database.Rides
LIMIT 100
;
SELECT
psgCnt, COUNT(*) AS cnt
FROM Rides
WHERE
isInNYC(lon, lat) AND isStart
GROUP BY psgCnt
;

The Flink Dashboard will show the SQL queries running as regular jobs:

Next

There are lot of future iterations on this first version to make it production ready but the base is getting there.

One that should be of popular interest would be to improve the SQL autocomplete which is based on Apache Calcite. Hue comes with a SDK for writing better grammars and even ships with a default Flink SQL dialect.

Another one coming soon will be a more user friendly display of the live data in the result grid.

Romain

Originally published at https://gethue.com.

--

--