Making Airflow Pods Use a Private Google Cloud SQL Connection

Alexa Griffith
The Startup
Published in
5 min readApr 28, 2020

--

Bluecore’s Data Science team uses Airflow for our model workflows. In our Airflow pods, we had been, until recently, using a Cloud SQL proxy as a sidecar container. The Cloud SQL connection handles database connections. We can get information about the state of a task or XComs, for example. Google has recently allowed users to connect to Cloud SQL using a VPC. Because of this, we decided to remove the proxy and implement a private connection to be more secure and save resources. As a result, Cloud SQL connections are no longer made via the localhost, and we stopped creating a sidecar container just for Cloud SQL connections.

web pod containing the Cloud SQL proxy (top left) versus the web pod using the private IP (bottom right)

XComs

XComs allow “cross-communication” of information between task instances. You can either “push” or “pull” an XCom.

def get_result_one(**kwargs):
# code to calculate some result
kwargs['ti'].xcom_push(key='some_key', value={result})
operator_1 = PythonOperator(
task_id='task1'
python_callable=get_result_one,
provide_context=True,
dag=dag,
)
def get_result_two(task_ids…

--

--

Alexa Griffith
The Startup

Software Engineer at Bloomberg. All opinions are my own! :)