Connect Excel to Databricks SQL Warehouse

Franco Patano
4 min readNov 29, 2022

Microsoft Excel is one of the most versatile tools that exist in the modern business stack. With Excel, it's simple for most business users to interact with their data. Databricks SQL Serverless Warehouse is a Simple, Secure, and Fast SQL Warehouse built on open-source technologies like Apache Spark and Linux Delta Lake that can boot and scale in seconds. DBSQL Warehouses are simple, open, and economic in a world of costly proprietary and premium warehouses. With Lakehouse on Databricks, you can serve all your data and AI use cases on the same data because Open Source Delta Lake can be leveraged as files or tables. Files for ETL, Data Science, and Machine Learning, and tables for accessing the data with SQL similar to other warehouses. All secured and audited with the unified catalog we call Unity.

Let's take a look at how to configure your local desktop to connect to Databricks SQL Warehouse so you can leverage your Delta Lake data in Excel. Before we go on, this is a gentle reminder that Excel has limits, so be wary of doing SELECT * type of queries to pull data into Excel. 1,048,576 is the limit of rows in a spreadsheet in the latest version of Excel.

Download ODBC Driver

Navigate to https://databricks.com/spark/odbc-drivers-download and download the appropriate version 32 or 64-bit. You will need to download the version that aligns with your MS Office installation. If you need help determining which version you have, head over here for the directions.

Install and Configure the Driver

Unzip the downloaded file, and Install the included driver.

Navigate to the ODBC Data Source Administrator, ensure you choose the right version (32 or 64-bit, based on your download and MS Office configuration), and open it.

In User DSN, Click Add, then choose the new Spark ODBC Driver we just installed and click Finish.

Next, give this data source a name, in this example, I creatively called it Databricks.

Then, copy the Server Hostname to the Host(s) box, and the Port, to the Port box.

Token Method

Choose the User Name and Password for the Authentication Mechanism.

In User Name, type the word “token”.

In Password, paste your token. If you do not have a token, follow these directions.

Next, click on the HTTP Options button at the bottom.

Then in the HTTP Path box, copy and paste the HTTP Path from the Warehouse Details screen.

Next Click on the SSL Options button, and click Enable SSL. Then Click OK.

If you have custom certificates, This would be where you configure those options.

Finally, Click on Test Connection. If you are using a Serverless SQL Warehouse it should boot in seconds, or if you are using Classic or Pro ensure your Warehouse is On and Active or you will have to wait for it to boot. You should get a Test Successful prompt!

Now let's go to Excel, in the Data tab, click Get Data, then navigate to From ODBC.

Next Click the dropdown, and select the name of the connection you created in the earlier ODBC step, then click OK.

Now use the Navigator to navigate to your table. Alternatively, you could click on Advanced Options in the previous step, and write a SQL statement for the data. It's advisable to put filters on your data and resist the urge to SELECT * FROM table;. This could result in issues if the table has more than 1048576 rows. If you really want all the records up to the limit, you could append LIMIT 1048576 to your SELECT to bring back as many rows as possible without errors.

Happy spreadsheeting with your Databricks Serverless SQL Warehouse powered Excel!

--

--

Franco Patano

I spend my time learning, practicing, and having fun with data in the cloud.