Jetbrains DataSpell — A brand new awesome IDE for data scientists

Peter Allen
Analytics Vidhya
Published in
3 min readSep 10, 2021
DataSpell — a new IDE for Data Scientists

Recently JetBrains, the makers behind the much loved PyCharm and IntelliJ IDEA amongst various other offereings, have released for public trial something that I’ve been looking for for quite a while. An IDE that combines many of the tools I currently use into a single easy to use package, DataSpell.

So far during the trial I have been able to pretty much stop using the main tools I use regularly as a professional data analyst:

  • TOAD Data Point (or DBeaver) as my SQL Editor
  • Sublime Text (and Jupyter Notebooks) as my python script editors
  • Git Bash for source control

I will go through in three short articles how it performs in these different roles above with a real example using the Google BigQuery open source COVID-19 database, starting with the SQL editor, as it was the part I was most excited about having as part of an IDE.

I cover the background of this dataset and connection in my other post:

Database Integration (SQL Editor):

To test this functionality I connected to the Covid-19 public dataset, this is really easy to do by just opening the “Database Explorer” panel on the right hand side of the workspace as shown below.

Database Explorer Panel

After this the following screen appears where you just need to download the drivers, enter the service account email address, Project ID and path to the key file, see my above article (especially for creating the project in Google) on creating these for your own project, or check out this really clear page from Jetbrains themselves — https://www.jetbrains.com/help/dataspell/connect-to-bigquery.html#connecting-to-bigquery-with-a-google-service-account

Connection Screen — click “Test Connection” to check your connection

After this you will be connected to the Covid-19 database, one trick here that confused me for a little bit was that your connection will appear in the Database Explorer but no schemas will be selected as below. Just click on the “0 of 1” and click the check box, this makes the schemas active.

The schemas default to unselected when first added

Now you can open a new Database Console to write your queries. To test it out I used the same queries I used in my previous tutorial. You can see the queries are nicely formatted automatically and the output is displayed at the bottom of the screen. If multiple queries are run at the same time, the results of each query will be displayed in a separate tab.

Running an simple query on the BigQuery data — nice auto formatting

So as you can see, the Database connection part of the IDE works quite well and is very simple to set up, I’ve connected to Redshift also for other uses and that was even more seamless to connect.

In my next post I will test out the second part of IDE that excites me, the use of jupyter notebooks within the IDE and not needing to open a browser. See this new post here:

--

--

Peter Allen
Analytics Vidhya

Data Analyst in Melbourne Australia. Ex-mechanical engineer who transitioned across due to the love of all things data. Beekeeper. DIY. Tinkerer.