ColabWithMe: A GPT Specialized in Google Colab for Data Analysis & ML

Jeronimo De Leon
4 min readJan 12, 2024

--

Generative AI is redefining the programming landscape, inviting a non-engineering audience into the realm of software development. This shift, highlighted in the Harvard Business Review’s We’re All Programmers Now,” talks about the concept of ‘citizen developers’ — individuals who, despite lacking traditional coding experience, can now build complex applications, previously a domain reserved for skilled programmers. This evolution is not just about enabling non-technical employees to code; it’s about leveraging their unique insights and perspectives in technology creation, thereby enriching the specialized software landscape.

Citizen Developers and the Evolution of Generative Coding

The emergence of citizen developers presents an opportunity for tech organizations to harness a wider pool of talent and creativity. However, this comes with its own set of challenges. Organizations must navigate how to effectively support and guide these new developers, ensuring quality and governance without stifling innovation. Striking this balance is crucial for reaping the benefits of citizen development while preventing problems such as substandard production applications or breaches in data security.

Citizen data scientists have been a part of the analytics landscape for some time, thanks to tools like Dataiku and Akkio that have made advanced data analysis and machine learning more approachable. These platforms have simplified complex data processes with user-friendly no-code interfaces, more specifically designed for business use cases. The emergence of data analysis functionalities into chat interfaces, like OpenAI Data Analyst GPT, represents another leap in making data analysis even more accessible. ChatGPT does have a constraint regarding the size of data it can handle directly. Therefore, a more effective strategy is utilizing ChatGPT to generate code for data analysis. However, the question arises: what is the next step in developing and experimenting with AI generated code, specifically for data analysis and machine learning?

Executing AI-Generated Code through Notebooks

While ChatGPT is good at generating code snippets, finding the right platform for executing and learning from these snippets can be challenging. Jupyter Notebooks is a good solution for those who prefer to run code on their local machine and need easy access to their local files. Ideal for citizen developers and non-technical users, Jupyter Notebooks provide an interactive platform for experimenting with AI-generated code, particularly in data analysis projects.

Building on this, Google Colaboratory, or ‘Colab’ for short, is an online alternative to Jupyter Notebooks, easily accessible within Google Drive and requires no setup to use. To start using Colab, navigate to ‘New’ in Google Drive, select ‘More’, and choose ‘Colaboratory’. If it’s not immediately available, it can be added through ‘Connect more apps’ by searching for Colaboratory. A product from Google Research, Colab allows anyone to execute Python code through the browser, making it particularly well-suited for data analysis and machine learning. Colab enhances the Jupyter Notebook experience by offering a more user-friendly and shareable environment. Its accessibility and ease of use make Colab an ideal choice for beginners and experienced users looking to work together seamlessly on complex data-driven tasks.

Over the past few months, I’ve worked on several GPTs tailored to internal processes and tasks — the latest GPT designed to assist business users in beginning data tasks within Google Colab.

Introducing ‘ColabWithMe’: A Specialized GPT for Python Coding in Google Colab

https://chat.openai.com/g/g-FOQy9agEW-colabwithme

ColabWithMe,” is a specialized GPT designed to be a collaborative assistant that generates Python code for data analysis tasks in Google Colab. Through my experience, customizing GPTs for a specific programming language and coding environment significantly enhances the quality of code responses. ColabWithMe is instructed to provide a detailed, step-by-step guide to setup your desired function with Python in Google Colab, accompanying each step with appropriate code generation.

The GPT’s response focus are:

  1. Defining the packages that need to be installed for the desired function.
  2. Providing the import code for the function.
  3. A function to be able to import data.
  4. A function to transform the data to the users desired output.
  5. A function to print and export the data.

Here are a few use cases to try with your data:

  1. User Interaction Data: Leverage usage metrics to refine product features and user experience.
  2. Sales Transaction Data: Analyze historical sales figures to forecast trends and inform strategic decisions.
  3. Demographic and Behavioral Data: Utilize customer data to create targeted segments for personalized marketing.
  4. Customer Review Text: Apply NLP to textual feedback for customer satisfaction and preferences insights.
  5. Online Review Data: Examine competitor product reviews to identify market trends and opportunities for innovation.

Google Colab serves as a great starting point for business users keen on exploring coding. For more sophisticated application development, platforms like Replit and Streamlit offer intuitive interfaces to convert Python data scripts into shareable web applications.

For deeper data analysis and machine learning, Google Colab can be extended to access more powerful computing through Google Colab Enterprise within the Google Cloud ecosystem. I currently use Colab Enterprise to evaluate, test, and train Large Language Models for our startup Intelas, a real estate finance data management and insights company. Google Cloud has effectively integrated Colab notebooks with their generative AI training pipelines, offering a seamless experience for advanced enterprise users.

Empowering Citizen Developers in the New Era of Coding

In this new age of generative AI, we are witnessing a pivotal shift where the ability to code and harness data science is becoming a universal opportunity. This transformative period marks the beginning of a significant change in how businesses and individuals operate, opening doors for everyone to be a part of this technological revolution as ‘citizen developers’.

--

--

Jeronimo De Leon

AI Lead at www.Intelas.com, founder of www.Welcome.AI — Helping companies learn and adopt artificial intelligence within their business.