Introducing BlobCity AutoAI

We created AutoML that writes code so that you don’t have to code AI

Published in

World AI

5 min readNov 30, 2021

BlobCity AutoAI is a new and upcoming framework for Automatic AI. Of the several AutoML frameworks we studied, we found that none produce train + test code. While they do provide deployment flexibility, almost all of the frameworks are still significantly black-box.

We created BlobCity AutoAI to offer full model transparency. The framework finds the best-performing model and produces extensive and well-documented training + test code. In addition, generated code contains data pre-processing, feature selection, and training logic, exactly as was used to find the best fit model, thereby making AutoAI 100% white-box.

And to top it all up, BlobCity AutoAI is 100% Free and Open Source!

If you are working as a Data Science consultant, and are required to submit the source code of a project you are working on, then BlobCity AutoAI will write the entire model development and test code for you.

AutoAI automatically generates such code 👇

Example of actual code generated by AutoAI — Image by BlobCity

Getting Started

pip install blobcity

The first step is to install the BlobCity package. The framework is Python-based and supported on Python 3.6 and above. I recommend using it on a Jupyter Notebook interface, but regular py files are also fully supported.

If installing within a Jupyter Notebook, use the following code line instead, with the presiding exclamation mark.

!pip install blobcity

AutoAI in Action

import blobcity as bc
model = bc.train(file="data.csv", target="target")
model.spill("generated_code.ipynb")

Just the above three lines of code👆 perform the following operations 👇

Image by BlobCity. Example from running AutoAI.

The internal steps followed are:

Loading the data file
Imputing missing values
Encoding columns
Selecting features based on feature importance
Deciding between Regression / Classification
Searching for the best performing model from amongst the complete model repository
Hyper-parameter tuning
Generating train + test source code as a Jupyter Notebook file. Includes detailed markdown documentation explaining the code

Woah! The world is not the same anymore 😲

Generating Predictions

df = model.predict(file="unseen_data.csv")

Pass unseen data to a trained model for generating predictions. The predict function returns a DataFrame for ease of output handling.

Input data is automatically processed and encoded. You do not need to perform any pre-processing on unseen_data.csv

Evaluating Performance

model.stats()

Use the stats() function to access the accuracy/performance of a trained model. You can visually view the results by using plot_prediction()

model.plot_prediction()

Image by BlobCity, showing an example of an Actual v/s Predicted plot produced by AutoAI

A line graph is drawn showing the actual value alongside the predicted value.

Sometimes, you might have too many data points to plot. A control option is provided that allows you to control the plot.

model.plot_prediction(100)

Image by BlobCity, showing an example of the same Actual v/s Predicted graph, limited to the first 100 rows of data

Only the first 100 data points are plotted, thereby improving the readability of the chart.

Negative indexes are also supported. You can pass a value of -100 to plot the last 100 records.

The plot_prediction function is shared across problem types. An actual v/s predicted plot only appears if the model is trained for a Regression problem. In the case of a Classification problem, the same function will automatically plot a confusion matrix.

model.plot_prediction()

Image by BlobCity, showing an example of a confusion matrix produced by AutoAI

Generating Code

The model.spill() function generates complete source code corresponding to a trained model. Various control options are provided to either generate an ipynb or py file with options to generate code with or without extensive documentation. Examples below 👇

model.spill("code.ipynb") #Jupyter code with full docsmodel.spill("code.py") #Python file, with minimal docsmodel.spill("code.py", docs=True) #Python file with full docsmodel.spill("code.ipynb, docs=False) #Jupyter file without docs

As my title says, “You don’t have to code AI”; AutoAI truly stands by this bold claim. Let AutoAI write the complete source code of your next AI project.

Yes, DNN is Supported!

AutoAI fully supports Deep Learning Neural Networks, as well as simple Artificial Neural Networks. AutoAI first tries to find the best fit solution from amongst basic statistical models. AutoAI switches to architecting a neural network for improved accuracy if the statistical models fail to provide a satisfactory fit. It is all fully automated, so you don’t need to do a thing. And yes, AutoAI generates full train + test source code for the DNN model as well. It also produces source code for you to perform transfer learning in the future.

Saving Your Model

model.save("./my_model.pkl")

A trained model can be saved to a pickle file. All model configurations, such as selected features, column encoding techniques, etc., are saved and retrieved later.

model = bc.load("./my_model.pkl")

A saved model can be ported to a production server and deployed. Simply load the model and invoke model.predict() to generate a new prediction.

Accelerated Training

Training a model can be a system-hungry task. If your data is too large, then chances are CPU alone won’t suffice. AutoAI supports GPU acceleration. If you are experiencing long runtimes on the CPU, consider running AutoAI on a server with GPU availability. AutoAI will automatically discover and utilize any available GPUs to reduce your training time.

Conclusion

Use AutoAI to the fullest in your next AI project. It will save you a ton of time in finding the best fit model. It works for traditional ML as well as DNN models. We currently only support structured data Regression and Classification types of problems. Image and video analysis support is being worked upon. We recommend that you watch our Github repository to be notified of our latest updates and feature releases.

I encourage you to run your next project on BlobCity AutoAI. I would love to hear your feedback and anything that can help us make AutoAI better.

GitHub - blobcity/autoai

Python-based framework for Automatic AI for Regression and Classification over numerical data. Performs model search…

github.com