‘Let’s Begin! — Showcase your AI demo with Two Clicks!

Heting Li
6 min readApr 23, 2024

--

Photo by Eberhard Grossgasteiger
  1. Introduction

Studying the algorithms and formulas of AI models can be cumbersome, but it’s feasible to create an AI program without a deep understanding of how the model is constructed. Additionally, the online service Jupyter Notebook (a web-based interactive development environment) allows us to execute Python programming without any setup for development.

2. Experiment

Let us start with a scikit-learn example of the use case of “Classification” with SVM model, and execute the AI demo online on the onlne Jupyter Notebook Service.

Let us take the scikit-learn “Classification” example for the demo: Recognizing hand-written digits. This classification task’s purpose is to recoganize the hand-written digit image. And we will use the JupterLite for the online coding.

Step1: Open the scikit-learn example page Recognizing hand-written digits.

Step2: Scroll down to the bottom and click the icon “launch lite”. Now you will be directed to a web page from jupyterlite. This page show the Jupyter Notebook (with file extension .ipynb), the code is constructed by code cells, code descriptions and also the plotted images from the default code execution.

Step3: Rerun the code by yourself. Click the button as marked below to “Restart the kernel and run all cells”, and click “Restart” on the prompt dialog. Afterward the existing result are flushed.

Step4: Click the button marked below to “Run the call and Advance” multiple times to go though the code. And also notice the kernel status on the top right of the page is spinning, which indicate code execution is ongoing.

Step5: After a few seconds, the execution finish and the model validation result are generated. Note: Make sure only one JupyterLite page is open, otherwise the web page would go error during execution.

3. AI Case Walkthrough (Optional Reading)

The Jupyter Notebook had explained the code pretty well, and the code cell can be further examined by any Copilot tools to elaborate further. However let us provide some outliner as complement of the notebook.

The integrated data in Scikit-learn datasets.load_digits has been pre-processed according to the source of the dataset. The image pre-process had gone through below flow on high-level.

The outcome of the pre-processed image has eventually become an array, which is saved as a row in a csv like dataframe . The complete dataset is just an concatenation of each image which are represented by these rows, plus the target true number for the training purpose at the end.

This flow of image pre-process represents a fundamental data process steps in AI Modeling. Most contemporary graphical AI, including deep learning models, are constructed based on this preprocessing principle. They are however developed on a much larger scale and in higher dimensions, enhanced by more advanced algorithms.

Next step is the Training Set and Test Set segmentation, it is a crucial step during AI development. In this example it is devided with 50%-50%, without shuffling of the samples. The X_train are the flattened images which is represented 64 integers each. Y_train is the last integer present the target true number.

X_train, X_test, y_train, y_test = train_test_split(
data, digits.target, test_size=0.5, shuffle=False
)

Following the dataset segment, the model selection and optimization is the core of the AI modeling. In this exmaple the selected model is Supported Vector Machines (SVM) which is supported by Scikit, according to 1.4. Support Vector Machines. We will introduce these traditional AI model and how to use it in our following tutorials.

# Create a classifier: a support vector classifier
clf = svm.SVC(gamma=0.001)

The final part of this example is the prediction result which is the evaluation of the model. The confusion matrix is a method to visualize the quality of predictions supported by scikit-learn. There are various scoring mechanism supported by scikit-learn which we will experiment in following tutorials.

4. Conclusion

With just a few clicks in a Jupyter Notebook, we have navigated through one of the most crucial stages of AI programming, including Data Preprocessing, AI Model Selection, and AI Model Evaluation.

The sequence of Data Pre-processing, AI Model Selection and Tuning, and AI Model Evaluation outlined in this example is typical for all standard AI projects and often involves several iterations. Frequently, it is necessary to switch models, tune parameters, and even enhance the original data to secure satisfactory evaluation result.

Furthermore, following the AI Model Development, the AI Model Engineering is an essential phase to bring the AI model into production. It generally adheres to standard product development cycle , although tailored to the specific context of hardware and software need for AI programing.

5. What is Next Step?

Well Begun is Half Done!

Defining a Use Case that is relevant to your specific task is the most critical first step.

Are you excited to explore the AI modeling process in your private development environment to develop your AI use case? Let’s continue this journey in our next Tutorial Traditional AI — Development Environment Setup

If you’re unsure about a Use Case, No Worries! The upcoming tutorial, which includes setting up the development environment, will likely inspire you and help guide us deeper into the world of AI!

7. Appendix — Commercial Examples

Last but not least, we would like to share examples of commerial use cases which are developped or supported by scikit-learn. (Refer from Scikit-learn Testimonials). Hope it provide some inspiration of developping our own AI use cases.

  • J.P.Morgan- Bank business includingclassification, predictive analytics
  • Inria — Scikit-Learn is used to support leading-edge basic research in many teams: Parietal for neuroimaging, Lear for computer vision, Visages for medical image analysis, Privatics for security
  • Spotify — Music recommendations
  • Hugging Face — Assist NLP AI development
  • Booking.com — Implementing standard algorithms for prediction tasks, including recommending hotels and destinations , detecting fraudulent reservations, or scheduling service agents
  • Birchbox — E-commerce operations assistance, including product recommendation, user clustering, inventory prediction, trends detection
  • PeerIndex — Build Influence Graph which is a unique dataset that allows to identify who’s influential and in certain social context.
  • Apartmentguide — Recommend best apartment, including understanding user behavior, improving data quality, and detecting fraud.
  • Mars — Scikit-Learn is integral to the Machine Learning Ecosystem at Mars, which is responsible business practices that support a sustainable future, including climate action, land use, water stewardship, and sustainable food sourcing.

--

--