Data Science Academy Camp 2: Harnessing the Power of Data with Machine Learning

Muhammad Iqbal
COMPFEST
Published in
6 min readAug 22, 2023

Read in Bahasa Indonesia

COMPFEST 15, held in Depok, saw the second edition of the Data Science Academy (DSA) at Camp 2 from August 14 to 19, 2023. Conducted online via the Zoom platform, the camp spanned four days. The participants, the top 10 teams pre-selected by COMPFEST, delved into machine learning in data science. The sessions featured competent speakers well-versed in their fields. Intrigued by the excitement? Discover more in the article below!

Day 1 — Introduction to Machine Learning

Kicking off the Data Science Academy Camp 2, the first day revolved around “Introduction to Machine Learning.” The discussion encompassed data, machine learning foundations, and feature engineering. Senior data scientist at Jakarta Smart City, Andi Sulasikin, delivered the material. Sulasikin highlighted the exponential growth of data, owing to rapid information technology development. The collected data necessitates organization to transform it into valuable information, which in turn forms the basis for knowledge application in problem-solving and business processes.

According to Andi, data is the cornerstone of artificial intelligence. The process of machine learning mirrors human learning — just as humans learn from experience, machines learn from data. The subsequent topic covered is machine learning. Andi emphasized a crucial point: not all problems necessitate machine learning solutions, nor can all issues be tackled through traditional programming. We must discern when to leverage machine learning and when conventional programming suffices. Andi proceeded to outline the types of machine learning: supervised, unsupervised, and reinforcement learning.

The following segment, “Feature Engineering,” elucidated that this process occurs before modeling and post data preparation. Andi detailed feature engineering techniques like scaling, log transformation, one-hot encoding, outlier handling, and imputation. The module concluded with a Q&A session with participants and Andi, along with a quiz guided by the MC.

Day 2 — Supervised Learning

The second day of Data Science Academy Camp 2 delved into “Supervised Learning.” This theme explored techniques within supervised learning, including linear regression, logistic regression, and random forest. The material was presented by Muhammad Angga Muttaqin, Co-founder and CEO of Indonesia AI.

Angga elaborated on the distinctions among artificial intelligence, machine learning, and deep learning. He illustrated this relationship as a progression from general to specific. Focusing on supervised learning, Angga proceeded to delve into linear and logistic regression, fundamental models within this domain.

Linear and logistic regression, often referred to as the “hello world” of data science, hold essential knowledge for all data scientists, according to Angga. While no longer primary models in the field, they remain crucial foundations. Linear regression addresses continuous data, while logistic regression deals with discrete data, such as binary outcomes.

Expanding on practical applications, Angga introduced the random forest model, vital for both regression and classification tasks. The discussion extended to the interplay between entropy and random forest.

Post-material session, the hands-on session ensued, enabling participants to reinforce their understanding through practical application. The event concluded with a captivating Q&A session and an engaging quiz.

Day 3 — Unsupervised Learning

The third day featured Siti Aminah, a lecturer at the Faculty of Computer Science and researcher at the University of Indonesia. Siti delivered unsupervised learning content, delving into topics like clustering, k-means clustering, hierarchical clustering, and hyperparameter tuning.

Unsupervised learning, a branch of machine learning, operates with data lacking clear labels or targets. Its core objective is to discern patterns or structures existing within data, devoid of label-based guidance. Siti emphasized that when data lacks labels, which can be expensive or challenging to assign, unsupervised learning offers a valuable solution.

Clustering is a central technique within unsupervised learning. In contrast to supervised classification with known labels, clustering tackles the challenge of identifying an unknown number and types of groups within data. Clustering serves the purpose of data grouping by revealing inherent patterns. Evaluating clustering, Siti noted, is more intricate than classification, as outcomes can be subjective and contingent upon the clustering’s intent.

Partitional clustering segregates datasets into distinct, non-overlapping groups, while hierarchical clustering constructs a tiered structure of groups and sub-groups. Among the partitional techniques, K-means clustering is prominent but has limitations including sensitivity to centroid initialization and outliers. Hierarchical clustering, on the other hand, suits data with hierarchical structures.

Just like the prior day, the material presentation was followed by an engaging Q&A session and practical hands-on exercises.

Day 4 — Deep Learning

Moving to the fourth day of the event, the spotlight was on Mikael Alvian Rizki as the presenter. Mikael, a seasoned software engineer and former teaching assistant in the Artificial Intelligence & Basic Data Science course, steered the discussions. His session delved into deep learning, covering neural networks, activation functions, epochs and batches, image classification, and concluded with a hands-on session employing TensorFlow.

Deep learning navigates the realm of more abstract data compared to the structured tabular data addressed by machine learning. Mikael equated neural networks to our own brain’s functioning, with neurons as its elemental building blocks. He drew a parallel between neural networks and the human eye’s perception process: where the eye receives visual input, a neural network takes in and processes data through its layers. These layers consist of the input layer (input reception), hidden layer (internal processing), and output layer (prediction or classification outcome).

The activation function emerged as the next topic. Mikael likened this function to a switch, flicked off for negative results and on for positive outcomes. Progressing, he discussed epochs, which represents how many times a model encounters training data. In deep learning, epochs are crucial for enabling the model to internalize data patterns and mitigate overfitting.

Subsequently, Mikael explored convolutional neural networks, pooling layers, flattening, and the fully connected layer. These discussions highlighted the intricate world of deep learning and its various components.

Following the material presentation session, the momentum carried into a hands-on session utilizing TensorFlow. Here, Mikael put theory into practice, allowing participants to engage directly with the concepts explained earlier. Post-hands-on exercises, the final segment encompassed real-world case studies, personally introduced by Mikael. Providing instructions for tackling these cases, Mikael guided the participants. Following committee-assigned group divisions, participants were granted two hours to work on the case studies. Subsequently, a group was selected to present their findings.

After the conclusion of DSA Camp 2, an interview with one of the Data Science Academy groups, Infinity, consisting of Balqis, Adit, and Amar, shed light on their experience. Balqis remarked, “Data Science Academy COMPFEST is truly rewarding, incredibly engaging, and an entirely novel experience. Particularly during the case study session, where we collaborated with different teams.” Expressing hopes for the continued growth of the Data Science Academy program, they highlighted its significance due to limited data science learning resources in Indonesia. They saw the program as instrumental in democratizing data science knowledge for a broader audience.

DSA will continue for Camp 3 and there are still other series of COMPFEST events. So, keep monitoring our journey through our social media accounts, @compfest on Twitter, Instagram, Facebook, LinkedIn, and our site, compfest.id. For more excitement from COMPFEST Academy, read the complete articles on our Medium page. (Editorial Marketing/Muhammad Iqbal)

--

--