<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Anil Yildiz on Medium]]></title>
        <description><![CDATA[Stories by Anil Yildiz on Medium]]></description>
        <link>https://medium.com/@anil.yildiz36?source=rss-785f76bebacc------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*Tzd1_v-GWzUU5e3e</url>
            <title>Stories by Anil Yildiz on Medium</title>
            <link>https://medium.com/@anil.yildiz36?source=rss-785f76bebacc------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sat, 30 May 2026 07:54:58 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@anil.yildiz36/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Deep Learning with Tabnet]]></title>
            <link>https://medium.com/@anil.yildiz36/deep-learning-with-tabnet-bff55efd47aa?source=rss-785f76bebacc------2</link>
            <guid isPermaLink="false">https://medium.com/p/bff55efd47aa</guid>
            <category><![CDATA[tabnet]]></category>
            <category><![CDATA[deep-learning]]></category>
            <dc:creator><![CDATA[Anil Yildiz]]></dc:creator>
            <pubDate>Fri, 24 Nov 2023 05:34:10 GMT</pubDate>
            <atom:updated>2023-11-24T05:34:10.607Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*W61-YPiogF17_s7f.png" /></figure><p>TabNet is a deep learning architecture specifically designed for tabular data, introduced in the <a href="https://arxiv.org/abs/1908.07442">paper </a>“TabNet: Attentive Interpretable Tabular Learning” by Arik and Pfister from Google Research.</p><p>TabNet takes raw, unprocessed tabular data and undergoes training using gradient descent-based optimization. It utilizes sequential attention at each decision step to select features. This enhances interpretability by allocating learning capacity to the most useful features.</p><p>As feature selection is performed on an individual basis, it can vary for each row in the training dataset. TabNet utilizes a single deep learning architecture for feature selection and extraction, a technique referred to as soft feature selection.Based on design choices, TabNet can provide two types of interpretability: local interpretability, which visualizes the importance of features and shows how they are combined for a specific row; and global interpretability, which quantifies the contribution of each feature across the entire dataset in the trained model.</p><p>We can list the advantages of the Tabnet algorithm as follows;</p><ul><li>It allows you to train Multiregressor without creating separate models for each class.</li><li>It uses an attention structure to focus on a specific data point and even visualizes it, showing which parts receive attention for a given selection. The number of features can be changed depending on the features being focused on.</li><li>It uses backprop to improve decisions and weights, which gives more control.</li><li>LR reduction uses fine-tuning approaches that work for all deep learning principles such as special loss.</li><li>Tabnet automates feature selection, so you don’t need to take care of this.</li></ul><p>Its disadvantages are as follows;</p><ul><li>Tabnet only performs well with tabular data sets. (That was its purpose anyway 😄 )</li><li>Tabnet, like other deep learning algorithms, is quite complex.</li><li>Grid Search and Randomized Search cannot be used when optimizing parameters. Therefore, hyperparameters (EPOCHS, BATCH_SIZE, LEARNING_RATE, etc.) must be configured manually.</li><li>Layer and neuron values used in creating the artificial neural network architecture must be assigned correctly, otherwise the accuracy rates will be quite low.</li></ul><p><strong>Tabnet Implementation</strong></p><p>The dataset contains packet information generated over the network when smartphone applications are in use. Our objective is to predict the type of application through multi-class classification using the packet information.</p><p>First, let’s import the necessary libraries.</p><pre>import numpy as np<br>import pandas as pd<br>import seaborn as sns<br>from tqdm.notebook import tqdm<br>import matplotlib.pyplot as plt</pre><pre>import torch<br>import torch.nn as nn<br>import torch.optim as optim<br>from torch.utils.data import Dataset, DataLoader, WeightedRandomSampler</pre><pre>from sklearn.preprocessing import MinMaxScaler    <br>from sklearn.model_selection import train_test_split<br>from sklearn.metrics import confusion_matrix, classification_report</pre><pre>dataframe = pd.read_csv(‘dataset.csv&#39;)</pre><p>Let’s load and visualize the dataset. (Categories are given according to the type of applications. For example 0 : Social Media, 1: Multimedia, 3 : Communication, 4 : Navigation, 5 : Game etc.)</p><pre>df = pd.read_csv(&#39;dataset.csv&#39;)<br>rows = 100<br>dfhead = df.head(rows)<br>dftail = df.tail(rows)<br>dflast = pd.concat([dfhead, dftail])<br>dflast</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*Nogpx67h0U8wiplw.png" /></figure><p>We encode our target class.</p><pre>class2idx = {<br>    &#39;Entertainment&#39;:0,<br>    &#39;Social&#39;:1,<br>    &#39;Utility&#39;:2,<br>    &#39;Lifestyle&#39;:3,<br>    &#39;Productivity&#39;:4,<br>    &#39;Game&#39;:5<br>}<br>idx2class = {v: k for k, v in class2idx.items()}<br>dataframe[&#39;appcategory&#39;].replace(class2idx, inplace=True)</pre><p>Since Appcategory is our target class, we extract it from our training data and then partition our data sets as Train (training), Val (validation) and Test.</p><pre>X = dataframe.drop(columns=[&#39;appcategory&#39;])<br>y = dataframe[&#39;appcategory&#39;]<br>X_trainval, X_test, y_trainval, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=69)<br>X_train, X_val, y_train, y_val = train_test_split(X_trainval, y_trainval, test_size=0.1, stratify=y_trainval, random_state=21)</pre><p>We normalize the inputs.</p><pre>scaler = MinMaxScaler()<br>X_train = scaler.fit_transform(X_train)<br>X_val = scaler.transform(X_val)<br>X_test = scaler.transform(X_test)<br>X_train, y_train = np.array(X_train), np.array(y_train)<br>X_val, y_val = np.array(X_val), np.array(y_val)<br>X_test, y_test = np.array(X_test), np.array(y_test)</pre><p>We visualize the class distribution in Train and Val datasets.</p><pre>def get_class_distribution(obj):<br>    count_dict = {<br>        &quot;rating_ECOMMERCE&quot;: 0,<br>        &quot;rating_OTHER&quot;: 0,<br>        &quot;rating_MULTIMEDIA&quot;: 0,<br>        &quot;rating_SOCIAL&quot;: 0,<br>        &quot;rating_COMMUNICATION&quot;: 0,<br>        &quot;rating_NAVIGATION&quot;: 0,<br>        &quot;rating_GAME&quot;: 0,<br>    }<br>    <br>    for i in obj:<br>        if i == 0: <br>            count_dict[&#39;rating_ECOMMERCE&#39;] += 1<br>        elif i == 1: <br>            count_dict[&#39;rating_OTHER&#39;] += 1<br>        elif i == 2: <br>            count_dict[&#39;rating_MULTIMEDIA&#39;] += 1<br>        elif i == 3: <br>            count_dict[&#39;rating_SOCIAL&#39;] += 1<br>        elif i == 4: <br>            count_dict[&#39;rating_COMMUNICATION&#39;] += 1  <br>        elif i == 5: <br>            count_dict[&#39;rating_NAVIGATION&#39;] += 1      <br>        elif i == 6: <br>            count_dict[&#39;rating_GAME&#39;] += 1<br>        else:<br>            print(&quot;Check classes.&quot;)<br>            <br>    return count_dict</pre><pre>fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(25,7))</pre><pre># Train<br>sns.barplot(data = pd.DataFrame.from_dict([get_class_distribution(y_train)]).melt(), x = &quot;variable&quot;, y=&quot;value&quot;, hue=&quot;variable&quot;,  ax=axes[0]).set_title(&#39;Class Distribution in Train Set&#39;)<br># Validation<br>sns.barplot(data = pd.DataFrame.from_dict([get_class_distribution(y_val)]).melt(), x = &quot;variable&quot;, y=&quot;value&quot;, hue=&quot;variable&quot;,  ax=axes[1]).set_title(&#39;Class Distribution in Val Set&#39;)</pre><p>We create Train, Val and Test data sets.</p><pre>class ClassifierDataset(Dataset):<br>    <br>    def __init__(self, X_data, y_data):<br>        self.X_data = X_data<br>        self.y_data = y_data<br>        <br>    def __getitem__(self, index):<br>        return self.X_data[index], self.y_data[index]<br>        <br>    def __len__ (self):<br>        return len(self.X_data)</pre><pre>train_dataset = ClassifierDataset(torch.from_numpy(X_train).float(), torch.from_numpy(y_train).long())<br>val_dataset = ClassifierDataset(torch.from_numpy(X_val).float(), torch.from_numpy(y_val).long())<br>test_dataset = ClassifierDataset(torch.from_numpy(X_test).float(), torch.from_numpy(y_test).long())</pre><p>We implement the structure we will use as a sampler.</p><pre>target_list = []<br>for _, t in train_dataset:<br>    target_list.append(t)<br>    <br>target_list = torch.tensor(target_list)</pre><pre>class_count = [i for i in get_class_distribution(y_train).values()]<br>class_weights = 1./torch.tensor(class_count, dtype=torch.float)</pre><pre>class_weights_all = class_weights[target_list]</pre><pre>print(class_weights)</pre><pre>weighted_sampler = WeightedRandomSampler(<br>    weights=class_weights_all,<br>    num_samples=len(class_weights_all),<br>    replacement=True<br>)</pre><p>We assign the parameters of the algorithm. We mentioned that these parameters need to be assigned manually. At this stage, we can examine the parameters and what they do and decide on the most suitable parameters for our dataset. You can review it <a href="https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/tabnet">here</a>.</p><pre>EPOCHS = 300<br>BATCH_SIZE = 32<br>LEARNING_RATE = 0.001<br>NUM_FEATURES = len(X.columns)<br>NUM_CLASSES = 7</pre><p>We define dataloader.</p><pre>train_loader = DataLoader(dataset=train_dataset,<br>                          batch_size=BATCH_SIZE,<br>                          sampler=weighted_sampler<br>)<br>val_loader = DataLoader(dataset=val_dataset, batch_size=1)<br>test_loader = DataLoader(dataset=test_dataset, batch_size=1)</pre><p>We design the neural network architecture. The neuron and layer values assigned at this stage are done manually, just like the parameters defined in the two previous steps. If the neuron and layer values are not assigned properly at this stage, the success rate will be quite low.</p><pre>class MulticlassClassification(nn.Module):<br>    def __init__(self, num_feature, num_class):<br>        super(MulticlassClassification, self).__init__()<br>        <br>        self.layer_1 = nn.Linear(num_feature, 512)<br>        self.layer_2 = nn.Linear(512, 256)<br>        self.layer_3 = nn.Linear(256, 128)<br>        self.layer_out = nn.Linear(128, num_class) <br>        <br>        self.relu = nn.ReLU()<br>        self.dropout = nn.Dropout(p=0.2)<br>        self.batchnorm1 = nn.BatchNorm1d(512)<br>        self.batchnorm2 = nn.BatchNorm1d(256)<br>        self.batchnorm3 = nn.BatchNorm1d(128)<br>        <br>    def forward(self, x):<br>        x = self.layer_1(x)<br>        x = self.batchnorm1(x)<br>        x = self.relu(x)<br>        <br>        x = self.layer_2(x)<br>        x = self.batchnorm2(x)<br>        x = self.relu(x)<br>        x = self.dropout(x)<br>        <br>        x = self.layer_3(x)<br>        x = self.batchnorm3(x)<br>        x = self.relu(x)<br>        x = self.dropout(x)<br>        <br>        x = self.layer_out(x)<br>        <br>        return x</pre><pre>device = torch.device(&quot;cuda:0&quot; if torch.cuda.is_available() else &quot;cpu&quot;)    <br>    <br>model = MulticlassClassification(num_feature = NUM_FEATURES, num_class=NUM_CLASSES)<br>model.to(device)</pre><pre>criterion = nn.CrossEntropyLoss(weight=class_weights.to(device))<br>optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)</pre><p>As a result of this design, there will be a neural network architecture like this.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/580/0*05ENROiXiEJJXM4e.png" /></figure><p>We train and validate the model.</p><pre>def multi_acc(y_pred, y_test):<br>    y_pred_softmax = torch.log_softmax(y_pred, dim = 1)<br>    _, y_pred_tags = torch.max(y_pred_softmax, dim = 1)    <br>    <br>    correct_pred = (y_pred_tags == y_test).float()<br>    acc = correct_pred.sum() / len(correct_pred)<br>    <br>    acc = torch.round(acc * 100)<br>    <br>    return acc</pre><pre>accuracy_stats = {<br>    &#39;train&#39;: [],<br>    &quot;val&quot;: []<br>}<br>loss_stats = {<br>    &#39;train&#39;: [],<br>    &quot;val&quot;: []<br>}</pre><pre>print(&quot;Begin training.&quot;)</pre><pre>for e in tqdm(range(1, EPOCHS+1)):<br>    <br>    # TRAINING<br>    train_epoch_loss = 0<br>    train_epoch_acc = 0    </pre><pre>    for X_train_batch, y_train_batch in train_loader:<br>        X_train_batch, y_train_batch = X_train_batch.to(device), y_train_batch.to(device)<br>        optimizer.zero_grad()<br>        <br>        y_train_pred = model(X_train_batch)<br>        <br>        train_loss = criterion(y_train_pred, y_train_batch)<br>        train_acc = multi_acc(y_train_pred, y_train_batch)<br>        <br>        train_loss.backward()<br>        optimizer.step()<br>        <br>        train_epoch_loss += train_loss.item()<br>        train_epoch_acc += train_acc.item()<br>                <br>    # VALIDATION    <br>    with torch.no_grad():<br>        <br>        val_epoch_loss = 0<br>        val_epoch_acc = 0<br>        <br>        model.eval()<br>        for X_val_batch, y_val_batch in val_loader:<br>            X_val_batch, y_val_batch = X_val_batch.to(device), y_val_batch.to(device)<br>            <br>            y_val_pred = model(X_val_batch)<br>                        <br>            val_loss = criterion(y_val_pred, y_val_batch)<br>            val_acc = multi_acc(y_val_pred, y_val_batch)<br>            <br>            val_epoch_loss += val_loss.item()<br>            val_epoch_acc += val_acc.item()<br>            <br>    loss_stats[&#39;train&#39;].append(train_epoch_loss/len(train_loader))<br>    loss_stats[&#39;val&#39;].append(val_epoch_loss/len(val_loader))<br>    accuracy_stats[&#39;train&#39;].append(train_epoch_acc/len(train_loader))<br>    accuracy_stats[&#39;val&#39;].append(val_epoch_acc/len(val_loader))<br>                              <br>    <br>    print(f&#39;Epoch {e+0:03}: | Train Loss: {train_epoch_loss/len(train_loader):.5f} | Val Loss: {val_epoch_loss/len(val_loader):.5f} | Train Acc: {train_epoch_acc/len(train_loader):.3f}| Val Acc: {val_epoch_acc/len(val_loader):.3f}&#39;)</pre><p>We create dataframes and visualize forecasts and losses.</p><pre># Create dataframes<br>train_val_acc_df = pd.DataFrame.from_dict(accuracy_stats).reset_index().melt(id_vars=[&#39;index&#39;]).rename(columns={&quot;index&quot;:&quot;epochs&quot;})<br>train_val_loss_df = pd.DataFrame.from_dict(loss_stats).reset_index().melt(id_vars=[&#39;index&#39;]).rename(columns={&quot;index&quot;:&quot;epochs&quot;})<br># Plot the dataframes<br>fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(20,7))<br>sns.lineplot(data=train_val_acc_df, x = &quot;epochs&quot;, y=&quot;value&quot;, hue=&quot;variable&quot;,  ax=axes[0]).set_title(&#39;Train-Val Accuracy/Epoch&#39;)<br>sns.lineplot(data=train_val_loss_df, x = &quot;epochs&quot;, y=&quot;value&quot;, hue=&quot;variable&quot;, ax=axes[1]).set_title(&#39;Train-Val Loss/Epoch&#39;)</pre><p>We test the model and print the result. (Finally 😄)</p><pre>y_pred_list = []<br>with torch.no_grad():<br>    model.eval()<br>    for X_batch, _ in test_loader:<br>        X_batch = X_batch.to(device)<br>        y_test_pred = model(X_batch)<br>        _, y_pred_tags = torch.max(y_test_pred, dim = 1)<br>        y_pred_list.append(y_pred_tags.cpu().numpy())<br>y_pred_list = [a.squeeze().tolist() for a in y_pred_list]</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*loN8FuWCclKQ2L0q.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*_tsn0YWm8_c_r9p5.png" /></figure><p>It took 300 iterations to train the model because the EPOCH parameter was set to 300</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*PzlCxP--mXPyOrgv.png" /></figure><p>Class-based and average results.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/419/0*SpYj1o0BwQ8zoBV5.png" /></figure><p>Class distributions in train, test and val datasets.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*xLeuLjO2KmtvUfAm.png" /></figure><p>Iteration-based plots of accuracy and losses.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*7eKbKrgomilzh96-.png" /></figure><p><strong>Sources</strong></p><p><a href="https://arxiv.org/abs/1908.07442">TabNet: Attentive Interpretable Tabular Learning</a></p><p><a href="https://cloud.google.com/blog/products/ai-machine-learning/ml-model-tabnet-is-easy-to-use-on-cloud-ai-platform">TabNet on AI Platform: High-performance, Explainable Tabular Learning</a></p><p><a href="https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/tabnet">Tabular Workflow for TabNet</a></p><p><a href="https://github.com/google-research/google-research/tree/master/tabnet">Implemantation Example Google</a></p><p><a href="https://www.kaggle.com/code/tanulsingh077/achieving-sota-results-with-tabnet/notebook">Implemantation Example Kaggle</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=bff55efd47aa" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Light GBM Light and Powerful Gradient Boost Algorithm]]></title>
            <link>https://medium.com/@anil.yildiz36/light-gbm-light-and-powerful-gradient-boost-algorithm-d1898fa205d2?source=rss-785f76bebacc------2</link>
            <guid isPermaLink="false">https://medium.com/p/d1898fa205d2</guid>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[lightgbm]]></category>
            <dc:creator><![CDATA[Anil Yildiz]]></dc:creator>
            <pubDate>Fri, 24 Nov 2023 05:32:00 GMT</pubDate>
            <atom:updated>2023-12-05T05:38:58.906Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*7WyN5d1RBMa3Ie-x.png" /></figure><p>LightGBM, developed by Microsoft, is a gradient boosting algorithm that has rapidly gained popularity and secured a robust position among successful models. Light GBM is widely used in Kaggle and one of the reasons is its superior speed performance compared to other models, consistently placing it among the models that achieve the best results. LightGBM is a type of Gradient Boosting Machine (GBM) that utilizes a structure incorporating tree-based learning algorithms. These features positively impact the preference for LightGBM, contributing to its increased popularity.</p><p>LightGBM employs a leaf-wise decision tree-based gradient boosting method that reduces memory usage while enhancing model efficiency. This method adopts two innovative techniques, Gradient-based One Side Sampling (GOSS) and Exclusive Feature Bundling (EFB), to overcome the limitations of the traditional histogram-based approach used in Gradient Boosting Decision Tree (GBDT) algorithms. GOSS and EFB are employed to address these limitations and improve the overall performance of the algorithm. The characteristics of the LightGBM algorithm are shaped by the GOSS and EFB methodologies. These techniques are employed together to ensure the effective operation of the model and gain advantages over other Gradient Boosting Decision Tree (GBDT) algorithms.</p><p><strong>LightGBM Algorithm</strong></p><p>Let’s first examine the difference between LightGBM’s adopted leaf-wise decision tree and the level-wise decision tree in decision trees.</p><p>LightGBM adopts a leaf-wise growth strategy, as opposed to the traditional level-wise approach. The fundamental difference lies in how the tree grows and how branches expand.Level-Wise</p><p><strong>Level-Wise</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/483/0*cdwHLQB0LeWg3VQp.png" /></figure><ul><li>The tree grows level by level. In other words, all nodes are expanded at each level, and the children of these nodes are created.</li><li>This approach typically leads to shallower but wider trees.</li><li>The tree expansion may stop before ensuring completion of all levels, potentially prolonging the processing time.</li></ul><p><strong>Leaf-Wise</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/583/0*2Wtx-7vmtgBWggvh.png" /></figure><ul><li>The tree grows by adding a leaf that provides the maximum gain at each expansion step. In other words, only one leaf node is added at each expansion step.</li><li>This usually results in deeper but narrower trees. Deeper trees can offer more flexibility to capture complex feature relationships.</li><li>It is generally faster compared to the level-wise approach.</li></ul><p>In summary, LightGBM’s leaf-wise strategy focuses on expanding the tree by adding leaves that provide the maximum gain, offering advantages such as increased depth for capturing intricate patterns and faster processing.</p><p><strong>Gradient-based One Side Sampling (GOSS)</strong></p><p>It starts from the fact that different data examples contribute differently to information gain. Examples with higher gradients have a greater impact on information gain. GOSS preserves examples with large gradients (e.g., those greater than a specific threshold or in the top percentiles), while maintaining the accuracy of information gain predictions by randomly discarding examples with small gradients. This method allows for more accurate gain predictions compared to regular random sampling at the desired example ratio when information gain varies widely</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/700/0*BdY4FF7vKE4Vf08m.png" /></figure><p><strong>Exclusive Feature Bundling (EFB)</strong></p><p>It generally provides an almost lossless strategy to represent high-dimensional sparse data with fewer features. Especially in a sparse feature space, many features are mutually exclusive, meaning they do not simultaneously take non-zero values. These features can be safely merged into a single feature. As a result, the complexity of histogram construction decreases from O(data^feature) to O(data^bundle) levels (bundle &lt; feature). This speeds up the algorithm while maintaining precision.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/700/0*PqNXTRaGsmXrpQ1H.png" /></figure><p><strong>Light GBM Implemantation</strong></p><p>The dataset contains packet information generated over the network when smartphone applications are in use. Our objective is to predict the type of application through multi-class classification using the packet information.</p><p>First, let’s import the necessary libraries.</p><pre>import pandas as pd<br>from sklearn.model_selection import train_test_split<br>import lightgbm as lgb<br>from sklearn.metrics import accuracy_score<br>from sklearn.model_selection import GridSearchCV</pre><p>Let’s load and visualize the dataset. (Categories are given according to the type of applications. For example 0 : Social Media, 1: Multimedia, 3 : Communication, 4 : Navigation, 5 : Game etc.)</p><pre>df = pd.read_csv(&#39;dataset.csv&#39;)<br>rows = 100<br>dfhead = df.head(rows)<br>dftail = df.tail(rows)<br>dflast = pd.concat([dfhead, dftail])<br>dflast</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/0*yfmlE2lBoKa31ZHb.png" /></figure><p>Load the data and split it into test and train (80%-20%).</p><pre>df = pd.read_csv(dataset.csv&#39;)<br>X = df.drop(columns=[&#39;appcategory&#39;])<br>y = df[&#39;appcategory&#39;]<br>X_train, X_test, y_train, y_test=train_test_split(X, y, test_size = 0.2)</pre><p>We install Light GBM, do hyperparameter optimization and start the train process. In this example, GridSearchCV is used for hyperparameter optimization. However, the optimal values were found beforehand and set directly.</p><p>For hyperparameter optimization with GridSearchCV and LightGBM hyperparameters, you can check this <a href="https://www.datasnips.com/288/lightgbm-hyperparameter-tuning-with-gridsearch/">article</a>.</p><pre>lgb=lgb.LGBMClassifier()<br>parameters = {&#39;num_leaves&#39;:[100], &#39;min_child_samples&#39;:[15],&#39;max_depth&#39;:[20],<br>             &#39;learning_rate&#39;:[0.2],&#39;reg_alpha&#39;:[0.03]}<br>clf=GridSearchCV(lgb,parameters,cv = 2)<br>clf.fit(X=X_train, y=y_train)</pre><p>Finally, we complete the prediction process and write down the results.</p><pre>predictions=clf.predict(X_test)    <br>score = accuracy_score(y_test, predictions)<br>a = pd.crosstab(y_test,predictions)<br>print(score)<br>print(a.max(axis=1)/a.sum(axis=1))</pre><p>We display an average success rate of 79% and category-based success rates.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/149/0*VeNNxhS-83zsyjtO.png" /></figure><p><strong>Sources</strong></p><p><a href="https://medium.com/@ilyurek/light-gbm-a-powerful-gradient-boosting-algorithm-fe145a1cd8a6">Light GBM: A Powerful Gradient Boosting Algorithm</a></p><p><a href="https://nikolh92.medium.com/what-makes-lightgbm-sometimes-better-how-to-quickly-implement-it-3265e701e8d2">What is LightGBM (Light Gradient Boosting) + Example Python Code</a></p><p><a href="https://medium.com/@saradmishra28/light-gbm-difference-b-w-lgbm-xg-boost-theory-e41e957c2c63">Light-GBM &amp; difference b/w LGBM &amp; Xg-boost</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d1898fa205d2" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>