Streamlit Application

Simplest Leaderboard Using Python

Creating Competition Leaderboard apps in one python streamlit file

Hervind Philipe
Data Folks Indonesia

--

kaggle leaderboard (SIIM-ISIC Melanoma)

I wish you stumble upon this article when you want to make a “machine learning” competition between your groups and need a leaderboard.

This article is for you !!!

I am making a leaderboard app using Streamlit which is very easy to use and configure within 1 file of codes in Python.

Alternatively, you already can:

Host it at an online platform like Kaggle but you do not like to struggle with administrative things (a.k.a forms) and wait for the platform manager to respond on your submission.

The second option is to spend extra time making your own website and needs to handle HTML, CSS, JS, plus Flask or Django (not forget to mention the cool FastAPI) with tons of lines of codes.

😊 This app is for you if:

  1. You have no time to do the two things mentioned above.
  2. Want to make the leaderboard app within 1 minute.
  3. Do not know python (but if you familiar with python, it’s a plus)

😔 This app quite not suitable for you if :

  1. You want to make a very long competition, but you can try it (will explain later)

Why the app is simple, even I dare say the simplest

  1. Insert name, upload file, see the result
  2. For admin of competition do not need to change the code
  3. Does not use a database, just store data inside the text file
  4. No password what so ever, even for submit the result, just put username

The codes can be accessed at my GitHub

Let’s Get Started

We will discuss the codes and how to run it later, now we see how the usability of the apps when it is already set.

This is how the leaderboard looks like in the user’ view

Leaderboard view

What the user needs to do is prepare a submission CSV file with index and target column name exactly the same then fill the username, upload, and submit.

Surprisingly, there is no password or user registration. I just want to make it simple, the drawback is on can submit using others’ names, but will it just benefit the others?

The view above is the condition when the master data and competition configuration is already set

Now we see how to set it

Init competition

When you just run the apps, and no master data registered, this is what you would encounter

Initial View

what you need to do is g̶o̶ ̶t̶o̶ ̶t̶h̶e̶ ̶c̶o̶d̶e change the username into admin (this is the default, sure can be changed) then suddenly appear the checkbox with text Change Master Key just check it.

See this gif below:

admin register master data
Admin register master data

You as admin can choose competition type and metrics type. Until now I just prepared for binary classification, multiclass classification, and regression competition with respective commonly used metrics.

Then upload your file, for example, I upload the titanic dataset, then appear 2 forms to select the index column and target column.

What Needs to be Highlighted

  • Assuming the admin already give users a submission template with the index and target columns name should be with master data as well as total row
  • There is no password and user registration
  • All the data is not stored in the database, just inside text files, so problems might be occurred because of this

Where the data stored

  • Master config (competition type and metric) and master data frame (only index col and target col) will be stored inside master folder as JSON file and CSV file respectively
  • Each submission stored in submission folder
  • Submission scores will be stored in CSV file leaderboard.csv which 1 row each submission, what appeared at the apps is already aggregated. the seen leaderboard is not stored

What can be customized

I highlight it inside the code, just find the hashtag#CHANGE HERE AS YOU WANT

  1. Admin name
    the default is admin but it is too obvious, just change it with a tricky name inside the code so your user cannot change the master data.
  2. You can custom your metrics
    Just change scorer variable with legit scorer (see make scorer) and the corresponding variable greater_is_better for example, if the metric is accuracy then greater_is_better = True while mean square error would be greater_is_better = False . Although I already provide common metrics within scikit-learn package.
  3. All the code obviously :D just in case you need it

Set up the apps

Make sure you have python 3

  1. Download or clone the repo
  2. Install the packages with pip install -r requirement.txt in the terminal inside the repo directory
    Its essentially just 3 packages, streamlit , pandas , and scikit-learn . You can skip this if you already have them.
  3. Run streamlit with stremlit run leaderboard.py
  4. Go to http://localhost:8501

Code Tour

Its time to code!!!

Import Packages

I try to make it minimalist as possible, its only use 6 packages. streamlit for the web apps, pandas for playing with the data frame, json to handle the configuration file, os for file management, datetime to record submission time, skelarn (scikit-learn) to calculate the score.

Title, Username, and Greetings

Just using 1 line of code, you can put title. The next lines are username placeholder provided with default “billy”. The last is just simple greeting the username.

Check Master Data and Config

First, we need to ensure the master folder contains a master data frame and config, then just check if the data is not empty.

If those two already checked, load the master data and config then show the config into the web so the users are able to know the competition context.

Plus call the scorer as written inside config, or you just can make it yourself, while greater_is_better is for leaderboard sorting, so choose it according to your metric or scorer.

Upload Submission

Only CSV file allowed that’s the common format for the competition, so I suggest you give the submission template to users with the same columns name and total rows.

Once the user uploads and presses the submit button, the file is saved into submission folder. After that, the submission data frame will be joined by master data, then calculate the score then append a row in leaderboard.csv to store the score

Showing Leaderboard

First, check if the leaderboard is not empty, then we do the aggregation on Username column to get the best score of the user (according togreater_is_better parameter), max submission, and last entries time.

Just show the pandas data frame using st.write(df_leaderboard)

Admin Access

This will be activated if the input username is admin (you can change it). After that, you can choose competition type and metric, you can add some as needed. Then upload the master data with CSV format and choose the required columns which are index and target columns. When the CHANGE button is pressed, the code check if the index column is unique, then save the master data frame and config.

Closing

I am very grateful you read this article till the end, the best thing is when this app can be helpful. I am still developing this and add some features though keep it as simple as possible to use, every feedback is welcome.

If you like this article, please give this article clap(s)

CHEERS!!!

--

--